expression_ann (dCGP-ANN)

Important

This Cartesian Genetic Program is able to encode an Artificial Neural Network. Weights and biases are added to the acyclic graph as well as extra methods to allow to perform backpropagation (in parallel), to visualize the network and more …

class dcgpy.expression_ann(inputs, outputs, rows, cols, levels_back, arity = 2, kernels, n_eph = 0, seed = randint)

A dCGPANN expression

Constructs a CGP expression operating on double

Parameters
  • inputs (int) – number of inputs

  • outputs (int) – number of outputs

  • rows (int) – number of rows of the cartesian representation of the expression as an acyclic graph.

  • cols (int) – number of columns of the cartesian representation of the expression as an acyclic graph.

  • levels_back (int) – number of levels-back allowed. This, essentially, controls the minimum number of allowed operations in the formula. If uncertain set it to cols + 1

  • arity (int on list) – arity of the kernels. Assumed equal for all columns unless its specified by a list. The list must contain a number of entries equal to the number of columns.

  • kernels (List[dcgpy.kernel_double]) – kernel functions

  • n_eph (int) – Number of ephemeral constants. Their values and their symbols can be set via dedicated methods.

  • seed (int) – random seed to generate mutations and chromosomes

eph_val

Values of the ephemeral constants.

Type

list(`)double`

eph_symb

Symbols used for the ephemeral constants.

Type

list(str)

Examples:

>>> from dcgpy import *
>>> dcgp = expression_double(1,1,1,10,11,2,kernel_set_double(["sum","diff","mul","div"])(), 0, 32)
>>> print(dcgp)
...
>>> num_out = dcgp([in])
>>> sym_out = dcgp(["x"])
get_bias(node_id)

Gets a bias.

Note

Convention adopted for node numbering: http://ppsn2014.ijs.si/files/slides/ppsn2014-tutorial3-miller.pdf

Parameters

node_id (int) – the id of the node

Returns

The value of the bias (a float)

Raises

ValueError – if node_id is not valid

get_biases()

Gets all biases

get_weight(node_id, input_id)
get_weight(idx) None

Gets a weight. Two overloads are available. You can get the weight specifying the node and the input id (that needs to be less than the arity), or directly specifying its position in the weight vector.

Note

Convention adopted for node numbering: http://ppsn2014.ijs.si/files/slides/ppsn2014-tutorial3-miller.pdf

Parameters
  • node_id (int) – the id of the node

  • input_id (int) – the id of the node input (0 for the first one up to arity-1)

Returns

The value of the weight (float)

Raises

ValueError – if node_id or input_id or idx are not valid

get_weights()

Gets all weights

n_active_weights(unique=False)

Computes the number of weights influencing the result. This will also be the number of weights that are updated when calling sgd. The number of active weights, as well as the number of active nodes, define the complexity of the expression expressed by the chromosome.

Parameters

unique (bool) – when True weights are counted only once if connecting the same two nodes.

randomise_biases(mean=0, std=0.1, seed=randomint)

Randomises all the values for the biases using a normal distribution.

Parameters
  • mean (float) – the mean of the normal distribution.

  • std (float) – the standard deviation of the normal distribution.

  • seed (int) – the random seed to use.

randomise_weights(mean=0, std=0.1, seed=randomint)

Randomises all the values for the weights using a normal distribution.

Parameters
  • mean (float) – the mean of the normal distribution.

  • std (float) – the standard deviation of the normal distribution.

  • seed (int) – the random seed to use.

set_bias(node_id, bias)

Sets a bias.

Note

Convention adopted for node numbering: http://ppsn2014.ijs.si/files/slides/ppsn2014-tutorial3-miller.pdf

Parameters
  • node_id (int) – the id of the node whose weight is being set

  • weight (float) – the new value of the weight

Raises

ValueError – if node_id is not valid

set_biases(biases)

Sets all biases.

Parameters

biases (List[float]) – the new values of the biases

Raises

ValueError – if the input vector dimension is not valid (r*c)

set_output_f(name)

Sets the nonlinearities of all nodes connected to the output nodes. This is useful when, for example, the dCGPANN is used for a regression task where output values are expected in [-1 1] and hence the output layer should have some sigmoid or tanh nonlinearity, or in a classification task when one wants to have a softmax layer by having a sum in all output neurons.

Parameters

name (string) – the kernel name

Raises

ValueError – if name is not one of the kernels in the expression.

set_weight(node_id, input_id, weight)
set_weight(idx, weight) None

Sets a weight. Two overloads are available. You can set the weight specifying the node and the input id (that needs to be less than the arity), or directly specifying its position in the weight vector.

Note

Convention adopted for node numbering: http://ppsn2014.ijs.si/files/slides/ppsn2014-tutorial3-miller.pdf

Parameters
  • node_id (int) – the id of the node whose weight is being set

  • input_id (int) – the id of the node input (0 for the first one up to arity-1)

  • weight (float) – the new value of the weight

  • idx (int) – the idx of weight to be set

Raises

ValueError – if node_id or input_id or idx are not valid

set_weights(weights)

Sets all weights.

Parameters

weights (List[float]) – the new values of the weights

Raises

ValueError – if the input vector dimension is not valid (r*c*arity)

sgd(points, labels, lr, batch_size, loss_type, parallel=0, shuffle=True)

Performs one epoch of mini-batch (stochastic) gradient descent updating the weights and biases using the points and labels to decrease the loss.

Parameters
  • points (2D NumPy float array or list of lists of float) – the input data

  • labels (2D NumPy float array or list of lists of float) – the output labels (supervised signal)

  • lr (float) – the learning generate

  • batch_size (int) – the batch size

  • loss_type (str) – the loss, one of “MSE” for Mean Square Error and “CE” for Cross-Entropy.

  • parallel (int) – sets the grain for parallelism. 0 -> no parallelism n -> divides the data into n parts and processes them in parallel threads

  • shuffle (bool) – when True it shuffles the points and labels before performing one epoch of training.

Returns

this is only a proxy for the real loss on the whole data set.

Return type

The average error across the batches (float). Note

Raises

ValueError – if points or labels are malformed or if loss_type is not one of the available types.

simplify(in_sym, subs_weights=False, erc=[])

Simplifies the symbolic output of dCGP expressions

Returns the simplified dCGP expression for each output

Note

This method requires sympy and pyaudi modules installed in your Python system

Parameters
  • in_sym (a List[str]) – input symbols (its length must match the number of inputs)

  • subs_weights (a bool) – indicates whether to substitute the weights symbols with their values

  • erc (a List[float]) – values of the ephemeral random constants (if empty their symbolic representation is used instead)

Returns

A List[str] containing the simplified expressions

Raises
  • ValueError – if the length of in_sym does not match the number of inputs

  • ValueError – if the length of erc is larger than the number of inputs

  • ImportError – if modules sympy or pyaudi are not installed in your Python system

Examples

>>> ex = dcgpy.expression_weighted_gdual_double(3,2,3,3,2,2,dcgpy.kernel_set_gdual_double(["sum","diff"])(),0)
>>> print(ex.simplify(['x','c0','c1'],True,[1,2])[0])
x + 6
visualize(show_connections=True, fill_color='w', show_nonlinearities=False, active_connection_alpha=0.1, inactive_connection_alpha=0.01, legend=True)

Visualizes the dCGPANN expression

Parameters
  • show_connections (bool) – shows active connections between nodes

  • show_inactive (`bool`) – shows also inactive connections between nodes

  • active_connection_alpha (bool) – the alpha used to show active connections

  • inactive_connection_alpha (bool) – the alpha used to show inactive connections

  • fill_color (str or `RGB values`) – the fill color of all nodes

  • show_nonlinearities (bool) – color codes the nodes with the contained nonlinearity

  • legend (bool) – shows a legend (only when show_nonlinearities is also True)

Examples:

>>> from dcgpy import *
>>> nonlinearities = dcgpy.kernel_set_double(["sig", "ReLu", "tanh"])
>>> dcgpann = dcgpy.expression_ann(inputs=3, outputs=4, rows=15, cols=8, levels_back=2, arity=4, kernels=nonlinearities(), seed=32)
>>> dcgpann.randomise_weights()
>>> dcgpann.visualize(show_nonlinearities=False)