expression_ann (dCGP-ANN)

Important

This Cartesian Genetic Program is able to encode an Artificial Neural Network. Weights and biases are added to the acyclic graph as well as extra methods to allow to perform backpropagation (in parallel), to visualize the network and more …

class dcgpy.expression_ann(inputs, outputs, rows, cols, levels_back, arity = 2, kernels, n_eph = 0, seed = randint)

A dCGPANN expression

Constructs a CGP expression operating on double

Parameters

inputs (int) – number of inputs
outputs (int) – number of outputs
rows (int) – number of rows of the cartesian representation of the expression as an acyclic graph.
cols (int) – number of columns of the cartesian representation of the expression as an acyclic graph.
levels_back (int) – number of levels-back allowed. This, essentially, controls the minimum number of allowed operations in the formula. If uncertain set it to cols + 1
arity (int on list) – arity of the kernels. Assumed equal for all columns unless its specified by a list. The list must contain a number of entries equal to the number of columns.
kernels (List[dcgpy.kernel_double]) – kernel functions
n_eph (int) – Number of ephemeral constants. Their values and their symbols can be set via dedicated methods.
seed (int) – random seed to generate mutations and chromosomes

eph_val

Values of the ephemeral constants.

Type: list(`)double`

eph_symb

Symbols used for the ephemeral constants.

Type: list(str)

Examples:

>>> from dcgpy import *
>>> dcgp = expression_double(1,1,1,10,11,2,kernel_set_double(["sum","diff","mul","div"])(), 0, 32)
>>> print(dcgp)
...
>>> num_out = dcgp([in])
>>> sym_out = dcgp(["x"])

get_bias(node_id)

Gets a bias.

Note

Convention adopted for node numbering: http://ppsn2014.ijs.si/files/slides/ppsn2014-tutorial3-miller.pdf

Parameters: node_id (int) – the id of the node
Returns: The value of the bias (a float)
Raises: ValueError – if node_id is not valid

get_biases(): Gets all biases

get_weight(node_id, input_id)

get_weight(idx) → None

Gets a weight. Two overloads are available. You can get the weight specifying the node and the input id (that needs to be less than the arity), or directly specifying its position in the weight vector.

Note

Convention adopted for node numbering: http://ppsn2014.ijs.si/files/slides/ppsn2014-tutorial3-miller.pdf

Parameters

node_id (int) – the id of the node
input_id (int) – the id of the node input (0 for the first one up to arity-1)

Returns

The value of the weight (float)

Raises

ValueError – if node_id or input_id or idx are not valid

get_weights(): Gets all weights

n_active_weights(unique=False)

Computes the number of weights influencing the result. This will also be the number of weights that are updated when calling sgd. The number of active weights, as well as the number of active nodes, define the complexity of the expression expressed by the chromosome.

Parameters: unique (bool) – when True weights are counted only once if connecting the same two nodes.

randomise_biases(mean=0, std=0.1, seed=randomint)

Randomises all the values for the biases using a normal distribution.

Parameters

mean (float) – the mean of the normal distribution.
std (float) – the standard deviation of the normal distribution.
seed (int) – the random seed to use.

randomise_weights(mean=0, std=0.1, seed=randomint)

Randomises all the values for the weights using a normal distribution.

Parameters

mean (float) – the mean of the normal distribution.
std (float) – the standard deviation of the normal distribution.
seed (int) – the random seed to use.

set_bias(node_id, bias)

Sets a bias.

Note

Convention adopted for node numbering: http://ppsn2014.ijs.si/files/slides/ppsn2014-tutorial3-miller.pdf

Parameters

node_id (int) – the id of the node whose weight is being set
weight (float) – the new value of the weight

Raises

ValueError – if node_id is not valid

set_biases(biases)

Sets all biases.

Parameters: biases (List[float]) – the new values of the biases
Raises: ValueError – if the input vector dimension is not valid (r*c)

set_output_f(name)

Sets the nonlinearities of all nodes connected to the output nodes. This is useful when, for example, the dCGPANN is used for a regression task where output values are expected in [-1 1] and hence the output layer should have some sigmoid or tanh nonlinearity, or in a classification task when one wants to have a softmax layer by having a sum in all output neurons.

Parameters: name (string) – the kernel name
Raises: ValueError – if name is not one of the kernels in the expression.

set_weight(node_id, input_id, weight)

set_weight(idx, weight) → None

Sets a weight. Two overloads are available. You can set the weight specifying the node and the input id (that needs to be less than the arity), or directly specifying its position in the weight vector.

Note

Convention adopted for node numbering: http://ppsn2014.ijs.si/files/slides/ppsn2014-tutorial3-miller.pdf

Parameters

node_id (int) – the id of the node whose weight is being set
input_id (int) – the id of the node input (0 for the first one up to arity-1)
weight (float) – the new value of the weight
idx (int) – the idx of weight to be set

Raises

ValueError – if node_id or input_id or idx are not valid

set_weights(weights)

Sets all weights.

Parameters: weights (List[float]) – the new values of the weights
Raises: ValueError – if the input vector dimension is not valid (r*c*arity)

sgd(points, labels, lr, batch_size, loss_type, parallel=0, shuffle=True)

Performs one epoch of mini-batch (stochastic) gradient descent updating the weights and biases using the points and labels to decrease the loss.

Parameters

points (2D NumPy float array or list of lists of float) – the input data
labels (2D NumPy float array or list of lists of float) – the output labels (supervised signal)
lr (float) – the learning generate
batch_size (int) – the batch size
loss_type (str) – the loss, one of “MSE” for Mean Square Error and “CE” for Cross-Entropy.
parallel (int) – sets the grain for parallelism. 0 -> no parallelism n -> divides the data into n parts and processes them in parallel threads
shuffle (bool) – when True it shuffles the points and labels before performing one epoch of training.

Returns

this is only a proxy for the real loss on the whole data set.

Return type

The average error across the batches (float). Note

Raises

ValueError – if points or labels are malformed or if loss_type is not one of the available types.

simplify(in_sym, subs_weights=False, erc=[])

Simplifies the symbolic output of dCGP expressions

Returns the simplified dCGP expression for each output

Note

This method requires sympy and pyaudi modules installed in your Python system

Parameters

in_sym (a List[str]) – input symbols (its length must match the number of inputs)
subs_weights (a bool) – indicates whether to substitute the weights symbols with their values
erc (a List[float]) – values of the ephemeral random constants (if empty their symbolic representation is used instead)

Returns

A List[str] containing the simplified expressions

Raises

ValueError – if the length of in_sym does not match the number of inputs
ValueError – if the length of erc is larger than the number of inputs
ImportError – if modules sympy or pyaudi are not installed in your Python system

Examples

>>> ex = dcgpy.expression_weighted_gdual_double(3,2,3,3,2,2,dcgpy.kernel_set_gdual_double(["sum","diff"])(),0)
>>> print(ex.simplify(['x','c0','c1'],True,[1,2])[0])
x + 6

visualize(show_connections=True, fill_color='w', show_nonlinearities=False, active_connection_alpha=0.1, inactive_connection_alpha=0.01, legend=True)

Visualizes the dCGPANN expression

Parameters

show_connections (bool) – shows active connections between nodes
show_inactive (`bool`) – shows also inactive connections between nodes
active_connection_alpha (bool) – the alpha used to show active connections
inactive_connection_alpha (bool) – the alpha used to show inactive connections
fill_color (str or `RGB values`) – the fill color of all nodes
show_nonlinearities (bool) – color codes the nodes with the contained nonlinearity
legend (bool) – shows a legend (only when show_nonlinearities is also True)

Examples:

>>> from dcgpy import *
>>> nonlinearities = dcgpy.kernel_set_double(["sig", "ReLu", "tanh"])
>>> dcgpann = dcgpy.expression_ann(inputs=3, outputs=4, rows=15, cols=8, levels_back=2, arity=4, kernels=nonlinearities(), seed=32)
>>> dcgpann.randomise_weights()
>>> dcgpann.visualize(show_nonlinearities=False)