expression_ann (dCGP-ANN)

This class represents a Artificial Neural Network Cartesian Genetic Program. Each node connection is associated to a weight and each node to a bias. Only a subset of the kernel functions is allowed, including the most used nonlinearities in ANN research: tanh, sig, ReLu, ELU and ISRU. The resulting expression can represent any feed forward neural network but also other less obvious architectures. Weights and biases of the expression can be trained using the efficient backpropagation algorithm (gduals are not allowed for this class, they correspond to forward mode automated differentiation which is super inefficient for deep networks ML.)

weighted dCGP expression

A, small, artificial neural network as using the dCPP-ANN approach.

class expression_ann : public dcgp::expression<double>

A dCGP-ANN expression.

This class represents an artificial neural network as a differentiable Cartesian Genetic program. It adds weights, biases and backward automated differentiation to the class dcgp::expression.

Public Types

enum class kernel_type

Allowed kernels (for backpropagation to work)

Values:

enumerator SIG

sigmoid

enumerator TANH

Hyperbolic tangent.

enumerator RELU

Rectified linear unit.

enumerator ELU

Exponential linear unit.

enumerator ISRU

ISRU.

enumerator SUM

Simple sum of inputs.

enumerator SIN_NU

non unary sine

enumerator COS_NU

non unary cosine

enumerator GAUSSIAN_NU

non unary cosine

enumerator INV_SUM

negative of the input sum

enumerator ABS

absolute value of inputs

enumerator STEP

step funxtion

Public Functions

inline expression_ann(unsigned n, unsigned m, unsigned r, unsigned c, unsigned l, std::vector<unsigned> arity, std::vector<kernel<double>> f, unsigned seed = dcgp::random_device::next())

Constructor.

Constructs a dCGPANN expression

Parameters
  • n[in] number of inputs (independent variables).

  • m[in] number of outputs (dependent variables).

  • r[in] number of rows of the cartesian representation of the network as an acyclic graph.

  • c[in] number of columns of the cartesian representation of the network as an acyclic graph.

  • l[in] number of levels-back allowed. This, essentially, controls the minimum number of allowed operations in the network. If uncertain set it to c + 1

  • arity[in] arities of the basis functions for each column.

  • f[in] function set. An std::vector of dcgp::kernel<expression::type>. Can only contain allowed functions.

  • seed[in] seed for the random number generator (initial expression and mutations depend on this).

inline expression_ann(unsigned n = 1u, unsigned m = 1u, unsigned r = 1u, unsigned c = 1u, unsigned l = 1u, unsigned arity = 2u, std::vector<kernel<double>> f = kernel_set<double>({"sum"})(), unsigned seed = dcgp::random_device::next())

Constructor.

Constructs a dCGPANN expression

Parameters
  • n[in] number of inputs (independent variables).

  • m[in] number of outputs (dependent variables).

  • r[in] number of rows of the cartesian representation of the network as an acyclic graph.

  • c[in] number of columns of the cartesian representation of the network as an acyclic graph.

  • l[in] number of levels-back allowed. This, essentially, controls the minimum number of allowed operations in the network. If uncertain set it to c + 1

  • arity[in] uniform arity for all basis functions.

  • f[in] function set. An std::vector of dcgp::kernel<expression::type>. Can only contain allowed functions.

  • seed[in] seed for the random number generator (initial expression and mutations depend on this).

inline virtual std::vector<double> operator()(const std::vector<double> &point) const override

Evaluates the dCGP-ANN expression.

This evaluates the dCGP-ANN expression. This method overrides the base class method. NOTE we cannot template this and the following function as they are virtual in the base class.

Parameters

[point] – in an std::vector containing the values where the dCGP-ANN expression has to be computed

Returns

The value of the output (an std::vector)

inline virtual std::vector<std::string> operator()(const std::vector<std::string> &point) const override

Evaluates the dCGP-ANN expression.

This evaluates the dCGP-ANN expression. This method overrides the base class method.

Parameters

[point] – in an std::vector containing the symbol names.

Returns

The symbolic value of the output (an std::vector)

template<typename U, enable_double_string<U> = 0>
inline std::vector<U> operator()(const std::initializer_list<U> &point) const

Evaluates the dCGP-ANN expression.

This evaluates the dCGP-ANN expression. This template can be instantiated with type U double, in which case the algorithm computes the numerical value of the inputs or with U being a string, in which case the instantiated method will produce a symbolic representation of the output.

Parameters

[point] – in an initialzer list containing the values where the dCGP-ANN expression has to be computed (doubles or strings)

Returns

The value of the output (an std::vector)

inline void d_loss(double &value, std::vector<double> &gweights, std::vector<double> &gbiases, const std::vector<double> &point, const std::vector<double> &prediction, const expression<double>::loss_type loss_e) const

Cumulates the loss and its gradient (of a single point)

Cumulates the loss and its gradient with respect to weights and biases. The values are cumulated into the inputs. If called in a loop with many data points will cumulate the total batch values.

Parameters
  • [value] – The initial loss

  • [gweights] – The initial loss gradient w.r.t. weights

  • [gbiases] – The initial loss gradient w.r.t. biases

  • [point] – The input data (single point)

  • [prediction] – The predicted output (single point)

  • [loss_e] – The loss type. Must be loss_type::MSE for Mean Square Error (regression) or loss_type::CE for Cross Entropy (classification)

inline std::tuple<double, std::vector<double>, std::vector<double>> d_loss(const std::vector<std::vector<double>> &points, const std::vector<std::vector<double>> &labels, expression<double>::loss_type loss_e, unsigned parallel = 0u)

Evaluates the loss and its gradient (on a batch)

Returns the loss and its gradient with respect to weights and biases.

Parameters
  • [points] – The input data (a batch).

  • [labels] – The predicted outputs (a batch).

  • [loss_e] – The loss type. Must be loss_type::MSE for Mean Square Error (regression) or loss_type::CE for Cross Entropy (classification)

  • [parallel] – sets the grain for parallelism. 0 -> no parallelism n -> divides the data into n parts and processes them in parallel threads.

Returns

the loss, the gradient of the loss w.r.t. all weights (also inactive) and the gradient of the loss w.r.t all biases.

inline double sgd(std::vector<std::vector<double>> &points, std::vector<std::vector<double>> &labels, double lr, unsigned batch_size, const std::string &loss_s, unsigned parallel = 0u, bool shuffle = true)

Stochastic gradient descent.

Performs one “epoch” of stochastic gradient descent using mean square error

Parameters
  • [points] – The input data (a batch). Will be randomly shuffled (with labels) after a call to sgd.

  • [labels] – The predicted outputs (a batch). Will be randomly shuffled (with points) after a call to sgd.

  • [lr] – The learning rate.

  • [batch_size] – The batch size.

  • [loss_s] – A string defining the loss type. Can be one of “MSE” (mean squared error) or “CE” (cross-entropy)

  • [parallel] – sets the grain for parallelism. 0 -> no parallelism n -> divides the data into n parts and processes them in parallel threads.

  • [shuffle] – when true it shuffles the points and labels before performing one epoch of training.

Throws

std::invalid_argument – if the data and label size do not match or is zero, or if lr is not positive.

Returns

The average error across the batches. Note: this will not be equal to the error on the whole data set as weights get updated after each batch. It is an indicator, though, and its free to compute.

inline void set_output_f(const std::string &name)

Sets the output nonlinearities.

Sets the nonlinearities of all nodes connected to the output nodes. This is useful when, for example, the dCGPANN is used for a regression task where output values are expected in [-1 1] and hence the output layer should have some sigmoid or tanh nonlinearity.

Parameters

name[in] the name of the kernel (nonlinearity)

Throws

std::invalid_argument – if name is invalid.

inline unsigned n_active_weights(bool unique = false) const

Computes the number of weights influencing the result.

Computes the number of weights influencing the result. This will also be the number of weights that are updated when calling sgd. The number of active weights, as well as the number of active nodes, define the complexity of the expression expressed by the chromosome.

Parameters

unique[in] when true weights are counted only once if connecting the same two nodes.

inline void set_weight(unsigned node_id, unsigned input_id, const double &w)

Sets a weight.

Sets a connection weight to a new value

Parameters
Throws

std::invalid_argument – if the node_id or input_id are not valid

inline void set_weight(std::vector<double>::size_type idx, const double &w)

Sets a weight.

Sets a connection weight to a new value

Parameters
  • [idx] – index of the weight to be changed.

  • [w] – value of the weight to be changed.

Throws

std::invalid_argument – if the node_id or input_id are not valid

inline void set_weights(const std::vector<double> &ws)

Sets all weights.

Sets all the connection weights at once

Parameters

ws[in] an std::vector containing all the weights to set

Throws

std::invalid_argument – if the input vector dimension is not valid.

inline double get_weight(unsigned node_id, unsigned input_id) const

Gets a weight.

Gets the value of a connection weight

Parameters
Throws

std::invalid_argument – if the node_id or input_id are not valid

Returns

the value of the weight

inline double get_weight(std::vector<double>::size_type idx) const

Gets a weight.

Gets the value of a connection weight

Parameters

idx[in] index of the weight

inline const std::vector<double> &get_weights() const

Gets the weights.

Gets the values of all the weights.

Returns

an std::vector containing all the weights

inline void randomise_weights(double mean = 0, double std = 0.1, std::random_device::result_type seed = random_number)

Randomises all weights.

Set all weights to a normally distributed number

Parameters
  • [mean] – the mean of the normal distribution.

  • [std] – the standard deviation of the normal distribution.

  • [seed] – the seed to generate the new weights (by default its randomly generated).

inline void set_bias(typename std::vector<double>::size_type idx, const double &w)

Sets a bias.

Sets a node bias to a new value

Parameters
  • [idx] – index of the bias to be changed.

  • [w] – value of the new bias.

inline void set_biases(const std::vector<double> &bs)

Sets all biases.

Sets all the nodes biases at once

Parameters

bs[in] an std::vector containing all the biases to set

Throws

std::invalid_argument – if the input vector dimension is not valid (r*c)

inline double get_bias(typename std::vector<double>::size_type idx) const

Gets a bias.

Gets the value of a bias

Parameters

idx[in] index of the bias

inline const std::vector<double> &get_biases() const

Gets the biases.

Gets the values of all the biases

Returns

an std::vector containing all the biases

inline void randomise_biases(double mean = 0, double std = 0.1, std::random_device::result_type seed = random_number)

Randomises all biases.

Set all biases to a normally distributed number

Parameters
  • mean[in] the mean of the normal distribution

  • std[in] the standard deviation of the normal distribution

  • seed[in] the seed to generate the new biases (by default its randomly generated)

template<typename Archive>
inline void serialize(Archive &ar, unsigned)

Object serialization.

This method will save/load this into the archive ar.

Parameters

ar – target archive.

Throws

unspecified – any exception thrown by the serialization of the expression and of primitive types.

Friends

inline friend std::ostream &operator<<(std::ostream &os, const expression_ann &d)

Overloaded stream operator.

Will return a formatted string containing a human readable representation of the class

Returns

std::string containing a human-readable representation of the problem.