expression_ann (dCGP-ANN)
This class represents a Artificial Neural Network Cartesian Genetic Program. Each node connection is associated to a weight and each node to a bias. Only a subset of the kernel functions is allowed, including the most used nonlinearities in ANN research: tanh, sig, ReLu, ELU and ISRU. The resulting expression can represent any feed forward neural network but also other less obvious architectures. Weights and biases of the expression can be trained using the efficient backpropagation algorithm (gduals are not allowed for this class, they correspond to forward mode automated differentiation which is super inefficient for deep networks ML.)
-
class expression_ann : public dcgp::expression<double>
A dCGP-ANN expression.
This class represents an artificial neural network as a differentiable Cartesian Genetic program. It adds weights, biases and backward automated differentiation to the class dcgp::expression.
Public Types
-
enum class kernel_type
Allowed kernels (for backpropagation to work)
Values:
-
enumerator SIG
sigmoid
-
enumerator TANH
Hyperbolic tangent.
-
enumerator RELU
Rectified linear unit.
-
enumerator ELU
Exponential linear unit.
-
enumerator ISRU
ISRU.
-
enumerator SUM
Simple sum of inputs.
-
enumerator SIN_NU
non unary sine
-
enumerator COS_NU
non unary cosine
-
enumerator GAUSSIAN_NU
non unary cosine
-
enumerator INV_SUM
negative of the input sum
-
enumerator ABS
absolute value of inputs
-
enumerator STEP
step funxtion
-
enumerator SIG
Public Functions
-
inline expression_ann(unsigned n, unsigned m, unsigned r, unsigned c, unsigned l, std::vector<unsigned> arity, std::vector<kernel<double>> f, unsigned seed = dcgp::random_device::next())
Constructor.
Constructs a dCGPANN expression
- Parameters
n – [in] number of inputs (independent variables).
m – [in] number of outputs (dependent variables).
r – [in] number of rows of the cartesian representation of the network as an acyclic graph.
c – [in] number of columns of the cartesian representation of the network as an acyclic graph.
l – [in] number of levels-back allowed. This, essentially, controls the minimum number of allowed operations in the network. If uncertain set it to c + 1
arity – [in] arities of the basis functions for each column.
f – [in] function set. An std::vector of dcgp::kernel<expression::type>. Can only contain allowed functions.
seed – [in] seed for the random number generator (initial expression and mutations depend on this).
-
inline expression_ann(unsigned n = 1u, unsigned m = 1u, unsigned r = 1u, unsigned c = 1u, unsigned l = 1u, unsigned arity = 2u, std::vector<kernel<double>> f = kernel_set<double>({"sum"})(), unsigned seed = dcgp::random_device::next())
Constructor.
Constructs a dCGPANN expression
- Parameters
n – [in] number of inputs (independent variables).
m – [in] number of outputs (dependent variables).
r – [in] number of rows of the cartesian representation of the network as an acyclic graph.
c – [in] number of columns of the cartesian representation of the network as an acyclic graph.
l – [in] number of levels-back allowed. This, essentially, controls the minimum number of allowed operations in the network. If uncertain set it to c + 1
arity – [in] uniform arity for all basis functions.
f – [in] function set. An std::vector of dcgp::kernel<expression::type>. Can only contain allowed functions.
seed – [in] seed for the random number generator (initial expression and mutations depend on this).
-
inline virtual std::vector<double> operator()(const std::vector<double> &point) const override
Evaluates the dCGP-ANN expression.
This evaluates the dCGP-ANN expression. This method overrides the base class method. NOTE we cannot template this and the following function as they are virtual in the base class.
- Parameters
[point] – in an std::vector containing the values where the dCGP-ANN expression has to be computed
- Returns
The value of the output (an std::vector)
-
inline virtual std::vector<std::string> operator()(const std::vector<std::string> &point) const override
Evaluates the dCGP-ANN expression.
This evaluates the dCGP-ANN expression. This method overrides the base class method.
- Parameters
[point] – in an std::vector containing the symbol names.
- Returns
The symbolic value of the output (an std::vector)
-
template<typename U, enable_double_string<U> = 0>
inline std::vector<U> operator()(const std::initializer_list<U> &point) const Evaluates the dCGP-ANN expression.
This evaluates the dCGP-ANN expression. This template can be instantiated with type U double, in which case the algorithm computes the numerical value of the inputs or with U being a string, in which case the instantiated method will produce a symbolic representation of the output.
- Parameters
[point] – in an initialzer list containing the values where the dCGP-ANN expression has to be computed (doubles or strings)
- Returns
The value of the output (an std::vector)
-
inline void d_loss(double &value, std::vector<double> &gweights, std::vector<double> &gbiases, const std::vector<double> &point, const std::vector<double> &prediction, const expression<double>::loss_type loss_e) const
Cumulates the loss and its gradient (of a single point)
Cumulates the loss and its gradient with respect to weights and biases. The values are cumulated into the inputs. If called in a loop with many data points will cumulate the total batch values.
- Parameters
[value] – The initial loss
[gweights] – The initial loss gradient w.r.t. weights
[gbiases] – The initial loss gradient w.r.t. biases
[point] – The input data (single point)
[prediction] – The predicted output (single point)
[loss_e] – The loss type. Must be loss_type::MSE for Mean Square Error (regression) or loss_type::CE for Cross Entropy (classification)
-
inline std::tuple<double, std::vector<double>, std::vector<double>> d_loss(const std::vector<std::vector<double>> &points, const std::vector<std::vector<double>> &labels, expression<double>::loss_type loss_e, unsigned parallel = 0u)
Evaluates the loss and its gradient (on a batch)
Returns the loss and its gradient with respect to weights and biases.
- Parameters
[points] – The input data (a batch).
[labels] – The predicted outputs (a batch).
[loss_e] – The loss type. Must be loss_type::MSE for Mean Square Error (regression) or loss_type::CE for Cross Entropy (classification)
[parallel] – sets the grain for parallelism. 0 -> no parallelism n -> divides the data into n parts and processes them in parallel threads.
- Returns
the loss, the gradient of the loss w.r.t. all weights (also inactive) and the gradient of the loss w.r.t all biases.
-
inline double sgd(std::vector<std::vector<double>> &points, std::vector<std::vector<double>> &labels, double lr, unsigned batch_size, const std::string &loss_s, unsigned parallel = 0u, bool shuffle = true)
Stochastic gradient descent.
Performs one “epoch” of stochastic gradient descent using mean square error
- Parameters
[points] – The input data (a batch). Will be randomly shuffled (with labels) after a call to sgd.
[labels] – The predicted outputs (a batch). Will be randomly shuffled (with points) after a call to sgd.
[lr] – The learning rate.
[batch_size] – The batch size.
[loss_s] – A string defining the loss type. Can be one of “MSE” (mean squared error) or “CE” (cross-entropy)
[parallel] – sets the grain for parallelism. 0 -> no parallelism n -> divides the data into n parts and processes them in parallel threads.
[shuffle] – when true it shuffles the points and labels before performing one epoch of training.
- Throws
std::invalid_argument – if the data and label size do not match or is zero, or if lr is not positive.
- Returns
The average error across the batches. Note: this will not be equal to the error on the whole data set as weights get updated after each batch. It is an indicator, though, and its free to compute.
-
inline void set_output_f(const std::string &name)
Sets the output nonlinearities.
Sets the nonlinearities of all nodes connected to the output nodes. This is useful when, for example, the dCGPANN is used for a regression task where output values are expected in [-1 1] and hence the output layer should have some sigmoid or tanh nonlinearity.
- Parameters
name – [in] the name of the kernel (nonlinearity)
- Throws
std::invalid_argument – if name is invalid.
-
inline unsigned n_active_weights(bool unique = false) const
Computes the number of weights influencing the result.
Computes the number of weights influencing the result. This will also be the number of weights that are updated when calling sgd. The number of active weights, as well as the number of active nodes, define the complexity of the expression expressed by the chromosome.
- Parameters
unique – [in] when true weights are counted only once if connecting the same two nodes.
-
inline void set_weight(unsigned node_id, unsigned input_id, const double &w)
Sets a weight.
Sets a connection weight to a new value
- Parameters
node_id – [in] the id of the node whose weight is being set (convention adopted for node numbering http://ppsn2014.ijs.si/files/slides/ppsn2014-tutorial3-miller.pdf)
input_id – [in] the id of the node input (0 for the first one up to arity-1)
w – [in] the new value of the weight
- Throws
std::invalid_argument – if the node_id or input_id are not valid
-
inline void set_weight(std::vector<double>::size_type idx, const double &w)
Sets a weight.
Sets a connection weight to a new value
- Parameters
[idx] – index of the weight to be changed.
[w] – value of the weight to be changed.
- Throws
std::invalid_argument – if the node_id or input_id are not valid
-
inline void set_weights(const std::vector<double> &ws)
Sets all weights.
Sets all the connection weights at once
- Parameters
ws – [in] an std::vector containing all the weights to set
- Throws
std::invalid_argument – if the input vector dimension is not valid.
-
inline double get_weight(unsigned node_id, unsigned input_id) const
Gets a weight.
Gets the value of a connection weight
- Parameters
node_id – [in] the id of the node (convention adopted for node numbering http://ppsn2014.ijs.si/files/slides/ppsn2014-tutorial3-miller.pdf)
input_id – [in] the id of the node input (0 up to node arity-1)
- Throws
std::invalid_argument – if the node_id or input_id are not valid
- Returns
the value of the weight
-
inline double get_weight(std::vector<double>::size_type idx) const
Gets a weight.
Gets the value of a connection weight
- Parameters
idx – [in] index of the weight
-
inline const std::vector<double> &get_weights() const
Gets the weights.
Gets the values of all the weights.
- Returns
an std::vector containing all the weights
-
inline void randomise_weights(double mean = 0, double std = 0.1, std::random_device::result_type seed = random_number)
Randomises all weights.
Set all weights to a normally distributed number
- Parameters
[mean] – the mean of the normal distribution.
[std] – the standard deviation of the normal distribution.
[seed] – the seed to generate the new weights (by default its randomly generated).
-
inline void set_bias(typename std::vector<double>::size_type idx, const double &w)
Sets a bias.
Sets a node bias to a new value
- Parameters
[idx] – index of the bias to be changed.
[w] – value of the new bias.
-
inline void set_biases(const std::vector<double> &bs)
Sets all biases.
Sets all the nodes biases at once
- Parameters
bs – [in] an std::vector containing all the biases to set
- Throws
std::invalid_argument – if the input vector dimension is not valid (r*c)
-
inline double get_bias(typename std::vector<double>::size_type idx) const
Gets a bias.
Gets the value of a bias
- Parameters
idx – [in] index of the bias
-
inline const std::vector<double> &get_biases() const
Gets the biases.
Gets the values of all the biases
- Returns
an std::vector containing all the biases
-
inline void randomise_biases(double mean = 0, double std = 0.1, std::random_device::result_type seed = random_number)
Randomises all biases.
Set all biases to a normally distributed number
- Parameters
mean – [in] the mean of the normal distribution
std – [in] the standard deviation of the normal distribution
seed – [in] the seed to generate the new biases (by default its randomly generated)
Friends
-
inline friend std::ostream &operator<<(std::ostream &os, const expression_ann &d)
Overloaded stream operator.
Will return a formatted string containing a human readable representation of the class
- Returns
std::string containing a human-readable representation of the problem.
-
enum class kernel_type