expression (dCGP)

This class represents a Cartesian Genetic Program. Since that is, essentially, an artificial genetic encoding for a mathematical expression, we named the templated class expression. The class template can be instantiated using the types double or gdual<T>. In the case of double, the class would basically reproduce a canonical CGP expression. In the case of gdual<T> the class would operate in the differential algebra of truncated Taylor polynomials with coefficients in T, and thus provide also any order derivative information on the program (i.e. the Taylor expansion of the program output with respect to its inputs).

dCGP expression

A dCGP expression

template<typename T>
class expression

A dCGP expression.

This class represents a mathematical expression as encoded using CGP and contains algorithms to compute its value (numerical and symbolical) and its derivatives as well as to mutate the expression.

Template Parameters

T – expression type. Can be double, or a gdual type.

Subclassed by dcgp::expression_weighted< T >

Public Types

enum class loss_type

Loss types.

Values:

enumerator MSE

Mean Squared Error.

enumerator CE

Public Functions

inline expression(unsigned n, unsigned m, unsigned r, unsigned c, unsigned l, std::vector<unsigned> arity, std::vector<kernel<T>> f, unsigned n_eph, unsigned seed = dcgp::random_device::next())

Constructor.

Constructs a dCGP expression with variable arity

Parameters
  • n[in] number of inputs (independent variables).

  • m[in] number of outputs (dependent variables).

  • r[in] number of rows of the cartesian representation of the expression as an acyclic graph.

  • c[in] number of columns of the cartesian representation of the expression as an acyclic graph.

  • l[in] number of levels-back allowed. This, essentially, controls the minimum number of allowed operations in the formula. If uncertain set it to c + 1

  • arity[in] arities of the basis functions for each column.

  • f[in] function set. An std::vector of dcgp::kernel<expression::type>.

  • n_eph[in] Number of ephemeral constants. Their values and their symbols can be set via the dedicate methods.

  • seed[in] seed for the random number generator (initial expression and mutations depend on this).

inline expression(unsigned n = 1u, unsigned m = 1u, unsigned r = 1u, unsigned c = 1u, unsigned l = 1u, unsigned arity = 1u, std::vector<kernel<T>> f = kernel_set<T>({"sum"})(), unsigned n_eph = 0u, unsigned seed = dcgp::random_device::next())

Constructor.

Constructs a dCGP expression with uniform arity

Parameters
  • n[in] number of inputs (independent variables).

  • m[in] number of outputs (dependent variables).

  • r[in] number of rows of the cartesian representation of the expression as an acyclic graph.

  • c[in] number of columns of the cartesian representation of the expression as an acyclic graph.

  • l[in] number of levels-back allowed. This, essentially, controls the minimum number of allowed operations in the formula. If uncertain set it to c + 1

  • arity[in] arity of the basis functions.

  • f[in] function set. An std::vector of dcgp::kernel<expression::type>.

  • n_eph[in] Number of ephemeral constants. Their values and their symbols can be set via the dedicate methods.

  • seed[in] seed for the random number generator (initial expression and mutations depend on this).

inline virtual ~expression()

Virtual destructor.

expression(const expression&) = default

Copy constructor.

expression(expression&&) = default

Copy assignment operator.

expression &operator=(const expression&) = default

Move constructor.

expression &operator=(expression&&) = default

Move assignment operator.

inline virtual std::vector<T> operator()(const std::vector<T> &point) const

Evaluates the dCGP expression.

This evaluates the dCGP expression.

Parameters

[point] – an std::vector containing the values where the dCGP expression has to be computed (doubles, gduals or strings)

Returns

The value of the function (an std::vector)

inline std::vector<double> operator()(const std::initializer_list<double> &in) const

Evaluates the dCGP expression (from initializer list)

This evaluates the dCGP expression from an initializer list.

Parameters

in[in] an initializer list containing the values where the dCGP expression has to be computed (doubles, gduals or strings)

Returns

The value of the function (an std::vector)

inline virtual std::vector<std::string> operator()(const std::vector<std::string> &in) const

Evaluates the dCGP expression (symbolic)

This evaluates the symbolic form of a dCGP expression from symbols.

Parameters

in[in] an initializer list containing the symbols to use to construct the dCGP expression.

Returns

The value of the function (an std::vector)

inline std::vector<std::string> operator()(const std::initializer_list<std::string> &in) const

Evaluates the dCGP expression (symbolic from initializer list)

This evaluates the symbolic form of a dCGP expression from an initializer list of symbols.

Parameters

in[in] an initializer list containing the symbols to use to construct the dCGP expression.

Returns

The value of the function (an std::vector)

inline T loss(const std::vector<T> &point, const std::vector<T> &prediction, loss_type loss_e) const

Evaluates the model loss (single data point)

Returns the model loss over a single point of data of the dCGP output.

Parameters
  • [point] – The input data (single point)

  • [prediction] – The predicted output (single point)

  • [loss_e] – The loss type. Can be “MSE” for Mean Square Error (regression) or “CE” for Cross Entropy (classification)

Returns

the computed loss

inline T loss(const std::vector<std::vector<T>> &points, const std::vector<std::vector<T>> &labels, const std::string &loss_s, unsigned parallel = 0u) const

Evaluates the model loss (on a batch)

Evaluates the model loss over a batch.

Parameters
  • [points] – The input data (a batch).

  • [labels] – The predicted outputs (a batch).

  • [loss_s] – The loss type. Can be “MSE” for Mean Square Error (regression) or “CE” for Cross Entropy (classification)

  • [parallel] – sets the grain for parallelism. 0 -> no parallelism n -> divides the data into n parts and evaluates them in parallel threads.

Returns

the loss

inline void set(const std::vector<unsigned> &xu)

Sets the chromosome.

Sets a given chromosome as genotype for the expression and updates the active nodes and active genes information accordingly

Parameters

xu[in] the new cromosome

Throws

std::invalid_argument – if the chromosome is out of bounds or has the wrong size.

template<class InputIt>
inline void set_from_range(InputIt begin, InputIt end)

Sets the chromosome from range.

Sets a given chromosome as genotype for the expression and updates the active nodes and active genes information accordingly

Parameters
  • begin[in] iterator to the first element of the range

  • end[in] iterator to the end element of the range

Throws

std::invalid_argument – if the chromosome is out of bounds or has the wrong size.

inline void set_f_gene(unsigned node_id, unsigned f_id)

Sets the function gene of a node.

Sets for a valid node (i.e. not an input node) a new kernel

Parameters
  • node_id[in] the id of the node

  • f_id[in] the id of the kernel

Throws

std::invalid_argument – if the node_id or f_id are invalid.

inline void set_eph_val(const std::vector<T> &eph_val)

Sets the values of ephemeral constants.

Sets the values of ephemeral constants

Parameters

eph_val[in] the values of the ephemeral constants.

Throws

std::invalid_argument – if the size of eph_val is not equal to the number of ephemeral constants.

inline void set_eph_symb(const std::vector<std::string> &eph_symb)

Sets the values of ephemeral constants.

Sets the values of ephemeral constants

Parameters

eph_symb[in] the symbols to use for the ephemeral constants.

Throws

std::invalid_argument – if the size of eph_symb is not equal to the number of ephemeral constants.

inline const std::vector<T> &get_eph_val() const

Gets the values of ephemeral constants.

Gets the values of ephemeral constants

Returns

the values of ephemeral constants.

inline const std::vector<std::string> &get_eph_symb() const

Gets the symbols of ephemeral constants.

Gets the symbols of ephemeral constants

Returns

the symbols of ephemeral constants.

inline const std::vector<unsigned> &get() const

Gets the chromosome.

Gets the chromosome encoding the current expression

Returns

The chromosome

inline const std::vector<unsigned> &get_lb() const

Gets the lower bounds.

Gets the lower bounds for the genes

Returns

An std::vector containing the lower bound for each gene

inline const std::vector<unsigned> &get_ub() const

Gets the upper bounds.

Gets the upper bounds for the genes

Returns

An std::vector containing the upper bound for each gene

inline const std::vector<unsigned> &get_active_genes() const

Gets the active genes.

Gets the idx of the active genes in the current chromosome (numbering is from 0)

Returns

An std::vector containing the idx of the active genes in the current chromosome

inline const std::vector<unsigned> &get_active_nodes() const

Gets the active nodes.

Gets the idx of the active nodes in the current chromosome. The numbering starts from 0 at the first input node to then follow PPSN tutorial from Miller

Returns

An std::vector containing the idx of the active nodes

inline unsigned get_n() const

Gets the number of inputs.

Gets the number of inputs of the dCGP expression

Returns

the number of inputs

inline unsigned get_m() const

Gets the number of outputs.

Gets the number of outputs of the dCGP expression

Returns

the number of outputs

inline unsigned get_r() const

Gets the number of rows.

Gets the number of rows of the dCGP expression

Returns

the number of rows

inline unsigned get_c() const

Gets the number of columns.

Gets the number of columns of the dCGP expression

Returns

the number of columns

inline unsigned get_l() const

Gets the number of levels-back.

Gets the number of levels-back allowed for the dCGP expression

Returns

the number of levels-back

inline const std::vector<unsigned> &get_arity() const

Gets the arity.

Gets the arity of the basis functions of the dCGP expression

Returns

the arity

inline unsigned get_arity(unsigned node_id) const

Gets the arity of a particular node.

Gets the arity of a particular node

Parameters

node_id[in] id of the node

Returns

the arity of that node

inline const std::vector<kernel<T>> &get_f() const

Gets the function set.

Gets the set of functions used in the dCGP expression

Returns

an std::vector of kernels

inline const std::vector<unsigned> &get_gene_idx() const

Gets gene_idx.

Gets gene_idx, a vector containing the indexes in the chromosome where nodes start expressing.

Returns

an std::vector containing the indexes of the chromosome expressing each node

inline void mutate(unsigned idx)

Mutates randomly one gene.

Mutates exactly one gene within its allowed bounds.

Parameters

idx[in] index of the gene to me mutated

Throws

std::invalid_argument – if idx is too large

inline void mutate(std::vector<unsigned> idxs)

Mutates multiple genes at once.

Mutates multiple genes within their allowed bounds.

Parameters

idxs[in] vector of indexes of the genes to me mutated

Throws

std::invalid_argument – if idx is too large

inline void mutate_random(unsigned N)

Mutates N random genes.

Mutates a specified number of random genes within their bounds

Parameters

N[in] number of genes to be mutated

inline void mutate_inactive(unsigned N = 1u)

Mutates inactive genes randomly up to N.

Mutates inactive random genes within their bounds up to N. The guarantee to actually mutate N would cost and is deemed unnecessary.

Parameters

N[in] maximum number of inactive genes to be mutated

inline void mutate_active(unsigned N = 1u)

Mutates active genes.

Mutates N active genes within their allowed bounds. The mutation can affect function genes, input genes and output genes.

Parameters

N[in] Number of active genes to be mutated

inline void mutate_active_fgene(unsigned N = 1u)

Mutates active function genes.

Mutates N active function genes within their allowed bounds.

Parameters

N[in] Number of active function genes to be mutated

inline void mutate_active_cgene(unsigned N = 1u)

Mutates active connection genes.

Mutates N active connection genes within their allowed bounds.

Parameters

N[in] Number of active connection genes to be mutated

inline void mutate_ogene(unsigned N = 1u)

Mutates active output genes.

Mutates N times random active output genes within their allowed bounds.

Parameters

N[in] Number of output genes to be mutated

inline void seed(long seed)

Sets the internal seed.

Sets the internal seed used to perform mutations and other things.

inline bool is_active_node(const unsigned node_id) const

Checks if a given node is active.

Parameters

node_id[in] the node to be checked

Returns

True if the node node_id is active in the CGP expression.

inline bool is_active_gene(const unsigned idx) const

Checks if a given gene is active.

Parameters

idx[in] the idx of the gene to be checked

Returns

True if the gene idx is active in the CGP expression.

inline void set_phenotype_correction(pc_fun_type pc)

Sets the phenotype correction.

A phenotype correction is a correction applied to the expression output that depends on the expression itself and on its inputs. Indicating with g the expression, the overall output, after a phenotype expression is applied, will be of the generic form y = pc(x, g)

Parameters

pc – callable to be applied to the CGP expression.

inline void unset_phenotype_correction()

Unsets the phenotype correction.

template<typename Archive>
inline void serialize(Archive &ar, unsigned)

Object serialization.

This method will save/load this into the archive ar.

Parameters

ar – target archive.

Throws

unspecified – any exception thrown by the serialization of the expression and of primitive types.

Friends

inline friend std::ostream &operator<<(std::ostream &os, const expression &d)

Overloaded stream operator.

Will return a formatted string containing a human readable representation of the class

Returns

std::string containing a human-readable representation of the problem.