Evolutionary Strategy (UDA)

class dcgpy.es4cgp(gen=1, max_mut=4, ftol=0, learn_constants=False, seed=random)

Evolutionary strategies are popular global optimization meta-heuristics essentially based on the following simple pseudo-algorithm:

Start from a population (pop) of dimension N
while i < gen
> > Mutation: create a new population pop2 mutating N times the best individual
> > Evaluate all new chromosomes in pop2
> > Reinsertion: set pop to contain the best N individuals taken from pop and pop2

The key to the success of such a search strategy is in the quality of its mutation operator. In the case of chrosomoses that encode a Cartesian Genetic Program (CGP), it makes sense to have mutation act on active genes only (that is on that part of the chromosome that is actually expressed in the final CGP / formula / model). This introduces a coupling between the optimization problem (say a symbolic regression problem) and its solution strategy which, although not preventing, makes the use of general purpose optimization algorithms inefficient (e.g. a generic evolutionary strategy would have a mutation operator which is agnostic of the existence of active genes).

Note

ES4CGP is tailored to solve dcgpy.symbolic_regression problems and will not work on different types.

In this class we provide an evolutionary strategy tailored to solve dcgpy.symbolic_regression problems leveraging the kowledge on the genetic structure of Cartesian Genetic Programs (i.e. able to mutate only active genes).

Parameters

gen (int) – number of generations.
max_mut (int) – number of active genes to be mutated.
ftol (float) – the algorithm will exit when the loss is below this tolerance.
learn_constants (bool) – when true a gaussian mutation is applied to the ephemeral constants (std = 0.1).
seed (int) – seed used by the internal random number generator (default is random).

Raises

unspecified – any exception thrown by failures at the intersection between C++ and Python (e.g., type conversion errors, mismatched function signatures, etc.)
ValueError – if max_mut is 0 or ftol is negative.

Note

If a bfe_mp is set using the set_bfe(), the algorithm cannot be used in a archipelago as nested parallelism would lead to AssertionError: daemonic processes are not allowed to have children.

get_log()

Returns a log containing relevant parameters recorded during the last call to evolve(). The log frequency depends on the verbosity parameter (by default nothing is logged) which can be set calling the method set_verbosity() on an algorithm constructed with a es4cgp. A verbosity of N implies a log line each N generations.

Returns

at each logged epoch, the values Gen, Fevals, Current best, Best, where:

Gen (int), generation number.
Fevals (int), number of functions evaluation made.
Best (float), the best fitness found.
Constants (list), the current values for the ephemeral constants.
Model (string), the string representation of the current best model

Return type

list of tuples

Examples

>>> import dcgpy
>>> from pygmo import *
>>>
>>> algo = algorithm(dcgpy.es4cgp(gen = 2000, max_mut = 4, ftol = 1e-4, learn_constants=True))
>>> X, Y = dcgpy.generate_koza_quintic()
>>> udp = dcgpy.symbolic_regression(X, Y ,1,20,21,2, dcgpy.kernel_set_double(["sum", "diff", "mul"])(), 1, False, 0)
>>> pop = population(udp, 4)
>>> algo.set_verbosity(200)
>>> pop = algo.evolve(pop) 
Gen:        Fevals:          Best:   Constants:   Model:
   0              0        7398.14   [-1.22497]   [x0*c1**2 + c1**2] ...
 200            800        233.979   [-1.34118]   [x0*(x0 - x0**2 + x0**3 + (x0 - x0**2)** ...
 400           1600        4.26131   [-1.15376]   [x0*(x0 + x0*(c1 + x0**2) - x0**2 + (x0  ...
 600           2400        4.26126   [-1.15198]   [x0*(x0 + x0*(c1 + x0**2) - x0**2 + (x0  ...
 800           3200        4.26126   [-1.15198]   [x0*(x0 + x0*(c1 + x0**2) - x0**2 + (x0  ...
1000           4000        4.26126   [-1.15198]   [x0*(x0 + x0*(c1 + x0**2) - x0**2 + (x0  ...
1200           4800        4.26126   [-1.15198]   [x0*(x0 + x0*(c1 + x0**2) - x0**2 + (x0  ...
1400           5600        4.26126   [-1.15198]   [x0*(x0 + x0*(c1 + x0**2) - x0**2 + (x0  ...
1600           6400       0.664691   [-1.12614]   [x0*(x0 + x0*(c1 + x0**2) - (c1 + x0**2) ...
1800           7200       0.664691   [-1.12614]   [x0*(x0 + x0*(c1 + x0**2) - (c1 + x0**2) ...
2000           8000       0.664689   [-1.12548]   [x0*(x0 + x0*(c1 + x0**2) - (c1 + x0**2) ...
Exit condition -- generations = 2000
>>> uda = algo.extract(dcgpy.es4cgp)
>>> uda.get_log() 
[(0, 0, 7398.139620548432, array([-1.22496858]), '[x0*c1**2 + c1**2]'), ...

See also the docs of the relevant C++ method dcgp::es4cgp::get_log().

get_seed()

This method will return the random seed used internally by this uda.

Returns: the random seed of the population
Return type: int

set_bfe(b)

Set the batch function evaluation scheme. This method will set the batch function evaluation scheme to be used.

Parameters: b (bfe) – the batch function evaluation object
Raises: unspecified – any exception thrown by the underlying C++ method