Multi-Objective Memetic Evolutionary Strategy (UDA)

class dcgpy.momes4cgp(gen=1, max_mut=1, seed=random)

Symbolic regression tasks seek for good mathematical models to represent input data. By increasing the model complexity it is always (theoretically) possible to find almost perfect fits of any input data. As a consequence, the model complexity must be traded off with its accuracy so that symbolic regression is, ultimately, a two-objectives optimization problem.

In this class we offer an UDA (User Defined Algorithm for the pygmo optimization suite) which extends dcgpy.mes4cgp for a multiobjective problem. The resulting algorithm, is outlined by the following pseudo-algorithm:

Start from a population (pop) of dimension N
while i < gen
> > Mutation: create a new population pop2 mutating N times the best individual
> > Life long learning: apply a one step of a second order Newton method to each individual (only the continuous part is affected)
> > Reinsertion: set pop to contain the best N individuals taken from pop and pop2 according to non dominated sorting.

Note

MOMES4CGP is tailored to solve dcgpy.symbolic_regression problems and will not work on different types.

Parameters

gen (int) – number of generations.
max_mut (int) – maximum number of active genes to be mutated.
ftol (float) – the algorithm will exit when the loss is below this tolerance.
seed (int) – seed used by the internal random number generator (default is random).

Raises

unspecified – any exception thrown by failures at the intersection between C++ and Python (e.g., type conversion errors, mismatched function signatures, etc.)
ValueError – if max_mut is 0.

get_log()

Returns a log containing relevant parameters recorded during the last call to evolve(). The log frequency depends on the verbosity parameter (by default nothing is logged) which can be set calling the method set_verbosity() on an algorithm constructed with a mes4cgp. A verbosity of N implies a log line each N generations.

Returns

at each logged epoch, the values Gen, Fevals, Current best, Best, where:

Gen (int), generation number.
Fevals (int), number of functions evaluation made.
Best loss (float), the best fitness found.
Ndf size (int), number of models in the non dominated front.
Compl. (in), minimum complexity across the models in the non dominated front.

Return type

list of tuples

Examples

>>> import dcgpy
>>> from pygmo import *
>>>
>>> algo = algorithm(dcgpy.momes4cgp(gen = 90, max_mut = 2))
>>> X, Y = dcgpy.generate_koza_quintic()
>>> udp = dcgpy.symbolic_regression(X, Y ,1,20,21,2, dcgpy.kernel_set_double(["sum", "diff", "mul"])(), 1, False, 0)
>>> pop = population(udp, 100)
>>> algo.set_verbosity(10)
>>> pop = algo.evolve(pop) 
Gen:        Fevals:     Best loss: Ndf size:   Compl.:
   0              0        6.07319         3        92
  10           1000        2.15419         5        10
  20           2000        1.92403         8        33
  30           3000       0.373663        12        72
  40           4000        0.36954        13        72
  50           5000       0.235749        16        73
  60           6000       0.235749        12        73
  70           7000       0.235749        13        73
  80           8000       0.217968        12        75
  90           9000       0.217968        12        75
 100          10000       0.217968        12        75
 110          11000       0.217968        14        75
 120          12000       0.217968        14        75
 130          13000       0.217968        13        75
 140          14000       0.162293        12        52
Exit condition -- generations = 140
>>> uda = algo.extract(dcgpy.momes4cgp)
>>> uda.get_log() 
[(0, 0, 6.0731942123423, 3, 92), ...

See also the docs of the relevant C++ method dcgp::momes4cgp::get_log().

get_seed()

This method will return the random seed used internally by this uda.

Returns: the random seed of the population
Return type: int