Find an exact model inclduing one parameter using a memetic approach
In this second tutorial we show how to find a model for our input data when we also want to learn some constants.
- Constants can, in general, be learned via two main techniques:
evolutionary (common and standard practice in GP)
memetic (original with dCGP)
The difference is that the evolutionary approach cannot find the correct and exact values for constants, only approximations. In this tutorial we follow the evolutionary approach 2. In the next tutorial we will follow a memetic approach 2. We use the problem P1 from the dcgp::gym, that is x^5 - pi*x^3 + x
Code:
1#include <vector>
2
3#include <boost/algorithm/string.hpp>
4#include <pagmo/algorithm.hpp>
5#include <pagmo/io.hpp>
6#include <pagmo/population.hpp>
7#include <pagmo/problem.hpp>
8
9#include <dcgp/algorithms/mes4cgp.hpp>
10#include <dcgp/gym.hpp>
11#include <dcgp/kernel_set.hpp>
12#include <dcgp/problems/symbolic_regression.hpp>
13#include <dcgp/symengine.hpp>
14
15using namespace dcgp;
16using namespace boost::algorithm;
17
18int main()
19{
20 // We load the data (using problem P1 from the gym)
21 std::vector<std::vector<double>> X, Y;
22 gym::generate_P1(X, Y);
23
24 // We instantiate a symbolic regression problem with one only ephemeral constants.
25 auto n_eph = 1u;
26 symbolic_regression udp(X, Y, 1u, 15u, 16u, 2u, kernel_set<double>({"sum", "diff", "mul", "pdiv"})(), n_eph);
27
28 // We init a population with four individuals.
29 pagmo::population pop{udp, 4u};
30
31 // We instantiate the memetic solver setting 1000 maximum generation and one active mutation (minimum)
32 dcgp::mes4cgp uda(1000u, 4u, 1e-8);
33 pagmo::algorithm algo{uda};
34 algo.set_verbosity(50u);
35
36 // We solve
37 pop = algo.evolve(pop);
38
39 // We print on screen the best found
40 auto idx = pop.best_idx();
41 auto prettier = udp.prettier(pop.get_x()[idx]);
42 trim_left_if(prettier, is_any_of("["));
43 trim_right_if(prettier, is_any_of("]"));
44
45 pagmo::print("\nBest fitness: ", pop.get_f()[idx], "\n");
46 pagmo::print("Chromosome: ", pop.get_x()[idx], "\n");
47 pagmo::print("Pretty Formula: ", udp.pretty(pop.get_x()[idx]), "\n");
48 pagmo::print("Prettier Formula: ", prettier, "\n");
49 pagmo::print("Expanded Formula: ", SymEngine::expand(SymEngine::Expression(prettier)), "\n");
50 return false;
51}
Output:
Note: the actual output is non deterministic. Sometimes, with a bit of luck :) the problem is solved exaclty (loss goes to zero). The following output reports one of these occurences. Note that in the traditional evolutionary approach this result is incredibly hard to obtain.
Gen: Fevals: Best: Constants: Formula:
0 0 4009.59 [-2.41031] [(c1 + x0)*x0] ...
50 200 0.978909 [0.238557] [x0**2*c1*(-x0 + x0**4) - (c1 + x0 + x0* ...
100 400 0.84565 [0.240548] [x0**2*c1*(-x0 + x0**4) - (c1 + 2*x0)] ...
150 600 0.761757 [0.240032] [-2*x0 + x0**2*c1*(-x0 + x0**4)] ...
200 800 0.0170582 [-1.16484] [(-x0 + x0**3)*(c1 + x0**2) - x0**3] ...
250 1000 0.0170582 [-0.164837] [(-x0 + x0**3)*(c1 + x0**2) - (-x0 + 2*x ...
300 1200 0.0170582 [-0.164837] [(-x0 + x0**3)*(c1 + x0**2) - (-x0 + 2*x ...
350 1400 0.0170582 [-0.164837] [(-x0 + x0**3)*(c1 + x0**2) - (-x0 + 2*x ...
357 1428 2.17578e-29 [-1.14159] [x0**3*(c1 + x0**2) - (-x0 + 2*x0**3)] ...
Exit condition -- ftol < 1e-08
Best fitness: [2.17578e-29]
Chromosome: [-1.14159, 2, 0, 0, 2, ... ]
Pretty Formula: [(((c1+(x0*x0))*((x0*x0)*x0))-(((x0*x0)*x0)+(((x0*x0)*x0)-x0)))]
Prettier Formula: x0**3*(c1 + x0**2) - (-x0 + 2*x0**3)
Expanded Formula: x0 + x0**3*c1 - 2*x0**3 + x0**5