Learning constants in a symbolic regression task (deprecated)

One of the long standing “skeletons in the closet” of GP techniques is the constant finding problem. It is widely acknowledged that the ephemeral random constant approach, de facto the main solution proposed to this problem, is far from being satisfactory.

Using dCGP, we are here able to successfully learn constants as well as expressions during evolution thanks to the hybridization of the evolutionary strategy with a, second order, gradient descent approach that learns the exact value of ephemeral constants, thus avoiding to build such a value by applying kernel functions to the constants.

NOTE: since v1.4 symbolic regression is performed via dedicated classes and not manipulating directly the dcgpy.expression

Lets first import dcgpy and pyaudi and set up things as to compute our CGP using the type “gdual” and thus get for free all derivatives

[1]:
from dcgpy import expression_gdual_vdouble as expression
from dcgpy import kernel_set_gdual_vdouble as kernel_set
from pyaudi import gdual_vdouble as gdual
import pyaudi
from matplotlib import pyplot as plt
import numpy as np
from random import randint
%matplotlib inline

The kernel functions

[2]:
# note he use of the protected division "pdiv" (not necessary here)
# note the call operator (returns the list of kernels)
kernels = kernel_set(["sum", "mul", "diff","pdiv"])()

The ES-(1+\(\lambda\)) algorithm

[41]:
def run_experiment(dCGP, offsprings, max_gen, x, yt, screen_output):
    # The offsprings chromosome, fitness and constant
    chromosome = [1] * offsprings
    fitness = [1] *offsprings
    constant = [1]*offsprings
    # Init the best as the initial random dCGP
    best_chromosome = dCGP.get()
    best_constant = 1.
    fit, _ = err2(dCGP, x, yt, best_constant)
    best_fitness = sum(fit.constant_cf)
    # Main loop over generations
    for g in range(max_gen):
        for i in range(offsprings):
            dCGP.set(best_chromosome)
            cumsum=0
            dCGP.mutate_active(i+1)
            fit, constant[i] = err2(dCGP, x, yt, best_constant)
            fitness[i] = sum(fit.constant_cf )
            chromosome[i] = dCGP.get()
        for i in range(offsprings):
            if fitness[i] <= best_fitness:
                if (fitness[i] != best_fitness):
                    best_chromosome = chromosome[i]
                    best_fitness = fitness[i]
                    best_constant = constant[i]
                    dCGP.set(best_chromosome)
                    if screen_output:
                        print("New best found: gen: ", g, " value: ", fitness[i],  dCGP.simplify(["x","c"]), "c =", best_constant)

        if best_fitness < 1e-14:
            break
    return g, best_chromosome, best_constant

The test problems

As target functions, we define three different problems of increasing complexity:

P1: \(x^5 - \pi x^3 + x\)

P2: \(x^5 - \pi x^3 + \frac{2\pi}x\)

P3: \(\frac{e x^5 + x^3}{x + 1}\)

P4: \(\sin(\pi x) + \frac 1x\)

note how \(\pi\) and \(e\) are present in the expressions.

[45]:
# The following functions create the target values for a gridded input x for different test problems
def data_P1(x):
    return x**5 - np.pi*x**3 + x
def data_P2(x):
    return x**5 - np.pi*x**3 + 2*np.pi / x
def data_P3(x):
    return (np.e*x**5 + x**3)/(x + 1)
def data_P4(x):
    return pyaudi.sin(np.pi * x) + 1./x
def data_P5(x):
    return np.e * x**5 - np.pi*x**3 + np.sqrt(2) * x

The error functions

[5]:
# This is the quadratic error of a dCGP expression when the constant value is cin. The error is computed
# over the input points xin (of type gdual, order 0 as we are not interested in expanding the program w.r.t. these)
# The target values are contained in yt (of type gdual, order 0 as we are not interested in expanding the program w.r.t. these)
def err(dCGP, xin, yt, cin):
    c = gdual([cin], "c", 2)
    y = dCGP([xin,c])[0]
    return (y-yt)**2

# This is the quadratic error of the expression when the constant value is learned using a, one step,
# second order method.
def err2(dCGP, xin, yt,cin):
    c = gdual([cin], "c", 2)
    y = dCGP([xin,c])[0]
    error = err(dCGP,xin,yt,cin)
    dc =  sum(error.get_derivative({"dc":1}))
    dc2 = sum(error.get_derivative({"dc":2}))
    if dc2 != 0:
        learned_constant = c - dc/dc2
        y = dCGP([xin, learned_constant])[0]
    else:
        learned_constant = c
    return (y-yt)**2, learned_constant.constant_cf[0]

Problem P1: \(x^5 - \pi x^3 + x\)

[17]:
x = np.linspace(1,3,10)
x = gdual(x)
yt = data_P1(x)
[18]:
# We run nexp experiments and accumulate statistic for the ERT
nexp = 100
offsprings = 4
max_gen=1000
res = []
kernels = kernel_set(["sum", "mul", "diff","pdiv"])()
print("restart: \t gen: \t expression:")
for i in range(nexp):
    dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = randint(0,1233456))
    g, best_chromosome, best_constant = run_experiment(dCGP, offsprings,max_gen,x,yt, screen_output=False)
    res.append(g)
    dCGP.set(best_chromosome)
    if g < (max_gen-1):
        print(i, "\t\t", res[i], "\t", dCGP(["x","c"]), " a.k.a ", dCGP.simplify(["x","c"]), "c = ", best_constant)
res = np.array(res)
restart:         gen:    expression:
5                163     ['((((x*x)-c)/(x/((x*x)*(x*x))))+x)']  a.k.a  [-c*x**3 + x**5 + x] c =  3.141592653589793
8                452     ['(x+((((((x/c)*(c*x))-(c+c))*((x/c)*(c*x)))-((x/c)*(c*x)))*x))']  a.k.a  [-2*c*x**3 + x**5 - x**3 + x] c =  1.0707963267948972
12               309     ['(((x*x)*(((c-(x*x))*(x*x))/(((c-(x*x))*(x*x))*x)))-(((c-(x*x))*(x*x))*x))']  a.k.a  [-c*x**3 + x**5 + x] c =  3.141592653589793
15               764     ['(((x/c)*c)+((x*(x*x))*((x*x)-c)))']  a.k.a  [-c*x**3 + x**5 + x] c =  3.1415926535897927
18               193     ['(((x*(x*x))*((x*x)+c))-(c-(c+x)))']  a.k.a  [c*x**3 + x**5 + x] c =  -3.1415926535897927
36               334     ['(x+((x*x)*(((((x*x)-c)+x)*x)-(x*x))))']  a.k.a  [-c*x**3 + x**5 + x] c =  3.1415926535897927
54               826     ['((((x*x)*x)*((c+(x*x))+c))+x)']  a.k.a  [2*c*x**3 + x**5 + x] c =  -1.5707963267948968
57               154     ['(((x+((x*x)+c))-((x*x)+c))+(((x*x)+c)*((x*x)*x)))']  a.k.a  [c*x**3 + x**5 + x] c =  -3.1415926535897927
58               312     ['((((x*x)+c)*(x*(x*x)))-((x*(x*x))+((x*(x*x))-x)))']  a.k.a  [c*x**3 + x**5 - 2*x**3 + x] c =  -1.1415926535897931
66               378     ['(((((x*x)-c)*(x*x))*x)+x)']  a.k.a  [-c*x**3 + x**5 + x] c =  3.1415926535897927
72               68      ['(((c/x)/((c/x)/x))-((((x*c)*x)*x)-((((x*c)*x)*x)/((c/x)/x))))']  a.k.a  [-c*x**3 + x**5 + x] c =  3.141592653589794
77               67      ['(x+((x*x)*(x*((x*x)+c))))']  a.k.a  [c*x**3 + x**5 + x] c =  -3.1415926535897927
81               891     ['(((x*(x/(x/x)))+(x*((x*(x/(x/x)))*((x*(x/(x/x)))+(c+(x/x))))))-((x*(x/(x/x)))-x))']  a.k.a  [c*x**3 + x**5 + x**3 + x] c =  -4.141592653589793
83               815     ['(((x*x)*(((x*x)-c)*x))+((x*c)/c))']  a.k.a  [-c*x**3 + x**5 + x] c =  3.1415926535897927
97               210     ['((((x*x)/c)+((x*x)*(x*x)))*(x+(c/x)))']  a.k.a  [c*x**3 + x**5 + x + x**3/c] c =  -2.782160919529689
[8]:
mean_gen = sum(res) / sum(res<(max_gen-1))
print("ERT Expected run time = avg. number of dCGP evaluations needed: ", mean_gen * offsprings)
ERT Expected run time = avg. number of dCGP evaluations needed:  19902.4444444

Problem P2 - \(x^5 - \pi x^3 + \frac{2\pi}x\)

[9]:
x = np.linspace(0.1,5,10) # we include points close to zero here to favour learning of 1/x
x = gdual(x)
yt = data_P2(x)

[10]:
# We run nexp experiments and accumulate statistic for the ERT
nexp = 100
offsprings = 4
max_gen=5000
res = []
kernels = kernel_set(["sum", "mul", "diff","pdiv"])()
print("restart: \t gen: \t expression:")
for i in range(nexp):
    dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = randint(0,1233456))
    g, best_chromosome, best_constant = run_experiment(dCGP, offsprings,max_gen,x,yt, screen_output=False)
    res.append(g)
    dCGP.set(best_chromosome)
    if g < (max_gen-1):
        print(i, "\t\t", res[i], "\t", dCGP(["x","c"]), " a.k.a ", dCGP.simplify(["x","c"]), "c = ", best_constant)
res = np.array(res)
restart:         gen:    expression:
2                4663    ['(((x-((((x*x)-c)*x)/(x*x)))+((((x*x)-c)*x)*(x*x)))+(x-((((x*x)-c)*x)/(x*x))))']  a.k.a  [-c*x**3 + 2*c/x + x**5] c =  3.1415926535897913
6                1520    ['((x*((x*x)*((x*x)+c)))+((((x*x)-c)-((x*x)+c))/x))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.1415926535897927
15               2945    ['(((x*((c*x)/c))*((x*(x*((c*x)/c)))+(c*x)))-((c+c)*(c/(c*x))))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.1415926535897927
18               1937    ['((((x*x)*x)*(c+(x*x)))-(((x/(x*x))*c)+((x/(x*x))*c)))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.141592653589793
20               3489    ['(x*((c/(x*x))-(((x*x)*(c-(x*x)))-(c/(x*x)))))']  a.k.a  [-c*x**3 + 2*c/x + x**5] c =  3.1415926535897905
21               3427    ['((((c+(x*x))*((x*x)*x))+x)-(((c+(x*x))*(x/(x*x)))-(x-((c+(x*x))*(x/(x*x))))))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.1415926535897944
36               1999    ['((((x*x)*(x*c))*((x*x)/c))-(((x*x)*(x*c))-((c+c)/x)))']  a.k.a  [-c*x**3 + 2*c/x + x**5] c =  3.141592653589792
44               3911    ['((((c+(x*x))*(x*x))-((c/(x*x))+(c/(x*x))))*x)']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.141592653589789
52               4993    ['((((x-x)/(c+c))-((x*(x*x))*(c-(x*x))))+((c+c)/x))']  a.k.a  [-c*x**3 + 2*c/x + x**5] c =  3.141592653589792
56               3723    ['(((x-((c+(x*x))/x))+(x-((c+(x*x))/x)))+((c+(x*x))*(x*(x*x))))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.1415926535897913
68               3551    ['(((((x*x)-c)-(c+(x*x)))/x)+((c+(x*x))*((x*x)*x)))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.1415926535897936
69               1716    ['(((((x*x)-c)*(x*x))*x)+((c+c)/x))']  a.k.a  [-c*x**3 + 2*c/x + x**5] c =  3.141592653589794
75               315     ['(((((x*x)*((x*x)+c))/(c/x))*((c/x)*x))-((c/x)+(c/x)))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.1415926535897922
76               3441    ['((((x*x)*((x*x)+c))*x)-((c/x)+(c/x)))']  a.k.a  [c*x**3 - 2*c/x + x**5] c =  -3.141592653589797
79               1366    ['(((c/x)+(c/x))+((((x*x)-c)*x)/((c/x)/((x*x)*(c/x)))))']  a.k.a  [-c*x**3 + 2*c/x + x**5] c =  3.1415926535897927
[11]:
mean_gen = sum(res) / sum(res<(max_gen-1))
print("ERT Expected run time = avg. number of dCGP evaluations needed: ", mean_gen * offsprings)
ERT Expected run time = avg. number of dCGP evaluations needed:  124776.266667

Problem P3 - \(\frac{e x^5 + x^3}{x + 1}\)

[12]:
x = np.linspace(-0.9,1,10)
x = gdual(x)
yt = data_P3(x)

[13]:
# We run nexp experiments and accumulate statistic for the ERT
nexp = 100
offsprings = 4
max_gen=5000
res = []
kernels = kernel_set(["sum", "mul", "diff","pdiv"])()
print("restart: \t gen: \t expression:")
for i in range(nexp):
    dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = randint(0,1233456))
    g, best_chromosome, best_constant = run_experiment(dCGP, offsprings,max_gen,x,yt, screen_output=False)
    res.append(g)
    dCGP.set(best_chromosome)
    if g < (max_gen-1):
        print(i, "\t\t", res[i], "\t", dCGP(["x","c"]), " a.k.a ", dCGP.simplify(["x","c"]), "c = ", best_constant)
res = np.array(res)
restart:         gen:    expression:
1                2610    ['((((x*x)-((x*x)*(c*(x*x))))*(x*x))/(x+(x*x)))']  a.k.a  [-c*x**6/(x**2 + x) + x**4/(x**2 + x)] c =  -2.7182818284590438
5                3618    ['(((x*(x*(c*x)))*(x/((c*x)+(x*(c*x)))))*(((x-c)/(x-c))+(x*(c*x))))']  a.k.a  [c**2*x**6/(c*x**2 + c*x) + c*x**4/(c*x**2 + c*x)] c =  2.7182818284590478
10               3956    ['(((x*x)*((x*x)+(((x*x)*c)*(x*x))))/((x*x)+x))']  a.k.a  [c*x**6/(x**2 + x) + x**4/(x**2 + x)] c =  2.7182818284590455
14               4942    ['((((((c*x)*x)/c)*(((c/c)+((c*x)*x))*((c*x)*x)))*(c/c))/((c*x)+((c*x)*x)))']  a.k.a  [c**2*x**6/(c*x**2 + c*x) + c*x**4/(c*x**2 + c*x)] c =  2.718281828459043
27               4107    ['((((c*(x*x))*(x*x))+(x*x))*(((c*(x*x))*(x*x))/((x*(c*(x*x)))+((x*(c*(x*x)))*x))))']  a.k.a  [c**2*x**8/(c*x**4 + c*x**3) + c*x**6/(c*x**4 + c*x**3)] c =  2.7182818284590287
77               1921    ['((x*((x*x)+(x*(((x*x)*c)*x))))/(((x*((x*x)+(x*(((x*x)*c)*x))))/(x*((x*x)+(x*(((x*x)*c)*x)))))+x))']  a.k.a  [c*x**5/(x + 1) + x**3/(x + 1)] c =  2.7182818284590455
78               3011    ['(((x-((x*(c*x))*x))/((x*(c*x))+((x*(c*x))/x)))*((x*(c*x))*x))']  a.k.a  [-c**2*x**6/(c*x**2 + c*x) + c*x**4/(c*x**2 + c*x)] c =  -2.7182818284590495
[14]:
mean_gen = sum(res) / sum(res<(max_gen-1))
print("ERT Expected run time = avg. number of dCGP evaluations needed: ", mean_gen * offsprings)
ERT Expected run time = avg. number of dCGP evaluations needed:  279469.714286

Problem P4: \(\sin(\pi x) + \frac 1x\)

[42]:
x = np.linspace(-1,1,10)
x = gdual(x)
yt = data_P4(x)
[43]:
# We run nexp experiments and accumulate statistic for the ERT
nexp = 100
offsprings = 4
max_gen=5000
res = []
kernels = kernel_set(["sum", "mul", "diff","pdiv","sin"])()
print("restart: \t gen: \t expression:")
for i in range(nexp):
    dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = randint(0,1233456))
    g, best_chromosome, best_constant = run_experiment(dCGP, offsprings,max_gen,x,yt, screen_output=False)
    res.append(g)
    dCGP.set(best_chromosome)
    if g < (max_gen-1):
        print(i, "\t\t", res[i], "\t", dCGP(["x","c"]), " a.k.a ", dCGP.simplify(["x","c"]), "c = ", best_constant)
res = np.array(res)
restart:         gen:    expression:
5                2914    ['(sin((x/c))+(((x-(x/c))/((x*(x-(x/c)))/c))/(x/(x/c))))']  a.k.a  [sin(x/c) + 1/x] c =  0.3183098859546274
12               808     ['((c/((x*c)+((x*c)-(x*c))))+sin((x+((x*c)+((x*c)-(x*c))))))']  a.k.a  [sin(c*x + x) + 1/x] c =  2.1415926536812546
17               1434    ['(sin(((x*c)+x))+(c/(x*c)))']  a.k.a  [sin(c*x + x) + 1/x] c =  2.1415926536986984
28               1079    ['((c/(x*c))+sin((x+((x*c)+(x*c)))))']  a.k.a  [sin(2*c*x + x) + 1/x] c =  1.0707963267948972
29               2113    ['(sin((((x/c)/c)+(x+x)))+((x/x)/x))']  a.k.a  [sin(2*x + x/c**2) + 1/x] c =  -0.935932260872573
35               3739    ['(((c/c)/x)+sin(((c-(c-x))+((x*(c+(c/c)))+x))))']  a.k.a  [sin(c*x + 3*x) + 1/x] c =  0.14159265368124904
36               127     ['((c/(x*c))-(sin(((x*c)-c))*((c/(x*c))/(c/(x*c)))))']  a.k.a  [-sin(c*x - c) + 1/x] c =  3.1415926536189205
39               805     ['(sin(((x*c)+(x*c)))+(c/(x*c)))']  a.k.a  [sin(2*c*x) + 1/x] c =  1.5707963267952043
46               4412    ['(sin(((x/c)+x))+((sin(((x/c)+x))+x)/((sin(((x/c)+x))+x)*x)))']  a.k.a  [sin(x + x/c) + 1/x] c =  0.4669422068457747
53               2197    ['((((c/x)*(c/x))*(((c/(c/x))/c)/c))+sin((((c/(c/x))/c)/c)))']  a.k.a  [sin(x/c**2) + 1/x] c =  0.5641895835477555
54               4082    ['((((c+(sin((x*c))*sin((x*c))))/(sin((x*c))*sin((x*c))))-c)/((((c+(sin((x*c))*sin((x*c))))/(sin((x*c))*sin((x*c))))-c)/((c/(x*c))+sin((x*c)))))']  a.k.a  [sin(c*x) + 1/x] c =  3.1415926535902257
59               601     ['(sin((x*c))+(c/(x*c)))']  a.k.a  [sin(c*x) + 1/x] c =  3.1415926537675927
64               4031    ['(sin((x+(x*c)))+(c/(x*c)))']  a.k.a  [sin(c*x + x) + 1/x] c =  2.1415926536986984
66               1069    ['((x/(x*x))+sin(((x+x)+(x/sin(c)))))']  a.k.a  [sin(2*x + x/sin(c)) + 1/x] c =  2.0741512360814633
73               978     ['((c/(x*c))+sin((x*c)))']  a.k.a  [sin(c*x) + 1/x] c =  3.141592653745218
74               2798    ['(sin((c*x))+(((c*x)/(c*x))/x))']  a.k.a  [sin(c*x) + 1/x] c =  3.141592653590368
81               2771    ['(sin((x+(x*c)))+(c/(x*c)))']  a.k.a  [sin(c*x + x) + 1/x] c =  2.1415926536986984
83               2865    ['(sin(((x-(x*c))-(x*c)))+(c/(x*c)))']  a.k.a  [-sin(2*c*x - x) + 1/x] c =  -1.0707963269360328
90               3688    ['(sin(((c*(x*c))-x))+(c/(x*c)))']  a.k.a  [sin(c**2*x - x) + 1/x] c =  2.035090330583488
[44]:
mean_gen = sum(res) / sum(res<(max_gen-1))
print("ERT Expected run time = avg. number of dCGP evaluations needed: ", mean_gen * offsprings)
ERT Expected run time = avg. number of dCGP evaluations needed:  94195.7894737
[ ]: