Learning constants in a symbolic regression task (deprecated)
One of the long standing “skeletons in the closet” of GP techniques is the constant finding problem. It is widely acknowledged that the ephemeral random constant approach, de facto the main solution proposed to this problem, is far from being satisfactory.
Using dCGP, we are here able to successfully learn constants as well as expressions during evolution thanks to the hybridization of the evolutionary strategy with a, second order, gradient descent approach that learns the exact value of ephemeral constants, thus avoiding to build such a value by applying kernel functions to the constants.
NOTE: since v1.4 symbolic regression is performed via dedicated classes and not manipulating directly the dcgpy.expression
Lets first import dcgpy and pyaudi and set up things as to compute our CGP using the type “gdual” and thus get for free all derivatives
[1]:
from dcgpy import expression_gdual_vdouble as expression
from dcgpy import kernel_set_gdual_vdouble as kernel_set
from pyaudi import gdual_vdouble as gdual
import pyaudi
from matplotlib import pyplot as plt
import numpy as np
from random import randint
%matplotlib inline
The kernel functions
[2]:
# note he use of the protected division "pdiv" (not necessary here)
# note the call operator (returns the list of kernels)
kernels = kernel_set(["sum", "mul", "diff","pdiv"])()
The ES-(1+\(\lambda\)) algorithm
[41]:
def run_experiment(dCGP, offsprings, max_gen, x, yt, screen_output):
# The offsprings chromosome, fitness and constant
chromosome = [1] * offsprings
fitness = [1] *offsprings
constant = [1]*offsprings
# Init the best as the initial random dCGP
best_chromosome = dCGP.get()
best_constant = 1.
fit, _ = err2(dCGP, x, yt, best_constant)
best_fitness = sum(fit.constant_cf)
# Main loop over generations
for g in range(max_gen):
for i in range(offsprings):
dCGP.set(best_chromosome)
cumsum=0
dCGP.mutate_active(i+1)
fit, constant[i] = err2(dCGP, x, yt, best_constant)
fitness[i] = sum(fit.constant_cf )
chromosome[i] = dCGP.get()
for i in range(offsprings):
if fitness[i] <= best_fitness:
if (fitness[i] != best_fitness):
best_chromosome = chromosome[i]
best_fitness = fitness[i]
best_constant = constant[i]
dCGP.set(best_chromosome)
if screen_output:
print("New best found: gen: ", g, " value: ", fitness[i], dCGP.simplify(["x","c"]), "c =", best_constant)
if best_fitness < 1e-14:
break
return g, best_chromosome, best_constant
The test problems
As target functions, we define three different problems of increasing complexity:
P1: \(x^5 - \pi x^3 + x\)
P2: \(x^5 - \pi x^3 + \frac{2\pi}x\)
P3: \(\frac{e x^5 + x^3}{x + 1}\)
P4: \(\sin(\pi x) + \frac 1x\)
note how \(\pi\) and \(e\) are present in the expressions.
[45]:
# The following functions create the target values for a gridded input x for different test problems
def data_P1(x):
return x**5 - np.pi*x**3 + x
def data_P2(x):
return x**5 - np.pi*x**3 + 2*np.pi / x
def data_P3(x):
return (np.e*x**5 + x**3)/(x + 1)
def data_P4(x):
return pyaudi.sin(np.pi * x) + 1./x
def data_P5(x):
return np.e * x**5 - np.pi*x**3 + np.sqrt(2) * x
The error functions
[5]:
# This is the quadratic error of a dCGP expression when the constant value is cin. The error is computed
# over the input points xin (of type gdual, order 0 as we are not interested in expanding the program w.r.t. these)
# The target values are contained in yt (of type gdual, order 0 as we are not interested in expanding the program w.r.t. these)
def err(dCGP, xin, yt, cin):
c = gdual([cin], "c", 2)
y = dCGP([xin,c])[0]
return (y-yt)**2
# This is the quadratic error of the expression when the constant value is learned using a, one step,
# second order method.
def err2(dCGP, xin, yt,cin):
c = gdual([cin], "c", 2)
y = dCGP([xin,c])[0]
error = err(dCGP,xin,yt,cin)
dc = sum(error.get_derivative({"dc":1}))
dc2 = sum(error.get_derivative({"dc":2}))
if dc2 != 0:
learned_constant = c - dc/dc2
y = dCGP([xin, learned_constant])[0]
else:
learned_constant = c
return (y-yt)**2, learned_constant.constant_cf[0]
Problem P1: \(x^5 - \pi x^3 + x\)
[17]:
x = np.linspace(1,3,10)
x = gdual(x)
yt = data_P1(x)
[18]:
# We run nexp experiments and accumulate statistic for the ERT
nexp = 100
offsprings = 4
max_gen=1000
res = []
kernels = kernel_set(["sum", "mul", "diff","pdiv"])()
print("restart: \t gen: \t expression:")
for i in range(nexp):
dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = randint(0,1233456))
g, best_chromosome, best_constant = run_experiment(dCGP, offsprings,max_gen,x,yt, screen_output=False)
res.append(g)
dCGP.set(best_chromosome)
if g < (max_gen-1):
print(i, "\t\t", res[i], "\t", dCGP(["x","c"]), " a.k.a ", dCGP.simplify(["x","c"]), "c = ", best_constant)
res = np.array(res)
restart: gen: expression:
5 163 ['((((x*x)-c)/(x/((x*x)*(x*x))))+x)'] a.k.a [-c*x**3 + x**5 + x] c = 3.141592653589793
8 452 ['(x+((((((x/c)*(c*x))-(c+c))*((x/c)*(c*x)))-((x/c)*(c*x)))*x))'] a.k.a [-2*c*x**3 + x**5 - x**3 + x] c = 1.0707963267948972
12 309 ['(((x*x)*(((c-(x*x))*(x*x))/(((c-(x*x))*(x*x))*x)))-(((c-(x*x))*(x*x))*x))'] a.k.a [-c*x**3 + x**5 + x] c = 3.141592653589793
15 764 ['(((x/c)*c)+((x*(x*x))*((x*x)-c)))'] a.k.a [-c*x**3 + x**5 + x] c = 3.1415926535897927
18 193 ['(((x*(x*x))*((x*x)+c))-(c-(c+x)))'] a.k.a [c*x**3 + x**5 + x] c = -3.1415926535897927
36 334 ['(x+((x*x)*(((((x*x)-c)+x)*x)-(x*x))))'] a.k.a [-c*x**3 + x**5 + x] c = 3.1415926535897927
54 826 ['((((x*x)*x)*((c+(x*x))+c))+x)'] a.k.a [2*c*x**3 + x**5 + x] c = -1.5707963267948968
57 154 ['(((x+((x*x)+c))-((x*x)+c))+(((x*x)+c)*((x*x)*x)))'] a.k.a [c*x**3 + x**5 + x] c = -3.1415926535897927
58 312 ['((((x*x)+c)*(x*(x*x)))-((x*(x*x))+((x*(x*x))-x)))'] a.k.a [c*x**3 + x**5 - 2*x**3 + x] c = -1.1415926535897931
66 378 ['(((((x*x)-c)*(x*x))*x)+x)'] a.k.a [-c*x**3 + x**5 + x] c = 3.1415926535897927
72 68 ['(((c/x)/((c/x)/x))-((((x*c)*x)*x)-((((x*c)*x)*x)/((c/x)/x))))'] a.k.a [-c*x**3 + x**5 + x] c = 3.141592653589794
77 67 ['(x+((x*x)*(x*((x*x)+c))))'] a.k.a [c*x**3 + x**5 + x] c = -3.1415926535897927
81 891 ['(((x*(x/(x/x)))+(x*((x*(x/(x/x)))*((x*(x/(x/x)))+(c+(x/x))))))-((x*(x/(x/x)))-x))'] a.k.a [c*x**3 + x**5 + x**3 + x] c = -4.141592653589793
83 815 ['(((x*x)*(((x*x)-c)*x))+((x*c)/c))'] a.k.a [-c*x**3 + x**5 + x] c = 3.1415926535897927
97 210 ['((((x*x)/c)+((x*x)*(x*x)))*(x+(c/x)))'] a.k.a [c*x**3 + x**5 + x + x**3/c] c = -2.782160919529689
[8]:
mean_gen = sum(res) / sum(res<(max_gen-1))
print("ERT Expected run time = avg. number of dCGP evaluations needed: ", mean_gen * offsprings)
ERT Expected run time = avg. number of dCGP evaluations needed: 19902.4444444
Problem P2 - \(x^5 - \pi x^3 + \frac{2\pi}x\)
[9]:
x = np.linspace(0.1,5,10) # we include points close to zero here to favour learning of 1/x
x = gdual(x)
yt = data_P2(x)
[10]:
# We run nexp experiments and accumulate statistic for the ERT
nexp = 100
offsprings = 4
max_gen=5000
res = []
kernels = kernel_set(["sum", "mul", "diff","pdiv"])()
print("restart: \t gen: \t expression:")
for i in range(nexp):
dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = randint(0,1233456))
g, best_chromosome, best_constant = run_experiment(dCGP, offsprings,max_gen,x,yt, screen_output=False)
res.append(g)
dCGP.set(best_chromosome)
if g < (max_gen-1):
print(i, "\t\t", res[i], "\t", dCGP(["x","c"]), " a.k.a ", dCGP.simplify(["x","c"]), "c = ", best_constant)
res = np.array(res)
restart: gen: expression:
2 4663 ['(((x-((((x*x)-c)*x)/(x*x)))+((((x*x)-c)*x)*(x*x)))+(x-((((x*x)-c)*x)/(x*x))))'] a.k.a [-c*x**3 + 2*c/x + x**5] c = 3.1415926535897913
6 1520 ['((x*((x*x)*((x*x)+c)))+((((x*x)-c)-((x*x)+c))/x))'] a.k.a [c*x**3 - 2*c/x + x**5] c = -3.1415926535897927
15 2945 ['(((x*((c*x)/c))*((x*(x*((c*x)/c)))+(c*x)))-((c+c)*(c/(c*x))))'] a.k.a [c*x**3 - 2*c/x + x**5] c = -3.1415926535897927
18 1937 ['((((x*x)*x)*(c+(x*x)))-(((x/(x*x))*c)+((x/(x*x))*c)))'] a.k.a [c*x**3 - 2*c/x + x**5] c = -3.141592653589793
20 3489 ['(x*((c/(x*x))-(((x*x)*(c-(x*x)))-(c/(x*x)))))'] a.k.a [-c*x**3 + 2*c/x + x**5] c = 3.1415926535897905
21 3427 ['((((c+(x*x))*((x*x)*x))+x)-(((c+(x*x))*(x/(x*x)))-(x-((c+(x*x))*(x/(x*x))))))'] a.k.a [c*x**3 - 2*c/x + x**5] c = -3.1415926535897944
36 1999 ['((((x*x)*(x*c))*((x*x)/c))-(((x*x)*(x*c))-((c+c)/x)))'] a.k.a [-c*x**3 + 2*c/x + x**5] c = 3.141592653589792
44 3911 ['((((c+(x*x))*(x*x))-((c/(x*x))+(c/(x*x))))*x)'] a.k.a [c*x**3 - 2*c/x + x**5] c = -3.141592653589789
52 4993 ['((((x-x)/(c+c))-((x*(x*x))*(c-(x*x))))+((c+c)/x))'] a.k.a [-c*x**3 + 2*c/x + x**5] c = 3.141592653589792
56 3723 ['(((x-((c+(x*x))/x))+(x-((c+(x*x))/x)))+((c+(x*x))*(x*(x*x))))'] a.k.a [c*x**3 - 2*c/x + x**5] c = -3.1415926535897913
68 3551 ['(((((x*x)-c)-(c+(x*x)))/x)+((c+(x*x))*((x*x)*x)))'] a.k.a [c*x**3 - 2*c/x + x**5] c = -3.1415926535897936
69 1716 ['(((((x*x)-c)*(x*x))*x)+((c+c)/x))'] a.k.a [-c*x**3 + 2*c/x + x**5] c = 3.141592653589794
75 315 ['(((((x*x)*((x*x)+c))/(c/x))*((c/x)*x))-((c/x)+(c/x)))'] a.k.a [c*x**3 - 2*c/x + x**5] c = -3.1415926535897922
76 3441 ['((((x*x)*((x*x)+c))*x)-((c/x)+(c/x)))'] a.k.a [c*x**3 - 2*c/x + x**5] c = -3.141592653589797
79 1366 ['(((c/x)+(c/x))+((((x*x)-c)*x)/((c/x)/((x*x)*(c/x)))))'] a.k.a [-c*x**3 + 2*c/x + x**5] c = 3.1415926535897927
[11]:
mean_gen = sum(res) / sum(res<(max_gen-1))
print("ERT Expected run time = avg. number of dCGP evaluations needed: ", mean_gen * offsprings)
ERT Expected run time = avg. number of dCGP evaluations needed: 124776.266667
Problem P3 - \(\frac{e x^5 + x^3}{x + 1}\)
[12]:
x = np.linspace(-0.9,1,10)
x = gdual(x)
yt = data_P3(x)
[13]:
# We run nexp experiments and accumulate statistic for the ERT
nexp = 100
offsprings = 4
max_gen=5000
res = []
kernels = kernel_set(["sum", "mul", "diff","pdiv"])()
print("restart: \t gen: \t expression:")
for i in range(nexp):
dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = randint(0,1233456))
g, best_chromosome, best_constant = run_experiment(dCGP, offsprings,max_gen,x,yt, screen_output=False)
res.append(g)
dCGP.set(best_chromosome)
if g < (max_gen-1):
print(i, "\t\t", res[i], "\t", dCGP(["x","c"]), " a.k.a ", dCGP.simplify(["x","c"]), "c = ", best_constant)
res = np.array(res)
restart: gen: expression:
1 2610 ['((((x*x)-((x*x)*(c*(x*x))))*(x*x))/(x+(x*x)))'] a.k.a [-c*x**6/(x**2 + x) + x**4/(x**2 + x)] c = -2.7182818284590438
5 3618 ['(((x*(x*(c*x)))*(x/((c*x)+(x*(c*x)))))*(((x-c)/(x-c))+(x*(c*x))))'] a.k.a [c**2*x**6/(c*x**2 + c*x) + c*x**4/(c*x**2 + c*x)] c = 2.7182818284590478
10 3956 ['(((x*x)*((x*x)+(((x*x)*c)*(x*x))))/((x*x)+x))'] a.k.a [c*x**6/(x**2 + x) + x**4/(x**2 + x)] c = 2.7182818284590455
14 4942 ['((((((c*x)*x)/c)*(((c/c)+((c*x)*x))*((c*x)*x)))*(c/c))/((c*x)+((c*x)*x)))'] a.k.a [c**2*x**6/(c*x**2 + c*x) + c*x**4/(c*x**2 + c*x)] c = 2.718281828459043
27 4107 ['((((c*(x*x))*(x*x))+(x*x))*(((c*(x*x))*(x*x))/((x*(c*(x*x)))+((x*(c*(x*x)))*x))))'] a.k.a [c**2*x**8/(c*x**4 + c*x**3) + c*x**6/(c*x**4 + c*x**3)] c = 2.7182818284590287
77 1921 ['((x*((x*x)+(x*(((x*x)*c)*x))))/(((x*((x*x)+(x*(((x*x)*c)*x))))/(x*((x*x)+(x*(((x*x)*c)*x)))))+x))'] a.k.a [c*x**5/(x + 1) + x**3/(x + 1)] c = 2.7182818284590455
78 3011 ['(((x-((x*(c*x))*x))/((x*(c*x))+((x*(c*x))/x)))*((x*(c*x))*x))'] a.k.a [-c**2*x**6/(c*x**2 + c*x) + c*x**4/(c*x**2 + c*x)] c = -2.7182818284590495
[14]:
mean_gen = sum(res) / sum(res<(max_gen-1))
print("ERT Expected run time = avg. number of dCGP evaluations needed: ", mean_gen * offsprings)
ERT Expected run time = avg. number of dCGP evaluations needed: 279469.714286
Problem P4: \(\sin(\pi x) + \frac 1x\)
[42]:
x = np.linspace(-1,1,10)
x = gdual(x)
yt = data_P4(x)
[43]:
# We run nexp experiments and accumulate statistic for the ERT
nexp = 100
offsprings = 4
max_gen=5000
res = []
kernels = kernel_set(["sum", "mul", "diff","pdiv","sin"])()
print("restart: \t gen: \t expression:")
for i in range(nexp):
dCGP = expression(inputs=2, outputs=1, rows=1, cols=15, levels_back=16, arity=2, kernels=kernels, seed = randint(0,1233456))
g, best_chromosome, best_constant = run_experiment(dCGP, offsprings,max_gen,x,yt, screen_output=False)
res.append(g)
dCGP.set(best_chromosome)
if g < (max_gen-1):
print(i, "\t\t", res[i], "\t", dCGP(["x","c"]), " a.k.a ", dCGP.simplify(["x","c"]), "c = ", best_constant)
res = np.array(res)
restart: gen: expression:
5 2914 ['(sin((x/c))+(((x-(x/c))/((x*(x-(x/c)))/c))/(x/(x/c))))'] a.k.a [sin(x/c) + 1/x] c = 0.3183098859546274
12 808 ['((c/((x*c)+((x*c)-(x*c))))+sin((x+((x*c)+((x*c)-(x*c))))))'] a.k.a [sin(c*x + x) + 1/x] c = 2.1415926536812546
17 1434 ['(sin(((x*c)+x))+(c/(x*c)))'] a.k.a [sin(c*x + x) + 1/x] c = 2.1415926536986984
28 1079 ['((c/(x*c))+sin((x+((x*c)+(x*c)))))'] a.k.a [sin(2*c*x + x) + 1/x] c = 1.0707963267948972
29 2113 ['(sin((((x/c)/c)+(x+x)))+((x/x)/x))'] a.k.a [sin(2*x + x/c**2) + 1/x] c = -0.935932260872573
35 3739 ['(((c/c)/x)+sin(((c-(c-x))+((x*(c+(c/c)))+x))))'] a.k.a [sin(c*x + 3*x) + 1/x] c = 0.14159265368124904
36 127 ['((c/(x*c))-(sin(((x*c)-c))*((c/(x*c))/(c/(x*c)))))'] a.k.a [-sin(c*x - c) + 1/x] c = 3.1415926536189205
39 805 ['(sin(((x*c)+(x*c)))+(c/(x*c)))'] a.k.a [sin(2*c*x) + 1/x] c = 1.5707963267952043
46 4412 ['(sin(((x/c)+x))+((sin(((x/c)+x))+x)/((sin(((x/c)+x))+x)*x)))'] a.k.a [sin(x + x/c) + 1/x] c = 0.4669422068457747
53 2197 ['((((c/x)*(c/x))*(((c/(c/x))/c)/c))+sin((((c/(c/x))/c)/c)))'] a.k.a [sin(x/c**2) + 1/x] c = 0.5641895835477555
54 4082 ['((((c+(sin((x*c))*sin((x*c))))/(sin((x*c))*sin((x*c))))-c)/((((c+(sin((x*c))*sin((x*c))))/(sin((x*c))*sin((x*c))))-c)/((c/(x*c))+sin((x*c)))))'] a.k.a [sin(c*x) + 1/x] c = 3.1415926535902257
59 601 ['(sin((x*c))+(c/(x*c)))'] a.k.a [sin(c*x) + 1/x] c = 3.1415926537675927
64 4031 ['(sin((x+(x*c)))+(c/(x*c)))'] a.k.a [sin(c*x + x) + 1/x] c = 2.1415926536986984
66 1069 ['((x/(x*x))+sin(((x+x)+(x/sin(c)))))'] a.k.a [sin(2*x + x/sin(c)) + 1/x] c = 2.0741512360814633
73 978 ['((c/(x*c))+sin((x*c)))'] a.k.a [sin(c*x) + 1/x] c = 3.141592653745218
74 2798 ['(sin((c*x))+(((c*x)/(c*x))/x))'] a.k.a [sin(c*x) + 1/x] c = 3.141592653590368
81 2771 ['(sin((x+(x*c)))+(c/(x*c)))'] a.k.a [sin(c*x + x) + 1/x] c = 2.1415926536986984
83 2865 ['(sin(((x-(x*c))-(x*c)))+(c/(x*c)))'] a.k.a [-sin(2*c*x - x) + 1/x] c = -1.0707963269360328
90 3688 ['(sin(((c*(x*c))-x))+(c/(x*c)))'] a.k.a [sin(c**2*x - x) + 1/x] c = 2.035090330583488
[44]:
mean_gen = sum(res) / sum(res<(max_gen-1))
print("ERT Expected run time = avg. number of dCGP evaluations needed: ", mean_gen * offsprings)
ERT Expected run time = avg. number of dCGP evaluations needed: 94195.7894737
[ ]: