Presentation is loading. Please wait.

Presentation is loading. Please wait.

Slightly beyond Turing’s computability for studying Genetic Programming Olivier Teytaud, Tao, Inria, Lri, UMR CNRS 8623, Univ. Paris-Sud, Pascal, Digiteo.

Similar presentations


Presentation on theme: "Slightly beyond Turing’s computability for studying Genetic Programming Olivier Teytaud, Tao, Inria, Lri, UMR CNRS 8623, Univ. Paris-Sud, Pascal, Digiteo."— Presentation transcript:

1 Slightly beyond Turing’s computability for studying Genetic Programming Olivier Teytaud, Tao, Inria, Lri, UMR CNRS 8623, Univ. Paris-Sud, Pascal, Digiteo

2 Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view Complexity point of view

3 What is Genetic Programming (GP) GP = mining Turing-equivalent spaces of functions GP = mining Turing-equivalent spaces of functions Typical example: symbolic regression. Typical example: symbolic regression. Inputs: Inputs: x1,x2,x3,…,xN in {0,1}* x1,x2,x3,…,xN in {0,1}* y1,y2,y3,…,yN in {0,1} yi=f(xi) y1,y2,y3,…,yN in {0,1} yi=f(xi) (xi,yi) assumed independently identically distributed (unknown distribution of probability) (xi,yi) assumed independently identically distributed (unknown distribution of probability) Goal: Goal: Finding g such that Finding g such that E|g(x)-y| + C E Time(g,x) as small as possible

4 How does GP works ? GP = evolutionary algorithm. GP = evolutionary algorithm. Evolutionary algorithm: Evolutionary algorithm: P = initial population P = initial population While (my favorite criterion) While (my favorite criterion) Selection = best functions in P according to some score Selection = best functions in P according to some score Mutations = random perturbations of progs in the Selection Mutations = random perturbations of progs in the Selection Cross-over = merging of programs in the Selection Cross-over = merging of programs in the Selection P ≈ Selection + Mutations + Cross-over P ≈ Selection + Mutations + Cross-over

5 How does GP works ? GP = evolutionary algorithm. GP = evolutionary algorithm. Evolutionary algorithm: Evolutionary algorithm: P = initial population P = initial population While (my favorite criterion) While (my favorite criterion) Selection = best functions in P according to some score Selection = best functions in P according to some score Mutations = random perturbations of progs in the Selection Mutations = random perturbations of progs in the Selection Cross-over = merging of programs in the Selection Cross-over = merging of programs in the Selection P ≈ Selection + Mutations + Cross-over P ≈ Selection + Mutations + Cross-over Does it work ?

6 How does GP works ? GP = evolutionary algorithm. GP = evolutionary algorithm. Evolutionary algorithm: Evolutionary algorithm: P = initial population P = initial population While (my favorite criterion) While (my favorite criterion) Selection = best functions in P according to some score Selection = best functions in P according to some score Mutations = random perturbations of progs in the Selection Mutations = random perturbations of progs in the Selection Cross-over = merging of programs in the Selection Cross-over = merging of programs in the Selection P ≈ Selection + Mutations + Cross-over P ≈ Selection + Mutations + Cross-over Does it work ? Definitely, yes for robust and multimodal optimization in complex domains (trees, bitstrings,…).

7 How does GP works ? GP = evolutionary algorithm. GP = evolutionary algorithm. Evolutionary algorithm: Evolutionary algorithm: P = initial population P = initial population While (my favorite criterion) While (my favorite criterion) Selection = best functions in P according to some score Selection = best functions in P according to some score Mutations = random perturbations of progs in the Selection Mutations = random perturbations of progs in the Selection Cross-over = merging of programs in the Selection Cross-over = merging of programs in the Selection P ≈ Selection + Mutations + Cross-over P ≈ Selection + Mutations + Cross-over Does it work ?

8 How does GP works ? GP = evolutionary algorithm. GP = evolutionary algorithm. Evolutionary algorithm: Evolutionary algorithm: P = initial population P = initial population While (my favorite criterion) While (my favorite criterion) Selection = best functions in P according to some score Selection = best functions in P according to some score Mutations = random perturbations of progs in the Selection Mutations = random perturbations of progs in the Selection Cross-over = merging of programs in the Selection Cross-over = merging of programs in the Selection P ≈ Selection + Mutations + Cross-over P ≈ Selection + Mutations + Cross-over Which score ? A nice question for mathematicians

9 Why studying GP ? GP is studied by many people GP is studied by many people 5440 articles in the GP bibliography [5] 5440 articles in the GP bibliography [5] More than 880 authors More than 880 authors GP seemingly works GP seemingly works Human-competitive results http://www.genetic- programming.com/humancompetitive.html Human-competitive results http://www.genetic- programming.com/humancompetitive.htmlhttp://www.genetic- programming.com/humancompetitive.htmlhttp://www.genetic- programming.com/humancompetitive.html Nothing else for mining Turing-equivalent spaces of programs Nothing else for mining Turing-equivalent spaces of programs Probably better than random search Probably better than random search Not so many mathematical fundations in GP Not so many mathematical fundations in GP Not so many open problems in computability, in particular with applications Not so many open problems in computability, in particular with applications

10 Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view Complexity point of view

11 Formalization of GP What is typically GP ? No halting criterion. We stop when time is exhausted. No halting criterion. We stop when time is exhausted. No use of prior knowledge; no use of f, whenever you know it. No use of prior knowledge; no use of f, whenever you know it. People (often) do not like GP because: It is slow and has no halting criterion It is slow and has no halting criterion It uses the yi=f(xi) and not f (different from automatic code generation) It uses the yi=f(xi) and not f (different from automatic code generation)  Are these two elements necessary ?

12 Iterative algorithms

13 Black-box ?

14 Formalization of GP Summary: GP uses only the f(xi) and the Time(f,xi). GP never halts: O1, O2, O3, …. Can we do better ?

15 Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view Complexity point of view

16 Known results Whenever f is available (and not only the f(xi) ), computing O such that O≡f O≡f O optimal for size (or speed, or space …) O optimal for size (or speed, or space …) is not possible. (i.e. there’s no Turing machine performing that task for all f)

17 A first (easy) good reason for GP. Whenever f is available (and not only the f(xi) ), computing O1, O2, …, such that Op ≡ f for p sufficiently large Op ≡ f for p sufficiently large Lim size(Op) optimal Lim size(Op) optimal is possible, with proved convergence rates, e.g. by bloat penalization: - while (true)- select the best program P for a compromise relevance on the n first examples + penalization of size, e.g. Sum |P(xi)-yi |+ C( |P|, n ) i < n i < n - n=n+1 (see details of the proof and of the algorithm in the paper)

18 A first (easy) good reason for GP. Whenever f is not available (and not only the f(xi) ), computing O1, O2, …, such that Op ≡ f for p sufficiently large Op ≡ f for p sufficiently large Lim size(Op) optimal Lim size(Op) optimal is possible, with proved convergence rates, e.g. by bloat penalization: - consider a population of programs; set n=1 - while (true)- select the best program P for a compromise relevance on the n first examples + penalization of size, e.g. Sum |P(xi)-yi |+ C( |P|, n ) i < n i < n - n=n+1 (see details of the proof and of the algorithm in the paper)

19 A first (easy) good reason for GP.  Asymptotically (only!), finding an optimal function O ≡ f is possible.  No halting criterion is possible (avoids the use of an oracle in 0’)

20 Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view Complexity point of view

21 Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view: Complexity point of view: Kolmogorov’s complexity with bounded time Kolmogorov’s complexity with bounded time Application to genetic programming Application to genetic programming

22 Kolmogorov’s complexity Kolmogorov’s complexity of x : Kolmogorov’s complexity of x : Minimum size of a program generating x Kolmogorov’s complexity of x with time at most T : Kolmogorov’s complexity of x with time at most T : Minimum size of a program generating x in time at most T. Kolmogorov’s complexity in bounded time = computable.

23 Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view: Complexity point of view: Kolmogorov’s complexity with bounded time Kolmogorov’s complexity with bounded time Application to genetic programming Application to genetic programming

24 Kolmogorov’s complexity and genetic programming GP uses expensive simulations of programs GP uses expensive simulations of programs Can we get rid of the simulation time ? e.g. by using f not only as a black box ? Can we get rid of the simulation time ? e.g. by using f not only as a black box ? Essentially, no: Essentially, no: Example of GP problem: finding O as small as possible with Example of GP problem: finding O as small as possible with ETime(O,x)<T n, ETime(O,x)<T n, |O|<S n |O|<S n O(x)=y O(x)=y If T n = Ω(2 n ) and some S n = O(log(n)), this requires time at least T n /polynomial(n) If T n = Ω(2 n ) and some S n = O(log(n)), this requires time at least T n /polynomial(n) Just simulating all programs shorter than S n and « faster » than T n is possible in time polynomial(n)T n Just simulating all programs shorter than S n and « faster » than T n is possible in time polynomial(n)T n

25 Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view: Complexity point of view: Kolmogorov’s complexity with bounded time Kolmogorov’s complexity with bounded time Application to genetic programming Application to genetic programming Conclusion Conclusion

26 Conclusion Summary Summary GP is typically solving approximately problems in 0’ GP is typically solving approximately problems in 0’ A lot of work about approximating NP-complete problems, but not a lot about 0’ A lot of work about approximating NP-complete problems, but not a lot about 0’ We provide a theoretical analysis of GP We provide a theoretical analysis of GP Conclusions: Conclusions: GP uses expensive simulations, but the simulation cost can anyway not be removed. GP uses expensive simulations, but the simulation cost can anyway not be removed. GP has no halting criterion, but no halting criterion can be found. GP has no halting criterion, but no halting criterion can be found. Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms. Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms.

27 Conclusion Summary Summary GP is typically solving approximately problems in 0’ GP is typically solving approximately problems in 0’ A lot of work about approximating NP-complete problems, but not a lot about 0’ A lot of work about approximating NP-complete problems, but not a lot about 0’ We provide a theoretical analysis of GP We provide a theoretical analysis of GP Conclusions: Conclusions: GP uses expensive simulations, but the simulation cost can anyway not be removed. GP uses expensive simulations, but the simulation cost can anyway not be removed. GP has no halting criterion, but no halting criterion can be found. GP has no halting criterion, but no halting criterion can be found. Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms. Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms.

28 Conclusion Summary Summary GP is typically solving approximately problems in 0’ GP is typically solving approximately problems in 0’ A lot of work about approximating NP-complete problems, but not a lot about 0’ A lot of work about approximating NP-complete problems, but not a lot about 0’ We provide a mathematical analysis of GP We provide a mathematical analysis of GP Conclusions: Conclusions: GP uses expensive simulations, but the simulation cost can anyway not be removed. GP uses expensive simulations, but the simulation cost can anyway not be removed. GP has no halting criterion, but no halting criterion can be found. GP has no halting criterion, but no halting criterion can be found. Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms. Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms.


Download ppt "Slightly beyond Turing’s computability for studying Genetic Programming Olivier Teytaud, Tao, Inria, Lri, UMR CNRS 8623, Univ. Paris-Sud, Pascal, Digiteo."

Similar presentations


Ads by Google