Slightly beyond Turing’s computability for studying Genetic Programming Olivier Teytaud, Tao, Inria, Lri, UMR CNRS 8623, Univ. Paris-Sud, Pascal, Digiteo.

Slides:



Advertisements
Similar presentations
1/23 Learning from positive examples Main ideas and the particular case of CProgol4.2 Daniel Fredouille, CIG talk,11/2005.
Advertisements

Razdan with contribution from others 1 Algorithm Analysis What is the Big ‘O Bout? Anshuman Razdan Div of Computing.
An Introduction to Artificial Intelligence. Introduction Getting machines to “think”. Imitation game and the Turing test. Chinese room test. Key processes.
Machine Learning Week 2 Lecture 1.
EvoNet Flying Circus Introduction to Evolutionary Computation Brought to you by (insert your name) The EvoNet Training Committee The EvoNet Flying Circus.
Non-Linear Problems General approach. Non-linear Optimization Many objective functions, tend to be non-linear. Design problems for which the objective.
Estimation of Distribution Algorithms Ata Kaban School of Computer Science The University of Birmingham.
Feature Selection for Regression Problems
Basic Data Mining Techniques
An Introduction to Black-Box Complexity
Two Broad Classes of Functions for Which a No Free Lunch Result Does Not Hold Matthew J. Streeter Genetic Programming, Inc. Mountain View, California
Imagine that I am in a good mood Imagine that I am going to give you some money ! In particular I am going to give you z dollars, after you tell me the.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2002.
Evolutionary Computation Application Peter Andras peter.andras/lectures.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2004.
DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.
Genetic Programming. Agenda What is Genetic Programming? Background/History. Why Genetic Programming? How Genetic Principles are Applied. Examples of.
Image Registration of Very Large Images via Genetic Programming Sarit Chicotay Omid E. David Nathan S. Netanyahu CVPR ‘14 Workshop on Registration of Very.
Algorithm Design and Analysis Liao Minghong School of Computer Science and Technology of HIT July, 2003.
Chapter 11 Limitations of Algorithm Power. Lower Bounds Lower bound: an estimate on a minimum amount of work needed to solve a given problem Examples:
Computational Complexity Polynomial time O(n k ) input size n, k constant Tractable problems solvable in polynomial time(Opposite Intractable) Ex: sorting,
On Data Mining, Compression, and Kolmogorov Complexity. C. Faloutsos and V. Megalooikonomou Data Mining and Knowledge Discovery, 2007.
Using Genetic Programming to Learn Probability Distributions as Mutation Operators with Evolutionary Programming Libin Hong, John Woodward, Ender Ozcan,
Stochastic Algorithms Some of the fastest known algorithms for certain tasks rely on chance Stochastic/Randomized Algorithms Two common variations – Monte.
Theory of Computing Lecture 15 MAS 714 Hartmut Klauck.
Hierarchical Distributed Genetic Algorithm for Image Segmentation Hanchuan Peng, Fuhui Long*, Zheru Chi, and Wanshi Siu {fhlong, phc,
Study on Genetic Network Programming (GNP) with Learning and Evolution Hirasawa laboratory, Artificial Intelligence section Information architecture field.
Analysis of Algorithms
Introduction Algorithms and Conventions The design and analysis of algorithms is the core subject matter of Computer Science. Given a problem, we want.
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
What is Genetic Programming? Genetic programming is a model of programming which uses the ideas (and some of the terminology) of biological evolution to.
Formal Models in AGI Research Pei Wang Temple University Philadelphia, USA.
Neural and Evolutionary Computing - Lecture 6
Computer Science and Mathematical Basics Chap. 3 발표자 : 김정집.
Computational Complexity Jang, HaYoung BioIntelligence Lab.
1 Lower Bounds Lower bound: an estimate on a minimum amount of work needed to solve a given problem Examples: b number of comparisons needed to find the.
Genetic Algorithms Introduction Advanced. Simple Genetic Algorithms: Introduction What is it? In a Nutshell References The Pseudo Code Illustrations Applications.
Genetic Algorithms Siddhartha K. Shakya School of Computing. The Robert Gordon University Aberdeen, UK
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
CSCI 3160 Design and Analysis of Algorithms Tutorial 10 Chengyu Lin.
Artificial Intelligence Chapter 4. Machine Evolution.
Exact and heuristics algorithms
1 Information Security – Theory vs. Reality , Winter Lecture 10: Garbled circuits and obfuscation Eran Tromer Slides credit: Boaz.
MINING COLOSSAL FREQUENT PATTERNS BY CORE PATTERN FUSION FEIDA ZHU, XIFENG YAN, JIAWEI HAN, PHILIP S. YU, HONG CHENG ICDE07 Advisor: Koh JiaLing Speaker:
Umans Complexity Theory Lectures Lecture 1a: Problems and Languages.
Kanpur Genetic Algorithms Laboratory IIT Kanpur 25, July 2006 (11:00 AM) Multi-Objective Dynamic Optimization using Evolutionary Algorithms by Udaya Bhaskara.
ECE 103 Engineering Programming Chapter 52 Generic Algorithm Herbert G. Mayer, PSU CS Status 6/4/2014 Initial content copied verbatim from ECE 103 material.
Biologically inspired algorithms BY: Andy Garrett YE Ziyu.
Mining Evolutionary Model MEM Rida E. Moustafa And Edward J. Wegman George Mason University Phone:
GENETIC PROGRAMMING. THE CHALLENGE "How can computers learn to solve problems without being explicitly programmed? In other words, how can computers be.
Coevolutionary Automated Software Correction Josh Wilkerson PhD Candidate in Computer Science Missouri S&T.
Optimization Problems
Automated discovery in math Machine learning techniques (GP, ILP, etc.) have been successfully applied in science Machine learning techniques (GP, ILP,
Chapter 9 Genetic Algorithms Evolutionary computation Prototypical GA
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
Agenda  INTRODUCTION  GENETIC ALGORITHMS  GENETIC ALGORITHMS FOR EXPLORING QUERY SPACE  SYSTEM ARCHITECTURE  THE EFFECT OF DIFFERENT MUTATION RATES.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
1 Software Testing. 2 What is Software Testing ? Testing is a verification and validation activity that is performed by executing program code.
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
Hirophysics.com The Genetic Algorithm vs. Simulated Annealing Charles Barnes PHY 327.
Genetic (Evolutionary) Algorithms CEE 6410 David Rosenberg “Natural Selection or the Survival of the Fittest.” -- Charles Darwin.
Selected Topics in CI I Genetic Programming Dr. Widodo Budiharto 2014.
USING MICROBIAL GENETIC ALGORITHM TO SOLVE CARD SPLITTING PROBLEM.
Artificial Intelligence Chapter 4. Machine Evolution
Artificial Intelligence Chapter 4. Machine Evolution
Boltzmann Machine (BM) (§6.4)
Applications of Genetic Algorithms TJHSST Computer Systems Lab
Mathematical Analysis of Algorithms
Beyond Classical Search
Presentation transcript:

Slightly beyond Turing’s computability for studying Genetic Programming Olivier Teytaud, Tao, Inria, Lri, UMR CNRS 8623, Univ. Paris-Sud, Pascal, Digiteo

Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view Complexity point of view

What is Genetic Programming (GP) GP = mining Turing-equivalent spaces of functions GP = mining Turing-equivalent spaces of functions Typical example: symbolic regression. Typical example: symbolic regression. Inputs: Inputs: x1,x2,x3,…,xN in {0,1}* x1,x2,x3,…,xN in {0,1}* y1,y2,y3,…,yN in {0,1} yi=f(xi) y1,y2,y3,…,yN in {0,1} yi=f(xi) (xi,yi) assumed independently identically distributed (unknown distribution of probability) (xi,yi) assumed independently identically distributed (unknown distribution of probability) Goal: Goal: Finding g such that Finding g such that E|g(x)-y| + C E Time(g,x) as small as possible

How does GP works ? GP = evolutionary algorithm. GP = evolutionary algorithm. Evolutionary algorithm: Evolutionary algorithm: P = initial population P = initial population While (my favorite criterion) While (my favorite criterion) Selection = best functions in P according to some score Selection = best functions in P according to some score Mutations = random perturbations of progs in the Selection Mutations = random perturbations of progs in the Selection Cross-over = merging of programs in the Selection Cross-over = merging of programs in the Selection P ≈ Selection + Mutations + Cross-over P ≈ Selection + Mutations + Cross-over

How does GP works ? GP = evolutionary algorithm. GP = evolutionary algorithm. Evolutionary algorithm: Evolutionary algorithm: P = initial population P = initial population While (my favorite criterion) While (my favorite criterion) Selection = best functions in P according to some score Selection = best functions in P according to some score Mutations = random perturbations of progs in the Selection Mutations = random perturbations of progs in the Selection Cross-over = merging of programs in the Selection Cross-over = merging of programs in the Selection P ≈ Selection + Mutations + Cross-over P ≈ Selection + Mutations + Cross-over Does it work ?

How does GP works ? GP = evolutionary algorithm. GP = evolutionary algorithm. Evolutionary algorithm: Evolutionary algorithm: P = initial population P = initial population While (my favorite criterion) While (my favorite criterion) Selection = best functions in P according to some score Selection = best functions in P according to some score Mutations = random perturbations of progs in the Selection Mutations = random perturbations of progs in the Selection Cross-over = merging of programs in the Selection Cross-over = merging of programs in the Selection P ≈ Selection + Mutations + Cross-over P ≈ Selection + Mutations + Cross-over Does it work ? Definitely, yes for robust and multimodal optimization in complex domains (trees, bitstrings,…).

How does GP works ? GP = evolutionary algorithm. GP = evolutionary algorithm. Evolutionary algorithm: Evolutionary algorithm: P = initial population P = initial population While (my favorite criterion) While (my favorite criterion) Selection = best functions in P according to some score Selection = best functions in P according to some score Mutations = random perturbations of progs in the Selection Mutations = random perturbations of progs in the Selection Cross-over = merging of programs in the Selection Cross-over = merging of programs in the Selection P ≈ Selection + Mutations + Cross-over P ≈ Selection + Mutations + Cross-over Does it work ?

How does GP works ? GP = evolutionary algorithm. GP = evolutionary algorithm. Evolutionary algorithm: Evolutionary algorithm: P = initial population P = initial population While (my favorite criterion) While (my favorite criterion) Selection = best functions in P according to some score Selection = best functions in P according to some score Mutations = random perturbations of progs in the Selection Mutations = random perturbations of progs in the Selection Cross-over = merging of programs in the Selection Cross-over = merging of programs in the Selection P ≈ Selection + Mutations + Cross-over P ≈ Selection + Mutations + Cross-over Which score ? A nice question for mathematicians

Why studying GP ? GP is studied by many people GP is studied by many people 5440 articles in the GP bibliography [5] 5440 articles in the GP bibliography [5] More than 880 authors More than 880 authors GP seemingly works GP seemingly works Human-competitive results programming.com/humancompetitive.html Human-competitive results programming.com/humancompetitive.htmlhttp:// programming.com/humancompetitive.htmlhttp:// programming.com/humancompetitive.html Nothing else for mining Turing-equivalent spaces of programs Nothing else for mining Turing-equivalent spaces of programs Probably better than random search Probably better than random search Not so many mathematical fundations in GP Not so many mathematical fundations in GP Not so many open problems in computability, in particular with applications Not so many open problems in computability, in particular with applications

Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view Complexity point of view

Formalization of GP What is typically GP ? No halting criterion. We stop when time is exhausted. No halting criterion. We stop when time is exhausted. No use of prior knowledge; no use of f, whenever you know it. No use of prior knowledge; no use of f, whenever you know it. People (often) do not like GP because: It is slow and has no halting criterion It is slow and has no halting criterion It uses the yi=f(xi) and not f (different from automatic code generation) It uses the yi=f(xi) and not f (different from automatic code generation)  Are these two elements necessary ?

Iterative algorithms

Black-box ?

Formalization of GP Summary: GP uses only the f(xi) and the Time(f,xi). GP never halts: O1, O2, O3, …. Can we do better ?

Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view Complexity point of view

Known results Whenever f is available (and not only the f(xi) ), computing O such that O≡f O≡f O optimal for size (or speed, or space …) O optimal for size (or speed, or space …) is not possible. (i.e. there’s no Turing machine performing that task for all f)

A first (easy) good reason for GP. Whenever f is available (and not only the f(xi) ), computing O1, O2, …, such that Op ≡ f for p sufficiently large Op ≡ f for p sufficiently large Lim size(Op) optimal Lim size(Op) optimal is possible, with proved convergence rates, e.g. by bloat penalization: - while (true)- select the best program P for a compromise relevance on the n first examples + penalization of size, e.g. Sum |P(xi)-yi |+ C( |P|, n ) i < n i < n - n=n+1 (see details of the proof and of the algorithm in the paper)

A first (easy) good reason for GP. Whenever f is not available (and not only the f(xi) ), computing O1, O2, …, such that Op ≡ f for p sufficiently large Op ≡ f for p sufficiently large Lim size(Op) optimal Lim size(Op) optimal is possible, with proved convergence rates, e.g. by bloat penalization: - consider a population of programs; set n=1 - while (true)- select the best program P for a compromise relevance on the n first examples + penalization of size, e.g. Sum |P(xi)-yi |+ C( |P|, n ) i < n i < n - n=n+1 (see details of the proof and of the algorithm in the paper)

A first (easy) good reason for GP.  Asymptotically (only!), finding an optimal function O ≡ f is possible.  No halting criterion is possible (avoids the use of an oracle in 0’)

Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view Complexity point of view

Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view: Complexity point of view: Kolmogorov’s complexity with bounded time Kolmogorov’s complexity with bounded time Application to genetic programming Application to genetic programming

Kolmogorov’s complexity Kolmogorov’s complexity of x : Kolmogorov’s complexity of x : Minimum size of a program generating x Kolmogorov’s complexity of x with time at most T : Kolmogorov’s complexity of x with time at most T : Minimum size of a program generating x in time at most T. Kolmogorov’s complexity in bounded time = computable.

Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view: Complexity point of view: Kolmogorov’s complexity with bounded time Kolmogorov’s complexity with bounded time Application to genetic programming Application to genetic programming

Kolmogorov’s complexity and genetic programming GP uses expensive simulations of programs GP uses expensive simulations of programs Can we get rid of the simulation time ? e.g. by using f not only as a black box ? Can we get rid of the simulation time ? e.g. by using f not only as a black box ? Essentially, no: Essentially, no: Example of GP problem: finding O as small as possible with Example of GP problem: finding O as small as possible with ETime(O,x)<T n, ETime(O,x)<T n, |O|<S n |O|<S n O(x)=y O(x)=y If T n = Ω(2 n ) and some S n = O(log(n)), this requires time at least T n /polynomial(n) If T n = Ω(2 n ) and some S n = O(log(n)), this requires time at least T n /polynomial(n) Just simulating all programs shorter than S n and « faster » than T n is possible in time polynomial(n)T n Just simulating all programs shorter than S n and « faster » than T n is possible in time polynomial(n)T n

Outline What is genetic programming What is genetic programming Formal analysis of Genetic Programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ? Why is there nothing else than Genetic Programming ? Computability point of view Computability point of view Complexity point of view: Complexity point of view: Kolmogorov’s complexity with bounded time Kolmogorov’s complexity with bounded time Application to genetic programming Application to genetic programming Conclusion Conclusion

Conclusion Summary Summary GP is typically solving approximately problems in 0’ GP is typically solving approximately problems in 0’ A lot of work about approximating NP-complete problems, but not a lot about 0’ A lot of work about approximating NP-complete problems, but not a lot about 0’ We provide a theoretical analysis of GP We provide a theoretical analysis of GP Conclusions: Conclusions: GP uses expensive simulations, but the simulation cost can anyway not be removed. GP uses expensive simulations, but the simulation cost can anyway not be removed. GP has no halting criterion, but no halting criterion can be found. GP has no halting criterion, but no halting criterion can be found. Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms. Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms.

Conclusion Summary Summary GP is typically solving approximately problems in 0’ GP is typically solving approximately problems in 0’ A lot of work about approximating NP-complete problems, but not a lot about 0’ A lot of work about approximating NP-complete problems, but not a lot about 0’ We provide a theoretical analysis of GP We provide a theoretical analysis of GP Conclusions: Conclusions: GP uses expensive simulations, but the simulation cost can anyway not be removed. GP uses expensive simulations, but the simulation cost can anyway not be removed. GP has no halting criterion, but no halting criterion can be found. GP has no halting criterion, but no halting criterion can be found. Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms. Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms.

Conclusion Summary Summary GP is typically solving approximately problems in 0’ GP is typically solving approximately problems in 0’ A lot of work about approximating NP-complete problems, but not a lot about 0’ A lot of work about approximating NP-complete problems, but not a lot about 0’ We provide a mathematical analysis of GP We provide a mathematical analysis of GP Conclusions: Conclusions: GP uses expensive simulations, but the simulation cost can anyway not be removed. GP uses expensive simulations, but the simulation cost can anyway not be removed. GP has no halting criterion, but no halting criterion can be found. GP has no halting criterion, but no halting criterion can be found. Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms. Also, « bloat » penalization ensures consistency  this point proposes a parametrization of the usual algorithms.