Convergence in Genetic Programming

Slides:



Advertisements
Similar presentations
A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.
Advertisements

CS 206 Introduction to Computer Science II 09 / 22 / 2008 Instructor: Michael Eckmann.
Doug Downey, adapted from Bryan Pardo, Machine Learning EECS 349 Machine Learning Genetic Programming.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2002.
CS 206 Introduction to Computer Science II 02 / 11 / 2009 Instructor: Michael Eckmann.
CS 447 Advanced Topics in Artificial Intelligence Fall 2002.
What is Neutral? Neutral Changes and Resiliency Terence Soule Department of Computer Science University of Idaho.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2004.
Genetic Programming. Agenda What is Genetic Programming? Background/History. Why Genetic Programming? How Genetic Principles are Applied. Examples of.
CS 61B Data Structures and Programming Methodology Aug 11, 2008 David Sun.
Genetic Programming.
Genetic Programming Chapter 6. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Genetic Programming GP quick overview Developed: USA.
CS 206 Introduction to Computer Science II 09 / 30 / 2009 Instructor: Michael Eckmann.
B-trees (Balanced Trees) A B-tree is a special kind of tree, similar to a binary tree. However, It is not a binary search tree. It is not a binary tree.
Problems Premature Convergence Lack of genetic diversity Selection noise or variance Destructive effects of genetic operators Cloning Introns and Bloat.
What is Genetic Programming? Genetic programming is a model of programming which uses the ideas (and some of the terminology) of biological evolution to.
Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering.
L ECTURE 3 Notes are modified from EvoNet Flying Circus slides. Found at: www2.cs.uh.edu/~ceick/ai/EC1.ppt.
1/27 Discrete and Genetic Algorithms in Bioinformatics 許聞廉 中央研究院資訊所.
GENETIC ALGORITHMS.  Genetic algorithms are a form of local search that use methods based on evolution to make small changes to a popula- tion of chromosomes.
Lecture 11COMPSCI.220.FS.T Balancing an AVLTree Two mirror-symmetric pairs of cases to rebalance the tree if after the insertion of a new key to.
Introduction to Genetic Algorithms. Genetic Algorithms We’ve covered enough material that we can write programs that use genetic algorithms! –More advanced.
2101INT – Principles of Intelligent Systems Lecture 11.
Genetic Programming A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Chapter 6.
Automated discovery in math Machine learning techniques (GP, ILP, etc.) have been successfully applied in science Machine learning techniques (GP, ILP,
RMIT UNIVERSITY ASPGPAnalysis of Genetic Programming Runs1 Vic Ciesielski and Xiang Li {vc, School of Computer Science and Information.
Sporadic model building for efficiency enhancement of the hierarchical BOA Genetic Programming and Evolvable Machines (2008) 9: Martin Pelikan, Kumara.
Genetic Programming Using Simulated Natural Selection to Automatically Write Programs.
Genetics in EACirc DESCRIPTION OF THE COMPONENTS BASED ON EVOLUTION.
Genetic Programming. What is Genetic Programming? GP for Symbolic Regression Other Representations for GP Example of GP for Knowledge Discovery Outline.
Exploratory Data Analysis
Genetic Algorithms 11-Oct-17.
Genetic Algorithms.
Genetic Programming.
Haploid-Diploid Evolutionary Algorithms
Chapter 14 Genetic Algorithms.
Selected Topics in CI I Genetic Programming Dr. Widodo Budiharto 2014.
An Evolutionary Approach
Evolutionary Algorithms Jim Whitehead
Introduction Genetic programming falls into the category of evolutionary algorithms. Genetic algorithms vs. genetic programming. Concept developed by John.
Evolution Strategies Evolutionary Programming
W.B. Langdon Computer Science University College, London
Chapter 5. Greedy Algorithms
Binary Search Trees A binary search tree is a binary tree
Presented by: Dr Beatriz de la Iglesia
B+-Trees.
Multiway search trees and the (2,4)-tree
A Distributed Genetic Algorithm for Learning Evaluation Parameters in a Strategy Game Gary Matthias.
B+ Tree.
CSC 380: Design and Analysis of Algorithms
(edited by Nadia Al-Ghreimil)
Haploid-Diploid Evolutionary Algorithms
Monday, April 16, 2018 Announcements… For Today…
Optimization and Learning via Genetic Programming
Artificial Intelligence Chapter 4. Machine Evolution
Assessing Normality and Data Transformations
Trees & Forests D. J. Foreman.
GENETIC ALGORITHMS & MACHINE LEARNING
Artificial Intelligence Chapter 4. Machine Evolution
EE368 Soft Computing Genetic Algorithms.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Genetic Programming Chapter 6.
Genetic Programming.
Volume 5, Issue 4, Pages e4 (October 2017)
Genetic Programming Chapter 6.
Artificial Intelligence CIS 342
CSE 373 Data Structures and Algorithms
Genetic Programming Chapter 6.
Beyond Classical Search
CSC 380: Design and Analysis of Algorithms
Presentation transcript:

Convergence in Genetic Programming W. B. Langdon Computer Science. University College, London Humies $10000 Human-Competitive Results 500 trees 6.5.2017

Convergence in Genetic Programming and Long-Term Evolution Experiments Evolving Bacteria 60,000 generations v. evolving programs 100,000 generations LTEE continuous innovation v convergence Results quadratic tree growth differences from crossover only theory converge of binary trees random drift stops bloats W. B. Langdon, UCL

Genetic Programming and Long-Term Evolution Experiments Evolving Bacteria 60,000 generations v. evolving programs 100,000 generations LTEE continuous innovation v convergence Results Increase in code (bloat), end of bloat Theory some true, some less so W. B. Langdon, UCL

Long-Term Evolution Experiment Mean fitness of nine E. coli populations from the LTEE Evolving Bacteria 60,000 generations Even after 60000 gens fitness still improving Richard Lenski pulls frozen bacteria cultures out of a freezer 15 Oct 2009 R. E. Lenski et al. 2015. Sustained fitness gains and variability in fitness trajectories in the long-term evolution experiment with Escherichia coli. Proc. Royal Soc.

6 Multiplexor GP bench mark. Six inputs: Use two (D4 D5) as binary number to connect corresponding data lines (D0-D3) to the output Test on all 26=64 possible combinations Fitness score (0-64) is number correct

Impact of Subtrees Subtree like whole tree. Output of subtree is via its root node Intron: subtree which has no effect on overall fitness. I.e. its output does not impact on root node of whole tree. Constant subtree always has same output, i.e. same output on all 64 test cases. Remaining effective code has an impact on root node. Typically it is next root node W. B. Langdon, UCL

Example Intron: AND Function Left: two input AND node. Right: same but input B is always 0. So output always 0. Input A has no effect. Subtree A is always ignored, even in child. (NB no side effects)

Constants Two constants: always 0 and always 1 (FFFFFFFFFFFFFFFF). E.g. evolve by negating input and ANDing with same input (AND D0 (NOR D0 D0)) = 0 Constants help form introns but may be disrupted by crossover. However large subtrees which always output either 0 or 1 tend to be resilient to crossover

Evolution of Program Size Note evolution continues after 1st solution found in generation 22 and even after 1st population where everyone has maximum fitness (generation 312). GP+EM (1)1 pp95-119

Evolution of Program Size Note evolution continues even after 1st population where everyone has maximum fitness (generation 312) but falls as well as rises.

6-Mux Fitness Convergence Theory y = 2popsize(1-(1-x/popsize)7) matches experiment

Testing Theory Theory assumes crossover only (no selection). In EuroGP2007 distribution of sizes converged to limit rapidly. Selection caused by a few runts modifies size distribution

Convergence in Genetic Programming GP genotypes typically do not converge. Even after many generations every tree in the population is different, BUT… Every (or almost all) trees give the same answers (phenotypic convergence) Effective code, i.e. code to solve problem, does converge. Effective code other runs converges differently W. B. Langdon, UCL

Convergence of typical Effective Code Only 111 instructions of 15,495 are effective Gen 500 Only 141 instructions of 16,831 are effective Tree drawing code lisp2dot.awk

Convergence of Effective Code Effective code only. Yellow highly converged. Black unique code Circular lattice code gp2lattice.awk

Evolved Trees Random Shapes Plot whole tree (different population) W. B. Langdon, UCL

Shapes of Evolved Trees Both whole trees × and subtrees lie near Flajolet Depth ≈2 π 𝑠𝑖𝑧𝑒 2 ½ limit1 for random trees W. B. Langdon, UCL

Bloat limited by Gambler’s ruin Tiny fraction of disrupted (low fitness) children sufficient to drive evolution towards every bigger trees. As trees get bigger chance of hitting protected effective code near root node falls. In a finite population eventually no child will be disrupted. Size, without fitness, difference just wanders at random. Crossover cannot escape from population of tiny trees. So we have a lower limit on the random fluctuation. I.e. a Gambler’s ruin. But wondering towards lower limit will re-establish the conditions for bloat. Very approximate limit on tree size: tree size ≈ number of trees × core code size

Bloat limited by Gambler’s ruin tree size ≈ number of trees × core code size tree size ≈ 50 × 497 ≈ 25000 Across ten runs and 100,000 generation, median mean size 42,507 (smallest tree in pop size=10,513) In all ten runs the whole population repeatedly collapses towards smaller trees

Questions Next: Why quadratic increase in size < gen 350 Existing theory differences from crossover only limit Formalise random drift Which types of GP will converge like this? W. B. Langdon, UCL

END Humies $10000 Human-Competitive Results http://www.cs.ucl.ac.uk/staff/W.Langdon/ http://www.epsrc.ac.uk/ W. B. Langdon, UCL 21

W. B. Langdon CREST Department of Computer Science Genetic Programming W. B. Langdon CREST Department of Computer Science

Artificial Evolution of Programs Biological fitness =average number of children GP fitness =score on 6-mux problem The Genetic Programming Cycle. From GP and Data Structures.

Creating new child programs: crossover Crossover is symmetric. That is, on average size after crossover = size before crossover W. B. Langdon, UCL

GPquick GPquick C++, written by Andy Singleton Submachine code GP two bytes per tree node Submachine code GP Boolean (bit) problems. AND, NAND, OR, NOR operate simultaneously in parallel on bits in word (e.g. 32 or 64 bits) 64 bit computer can do 64 test cases in parallel W. B. Langdon, UCL

Genetic Programming to solve 6-Mux Terminals (tree leafs) D0,D1,D2,D3 D4,D5 Function set: 2 input gates→ binary trees AND, NAND, OR, NOR. No side effects Generational population of 500 trees Tournament selection: choose best of 7 100% subtree crossover Initially hard limit on tree size (106) W. B. Langdon, UCL

6-Mux Fitness Convergence Plot smoothed by taking running average over 30 generations

Runts Drive Evolution Don’t plot ratio if less than 5 data

Importance of Mothers Size of poor fitness children closely related to parent who they inherit root from (mum).

Importance of Mothers Although many runts are smaller than their mum, many mothers of runs are smaller than average. Selection removes all low fitness children, Since these are smaller than average, the average size increases

A few runts drive size increase Many mothers of runts are smaller than average (blue) Selection removes all low fitness children (runts) Since these are smaller than average Although there is noise, on average size increases

Testing Theory Same as testing theory plot but do every generation Colour only part of histogram ≥ 3σ Small tree and large tree tails ok (not coloured)

Conclusions Studied long term evolution (>>any other GP) 100s gens where everyone has same fitness No selection to drive size increase Gambler’s ruin with size falling as well as rising Evolved effective code surrounded by ring of sacrificial constants and introns Trees and subtrees resemble random trees But still differences from crossover only limit W. B. Langdon, UCL

The Genetic Programming Bibliography http://www.cs.bham.ac.uk/~wbl/biblio/ 11504 references RSS Support available through the Collection of CS Bibliographies. A web form for adding your entries. Co-authorship community. Downloads A personalised list of every author’s GP publications. blog Search the GP Bibliography at http://liinwww.ira.uka.de/bibliography/Ai/genetic.programming.html