1 Genetic Programming Primer ami hauptman
2 Outline Intro GP definition Human Competitive Results Representation Advantages GP operators The GP Individual The GP experiment Example – symbolic regression Additional Concepts
3 Intro Representation is a crucial issue in problem-solving Part of the solution Generic GAs Linear (bit-vector) solutions Fixed length Problems: Often difficult and unnatural Limited search; Length may be unknown Examples – TSP; AI; … Very few “real world” implementations
4 Towards a Variant A : Computer programs fundamental models in CS B : “Computer programs are the best model for computer programs” Conclusion… (A Λ B ) Why not implement GA individuals as programs? Implemented in…
5 Genetic Programming [GP] Evolutionary search in program space Tree-structured representation Non tree-based program representations exist (e.g. Machine-Code GA) Introduced by Cramer [1985] Developed by Koza [1992] Active and highly successful field Conferences Patents And especially…
6 Human-Competitive Results
7 Increasing in the field of genetic and evolutionary computation An automatically created result is “human- competitive” if ( Patent (invention) New scientific result Solution to long-standing problem Wins / Holds its own in regulated competition with human contestant (= live player or human-written program)
8 Human-Competitive Results Over 40 to date Examples: Evolved antenna for use by NASA (Lohn, 2004) Automatic quantum computer programming (Spector, 2004) Several analog electronic circuits (Koza et al., mid 90s till today): amplifiers, computational circuits, … MAJOR motivation: As of 2004, yearly contest WITH CASH PRIZES!
9 Outline Intro GP definition Human Competitive Results Representation Advantages GP operators The GP Individual The GP experiment Example – symbolic regression Additional Concepts
10 Representations Individuals as LISP programs (S- expressions) Why trees?
11 Advantages of trees Before Genetic operators Powerful representation (+ 1 2 (IF (> TIME 10) 3 4)) Simple No: local vars, types … Individual Solution Saves precious time Recursive - size may vary Main advantage:
12 Outline Intro GP definition Human Competitive Results Representation Advantages GP operators The GP Individual The GP experiment Example – symbolic regression Additional Concepts
13 Genetic Operators Reproduction – as before Crossover, Mutation - now tree operators S-expressions closed under most (tree) operators Generate VALID individuals Unlike C programs…
14 GP Operators - Crossover BINARY 1) Randomly select 2 nodes 2) Swap underlying sub-trees S.t. depth constraints (if applicable) FATHER MOTHER OFFSPRING1 OFFSPRING2
15 GP Operators - Mutation UNARY 1) Select mutation point 2) Remove entire sub-tree and grow a new one Again – depth constraints
16 Genome What’s inside the tree… Genome contains functions and terminals Functions - Internal nodes Varying complexity Examples: +, -, And, Or, If, IFLTE, ADFs Terminals - Leaves Constants or “sensors”; Actions Generally more domain dependant Examples: Variables, Constants, 0-params functions, ERCs
17 Example 1 Logic expression AND Y X OR NOT X Z
18 Example 2 Series of instructions Progn2 IF Wall_is_near 90 Rotate_Right Advance_5Advance_10
19 More realistic trees… Tree 0: (If3 (Or2 (Not (Or2 (And2 OppPieceAttUnprotected NotMyKingInCheck) (Or2 NotMyPieceAttUnprotected 100*Increase))) (And2 (Or3 (And2 OppKingStuck NotMyPieceAttUnprotected) (And2 OppPieceAttUnprotected OppKingStuck) (And *MateInOne OppKingInCheckPieceBehind NotMyKingStuck)) (Or2 (Not NotMyKingStuck) OppKingInCheck))) NumMyPiecesUNATT (If3 (< (If3 (Or2 NotMyPieceAttUnprotected NotMyKingInCheck) (If3 NotMyPieceAttUnprotected #NotMovesOppKing OppKingInCheckPieceBehind) (If3 OppKingStuck OppKingInCheckPieceBehind -1000*MateInOne)) (If3 (And2 100*Increase 1000*Mate?) (If3 (< NumMyPiecesUNATT (If3 NotMyPieceAttUnprotected -1000*MateInOne OppKingProxEdges)) (If3 (< MyKingDistEdges #NotMovesOppKing) (If *MateInOne -1000*MateInOne NotMyPieceATT) (If3 100*Increase #MovesMyKing OppKingInCheckPieceBehind)) NumOppPiecesATT) (If3 NotMyKingStuck OppKingProxEdges))) (If3 OppKingInCheck (If3 (Or2 NotMyPieceAttUnprotected NotMyKingInCheck) (If3 (< MyKingDistEdges #NotMovesOppKing) (If *MateInOne -1000*MateInOne NotMyPieceATT) (If3 100*Increase #MovesMyKing OppKingInCheckPieceBehind)) NumOppPiecesATT) (If3 (And *MateInOne NotMyPieceAttUnprotected 100*Increase) (If3 (< NumMyPiecesUNATT (If3 NotMyPieceAttUnprotected -1000*MateInOne OppKingProxEdges)) (If3 (< MyKingDistEdges #NotMovesOppKing) (If *MateInOne *MateInOne NotMyPieceATT) (If3 100*Increase #MovesMyKing OppKingInCheckPieceBehind)) NumOppPiecesATT) -1000*MateInOne)) (If3 (< (If3 100*Increase MyKingDistEdges 100*Increase) (If3 OppKingStuck OppKingInCheckPieceBehind -1000*MateInOne)) (If3 (And2 NotMyPieceAttUnprotected -1000*MateInOne) (If3 (< NumMyPiecesUNATT (If3 NotMyPieceAttUnprotected -1000*MateInOne OppKingProxEdges)) (If3 (< MyKingDistEdges #NotMovesOppKing) (If *MateInOne -1000*MateInOne NotMyPieceATT) (If3 100*Increase #MovesMyKing OppKingInCheckPieceBehind)) NumOppPiecesATT) (If3 OppPieceAttUnprotected NumMyPiecesUNATT MyFork))))) GP-EndChess
20 Outline Intro GP definition Human Competitive Results Representation Advantages GP operators The GP Individual The GP experiment Example – symbolic regression Additional Concepts
21 The GP experiment Preparatory steps: Determine set of terminals & functions Determine fitness measure Determine run parameters Population size Operator probabilities Tree depths …… Set up end of run criteria Run the experiment start with initial population:
22 Creating the Initial Population 3 Methods Depth=m Grow F.e. node – randomly select function or terminal varying depths Full All of same depth Ramped-half-and-half Divide population to max_depth-1 parts For each part: half by grow; half by full Preferred by Koza
23 Flowchart of GP
24 Example - Symbolic regression Independent variable X Dependent variable Y
25 Preparatory Steps Objective: Find a computer program with one input (independent variable X ) whose output equals the given data 1Terminal set: T = {X, Random-Constants} 2Function set: F = {+, -, *, %} 3Fitness: The sum of the absolute value of the differences between the candidate program’s output and the given data, where (-1 < X < 1) 4Parameters:Population size M = 4 5Termination:An individual emerges whose sum of absolute errors is less than 0.1
26 Notes Complexity of function = ? Size of solution unknown More points more difficult Depth constraints Can limit complexity Needed anyway Typically size of pop >= 50 “%” function is “/” excluding zero in 2 nd argument Needed to avoid BAD individuals
27 Generation 0 4 Random Individuals (later: creation) More complex than actual transformed
28 Fitness f(x) = x 2 +x+ 1 x + 1x x
29 Generation 1 Copy of (a) Mutant of (c) picking “2” as mutation point First offspring of crossover of (a) and (b) picking “+” of parent (a) and left-most “x” of parent (b) as crossover points Second offspring of crossover of (a) and (b) picking “+” of parent (a) and left-most “x” of parent (b) as crossover points
30 Generation 1 - Fitness IDEAL
31 Outline Intro GP definition Human Competitive Results Representation Advantages GP operators The GP Individual GP experiments Example – symbolic regression Additional Concepts
32 Additional Concepts Competitive Evaluation Ephemeral Random constants (ERCs) Strongly Typed GP (STGP)
33 Competitive Evaluation (not only in GP) Important in games ! Fitness depends on peers; relative E.g. compete against k peers Optional – count score for both Easier progress at start – weak opps Avoid over-generalization and early convergence Still need absolute measure
34 ERCs Sometimes constants not known in advance E.g. symbolic regression ERC (terminal) node – init to a random constant stay Mutate-ERC operator (low prob.) Change to other constant Typically-range for constants
35 Strongly Typed Genetic Programming (STGP) Montana [1995] Node data types may vary; casting not enough Impose structural constraints Assigning (pseudo) types to functions & terminals Sometimes more than one Typically actual (implementation) types are all real numbers
36 STGP Example Include both Boolean and Float terminals True, False, zero-arg Predicates ERCs, float zero-arg functions Some used as both – “Query” Important predicate Returns a large float if true If function Children: Boolean, Float, Float Query can be at each child location of If