Different Varieties of Genetic Programming Je-Gun Joung
Some of the Many Different Structures Used for GP
9.1 GP with Tree Genomes t Mutation Operators Applied in Tree-based GP
Point Mutation + * x -- * 1x1-- x1x1 *- x1 + * x x1-- x1x1 *- x1
Permutation + * x -- * 1x1-- x1x1 *- x1 + * x x1-- x11x *- x1
Hoist + * x -- * 1x1-- x1x1 *- x1 -- x1x1 *
Expansion Mutation + * x -- * 1x1-- x1x1 *- x1 + * x -- * x1-- x1x1 *- x 1 -- x1x1 *
Collapse Subtree Mutation + * x -- * 1x1-- x1x1 *- x1 + * x -- * 1x1 x- x1
Subtree Mutation + * x -- * 1x1-- x1x1 *- x1 + * -- x1x1 *- x1 - x1
Crossover Operators Applied in Tree-based GP
Subtree Exchange Crossover
Selfcrossover
Module CrossoverCrossover Operators Applied within Tree-based GP
9.2 GP with Linear Genomes t Linear GP acts on linear genomes, like program code represented by bit strings or code for register machines. t The influence of change in a linear structure can be expected to follow the linear order in which the instructions are executed. t Tree-based GP is that all operators uniformly select nodes from a tree. t Linear GP is that all operators uniformly select nodes from a sequence.
9.2.1 Evolutionary Program Induction with Introns t Wineberg and Oppacher [1994] have formulated an evolutionary programming method they call EPI (evolutionary program induction). t They use fixed length strings to code their individuals and a GA-like crossover. t The code is constructed to maintain a fixed structure within the chromosome that allows similar alleles to compete against each other at a locus during
9.2.2 Developmental Genetic Programming t Developmental genetic programming (DGP) is extension of GP by a developmental step. t In tree based GP, the space of genotypes (search space) is usually identical to the space of phenotypes (solution space) t DGP maps binary sequences, genotype, through a developmental process into separate phenotypes
The Genotype-phenotype Mapping Genotype Genotype-Phenotype Mapping (GPM) Penotype Search Space (unconstrained) Constraint implementation Solution space (constrained)
9.2.3 An Example: Evolution in C t Symbolic function regression
An Example Result t Runs lasted for 50 generations at most, with a population size of 500 individuals. t In one experimental run, the genotype t The raw symbol sequence T*(a)*R)aE+C)E)SRDT)vSqE* t Repairing transforms this illegal sequence into {T((a)*R(a+m)+(S(D((v+q+D} t This sequence is unfinished, repairing terminates by completing the sequence into {T((a)*R(a+m))+(S(D((v+q+D(m)))))}
t Finally, editing produces double ind(double m, double v, double a) {return T((a)*R(a+m))+(S(D((v+q+D(m))))); } t A C compiler takes over to generate an executable that is valid on the underlying hardware platform t This executable is the final phenotype encoded by the genotype
9.2.4 Machine Language 1: x=x-1 (x-1) 2 + (x-1) 3 2: y=x*x 3: x=x*y 4: y=x+y t Figure 9.13 x * * + y
+ * x -- * 1x1-- x1x1 *- x1 The representation of (x-1) 2 +(x-1) 3 in a tree-based genome
The reasons for using machine code in GP - as Opposed to Higher-level languages t The most efficient optimization can be done at the machine code level. t High-level tools might simply not be available for a target processor t It could be more convenient to let the computer evolve small pieces of machine code programs itself rather than learning to master machine code programming
Reasons for Using Binary Machine Code t The GP algorithm can be made very fast by having the individual programs in the population in binary machine code. t The system is also much more memory efficient than a tree based GP system. t An additional advantage is that memory consumption is stable during evolution with no need for garbage collection.
The JB Language 0 = BLOCK (group statements) 1 = LOOP 2 = SET 3 = ZERO (clear) 4 = INCREMENT Individual genome: Block stat. 1 stat.2 register 1 = 0 repeat stat.1, register2 register1 = register1+1
The GEMS System t One of the most extensive systems for evolution of machine code is the GEMS system [Crepeau, 1995]. t The system includes an almost complete interpreter for the Z-80 8-bit microprocessor. t The Z-80 has 691 different instructions, and GEMS implements 660 instructions. t It has so far been used to evolve a “hello world” program consisting of 58 instructions.
The Crossover of GEM
9.2.5 An Example: Evolution in Machine Language
9.3 GP with Graph Genomes t PADO The graph-based GP system PADO (Parallel Algorithm Discovery and Orchestration) [Teller and Veloso, 1995] Each program has a stack and an indexed memory for its own use of intermediate values and for communication. There are also the following special nodes in a program u Start node u Stop node u Subprogram calling nodes u Library subprogram calling nodes
The Representation of a Program and Subprogram in the PADO t Fig 9.19 STOP START STOP Main Program Subprogram (private of public) Stack Indexed Memory
9.3.2 Cellular Encoding
9.4 Other Genomes t STROGANOFF Iba, Sato, and deGaris [1995] have introduced a more complicated structure into the nodes of a tree that could represent a program. They base their approach on the well-known Group Method of Data Handling (GMDH) In order to understand STructured Representation On Genetic Algorithms for Nonlinear Function Fitting (STROGANOFF) The STROGANOFF method applies GP crossover and mutation to a population of the polynominal nodes.
Group Method of Data Handling (GMDH) P1 P2P4 X3 X5 X1 P3 X2X4
Crossover of trees of GMDH P1 P2P4 X3 X5 X4 Pa Pc X3X1X2X4 P1 P2Pb X3 X5 X2X4 Pa P4Pc X3X1X2X4 X2 Pb
Different Mutation of trees of GMDH P1 P2P4 X3 X5 X1 P3 X2X4 P1 P2 X4X1 P3 X2X5 P1 P2 X3X1 P3 X2X4 P1 P2P4 X5 X1 P3 X2 P1 P2P4 X3 X5 X1 P3 X2X4 (a)(b) (c)(d) X3 P3 X4
9.4.2 GP Using Context-Free Grammars t By the use of a context-free grammar, typing and syntax are automatically assured throughout the evolutionary process t A Context-free grammar can be considered a four- tuple Definition 9.2 A terminal of a context-free grammar is a symbol for which no production rule exists in the grammar. Definition 9.3 A production rule is a substitution of the kind where and
A Grammatical Structure S B S BB BB-BB-BB-BB- TT X1 TT X1 TT X1 TT X1 *BB* B + S : the start symbol B : a binary expression T : a terminal x and 1 : variables and a constant
9.4.3 Genetic Programming of L-Systems t Lindenmayer systems (also known as L-system [Lindenmayer, 1968][Prusinkiewicz and Lindenmayer, 1990] have been intorduced independently into the area of genetic programming by different researchers [Koza, 1993][Jacob, 1994][Hemmi et al., 1994] t L-systems were invented for the purpose of modeling biological structure formation t The rewriting all non-terminals in parallel is important in this respect. t L-system in their simplest form (0L-systems) are context-free grammars whose production rules are applied not sequentially but simultaneously to the growing tree of non-terminals.
Context-free L-system Individual Encoding a Production Rule System of Lindenmayer type 0L-System AxiomALRule predsuccpredsuccpredsucc