Genetic Algorithm with Limited Convergence 1 Simple Selectorecombinative GAs Scale poorely on hard problems (multimodal, deceptive, high degree of subsolution interaction, noise,...), largely the result of their mixing behaviour Inability of SGA to correctly identify and adequately mix the appropriate BBs in subsequent generations Exponential computation complexity of SGA Crossover operators or other exchange emchanisms are needed such that adapt to the problem at hand Linkage adaptation
Genetic Algorithm with Limited Convergence 2 Messy Genetic Algorithms (mGAs) Inspiration from the nature evolution starts from the simplest forms of life mGAs depart from SGA in four ways: messy codings messy operators separation of processing into three heterogeneous phases epoch-wise iteration to improve the complexity of solution
Genetic Algorithm with Limited Convergence 3 mGA’s codings Tagged alleles Variable-length strings: (name 1, allele 1 ) … (name N, allele N ) ((4,0) (1,1) (2,0) (4,1) (4,1) (5,1)) Over-specification multiple gene instances (gene 4) majority voting – would express deceptive genes too readily first-come first-served (left to right expression) - positional priority Underspecification missing gene instances (gene 3) average schema value – variance is too high competitive template – solution locally optimal with respect to k-bit perturbations
Genetic Algorithm with Limited Convergence 4 Messy operators: cut & splice Cut – divides a single string into two parts Splice – joins the head of one string with the tail of the other one When short strings are mated – probability of cut is small mostly the string will be just spliced the strings’ length is doubled When long string are mated – probability of cut is large one-point crossover
Genetic Algorithm with Limited Convergence 5 Three heterogeneous phases Initialization Enumerative initialization of the population with all sub-strings of a certain length k<<l ( l k )2 k O(l k ) computations Guaranteed that all BBs of certain size are present in the population Primordial phase Only selection used to dope the population with good BBs Good linkage groups are selected before their alleles are allowed to be mixed Juxtapositional phase selection + cut&splice Mixing of the BBs
Genetic Algorithm with Limited Convergence 6 Fast messy genetic algorithms Probabilistically complete enumeration Population of strings of length l’ close to l is generated Assumption: each string contains many different BBs of length k<<l Building block filtering extracts highly-fit and effectively linked BBs repeats (1) selection and (2) gene deletion only O(l) computations to converge Extended thresholding tournaments are held only between strings that have a threshold number of genes in common fmGA vs mGA 150-bit long problem, 30 5-bit deceptive function 1.9 10 5 vs. 5.9 10 8 evaluations
Genetic Algorithm with Limited Convergence 7 Gene expression messy GA - gemGA Messy ??? No variable-length strings No under- or over-specification No left-to-right expression Messy use of heterogeneous phases of processing in gemGA Linkage learning phase - first identifies linkage groups Mixing phase – selection + recombination exchanges good allele combinations within those groups to find optimal solution
Genetic Algorithm with Limited Convergence 8 gemGA: The idea Linkage learning phase Transcription I (antimutation) Each string undergoes l one-bit perturbations Improvements are ignored ?!? (bit does not belong to optimal BB) Changes that degrade the structure are marked as possible linkage groups candidates Ex.: two 3-bit deceptive BBs marked not marked (degrades) (improves) Transcription II Identifies the exact relations among the genes by checking nonlinearities IF f(X’ i ) + f(X’ j ) != f(X’ ij ) THEN link(i,j)