CIS 488/588 Bruce R. Maxim UM-Dearborn Genetic Algorithms CIS 488/588 Bruce R. Maxim UM-Dearborn 11/15/2018
Genetic Algorithms What are they? Uses? Evolutionary algorithms that make use of operations like mutation, recombination, and selection Uses? Difficult search problems Optimization problems Machine learning Adaptive rule-bases 11/15/2018
Theory of Evolution Every organism has unique attributes that can be transmitted to its offspring Offspring are unique and have attributes from each parent Selective breeding can be used to manage changes from one generation to the next Nature applies certain pressures that cause individuals to evolve over time 11/15/2018
Evolutionary Pressures Environment Creatures must work to survive by finding resources like food and water Competition Creatures within the same species compete with each other on similar tasks (e.g. finding a mate) Rivalry Different species affect each other by direct confrontation (e.g. hunting) or indirectly by fighting for the same resources 11/15/2018
Natural Selection Creatures that are not good at completing tasks like hunting or mating have fewer chances of having offspring Creatures that are successful in completing basic tasks are more likely to transmit their attributes to the next generation since there will be more creatures born that can survive and pass on these attributes 11/15/2018
Genetics Genome (class) Genomics Alleles Genotype (instance) Sequence of genes describing the overall structure of the genetic for a particular species Genomics Study of the meaning of the genes for a particular species Alleles Values that can be assigned to a given gene Genotype (instance) Sequence of alleles 11/15/2018
Physical Properties Phenetics Phenome Phenotype Study of physical properties and morphology of creatures independent of genetic information Phenome General structure of creatures body and attributes Phenotype Particular instance of phenome realized as a unique creature Product of genotype and environment forces 11/15/2018
Conversions In real-world mapping between genotypes and phenotypes is hard In AI work it can be done by defining a convenient function or even designing encodings by hand It is often easier to adapt genetic operators to work with the evolutionary data structure used to represent the phenotype than to encode and decode phenotypes 11/15/2018
Genetic Algorithmic Process Potential solution for problem domains are encoded using machine representation (e.g. bit strings) that supports variation and selection operations Mating and mutation operations produce new generation of solutions from parent encodings Fitness function judges the individuals that are “best” suited (e.g. most appropriate problem solution) for “survival” 11/15/2018
Initialization Initial population must be a representative sample of the search space Random initialization can be a good idea (if the sample is large enough) Random number generator can not be biased (e.g. Mersenne Twister) Can reuse or seed population with existing genotypes based on algorithms or expert opinion or previous evolutionary cycles 11/15/2018
Evaluation Each member of the population can be seen as candidate solution to a problem The fitness function determines the quality of each solution The fitness function takes a phenotype and returns a floating point number as its score It is problem dependent so can be very simple It can be a botteneck if it is not carefully thought out (there are no magic ways to create them) 11/15/2018
Selection Want to to give preference to “better” individuals to add to mating pool Mating can be sexual or asexual If entire population ends up being selected it may be desirable to conduct a tournament to order individuals in population Would like to keep the best in the mating pool and drop the worst (elitism) Elitism is trade-off with search space completeness 11/15/2018
Crossover - 1 In sexual reproduction the genetic codes of both parents are combined to create offspring Asexual crossover has no impact on the mating pool Would like to keep 60/40 split between parent contributions 95/5 splits negate the benefits of crossover (too much like asexual reproduction) 11/15/2018
Crossover - 2 If we have selected two strings A = 11111 and B = 00000 We might choose a uniformly random site (e.g. position 3) and trade bits This would create two new strings A’ =11100 and B’ = 00011 These new strings might then be added to the mating pool if they are “fit” 11/15/2018
Mutation Mutations happen at the genome level (rarely and not good) and the genotype level (better for the GA process) Mutation is important for maintaining diversity in the genetic code In humans, mutation was responsible for the evolution of intelleigence Example: The occasional (low probably) alteration of a bit position in a string 11/15/2018
Operators Selection and mutation Selection, crossover, and mutation When used together give us a genetic algorithm equivalent of to parallel, noise tolerant, hill climbing algorithm Selection, crossover, and mutation Provide an insurance policy against losing population diversity and avoiding some of the pitfalls of ordinary “hill climbing” 11/15/2018
Replacement Determine when to insert new offspring into the mating pool and which individuals to drop out based on fitness Steady state evolution calls for the same number of individuals in the population, so each new offspring processed one at a time so fit individuals can remain a long time In generational evolution, the offspring are placed into a new population with all other offspring (genetic code only survives in kids) 11/15/2018
Genetic Algorithm Set time t = 0 Initialize population P(t) While termination condition not met Evaluate fitness of each member of P(t) Select members from P(t) based on fitness Produce offspring form the selected pairs Replace members of P(t) with better offspring Set time t = t + 1 11/15/2018
Why use genetic algorithms? They can solve hard problems Easy to interface genetic algorithms to existing simulations and models GA’s are extensible GA’s are easy to hybridize GA’s work by sampling, so populations can be sized to detect differences with specified error rates Use little problem specific code 11/15/2018
TSP To use a genetic algorithm to solve the traveling salesman problem we could begin by creating a population of candidate solutions We need to define mutation, crossover, and selection methods to aid in evolving a solution from this population 11/15/2018
TSP For crossover we might take two paths (P1 and P2) break them at arbitrary points and define new solutions Left1+Right2 and Left2+Right1 For mutation we might randomly switch two cites in an existing path 11/15/2018
Evolve Algorithm for TSP Set up initial population For G generations Create M mutations and add them to the population Subject mutations to population constraints and determine their relative fitness Create C crossovers and add them to the population Subject crossovers to population constraints and determine their relative fitness 11/15/2018
Premature Convergence - 1 Occasionally a gene takes over because it is so much fitter than all others (genetic drift) If this is the best solution, that may be OK (if not you will may never find the optimal solution if this happens too soon) Large populations genetic drift is less likely to happen Using higher mutation rates can combat genetic drift 11/15/2018
Premature Convergence - 2 High levels of randomness are not always helpful to GA To prevent genetic drift You might have several small populations and cross-breed individuals from them Take game of life approach, pretend individuals live on 2D grid and only allow breeding between neighbors (spatial organizational structure) 11/15/2018
Slow Convergence Some GA will simply fail to converge Similar to plateau problem in hill climbing (need to add noise to fitness values to make them converge) Can increase elitism to encourage fitter individuals to spread their genes (at the risk of premature convergence) Increasing level of random mutations sometimes helps 11/15/2018
Parameters Require lots of parameters (mutation rate, crossover type, population size, fitness scaling policy) Can make use of a hierarchy of GA’s with a master GA setting the parameters for an ordinary GA Parameterless GA have default values chosen for parameters so that human interaction is not needed for fine tuning 11/15/2018
Domain Knowledge GA do not exploit domain knowledge unless the KE designs special policies and operators During initialization there can be a bias toward certain genotypes selected by the domain expert Can use gene dependent mutation rates and heuristic crossover split points The choice of representation can affect the size and search efficiency of the problem space 11/15/2018
GA Strengths Do well at avoiding local minima and can often times find near optimal solutions since search is not restricted to small search areas Easy to extend by creating custom operators Perform well for global optimizations Work required to to choose representations and conversion routines is acceptable 11/15/2018
GA Weaknesses Do not take advantage of domain knowledge Not very efficient at local optimization (fine tuning solutions) Randomness inherent in GA make them hard to predict (solutions can take a long time to stumble upon) Require entire populations to work (takes lots of time and memory) and may not work well for real-time applications 11/15/2018
Evolvee Uses existing representations (like NN) Realism is relatively poor Attack simple tasks (e.g. attack behaviors) do not pose any problems for it (not found in current archive) 11/15/2018
Actions and Parameters Limited action set needed Look parameter: direction Single value: up, ahead, down Move parameter: weights Vector (projectile, collision point, impact location) Fire parameter: Jump parameter: 11/15/2018
Sequences Contained in simple arrays of actions and times Times can be associated with actions in two ways Time offset relative to previous action Absolute time since start of sequence The order of sequences in an array is not important (this allows symmetric solutions but avoids the cost of sorting actions before evolution is complete) 11/15/2018
Random Generation - 1 Time offset will be a randomly generated values within maximum sequence length Action type can be encoded as a symbol randomly chosen from set of all possible actions Parameters values are action specific and need to be chosen after action is selected and given in range values 11/15/2018
Random Generation - 2 The length of all action sequences can also be generated randomly (with an maximum upper bound) The sequences of actions will be housed in a dynamic array Start time of first action in a sequence can be reset to zero 11/15/2018
Crossover Simple one point crossover Randomly split two move sequences from parents and swap subarrays to create two new children Fairly easy to program using arrays 11/15/2018
Mutation A low probability mutation might be to change the length of a sequence Empty spaces can be filled with random action Excess actions are simply ignored A low probability mutation might be to replace individual actions within existing sequences Gene storage time follows normal Guassian distribution 11/15/2018
Evolution Population size will remain constant Evolution happens on request If individual unassigned fitness exists chose it otherwise choos two parents with probabilities proportional to their fitness for crossover/mutation Individuals are removed from the population using random selection based on inverse fitness To diversify the population remove the poorer of two similar behaviors 11/15/2018
GA Module Interfaces Exported GA inteface Evolvable interface methods void ReportFitness(const float f); Evolvable interface methods void Crossover(const Individual& a, const Individual& b, const Individual& c); void Mutate(const Individual& a, void Randomize(const Individual& a); void Allocate(vector<Individual>& population); void Deallocate(vector<Individual>& population); void Evaluate(const Individual& a); 11/15/2018
Computing Fitness Rocket Jumping Assign rewards only for upward movement when animat is not touching the floor, to avoid rewarding running up the stairs Reward high jump a lot more than lower jumps 11/15/2018
Computing Fitness Dodging Fire Provide 0 reward when hit and high reward when animat escapes with no damage Must include distance of dodging movement away from point of impact to avoid rewarding “standing still” Damage to animat must also be measured and subtracted from fitness value Use time as a 4th dimension to resolve ties 11/15/2018
Kanga Makes use of genetic algorithm Learns it jumping and dodging behaviors during the game Fitness function provides rewards on a per jump or per dodge basis 11/15/2018
Evaluation - 1 Learns to jump fairly quickly Multiple jumps are no problem Dodging behavior is also learned quickly Any balanced combination of vector weights (estimated point of impact, closest collision point, project attributes) that causes movement to safety work well Approach is sub-optimal but acceptable 11/15/2018
Evaluation - 2 Continuous fitness values are more helpful to the genetic algorithm than Boolean success indicators Scheme reveals how well it is possible to evolve behaviors using genetic operators The representation is better suited to modeling sequences than either decision trees or fuzzy rules Representation is incompatible with rule-based schemes 11/15/2018