Algorithms and their Applications CS2004 ( ) Dr Stephen Swift 12.1 An Introduction to Genetic Algorithms
Previously On CS – Part 1 For the first part of the module we looked at the traditional and foundational parts of algorithms We have looked at: Concepts of Computation and Algorithms Comparing algorithms Some mathematical foundation The Big-Oh notation Computational Complexity Data Structures Sorting Algorithms Graphs and Graph Algorithms Tabu Search and ILSSlide 2
Previously On CS – Part 2 We then moved focus to Heuristic Search Algorithms: Concepts Fitness Representation Search Space Methods Hill Climbing Stochastic Hill Climbing Random Restart Hill Climbing Simulated Annealing Tabu Search Iterated Local Search Tabu Search and ILSSlide 3
Previously On CS – Part 2 For the next three lectures: We are going to look at some more esoteric Heuristic Search Algorithms Evolutionary Algorithms, e.g. Genetic Algorithms and Evolutionary Programming Swarm Algorithms, e.g. Ant Colony Optimisation and Particle Swarm Optimisation For the final two lectures: We will look at some more applications Bin Packing Data Clustering The Travelling Salesman Problem Tabu Search and ILSSlide 4
An Introduction to Genetic AlgorithmsSlide 5 Genetic Algorithms Genetic Algorithms Genetic Algorithms (GA) are a powerful tool They can perform numerical optimisation and AI search Also other tasks The correct use is down to experience GAs can help in areas where there seems to be no solution GAs can usually find a partial answer Other methods may well do better!
An Introduction to Genetic AlgorithmsSlide 6 Are GAs Controversial? Last year I had several students walk out after five minutes of this lecture Why? Evolution Probably because I discuss the concepts of Evolution NOT This is NOT a lecture on Evolution Evolution neutral! Evolution neutral! But we need to understand the concepts to understand Genetic Algorithms…
An Introduction to Genetic AlgorithmsSlide 7 Biological Evolution – Part 1 Genetic Algorithms “mimic” evolution gene Evolution is the change of a gene pool over time A gene is a biological hereditary unit that is passed on (usually unaltered) for many generations Chromosomes Genes are contained within the nucleus of a cell, within Chromosomes Most organisms have multiple chromosomes
An Introduction to Genetic AlgorithmsSlide 8 Biological Evolution – Part 2 Gene Pool The Gene Pool is the set of all genes for a species Evolutionary theory states “That if the environment changes, the Gene Pool must change for survival” adaptation This process is called adaptation apparently This is apparently happening all of the time 1918 Spanish Flu E.g. The 1918 Spanish Flu pandemic Approximately 1/3 of the population infected Approximately 1/7 of those infected died extremely The virus was extremely lethal in the initial stages of the outbreak It had to adapt (more infectious and less lethal) or it would have “killed” itself...
An Introduction to Genetic AlgorithmsSlide 9 The Process mutate Genes mutate through random change Natural Selection Individuals are selected/survive, through Natural Selection Populations recombination Populations evolve and breed through recombination Charles Darwin developed the basic idea in 1859 The subject has advanced a lot since then
An Introduction to Genetic AlgorithmsSlide 10 Evolution on Trial VERY Evolution Theory is VERY controversial God/Creationism/Intelligent Design Second Law of Thermodynamics Probability Models Observational Anomalies, e.g. the bat Whether the theory is true or false will have no bearing on these lectures since Genetic Algorithms work!
An Introduction to Genetic AlgorithmsSlide 11 History of Genetic Algorithms Developed by John Henry Holland In the early 1970’s MIT, IBM, Michigan He was one of the first PhD students in computer science He hated programming! We will look at Holland’s original GA and then look at some of the advances
An Introduction to Genetic AlgorithmsSlide 12 Genes and Chromosomes The technique uses biological metaphors gene Each gene is a binary digit chromosome A chromosome is a single string of genes (Haploid) solution A solution to a problem is encoded as a Chromosome representation The encoding is called the representation search space It must cover the whole search space Fitness Function A Fitness Function is needed to rate how good a solution a chromosome represents We should be very familiar by now with these concepts.....
An Introduction to Genetic AlgorithmsSlide 13 An Example Problem A transaction processing system performs two type of tasks for a customer, X and Y. Only 100 task Xs and 20 task Ys can be processed per day. The customer has specified that at least twice as many task Xs as task Ys must be processed per day. There is also only enough time to process 100 tasks a day. The charge to the customer for a task X is €5 and €200 for a task Y. The objective is to process the number of task Xs and task Ys that meets the requirements as well as making the most money out of the customer.
An Introduction to Genetic AlgorithmsSlide 14 Simplifying the Problem Let Z be the objective function to be maximised (This is a constraint satisfaction problem)
An Introduction to Genetic AlgorithmsSlide 15 Chromosome Example X Y We could solve this with a Genetic Algorithm The representation could be as follows: (Note there may be invalid chromosomes)
An Introduction to Genetic AlgorithmsSlide 16 Population and Generation An organism is a chromosome population The population is the number of chromosomes “alive” at any one time generations The term generations is the number of times breeding has occurred
An Introduction to Genetic AlgorithmsSlide 17 Crossover and Mutation Genetic Operators Two concepts are defined, which are referred to as Genetic Operators Crossover This is analogous to recombination or breeding children Typically genetic material from two parents are combined to create children Mutation This is analogous to biological mutation Genes are randomly changed
An Introduction to Genetic AlgorithmsSlide 18 Mutation Each bit (gene) of a chromosome is given a chance (probability) MP of inverting A ‘1’ becomes a ‘0’ A ‘0’ becomes a ‘1’ (These ones)
An Introduction to Genetic AlgorithmsSlide 19 Crossover – One Point Chromosomes (with n genes) move to the crossover pool with CP chance Each are randomly paired up ( A and B ) Two children are created ( C and D ) A random number p between 2 and n -1 is generated for each parent pair 1..p of D become 1..p of A p+1.. n of C become p+1..n of A 1..p of C becomes 1..p of B p+1..n of D become p+1..n of B Parents and children go back to population
An Introduction to Genetic AlgorithmsSlide 20 Crossover – Uniform One point crossover was used by Holland Uniform crossover is a more powerful extension For each gene, there is a 50% chance that child C gets the gene from parent A and a 50% chance that it is from parent B Child D gets the gene that child C does not
An Introduction to Genetic AlgorithmsSlide 21 Crossover Example One Point Uniform P1 P2 Parent 1 Parent 2 P1 P2 Child 1 Child 2 P1 P2 P1 P2 P1 P2
An Introduction to Genetic AlgorithmsSlide 22 Roulette A new population is formed Equal in size to the original population size The chance of a chromosome surviving is proportional to it’s fitness vs. the total of the others A chromosome may be chosen to survive zero or more times Survival of the Fittest via a biased Roulette Wheel! There are many other types
An Introduction to Genetic AlgorithmsSlide 23 GAs - Parameters NG Number of Generations PS Population Size CP Crossover Probability MP Mutation Probability n The number of bits (genes) making up each Chromosome
An Introduction to Genetic AlgorithmsSlide 24 Holland’s Algorithm Input: The GA parameters: NG, PS, CP, MP and n The Fitness Function 1) Generate PS random Chromosomes of length n 2) For i = 1 to NG 3) Crossover Population, with chance CP per Chromosome 4)Mutate all the Population, with chance MP per gene 5)Kill off all Invalid Chromosomes 6)Survival of Fittest, e.g. Roulette Wheel 7) End For Output: The best solution to the problem is the Chromosome in the last generation (the NGth population) which has the best fitness value
An Introduction to Genetic AlgorithmsSlide 25 Where are they Used? Search space is irregular Fitness function is noisy Task does not require an exact global maximum, just a good fast approximation No other method can help
An Introduction to Genetic AlgorithmsSlide 26 Why Do They Work? Under what circumstances do they converge? Proof is very complex Not a proper proof The new correct proof is highly mathematical We will look at Holland’s proof which is now out of date and old fashioned but simple! Only an outline will be given
An Introduction to Genetic AlgorithmsSlide 27 Definition: Schema Schema A Schema is a template They incorporate the wild card character ‘*’ - a don’t care symbol The 4 bit schema 1*00 matches 1100 and 1000 The value of a schema is the average of all the chromosomes in a population containing that schema The order of a schema is its length
An Introduction to Genetic AlgorithmsSlide 28 Definition: Epistasis locus A gene locus is the position of a gene alleles A gene can assume a number of values called alleles 0 or 1 for a simple GA Epistasis Epistasis is the measure of how Important two loci are for determining the scoring function, relative to distance epistatic In biological systems, a gene is epistatic if its presence suppresses the effect of another gene in another locus
An Introduction to Genetic AlgorithmsSlide 29 The Schema Theorem Short, low order, above average schemata receive increasing occurrences in subsequent generations of a Genetic Algorithm
An Introduction to Genetic AlgorithmsSlide 30 Building Blocks A Genetic Algorithm seeks near optimal performance through the juxtaposition of short, low order, above average schema. These are called Building Blocks
An Introduction to Genetic AlgorithmsSlide 31 Deception This is why a GA sometimes does not work Deception Deception is defined to be a special case of epistasis Rather like the schemata theorem failing At the start, a building block might be high scoring, but towards the end might be low scoring
An Introduction to Genetic AlgorithmsSlide 32 Summary of Theory The random initial population creates random schema Crossover and mutation aid the creation of schema Survival of the fittest gets rid of low scoring schema Hence the population’s average score tends to increase as the generation number increases
An Introduction to Genetic AlgorithmsSlide 33 Parameters Population size [10,100] depending on the problem Generations [100,1000] depending on the problem Chromosome size Dependent on problem As small as possible (not too small) Virtually zero invalid organisms Mutation rate: % ( 1/n ) Crossover rate: 50%-100%
An Introduction to Genetic AlgorithmsSlide 34 Other Parts/Features Inversion Niche methods Variable crossover and mutation Adaptive operators Carry forward methods (elitism) Multiple chromosomes Grey codes Floating point representation
An Introduction to Genetic AlgorithmsSlide 35 Co-Evolution – Part 1 The Fitness function is a GA The solution competes against the fitness function Predator prey scenario – “Evolutionary Arms Race” The increasing fitness function kills off the solutions, forcing them to adapt more efficiently
An Introduction to Genetic AlgorithmsSlide 36 Co-Evolution – Part 2 s s s s s s s s s f f f f f f f f f All vs. All All vs. Best All vs. Random
The Laboratory The laboratory will involve applying a GA to the Scales problem Slide 37An Introduction to Genetic Algorithms
Next Lecture We will look at using a GA to solve an example problem We will also look at other aspects of Evolutionary Computation Slide 38An Introduction to Genetic Algorithms