Genetic Algorithm
Outline Motivation Genetic algorithms An illustrative example Hypothesis space search
Motivation Evolution is known to be a successful, robust method for adaptation within biological systems GAs can search spaces of hypotheses containing complex interacting parts GAs are easily parallelized and can take advantage of the decreasing costs of powerful computer hardware
Introduction of GAs A genetic algorithm (or GA) is a search technique used in computing to find true or approximate solutions to optimization and search problems. Genetic algorithms are categorized as global search heuristics. Genetic algorithms are a particular class of evolutionary algorithms that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover (also called recombination).
Introduction of GAs Genetic algorithms are implemented as a computer simulation in which a population of abstract representations (called chromosomes or the genotype or the genome) of candidate solutions (called individuals) to an optimization problem evolves toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible.
Introduction of GAs
The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are selected from the current population (based on their fitness), and modified (recombined and possibly mutated) to form a new population
Introduction of GAs The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. If the algorithm has terminated due to a maximum number of generations, a satisfactory solution may or may not have been reached.
Genetic information is stored in the chromosomes Each chromosome is build of DNA Chromosomes in humans form pairs. There are 23 pairs. The chromosome is divided in parts: genes Genes code for properties Every gene has an unique positionon the chromosome: locus Biological Background “Chromosomes”
Biological Background “Reproduction ” During reproduction “errors” occur Due to these “errors” genetic variation exists Most important “errors” are: Recombination (cross-over) Mutation
Prototype of GA Fitness – A predefined numerical measure for the problem at hand Population – The algorithm operates by iteratively updating a pool of hypotheses, called the population
Genetic Algorithm GA(Fitness,Fitness_threshold,p,r,m) Fitness: A function that assigns an evaluation score,given a hypothesis Fitness_threshold: A threshold specifying the termination criterion p: The number of hypotheses to be included in the population r: The fraction of the population to be replaced by Crossover at each step m: The mutation rate
Genetic Algorithm Initialize population : P←Generate p hypotheses at random Evaluate: For each h in P, compute Fitness(h) While [max Fitness(h)]<Fitness_threshold do Create a new generation P S : 1.Select 2.Crossover 3.Mutate 4.Update: P PS 5.Evaluate: update Fitness(h) Return the hypothesis from P that has the highest fitness
Genetic Algorithm Select: Probabilistically select (1-r)p members of P to add to P S. Crossover: Probabilistically select r*p/2 pairs of hypotheses from P, according to Pr(h i ) Mutate: Choose m percent of the members of P S with uniform probability to invert one randomly selected bit in its representation
Genetic Algorithm Hypotheses in GAs are often represented by bit strings, so that they can be easily manipulated by genetic operators Rule precondition and postcondition – IF Wind = Strong THEN PlayTennis = yes
16/44 Bit String Outlook: Sunny,Overcast or Rain Wind: Strong or Weak Rule: Representation: Outlook Wind (
17/44 Bit String,cont. Rule: IF THEN Representation: Outlook Wind PlayTennis Note the string representing the rule contains a substring for each attribute in the hypothesis space
18/44 Genetic Operators Crossover – Produces two new offspring from two parent strings by copying selected bits from each parent – Crossover mask – Crossover point n is chosen at random – In the case of uniform crossover, the crossover mask is generated as a random bit string Mutation – Produces small random changes to the parents – By choosing a single bit at random then changing its value
Initial Strings OffspringSingle-Point Two-Point Uniform Genetic Operators
Mutation
21/44 Hypothesis Space Search The GA search is less likely to fall into local minima that can plague gradient descent methods Crowding – Some individual that is more highly fit than others in the population quickly reproduces – Alter the selection function – Fitness sharing – Restrict the kinds of individuals allowed to recombine to form offspring
22/44 Population Evolution and the Schema Theorem Each schema represents the set of bit strings containing the indicated 0s,1s and *s – 0*10 represents the set of bit strings that includes exactly 0010 and 0110 An individual bit string can be viewed as a representative of each of the different schemas that it matches – 0010 can be thought of as a representative of 2 4 distinct schemas including 00**,0*10,****,etc
23/44 Schema Theorem Let m(s,t) denote the number of instances of schema s in the population at time t Thus,schema theorem describes the expected value of m(s,t+1) Let us start by considering just the effect of the selection step : the fitness of the individual bit string : the average fitness of all individuals at time t
24/44 Schema Theorem : the total number of individuals The possibility of selecting hypothesis h is given by : h is both a representative of schema s and a member of the population at time t
25/44 Schema Theorem Let denote the average fitness of instances of schema s in the population at time t The possibility that we will select a representative of schema s is
26/44 Schema Theorem Thus, the expected number of instances of s resulting from the n independent selection steps is n times this probability If crossover and mutation is considered,
Case Study I F(s)=s^2 (s<32) – s1= 13 (01101) s2= 24 (11000) s3= 8 (01000) s4= 19 (10011) – f (s1) = f(13) = 13^2 = 169 f (s2) = f(24) = 24^2 = 576 f (s3) = f(8) = 8^2 = 64 f (s4) = f(19) = 19^2 = 361
Case Study I
Four random values – r1 = 0.450, r2 = 0.110, r3 = 0.572, r4 = IndexSFitnessPercent# selection s1’ =11000 ( 24 ), s2’ =01101 ( 13 ) s3’ =11000 ( 24 ), s4’ =10011 ( 19 ) Crossover (last two position) s1’’=11001 ( 25 ), s2’’=01100 ( 12 ) s3’’=11011 ( 27 ), s4’’=10000 ( 16 )
Case Study I IndexSFitnessPercent# selection s1’=11001 ( 25 ), s2’= ( 12 ) s3’=11011 ( 27 ), s4’= ( 16 ) Crossover (last three position) s1’’=11100 ( 28 ), s2’’ = ( 9 ) s3’’ =11000 ( 24 ), s4’’ = ( 19 )
Case Study I IndexSFitnessPercent# selection s1’=11100 ( 28 ), s2’=11100 ( 28 ) s3’=11000 ( 24 ), s4’=10011 ( 19 ) Crossover (last two position) s1’’=11111 ( 31 ), s2’’=11100 ( 28 ) s3’’=11000 ( 24 ), s4’’=10000 ( 16 )
Case Study I
Case Study II Using 3 bits to represent one variable. Therefore, 6 bits are for two variables. For example, mean x 1 =6 and x 2 =1
Case Study II Initialize population , , , Fitness function
Case Study II IndexP(0)X1X2FitnessPercent# selection Sum1431
Case Study II IndexselectionPairCrossover positionCrossover results :2 3-4:
Case Study II IndexCrossover resultsMutation siteMutation results IndexP(0)X1X2FitnessPercent Sum2351