Machine Learning Evolutionary Algorithms

Slides:



Advertisements
Similar presentations
Genetic Algorithms Chapter 3. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Genetic Algorithms GA Quick Overview Developed: USA in.
Advertisements

Genetic Algorithms Chapter 3.
1 Lecture 8: Genetic Algorithms Contents : Miming nature The steps of the algorithm –Coosing parents –Reproduction –Mutation Deeper in GA –Stochastic Universal.
EvoNet Flying Circus Introduction to Evolutionary Computation Brought to you by (insert your name) The EvoNet Training Committee The EvoNet Flying Circus.
Evolutionary Computational Intelligence
Doug Downey, adapted from Bryan Pardo, Machine Learning EECS 349 Fall 2007 Machine Learning Genetic Algorithms.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2002.
Evolutionary Computational Intelligence
Intro to AI Genetic Algorithm Ruth Bergman Fall 2004.
Chapter 6: Transform and Conquer Genetic Algorithms The Design and Analysis of Algorithms.
Prepared by Barış GÖKÇE 1.  Search Methods  Evolutionary Algorithms (EA)  Characteristics of EAs  Genetic Programming (GP)  Evolutionary Programming.
Genetic Algorithm.
SOFT COMPUTING (Optimization Techniques using GA) Dr. N.Uma Maheswari Professor/CSE PSNA CET.
Introduction Chapter 1. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Introduction Contents The basic EC metaphor Historical perspective.
Evolution Strategies Evolutionary Programming Genetic Programming Michael J. Watts
Introduction to Evolutionary Algorithms Lecture 1 Jim Smith University of the West of England, UK May/June 2012.
Genetic algorithms Prof Kang Li
Introduction Chapter 1. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Introduction with additions/modification by Christoph F. Eick.
Genetic algorithms Charles Darwin "A man who dares to waste an hour of life has not discovered the value of life"
Genetic Algorithms 虞台文. Content Evolutional Algorithms Genetic Algorithms Main Components of Genetic Algorithms – Encoding – Fitness Function – Recombination.
Introduction to GAs: Genetic Algorithms How to apply GAs to SNA? Thank you for all pictures and information referred.
Introduction to Evolutionary Algorithms Session 4 Jim Smith University of the West of England, UK May/June 2012.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Genetic Algorithms Chapter 3. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Genetic Algorithms GA Quick Overview Developed: USA in.
Crossovers and Mutation Richard P. Simpson. Genotype vs. Phenotype The genotype of a chromosome is just the basic data structure (it bits in a binary.
Evolutionary Computing Chapter 2. / 24 Chapter 2: Evolutionary Computing: the Origins Historical perspective Biological inspiration: – Darwinian evolution.
Evolutionary Computing Dialects Presented by A.E. Eiben Free University Amsterdam with thanks to the EvoNet Training Committee and its “Flying Circus”
Genetic Algorithms. 2 Overview Introduction To Genetic Algorithms (GAs) GA Operators and Parameters Genetic Algorithms To Solve The Traveling Salesman.
GENETIC ALGORITHM Basic Algorithm begin set time t = 0;
D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Genetic Algorithms. Underlying Concept  Charles Darwin outlined the principle of natural selection.  Natural Selection is the process by which evolution.
Genetic Algorithm Dr. Md. Al-amin Bhuiyan Professor, Dept. of CSE Jahangirnagar University.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
Genetic Algorithms. Solution Search in Problem Space.
Genetic Algorithms And other approaches for similar applications Optimization Techniques.
 Presented By: Abdul Aziz Ghazi  Roll No:  Presented to: Sir Harris.
Genetic Algorithms Chapter 3.
Introduction to Genetic Algorithms
Chapter 14 Genetic Algorithms.
Genetic Algorithms Author: A.E. Eiben and J.E. Smith
Genetic Algorithm in TDR System
Genetic Algorithm (GA)
Genetic Algorithms.
Evolutionary Programming
Introduction to Evolutionary Computing
Evolution Strategies Evolutionary Programming
Artificial Intelligence Methods (AIM)
Introduction to Genetic Algorithm (GA)
CSC 380: Design and Analysis of Algorithms
Genetic Algorithms Chapter 3.
Basics of Genetic Algorithms (MidTerm – only in RED material)
Introduction Chapter 1.
Genetic Algorithms Chapter 3.
Genetic Algorithms Chapter 3.
Genetic Algorithms Chapter 3.
GENETIC ALGORITHMS & MACHINE LEARNING
Genetic Algorithms Chapter 3.
Basics of Genetic Algorithms
EE368 Soft Computing Genetic Algorithms.
Evolutionary Programming
Boltzmann Machine (BM) (§6.4)
A Gentle introduction Richard P. Simpson
Introduction Chapter 1.
Genetic Algorithms Chapter 3.
Genetic Algorithms Chapter 3.
Genetic Algorithm Soft Computing: use of inexact t solution to compute hard task problems. Soft computing tolerant of imprecision, uncertainty, partial.
Beyond Classical Search
Population Based Metaheuristics
CSC 380: Design and Analysis of Algorithms
Presentation transcript:

Machine Learning Evolutionary Algorithms

You are here

What is Evolutionary Computation? An abstraction from the theory of biological evolution that is used to create optimization procedures or methodologies, usually implemented on computers, that are used to solve problems.

Brief History : the ancestors 1948, Turing: proposes “genetical or evolutionary search” 1962, Bremermann optimization through evolution and recombination 1964, Rechenberg introduces evolution strategies 1965, L. Fogel, Owens and Walsh introduce evolutionary programming 1975, Holland introduces genetic algorithms 1992, Koza introduces genetic programming

Darwinian Evolution Survival of the fittest All environments have finite resources (i.e., can only support a limited number of individuals) Life forms have basic instinct/ lifecycles geared towards reproduction Therefore some kind of selection is inevitable Those individuals that compete for the resources most effectively have increased chance of reproduction Note: fitness in natural evolution is a derived, secondary measure, i.e., we (humans) assign a high fitness to individuals with many offspring

Darwinian Evolution:Summary Population consists of diverse set of individuals Combinations of traits that are better adapted tend to increase representation in population Individuals are “units of selection” Variations occur through random changes yielding constant source of diversity, coupled with selection means that: Population is the “unit of evolution” Note the absence of “guiding force”

The Concept of Natural Selection Limited number of resources Competition results in struggle for existence Success depends on fitness -- fitness of an individual: how well-adapted an individual is to their environment. This is determined by their genes (blueprints for their physical and other characteristics). Successful individuals are able to reproduce and pass on their genes

Crossing-over Chromosome pairs align and duplicate Inner pairs link at a centromere and swap parts of themselves After crossing-over one of each pair goes into each gamete

Recombination (Crossing-Over) Image from http://esg-www.mit.edu:8001/bio/mg/meiosis.html

New person cell (zygote) Fertilisation Sperm cell from Father Egg cell from Mother New person cell (zygote)

Mutation Occasionally some of the genetic material changes very slightly during this process (replication error) This means that the child might have genetic material information not inherited from either parent This can be catastrophic: offspring in not viable (most likely) neutral: new feature not influences fitness advantageous: strong new feature occurs Redundancy in the genetic code forms a good way of error checking

Genetic code All proteins in life on earth are composed of sequences built from 20 different amino acids DNA is built from four nucleotides in a double helix spiral: purines A,G; pyrimidines T,C Triplets of these from codons, each of which codes for a specific amino acid Much redundancy: purines complement pyrimidines the DNA contains much rubbish 43=64 codons code for 20 amino acids genetic code = the mapping from codons to amino acids For all natural life on earth, the genetic code is the same !

Motivations for Evolutionary Computation The best problem solver known in nature is: the (human) brain that created “the wheel, New York, wars and so on” (after Douglas Adams’ Hitch-Hikers Guide) the evolution mechanism that created the human brain (after Darwin’s Origin of Species) Answer 1  neurocomputing Answer 2  evolutionary computing

Problem type 1 : Optimisation We have a model of our system and seek inputs that give us a specified goal e.g. time tables for university, call center, or hospital design specifications, etc etc

EC A population of individuals exists in an environment with limited resources Competition for those resources causes selection of those fitter individuals that are better adapted to the environment These individuals act as seeds for the generation of new individuals through recombination and mutation The new individuals have their fitness evaluated and compete (possibly also with parents) for survival. Over time Natural selection causes a rise in the fitness of the population

General Scheme of EC

Pseudo-code for typical EA

What are the different types of EAs Historically different flavours of EAs have been associated with different representations: Binary strings : Genetic Algorithms Real-valued vectors : Evolution Strategies Trees: Genetic Programming Finite state Machines: Evolutionary Programming

Evolutionary Algorithms Parameters of EAs may differ from one type to another. Main parameters: Population size Maximum number of generations Selection factor Mutation rate Cross-over rate There are six main characteristics of EAs Representation Selection Recombination Mutation Fitness Function Survivor Decision

Example: Discrete Representation (Binary alphabet) Representation of an individual can be using discrete values (binary, integer, or any other system with a discrete set of values). Following is an example of binary representation. CHROMOSOME GENE

Evaluation (Fitness) Function Represents the requirements that the population should adapt to Called also quality function or objective function Assigns a single real-valued fitness to each phenotype which forms the basis for selection So the more discrimination (different values) the better Typically we talk about fitness being maximised Some problems may be best posed as minimisation problems

Population Holds (representations of) possible solutions Usually has a fixed size and is a multiset of genotypes Some sophisticated EAs also assert a spatial structure on the population e.g., a grid. Selection operators usually take whole population into account i.e., reproductive probabilities are relative to current generation

Parent Selection Mechanism Assigns variable probabilities of individuals acting as parents depending on their fitness Usually probabilistic high quality solutions more likely to become parents than low quality but not guaranteed even worst in current population usually has non-zero probability of becoming a parent

Mutation Acts on one genotype and delivers another Element of randomness is essential and differentiates it from other unary heuristic operators May guarantee connectedness of search space and hence convergence proofs

Recombination Merges information from parents into offspring Choice of what information to merge is stochastic Most offspring may be worse, or the same as the parents Hope is that some are better by combining elements of genotypes that lead to good traits

Survivor Selection replacement Most EAs use fixed population size so need a way of going from (parents + offspring) to next generation Often deterministic Fitness based : e.g., rank parents+offspring and take best Age based: make as many offspring as parents and delete all parents Sometimes do combination

Example: Fitness proportionate selection Expected number of times fi is selected for mating is: Better (fitter) individuals have: more space more chances to be selected Best Worst

Example: Tournament selection Select k random individuals, without replacement Take the best k is called the size of the tournament

Example: Ranked based selection Individuals are sorted on their fitness value from best to worse. The place in this sorted list is called rank. Instead of using the fitness value of an individual, the rank is used by a function to select individuals from this sorted list. The function is biased towards individuals with a high rank (= good fitness).

Example: Ranked based selection Fitness: f(A) = 5, f(B) = 2, f(C) = 19 Rank: r(A) = 2, r(B) = 3, r(C) = 1 Function: h(A) = 3, h(B) = 5, h(C) = 1 Proportion on the roulette wheel: p(A) = 11.1%, p(B) = 33.3%, p(C) = 55.6% *skip*

Initialisation / Termination Initialisation usually done at random, Need to ensure even spread and mixture of possible allele values Can include existing solutions, or use problem-specific heuristics, to “seed” the population Termination condition checked every generation Reaching some (known/hoped for) fitness Reaching some maximum allowed number of generations Reaching some minimum level of diversity Reaching some specified number of generations without fitness improvement

Algorithm performance Never draw any conclusion from a single run use statistical measures (averages, medians) from a sufficient number of independent runs From the application point of view design perspective: find a very good solution at least once production perspective: find a good solution at almost every run More detailed explanation of the actual problem tackled by the EA. Is the problem static or dynamic? Is the evaluation function subjective ? or noisy ? The more diagrams or pictures here the better - show how the EA search works within the context of the application area 3

Genetic Algorithms

GA Overview Developed: USA in the 1970’s Early names: J. Holland, K. DeJong, D. Goldberg Typically applied to: discrete optimization Attributed features: not too fast good heuristic for combinatorial problems Special Features: Traditionally emphasizes combining information from good parents (crossover) many variants, e.g., reproduction models, operators

Genetic algorithms Holland’s original GA is now known as the simple genetic algorithm (SGA) Other GAs use different: Representations Mutations Crossovers Selection mechanisms

SGA technical summary tableau Representation Binary strings Recombination N-point or uniform Mutation Bitwise bit-flipping with fixed probability Parent selection Fitness-Proportionate Survivor selection All children replace parents Speciality Emphasis on crossover

SGA reproduction cycle Select parents for the mating pool (size of mating pool = population size) Shuffle the mating pool For each consecutive pair apply crossover with probability pc , otherwise copy parents For each offspring apply mutation (bit-flip with probability pm independently for each bit) Replace the whole population with the resulting offspring

SGA operators: 1-point crossover Choose a random point on the two parents Split parents at this crossover point Create children by exchanging tails Pc typically in range (0.6, 0.9)

SGA operators: mutation Alter each gene independently with a probability pm pm is called the mutation rate Typically between 1/pop_size and 1/ chromosome_length

SGA operators: Selection Main idea: better individuals get higher chance Chances proportional to fitness Implementation: roulette wheel technique Assign to each individual a part of the roulette wheel Spin the wheel n times to select n individuals A C 1/6 = 17% 3/6 = 50% B 2/6 = 33% fitness(A) = 3 fitness(B) = 1 fitness(C) = 2

Simple problem: max x2 over {0,1,…,31} GA approach: An example Simple problem: max x2 over {0,1,…,31} GA approach: Representation: binary code, e.g. 01101  13 Population size: 4 1-point xover, bitwise mutation Roulette wheel selection Random initialisation We show one generational cycle done by hand

x2 example: selection

X2 example: crossover

X2 example: mutation

Shows many shortcomings, e.g. The simple GA Shows many shortcomings, e.g. Representation is too restrictive Mutation & crossovers only applicable for bit-string & integer representations Selection mechanism sensitive for converging populations with close fitness values Generational population model (step 5 in SGA repr. cycle) can be improved with explicit survivor selection

Two-point Crossover Two points are chosen in the strings The material falling between the two points is swapped in the string for the two offspring Example:

n-point crossover Choose n random crossover points Split along those points Glue parts, alternating between parents Generalisation of 1 point (still some positional bias)

Uniform crossover Assign 'heads' to one parent, 'tails' to the other Flip a coin for each gene of the first child Make an inverse copy of the gene for the second child Inheritance is independent of position

Cycle crossover example Step 1: identify cycles Step 2: copy alternate cycles into offspring

Crossover OR mutation? Exploration: Discovering promising areas in the search space, i.e. gaining information on the problem Exploitation: Optimising within a promising area, i.e. using information There is co-operation AND competition between them Crossover is explorative, it makes a big jump to an area somewhere “in between” two (parent) areas Mutation is exploitative, it creates random small diversions, thereby staying near (in the area of ) the parent

Other representations Gray coding of integers (still binary chromosomes) Gray coding is a mapping that means that small changes in the genotype cause small changes in the phenotype (unlike binary coding). “Smoother” genotype-phenotype mapping makes life easier for the GA Nowadays it is generally accepted that it is better to encode numerical variables directly as Integers Floating point variables

Floating point mutations Non-uniform mutations: Many methods proposed, such as time-varying range of change etc. Most schemes are probabilistic but usually only make a small change to value Most common method is to add random deviate to each variable separately, taken from N(0, ) Gaussian distribution and then curtail to range Standard deviation  controls amount of change (2/3 of drawingns will lie in range (-  to + )

Multiparent recombination Recall that we are not constricted by the practicalities of nature Noting that mutation uses 1 parent, and “traditional” crossover 2, the extension to a>2 is natural to examine Three main types: Based on allele frequencies, e.g., p-sexual voting generalising uniform crossover Based on segmentation and recombination of the parents, e.g., diagonal crossover generalising n-point crossover Based on numerical operations on real-valued alleles, e.g., center of mass crossover, generalising arithmetic recombination operators

Fitness Based Competition Selection can occur in two places: Selection from current generation to take part in mating (parent selection) Selection from parents + offspring to go into next generation (survivor selection) Selection operators work on whole individual i.e. they are representation-independent Distinction between selection operators: define selection probabilities algorithms: define how probabilities are implemented

Tournament Selection All methods above rely on global population statistics Could be a bottleneck esp. on parallel machines Relies on presence of external fitness function which might not exist: e.g. evolving game players Informal Procedure: Pick k members at random then select the best of these Repeat to select more individuals

Tournament Selection 2 Probability of selecting i will depend on: Rank of i Size of sample k higher k increases selection pressure Whether contestants are picked with replacement Picking without replacement increases selection pressure Whether fittest contestant always wins (deterministic) or this happens with probability p For k = 2, time for fittest individual to take over population is the same as linear ranking with s = 2 • p

Survivor Selection Most of methods above used for parent selection Survivor selection can be divided into two approaches: Age-Based Selection e.g. SGA In SSGA can implement as “delete-random” (not recommended) or as first-in-first-out (a.k.a. delete-oldest) Fitness-Based Selection Using one of the methods above

Example application of order based GAs: JSSP Precedence constrained job shop scheduling problem J is a set of jobs. O is a set of operations M is a set of machines Able  O  M defines which machines can perform which operations Pre  O  O defines which operation should precede which Dur :  O  M  IR defines the duration of o  O on m  M The goal is now to find a schedule that is: Complete: all jobs are scheduled Correct: all conditions defined by Able and Pre are satisfied Optimal: the total duration of the schedule is minimal

Precedence constrained job shop scheduling GA Representation: individuals are permutations of operations Permutations are decoded to schedules by a decoding procedure take the first (next) operation from the individual look up its machine (here we assume there is only one) assign the earliest possible starting time on this machine, subject to machine occupation precedence relations holding for this operation in the schedule created so far fitness of a permutation is the duration of the corresponding schedule (to be minimized) use any suitable mutation and crossover use roulette wheel parent selection on inverse fitness Generational GA model for survivor selection use random initialisation

An Example Application to Transportation System Design Taken from the ACM Student Magazine Undergraduate project of Ricardo Hoar & Joanne Penner Vehicles on a road system Modelled as individual agents using ANT technology (Also an AI technique) Want to increase traffic flow Uses a GA approach to evolve solutions to: The problem of timing traffic lights Optimal solutions only known for very simple road systems Details not given about bit-string representation But traffic light switching times are real-valued numbers over a continuous space

Evolution strategies

ES quick overview Developed: Germany in the 1970’s Early names: I. Rechenberg, H.-P. Schwefel Typically applied to: numerical optimisation Attributed features: fast good optimizer for real-valued optimisation relatively much theory Special: self-adaptation of (mutation) parameters standard

Basic Evolution Strategy 1. Generate some random individuals 2. Select the p best individuals based on some selection algorithm (fitness function) 3. Use these p individuals to generate c children 4. Go to step 2, until the desired result is achieved (i.e. little difference between generations)

ES (cont) Recombination & Mutation: These operands are very similar to the those of GA. The main difference is that mutation amount is not constant in ES. There are additional parameters, sigma, which are used as the mutation amount of the original mutation amount. So, the mutation amount also gets closer to the optimum value. As recombination function, three main functions can be used: Arithmetic mean of the parents Geometric mean of the parents Discrete cross-over method.

Encoding Individuals are encoded as vectors of real numbers (object parameters) op = (o1, o2, o3, … , om) The strategy parameters control the mutation of the object parameters sp = (s1, s2, s3, … , sm) These two parameters constitute the individual’s chromosome

Object Parameter Mutation affected by the strategy parameters (another vector of real numbers) op = (8, 12, 31, … ,5) opmutated = (8.2, 11.9, 31.3, …, 5.7) sp = (.1, .3, .2, …, .5)

Example: Mutation for real valued representation Perturb values by adding some random noise Often, a Gaussian/normal distribution N(0,) is used, where 0 is the mean value  is the standard deviation and x’i = xi + N(0,i) for each parameter

Discrete Recombination Similar to crossover of genetic algorithms Equal probability of receiving each parameter from each parent (8, 12, 31, … ,5) (2, 5, 23, … , 14) (2, 12, 31, … , 14)

Intermediate Recombination Often used to adapt the strategy parameters Each child parameter is the mean value of the corresponding parent parameters (8, 12, 31, … ,5) (2, 5, 23, … , 14) (5, 8.5, 27, … , 9.5)

ES technical summary tableau Representation Real-valued vectors Recombination Discrete or intermediary Mutation Gaussian perturbation Parent selection Uniform random Survivor selection (,) or (+) Specialty Self-adaptation of mutation step sizes

Another historical example: the jet nozzle experiment Task: to optimize the shape of a jet nozzle Approach: random mutations to shape + selection Initial shape Final shape

Evolution Process p parents produce c children in each generation Four types of processes: p,c p/r,c p+c p/r+c

p,c p parents produce c children using mutation only (no recombination) The fittest p children become the parents for the next generation Parents are not part of the next generation c  p p/r,c is the above with recombination

p+c p parents produce c children using mutation only (no recombination) The fittest p individuals (parents or children) become the parents of the next generation p/r+c is the above with recombination

When to Use GA’s vs. ES’s Genetic Algorithms Evolution Strategies More important to find optimal solution (GA’s more likely to find global maximum; usually slower) Problem parameters can be represented as bit strings (computational problems) Evolution Strategies “Good enough” solution acceptable (ES’s usually faster; can readily find local maximum) Problem parameters are real numbers (engineering problems)

Quiz The following data set will be used to learn a decision tree for predicting whether students are lazy (L) or diligent (D) based on their weight (Normal or Underweight), their eye color (Amber or Violet) and the number of eyes they have (2 or 3 or 4). The following numbers may be helpful as you answer this problem without using a calculator: log2 0.1 =−3.32,log2 0.2 =−2.32,log2 0.3 =−1.73,log2 0.4 =−1.32,log2 0.5 =−1. Draw the full decision tree learned for this data What is the training set error

Weight Eye Color Num. Eyes Output N A 2 L N V 2 L U V 3 L U A 4 D N A 4 D N V 4 D U A 3 D