Machine Learning CS 165B Spring 2012 1. Course outline Introduction (Ch. 1) Concept learning (Ch. 2) Decision trees (Ch. 3) Ensemble learning Neural Networks.

Slides:



Advertisements
Similar presentations
Genetic Algorithms.
Advertisements

Genetic Algorithms Genetic Programming Ata Kaban School of Computer Science University of Birmingham 2003.
Representing Hypothesis Operators Fitness Function Genetic Programming
CS6800 Advanced Theory of Computation
Introduction to Genetic Algorithms Speaker: Moch. Rif’an
Biologically Inspired AI (mostly GAs). Some Examples of Biologically Inspired Computation Neural networks Evolutionary computation (e.g., genetic algorithms)
Genetic Algorithms Representation of Candidate Solutions GAs on primarily two types of representations: –Binary-Coded –Real-Coded Binary-Coded GAs must.
Introduction to Genetic Algorithms
1 Lecture 8: Genetic Algorithms Contents : Miming nature The steps of the algorithm –Coosing parents –Reproduction –Mutation Deeper in GA –Stochastic Universal.
COMP305. Part II. Genetic Algorithms. Genetic Algorithms.
Chapter 14 Genetic Algorithms.
CS 8751 ML & KDDGenetic Algorithms1 Evolutionary computation Prototypical GA An example: GABIL Genetic Programming Individual learning and population evolution.
Genetic Algorithms Nehaya Tayseer 1.Introduction What is a Genetic algorithm? A search technique used in computer science to find approximate solutions.
Introduction to Genetic Algorithms
Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs.
Genetic Algorithm.
Genetic Algorithms and Ant Colony Optimisation
Computer Implementation of Genetic Algorithm
Computing & Information Sciences Kansas State University Friday, 21 Nov 2008CIS 530 / 730: Artificial Intelligence Lecture 35 of 42 Friday, 21 November.
SOFT COMPUTING (Optimization Techniques using GA) Dr. N.Uma Maheswari Professor/CSE PSNA CET.
Genetic algorithms Prof Kang Li
CS 484 – Artificial Intelligence1 Announcements Lab 3 due Tuesday, November 6 Homework 6 due Tuesday, November 6 Lab 4 due Thursday, November 8 Current.
Lecture 8: 24/5/1435 Genetic Algorithms Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Genetic Algorithms Michael J. Watts
1 Genetic Algorithms “Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for optimal combinations.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
1 Machine Learning: Lecture 12 Genetic Algorithms (Based on Chapter 9 of Mitchell, T., Machine Learning, 1997)
1 Chapter 14 Genetic Algorithms. 2 Chapter 14 Contents (1) l Representation l The Algorithm l Fitness l Crossover l Mutation l Termination Criteria l.
GENETIC ALGORITHM A biologically inspired model of intelligence and the principles of biological evolution are applied to find solutions to difficult problems.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 16 February 2007 William.
Genetic Algorithms. Evolutionary Methods Methods inspired by the process of biological evolution. Main ideas: Population of solutions Assign a score or.
Machine Learning Chapter 9. Genetic Algorithm
Genetic Algorithms ML 9 Kristie Simpson CS536: Advanced Artificial Intelligence Montana State University.
1 Genetic Algorithms and Ant Colony Optimisation.
Why do GAs work? Symbol alphabet : {0, 1, * } * is a wild card symbol that matches both 0 and 1 A schema is a string with fixed and variable symbols 01*1*
1 Genetic Algorithms K.Ganesh Introduction GAs and Simulated Annealing The Biology of Genetics The Logic of Genetic Programmes Demo Summary.
Chapter 9 Genetic Algorithms.  Based upon biological evolution  Generate successor hypothesis based upon repeated mutations  Acts as a randomized parallel.
Introduction to Genetic Algorithms. Genetic Algorithms We’ve covered enough material that we can write programs that use genetic algorithms! –More advanced.
Edge Assembly Crossover
Genetic Algorithms Genetic algorithms provide an approach to learning that is based loosely on simulated evolution. Hypotheses are often described by bit.
Genetic Algorithms. 2 Overview Introduction To Genetic Algorithms (GAs) GA Operators and Parameters Genetic Algorithms To Solve The Traveling Salesman.
Chapter 12 FUSION OF FUZZY SYSTEM AND GENETIC ALGORITHMS Chi-Yuan Yeh.
CpSc 881: Machine Learning Genetic Algorithm. 2 Copy Right Notice Most slides in this presentation are adopted from slides of text book and various sources.
Chapter 9 Genetic Algorithms Evolutionary computation Prototypical GA
Chapter 9 Genetic Algorithms
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Monday, 25 February 2008 William.
D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.
Genetic Algorithms MITM613 (Intelligent Systems).
Genetic Algorithms Chapter Description of Presentations
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Why do GAs work? Symbol alphabet : {0, 1, * } * is a wild card symbol that matches both 0 and 1 A schema is a string with fixed and variable symbols 01*1*
Genetic Algorithms. Underlying Concept  Charles Darwin outlined the principle of natural selection.  Natural Selection is the process by which evolution.
Genetic Algorithm Dr. Md. Al-amin Bhuiyan Professor, Dept. of CSE Jahangirnagar University.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
1 Comparative Study of two Genetic Algorithms Based Task Allocation Models in Distributed Computing System Oğuzhan TAŞ 2005.
Genetic Algorithms. Solution Search in Problem Space.
Genetic Algorithms And other approaches for similar applications Optimization Techniques.
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
 Presented By: Abdul Aziz Ghazi  Roll No:  Presented to: Sir Harris.
Introduction to Genetic Algorithms
Chapter 14 Genetic Algorithms.
Genetic Algorithms.
Basics of Genetic Algorithms (MidTerm – only in RED material)
GENETIC ALGORITHM A biologically inspired model of intelligence and the principles of biological evolution are applied to find solutions to difficult.
دانشگاه صنعتی امیرکبیر
GENETIC ALGORITHMS & MACHINE LEARNING
Basics of Genetic Algorithms
Machine Learning: UNIT-4 CHAPTER-2
Population Based Metaheuristics
Chapter 9 Genetic Algorithms
Presentation transcript:

Machine Learning CS 165B Spring

Course outline Introduction (Ch. 1) Concept learning (Ch. 2) Decision trees (Ch. 3) Ensemble learning Neural Networks (Ch. 4) Linear classifiers Support Vector Machines Bayesian Learning (Ch. 6) Genetic Algorithms (Ch. 9) Instance-based Learning (Ch. 8) Clustering Computational learning theory (Ch. 7) 2

Genetic Algorithms - History Pioneered by John Holland in the 1970 ’ s Got popular in the late 1980 ’ s Based on ideas from Darwinian Evolution Can be used to solve a variety of problems that are not easy to solve using other techniques Particularly well suited for hard problems where little is known about the underlying search space Widely-used in business, science and engineering

Classes of Search Techniques Search Techniqes Guided random search techniqes Enumerative Techniqes BFS DFS Dynamic Programming Tabu SearchHill Climbing Simulated Anealing Evolutionary Algorithms Genetic Algorithms Genetic Programming

Background: Biological Evolution Biological analogy for learning –Lamarck:  Species adapt over time  Pass this adaptation to offsprings –Darwin:  Consistent, heritable variation among individuals in population  Natural selection of the fittest –Mendel:  A mechanism for inheriting traits  genotype → phenotype mapping (“code”) –Epigenetics:  Non-genetic inheritance

Evolution in the real world Each cell of a living thing contains chromosomes - strings of DNA Each chromosome contains a set of genes - blocks of DNA Each gene determines some aspect of the organism (like eye colour) A collection of genes is sometimes called a genotype A collection of aspects (like eye colour) is sometimes called a phenotype Reproduction involves recombination of genes from parents and then small amounts of mutation (errors) in copying The fitness of an organism is how much it can reproduce before it dies Evolution based on “ survival of the fittest ”

Start with a Dream… Suppose you have a problem You don’t know how to solve it What can you do? Can you use a computer to somehow find a solution for you? This would be nice! Can it be done?

A dumb solution A “blind generate and test” algorithm: Repeat Generate a random possible solution Test the solution and see how good it is Until solution is good enough

Can we use this dumb idea? Sometimes - yes: –if there are only a few possible solutions –and you have enough time –then such a method could be used For most problems - no: –many possible solutions –with no time to try them all –so this method can not be used

A “less-dumb” idea (GA) Generate a set of random solutions Repeat Test each solution in the set (rank them) Remove some bad solutions from set Duplicate some good solutions make small changes to some of them Until best solution is good enough

GA(Fitness, Fitness_threshold, p, r, m) Fitness : assigns evolution score Fitness_threshold : termination criterion p : number of hypotheses in a generation r : fraction of population to be replaced m : mutation rate Initialize: P ← Generate p random hypotheses Evaluate: for each h in P, compute Fitness ( h ) While  max h Fitness ( h )  Fitness_threshold –Create a new generation P (5 steps: Select, Crossover, Mutate, Update, Evaluate) Return the hypothesis in P that has the highest Fitness

Select: Probabilistically select  r  p members of P to be non- replaced part of P S Add rp members to these “ survivors ” –Crossover: Probabilistically select rp  pairs of hypotheses from P. For each pair  h   h  , produce two offspring by applying the Crossover operator. Add all offspring to P S –Mutate: Invert a randomly selected bit in m % of random members of P S –(Apply other operators,  e.g., Invert: “ switch ” portions of each hypothesis) –Update: P ← P S –Evaluate: for each h in P, compute Fitness ( h ) Create a New Generation Ps

How to encode a hypothesis/solution? Obviously this depends on the problem! GA’s often encode solutions as fixed length “bitstrings” (e.g , , ) Each bit represents some aspect of the proposed solution to the problem For GA’s to work, we need to be able to “test” any string and get a “score” indicating how “good” that solution is

Representing hypotheses Hypothesis representation specifics: –“1”s in all attribute value positions implies “don’t care” –Single “1”: only one attribute value possible –“0”s in all positions: no attribute values possible Can represent rule-sets using concatenation of rules Can construct symbolic strings (computer programs) instead of bit strings

Fitness landscapes

Representing hypotheses about concepts Hypotheses: represent as conjunctive rules/bit strings –Fixed length bit strings Enumerate attributes and possible attribute values One approach; for each value a of attribute A : –  for A  a possible,  if not –Multiple values of A means  –Different attribute means  Same for the target (post conditions) IF Wind  Strong THEN PlayTennis  WindOutlook Sunny Overcast Rain Strong Weak   WindOutlook Sunny Overcast Rain Strong Weak  Tennis? Yes No

Typical operators Initial strings Crossover Mask Single-point Crossover Offspring Two-point Crossover Uniform Crossover Point Mutation

Selecting most fit hypotheses Fitness proportionate selection: –Also called “Roulette Wheel Selection” Crowding problem possible under proportionate selection –One genetic structure becomes dominant –Low diversity in population Alternatives to simple random selection –Tournament selection (for more diverse population):  Pick h 1, h 2 at random with uniform probability  With probability p, select the more fit –Rank selection:  Sort all hypotheses by fitness  Probability of selection is proportional to rank –Not fitness

Representation: encoding into bit strings: –IF a 1  T  a 2  F THEN c  T; IF a 2  T THEN c  F a 1 a 2 c a 1 a 2 c –Encode multiple rules  Length of hypothesis grows with number of rules Fitness based on predictive correctness of hypotheses Fitness  h  correct  h   Genetic operators ???: –Need to accommodate variable length rule sets  Two point crossover –Standard mutation Example: GABIL

Two Point Crossover Variable length rule sets – Examples with various representations of a and c values –a 1 a 2 c a 1 a 2 c – a 1 a 2 c a 1 a 2 c a 1 a 2 c Result must be a well-formed (WF) bitstring hypothesis – a 1 a 2 c a 1 a 2 c a 1 a 2 c ?? Idea: find the corresponding positions to do crossover –Choose #crossover points –Choose points in first parent randomly –Choose points in second parent to give WF rules

Crossover with variable length bitstrings Two (initially equal length) hypotheses: a 1 a 2 c a 1 a 2 c h 1 : h 2 : Choose crossover points for h, e.g., after bits ,  Now restrict crossover points in h 2 to those that produce bitstrings with well-defined semantics e.g., , ,  Example: if we choose , result is a 1 a 2 c h 3 :  a 1 a 2 c a 1 a 2 c a 1 a 2 c h 4 : 

GABIL results 92% correctness in basic form Can extend to many variants –Add new genetic operators, also applied probabilistically:  Add Alternative : generalize constraint on a i by changing a  to   Drop Condition : generalize constraint on a i by changing every  to  –Performance improves to 95% Add two bits to determine whether to allow these a 1 a 2 c a 1 a 2 c AA DC –The learning strategy also evolves! r  0.6, m  0.001, p  Performance of GABIL comparable to symbolic rule/tree learning methods

23 Example: Traveling Salesman Problem (TSP) The traveling salesman must visit every city in his territory exactly once and then return to the starting point; given the cost of travel between all cities, how should he plan his itinerary for minimum total cost of the entire tour? TSP is NP-Complete

24 TSP solution by GA A vector v = (i 1 i 2… i n ) represents a tour (v is a permutation of {1,2,…,n}) Fitness f of a solution is the inverse cost of the corresponding tour Initialization: use either some heuristics, or a random sample of permutations of {1,2,…,n} Fitness: proportionate selection

25 TSP Crossover Build offspring by choosing a sub-sequence of a tour from one parent and preserving the relative order of cities from the other parent and feasibility Example: p 1 = ( ) and p 2 = ( ) First, the segments between cut points are copied into offspring o 1 = (x x x x x) and o 2 = (x x x x x)

26 TSP Crossover Next, starting from the second cut point of one parent, the cities from the other parent are copied in the same order The sequence of the cities in the second parent is 9 – 3 – 4 – 5 – 2 – 1 – 8 – 7 – 6 After removal of cities from the first offspring we get 9 – 3 – 2 – 1 – 8 This sequence is placed in the first offspring o 1 = ( ), and similarly in the second o 2 = ( )

27 TSP Inversion The sub-string between two randomly selected points in the path is reversed Example: ( ) is changed into ( ) Such simple inversion guarantees that the resulting offspring is a legal tour

Hypothesis space search by GA Local minima not much of a problem –Big jumps in search space Crowding as a problem –Various strategies to overcome over- representation of successful individuals  Tournament/rank selection  Fitness sharing –Fitness of an individual reduced by the presence of similar individuals

Population evolution and schemas How to characterize evolution of population in GA? Schema: string containing , ,  (don ’ t care) –E.g.  –Many instances of a schema: , , … –An individual represents many schemas  0010 represents 16 schemas Characterize population by – m ( s, t )  number of instances of schema s in the population at time t Schema theorem characterizes E(m(s,t+1)) in terms of –m(s,t) –Operators of genetic algorithm  Selection  Recombination  Mutation

Characterizing Population Change m ( s, t )  number of instances of schema s in the population at time t Find expected value of m ( s, t  ) [E(m( s, t  )] in terms of – m ( s, t ) – breeding parameters Schema theorem provides lower bound on E( m ( s, t  ) ) –First consider case of selection only –Then add effects of  crossover  mutation

Selection only f ( h )  fitness of hypothesis (or bit string) h f ( t )  average fitness of population of n at time t m ( s, t )  number of instances of schema s at time t u ( s, t )  average fitness of instances of s at time t n  size of population p t  set of bit strings at time t ( i.e., population ) Probability of selecting h in one selection step Probability in 1 selection of getting an instance of s

Schema Theorem After n selections (entire new generation): BUT crossover/mutation may reduce number of instances of s : p c  probability of single point crossover operator p m  probability of mutation operator l  length of single bit strings o ( s )  number of defined (non “  ” ) bits in s d ( s )  distance between leftmost, rightmost defined bits in s

Interpretation of Schema theorem Interpret E as –Proportional to average fitness of schema –Inversely proportional to fitness of average individual More fit schema tend to grow in influence under selection, cross-over, and mutation but especially if: –Have small number of defined bits (o(s)) –Have close bundles of defined bits (d(s)) GA keeps good chunks together: –GAs explore the search space by short, low-order schemata which, subsequently, are used for information exchange during crossover –Building block hypothesis: A genetic algorithm seeks near-optimal performance through the juxtaposition of short, low-order, high-performance schemata, called the building blocks

Genetic Programming Analogous ideas to basic GA, but elements are programs Fitness: executing the program on a set of training data Maintain the population size Genetic algorithm: evolution of programs Example: programs as (parse) trees  sin  x  ^ x y 2

Crossover  sin  x  ^ x y 2  x  y ^ x   x  y  x ^  yx ^ x2

Example: Block Problem Goal: spell the word UNIVERSAL Need appropriate representation –Key issue that determines efficiency of learning –Need  Representations of terminal arguments –“ natural ” representation, if possible  Primitive functions I U S ENARVL

Example: Block Problem Terminals: – CS (current stack):  Name of top block on stack or F (False) if there is no current stack – TB (top correct block):  name of topmost correct block on stack (it and blocks below are in correct order) – NN (next necessary):  name of the next block needed above TB in the stack or F if no more blocks needed I U S ENARVL

Primitive Functions (MS x ) (move to stack): –if block x is on the table, move x to the top of the stack and return the value T. Otherwise, do nothing and return the value F (MT x ) (move to table): – if block x is in stack, move block at top of stack to table and return the value T. Otherwise, return value F (EQ x y ) (equal): –return T if x equals y, and return F otherwise (NOT x ) (not x ): –return T if x  F, else return F (DU x y ) (do until): –execute expression x repeatedly until expression y returns the value T

Learned Program Trained to fit 166 test problems Using population of 300 programs, found a solution after 10 generations: (EQ (DU (MT CS ) (NOT CS )) (DU (MS NN ) (NOT NN ))) –(EQ x y ) equal –(DU x y ) do until –(MT x ) move to table – CS current stack –(NOT x ) not x –(MS x ) move to stack – NN next necessary

Other examples of genetic programs Design of electronic filter circuits –Discovery of circuits competitive with best human design Image classification

Summary of evolutionary programming Conduct randomized, hill-climbing search through hypotheses Analogy to biological evolution Learning  optimization problem (optimize fitness) Can be parallelized easily