Genetic Algorithms Schematic of neural network application to identify metabolites by mass spectrometry (MS) Developed by Dr. Lars Kangas Input to Genetic.

Slides:



Advertisements
Similar presentations
Genetic Algorithms Chapter 3. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Genetic Algorithms GA Quick Overview Developed: USA in.
Advertisements

Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Tuesday, May 14 Genetic Algorithms Handouts: Lecture Notes Question: when should there be an additional review session?
1 Lecture 8: Genetic Algorithms Contents : Miming nature The steps of the algorithm –Coosing parents –Reproduction –Mutation Deeper in GA –Stochastic Universal.
Data Mining Techniques Outline
COMP305. Part II. Genetic Algorithms. Genetic Algorithms.
Genetic Algorithm for Variable Selection
Intro to AI Genetic Algorithm Ruth Bergman Fall 2002.
Evolutionary Computational Intelligence
MACHINE LEARNING 6. Multivariate Methods 1. Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Motivating Example  Loan.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2004.
Chapter 6: Transform and Conquer Genetic Algorithms The Design and Analysis of Algorithms.
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
Bayesian Decision Theory Making Decisions Under uncertainty 1.
Genetic Algorithm.
Evolutionary Intelligence
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Introduction to Genetic Algorithms and Evolutionary Computation
Cristian Urs and Ben Riveira. Introduction The article we chose focuses on improving the performance of Genetic Algorithms by: Use of predictive models.
SOFT COMPUTING (Optimization Techniques using GA) Dr. N.Uma Maheswari Professor/CSE PSNA CET.
Genetic algorithms Prof Kang Li
CS 484 – Artificial Intelligence1 Announcements Lab 3 due Tuesday, November 6 Homework 6 due Tuesday, November 6 Lab 4 due Thursday, November 8 Current.
Zorica Stanimirović Faculty of Mathematics, University of Belgrade
Chapter 8 The k-Means Algorithm and Genetic Algorithm.
Genetic algorithms Charles Darwin "A man who dares to waste an hour of life has not discovered the value of life"
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Computational Complexity Jang, HaYoung BioIntelligence Lab.
Genetic Algorithms Introduction Advanced. Simple Genetic Algorithms: Introduction What is it? In a Nutshell References The Pseudo Code Illustrations Applications.
1 Machine Learning: Lecture 12 Genetic Algorithms (Based on Chapter 9 of Mitchell, T., Machine Learning, 1997)
Genetic Algorithms Siddhartha K. Shakya School of Computing. The Robert Gordon University Aberdeen, UK
Neural and Evolutionary Computing - Lecture 9 1 Evolutionary Neural Networks Design  Motivation  Evolutionary training  Evolutionary design of the architecture.
GENETIC ALGORITHM A biologically inspired model of intelligence and the principles of biological evolution are applied to find solutions to difficult problems.
Genetic Algorithms. Evolutionary Methods Methods inspired by the process of biological evolution. Main ideas: Population of solutions Assign a score or.
2005MEE Software Engineering Lecture 11 – Optimisation Techniques.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
1 Genetic Algorithms K.Ganesh Introduction GAs and Simulated Annealing The Biology of Genetics The Logic of Genetic Programmes Demo Summary.
Genetic Algorithms Genetic algorithms provide an approach to learning that is based loosely on simulated evolution. Hypotheses are often described by bit.
Genetic Algorithms What is a GA Terms and definitions Basic algorithm.
EE749 I ntroduction to Artificial I ntelligence Genetic Algorithms The Simple GA.
Fundamentals of Artificial Neural Networks Chapter 7 in amlbook.com.
Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Bayesian decision theory: A framework for making decisions when uncertainty exit 1 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e.
GENETIC ALGORITHM Basic Algorithm begin set time t = 0;
Review for final exam 2015 Fundamentals of ANN RBF-ANN using clustering Bayesian decision theory Genetic algorithm SOM SVM.
D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Genetic Algorithms. Underlying Concept  Charles Darwin outlined the principle of natural selection.  Natural Selection is the process by which evolution.
Genetic Algorithm Dr. Md. Al-amin Bhuiyan Professor, Dept. of CSE Jahangirnagar University.
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
Genetic Algorithms. Solution Search in Problem Space.
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
 Presented By: Abdul Aziz Ghazi  Roll No:  Presented to: Sir Harris.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Introduction to Genetic Algorithms
Chapter 14 Genetic Algorithms.
Genetic Algorithms.
Deep Feedforward Networks
CSC 380: Design and Analysis of Algorithms
Data Mining Lecture 11.
Basics of Genetic Algorithms (MidTerm – only in RED material)
Genetic Algorithms Chapter 3.
Basics of Genetic Algorithms
Review for test #2 Fundamentals of ANN Dimensionality reduction
Fewer attributes are better if they are optimal
Population Based Metaheuristics
CSC 380: Design and Analysis of Algorithms
Presentation transcript:

Genetic Algorithms Schematic of neural network application to identify metabolites by mass spectrometry (MS) Developed by Dr. Lars Kangas Input to Genetic Algorithm is measure of fitness from comparison of in silico and experimental MS Output are “chromosomes” translated into weights for neural network that is part of model for metabolite MS

Very Brief History of genetic algorithms: Genetic Algorithm were developed by John Holland in 60’s and 70’s Author of “Adaption in Natural and Artificial Systems” More recent book on the subject “An Introduction to Genetic Algorithms” by Melanie Mitchell (MIT Press, Cambridge, MA, 2002)

Natural adaption: Populations of organisms are subjected to environmental stress. Fitness is manifest by ability to survive and reproduce Fitness is passed to offspring by genes that are organized on chromosomes. If environmental conditions change, evolution creates a new population with different characteristics that optimize fitness under the new conditions

Basic tools of evolution Recombination (crossover) occurs during reproduction. Chromosome of offspring is a mixture of chromosomes from parents Mutation changes a single gene within a chromosome. To be expressed, organism must survive and pass modified chromosome to offspring

Artificial adaptation : Represent a candidate solution to a problem by a chromosome Define a fitness function on the domain of all chromosomes Define the probabilities of crossover and mutation. Select 2 chromosomes for reproduction based on their fitness Produce new chromosomes by crossover and mutation Evaluate fitness of new chromosomes Completes a “generation”

Artificial adaptation continued: In generations create a population of solutions with high fitness Repeat whole process several times and merge best solutions Simple example: Find the position of the maximum of a normal distribution with mean of 16 and standard deviation of 4

Fitness function

Chromosome = binary representation of integers between 0 and 31 (requires 5 bits) 0 to 31 covers the range where fitness is significantly different from zero Fitness of chromosome = value of fitness function f(x) where x is decimal equivalent of a 5-bit binary Crossover probability (rate) = 0.75 Mutation probability (rate) = Size of population, n = 4 Problem set up

Method to select chromosomes for refinement Calculate fitness f(x i ) for each chromosome in population Assigned each chromosome a discrete probability by Use p i to design a roulette wheel Divide number line between 0 and 1 into segments of length p i in a specified order Get r, random number uniformly distributed between 0 and 1 Choose the chromosome of the line segment containing r

00100 = 4fitness = pi = = 9fitness = pi = = 27fitness = pi = = 31fitness = pi =  i f(x i ) = st generation: 5-bit binary numbers chosen randomly Assume the pair with largest 2 probabilites (01001 and 11011) are selected for replication

Assume a mixing point (locus) is chosen between first and second bit. Crossover selected to induce change Mutation is rejected as method to induce change

Evaluate fitness of new population = 4fitness = pi = = 11fitness = pi = = 25fitness = pi = = 31fitness = pi =  i f(x i ) = about 2 times that of the 1 st generation Repeat until fitness of population is almost uniform Values of all chromosomes should be near 16

Crowding: In the initial chromosome population of this example has 86% of the selection probability. Potentially can lead to imbalance of fitness over diversity Limit the ability of GA to explore new regions of search space Solution: penalize choice of similar chromosomes for mating

 and  are the mean and standard deviation of fitness in the population In early generations, selection pressure should be low to enable wider coverage of search space (large  ) In later generations selection pressure should be higher to encourage convergence to optimum solution (small  ) Sigma scaling allows variable selection pressure Sigma scaling of fitness f(x)

Positional bias: Single-point crossover lets near-by loci stay together in children One of several methods to avoid positional bias

Genetic Algorithm for real-valued variables Real-valued variables can be converted to binary representation as in example of finding maximum of normal distribution. Results in loss of significance unless one uses a large number of bits Arithmetic crossover Parents and Choose k th gene at random Children 0 <  <1

Discrete crossover: With uniform probability, each gene of child chromosome chosen to be a gene in one or the other parent chromosomes at the same locus. Parents and Child Normally distributed mutation: Choose random number from normal distribution with zero mean and standard deviation comparable to size of genes (e.g.  = 1 for genes scaled between -1 and +1). Add to randomly chosen gene. Re-scale if needed. More methods for real-valued variables

Using GA in training of ANN ANN with 11 weights: 8 to hidden layer, 3 to output w 1A w 1B w 2A w 2B w 3A w 3B w 0A w 0B w AZ w BZ w 0Z

Chromosome for weight optimization by GA Scaled to values between -1 and +1 Use methods crossover and mutation for real numbers to modify chromosome Fitness function: mean squared deviation between output and target

Use feed forward to determine the fitness of this new chromosome

Genetic algorithm for attribute selection Find the best subset of attributes for data mining GA is well suited to this task since, with diversity, it can explore many combinations of attributes.

WEKA’s GA applied to attribute selection Default values: Population size = 20 Crossover probability = 0.6 Mutation probability = Example: breast-cancer classification Wisconsin Breast Cancer Database Breast-cancer.arff 683 instances 9 numerical attributes 2 target classes benign=2 malignant=4

Tumor characteristics 1.clump-thickness 2.uniform-cell size 3.uniform-cell shape 4.marg-adhesion 5.single-cell size 6.bare-nuclei 7.bland-chomatin 8.normal-nucleoli 9.mitoses Severity scores 5,1,1,1,2,1,3,1,1,2 5,4,4,5,7,10,3,2,1,2 3,1,1,1,2,2,3,1,1,2 6,8,8,1,3,4,3,7,1,2 4,1,1,3,2,1,3,1,1,2 8,10,10,8,7,10,9,7,1,4 1,1,1,1,2,10,3,1,1,2 2,1,2,1,2,1,3,1,1,2 2,1,1,1,2,1,1,1,5,2 4,2,1,1,2,1,2,1,1,2 Severity scores are attributes Last number in a row is class label Examples from dataset

Characteristic 1.clump-thickness 2.uniform-cell size 3.uniform-cell shape 4.marg-adhesion 5.single-cell size 6.bare-nuclei 7.bland-chomatin 8.normal-nucleoli 9.mitoses Severity score 5,1,1,1,2,1,3,1,1,2 5,4,4,5,7,10,3,2,1,2 3,1,1,1,2,2,3,1,1,2 6,8,8,1,3,4,3,7,1,2 4,1,1,3,2,1,3,1,1,2 8,10,10,8,7,10,9,7,1,4 1,1,1,1,2,10,3,1,1,2 2,1,2,1,2,1,3,1,1,2 2,1,1,1,2,1,1,1,5,2 4,2,1,1,2,1,2,1,1,2 Chromosomes have 9 binary genes gene k = 1 means k th severity score included Fitness: accuracy of naïve Bayes classification

Background on Bayesian classification

27 posterior Class likelihoodprior normalization Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) Assign client to class with higher posterior With normalized, assign to class with P(C|x) > 0.5 P(C|x) = 0.5 is a discriminant in attribute space Bayes’ Rule for binary classification

28 posterior Class likelihoodprior normalization Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) Prior is information relevant to classifying that is independent of attributes Class likelihood is probability that member of class C will have attribute x Bayes’ Rule for binary classification

29 posterior Class likelihoodprior normalization Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) Prior = risk tolerance of bank (determined from loan-approval history) Class likelihood = is x like other high-risk applications? Example: Bayes’ Rule for loan approval

30 posterior Class likelihoodprior normalization Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) Normalization is generally not necessary for classification Normalized Bayes’ rule for binary classification

Bayes’ Rule: K>2 Classes 31 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

32 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) With class labels r i t, estimators are Estimate priors and class likelihoods from data set Number of examples in class is estimate of its prior. Assume members of class are Gaussian distributed. Mean and covariance parameterize class likelihood.

Assume x i are independent, offdiagonals of ∑ are 0, p(x|C) is product of probabilities for each component of x 33 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) naïve Bayes classification Each class has set of means and variances for the components of the attributes in that class

Open file breast-cancer.arff Check attribute 10 (class) to see the number of examples in each class Attribute selection using WEKA’s genetic algorithm method

benign malignant

Open file breast-cancer.arff Click on attribute 10 (class) to see the number of examples in each class Attribute selection using WEKA’s genetic algorithm method Click on any other attribute.

clump thickness Distribution of attribute scores (1 – 10) over examples in dataset Severity of clump thickness positively correlated with malignancy increasing severity 

Baseline performance measures use naïve Bayes classifier

Under the Select-Attributes tab of Weka Explorer Press choose button under Attribute Evaluator Under Attribute Selection find WrapperSubsetEval

Click on WrapperSubsetEval to bring up dialog box which shows ZeroR as the default classifier Find the Naïve Bayes classifier, click OK Evaluator has been selected

Under the Select-Attribute tab of Weka Explorer Press choose button under Search Method Find Genetic Search (see package manager in Weka 3.7) Start search with default settings including “Use full training set”

Fitness function: linear scaling of the error rate of naïve Bayes classification such that the highest error rate corresponds to a fitness of zero How is subset related to chromosome?

Any subset that includes 9 th attribute has low fitness Results with Weka 3.6

Increasing the number of generations to 100 does not change the attributes selected 9 th attribute “mitoses” has been deselected Return to Preprocess tab, remove “mitoses” and reclassify

Performance with reduced attribute set is slightly improved Slight improvement Misclassified malignant cases decreased by 2

Weka has other attribute selection techniques For theory see “information gained” is alternative to SubSetEval with GA search Ranker is the only Search Method that can be used with InfoGainAttributeEval