The Performance of Evolutionary Artificial Neural Networks in Ambiguous and Unambiguous Learning Situations Melissa K. Carroll October, 2004.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

© Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems Introduction.
Multi-Layer Perceptron (MLP)
Slides from: Doug Gray, David Poole
NEURAL NETWORKS Backpropagation Algorithm
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Yuri R. Tsoy, Vladimir G. Spitsyn, Department of Computer Engineering
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
Using Parallel Genetic Algorithm in a Predictive Job Scheduling
Mehran University of Engineering and Technology, Jamshoro Department of Electronic Engineering Neural Networks Feedforward Networks By Dr. Mukhtiar Ali.
NVIS: An Interactive Visualization Tool for Neural Networks Matt Streeter Prof. Matthew O. Ward by Prof. Sergio A. Alvarez advised by and.
Application of Stacked Generalization to a Protein Localization Prediction Task Melissa K. Carroll, M.S. and Sung-Hyuk Cha, Ph.D. Pace University, School.
Institute of Intelligent Power Electronics – IPE Page1 Introduction to Basics of Genetic Algorithms Docent Xiao-Zhi Gao Department of Electrical Engineering.
Machine Learning Neural Networks
Producing Artificial Neural Networks using a Simple Embryogeny Chris Bowers School of Computer Science, University of Birmingham White.
Neural Networks Basic concepts ArchitectureOperation.
Data Mining Techniques Outline
Connectionist models. Connectionist Models Motivated by Brain rather than Mind –A large number of very simple processing elements –A large number of weighted.
Genetic algorithms for neural networks An introduction.
Genetic Algorithms Learning Machines for knowledge discovery.
October 14, 2010Neural Networks Lecture 12: Backpropagation Examples 1 Example I: Predicting the Weather We decide (or experimentally determine) to use.
Genetic Algorithm What is a genetic algorithm? “Genetic Algorithms are defined as global optimization procedures that use an analogy of genetic evolution.
November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 1 Learning in the BPN Gradients of two-dimensional functions:
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
Revision Michael J. Watts
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
December 5, 2012Introduction to Artificial Intelligence Lecture 20: Neural Network Application Design III 1 Example I: Predicting the Weather Since the.
Evolving a Sigma-Pi Network as a Network Simulator by Justin Basilico.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Integrating Neural Network and Genetic Algorithm to Solve Function Approximation Combined with Optimization Problem Term presentation for CSC7333 Machine.
Soft Computing Lecture 18 Foundations of genetic algorithms (GA). Using of GA.
Chapter 9 Neural Network.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 23 Nov 2, 2005 Nanjing University of Science & Technology.
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
NEURAL NETWORKS FOR DATA MINING
Zorica Stanimirović Faculty of Mathematics, University of Belgrade
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Artificial Intelligence Techniques Multilayer Perceptrons.
Genetic Algorithms Michael J. Watts
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
Evolving Virtual Creatures & Evolving 3D Morphology and Behavior by Competition Papers by Karl Sims Presented by Sarah Waziruddin.
1 Machine Learning: Lecture 12 Genetic Algorithms (Based on Chapter 9 of Mitchell, T., Machine Learning, 1997)
Neural and Evolutionary Computing - Lecture 9 1 Evolutionary Neural Networks Design  Motivation  Evolutionary training  Evolutionary design of the architecture.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 12-1 Chapter 12 Advanced Intelligent Systems.
Genetic Algorithms ML 9 Kristie Simpson CS536: Advanced Artificial Intelligence Montana State University.
Introduction to Neural Networks and Example Applications in HCI Nick Gentile.
 Based on observed functioning of human brain.  (Artificial Neural Networks (ANN)  Our view of neural networks is very simplistic.  We view a neural.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
CITS7212: Computational Intelligence An Overview of Core CI Technologies Lyndon While.
Image Source: ww.physiol.ucl.ac.uk/fedwards/ ca1%20neuron.jpg
A field of study that encompasses computational techniques for performing tasks that require intelligence when performed by humans. Simulation of human.
Kim HS Introduction considering that the amount of MRI data to analyze in present-day clinical trials is often on the order of hundreds or.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
An Evolutionary Algorithm for Neural Network Learning using Direct Encoding Paul Batchis Department of Computer Science Rutgers University.
A PID Neural Network Controller
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Evolutionary Computation Evolving Neural Network Topologies.
Chapter 12 Case Studies Part B. Control System Design.
Supervised Learning in ANNs
C.-S. Shieh, EC, KUAS, Taiwan
Real Neurons Cell structures Cell body Dendrites Axon
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
An Introduction To The Backpropagation Algorithm
Neural Networks Geoff Hulten.
Dr. Unnikrishnan P.C. Professor, EEE
Boltzmann Machine (BM) (§6.4)
Machine Learning: UNIT-4 CHAPTER-2
Evolutionary Ensembles with Negative Correlation Learning
Presentation transcript:

The Performance of Evolutionary Artificial Neural Networks in Ambiguous and Unambiguous Learning Situations Melissa K. Carroll October, 2004

Artificial Neural Networks and Supervised Learning

Backpropagation and Associated Parameters: Gain Activation Function: Used to compute output of neuron from its inputs Sigmoid function: As gain increases, slope of activation function of neurons increases: Red: gain =1 Blue: gain = 2 Green: gain =.5 Diagram source: orr/classes/cs449/Maple/Activ ationFuncs/active.html

Effects of Learning Rate Diagram source: te.edu/~gorr/classes/ cs449/linear2.html

Methods to Ensure or Speed Up Convergence that Often Work Adjust architecture: add more layers or more neurons per layer Adjust topology, or connections between neurons Add bias neuron that outputs 1 No learning can occur with backprop when neuron is outputting 0 Equivalent to shifting the range of the activation function Reduces number of neurons outputting 0 Add momentum term to weight adjustment equations: Smoothes learning to allow high learning rate without divergence ANN programmer must manipulate all of these parameters using expert knowledge

Introduction: Genetic Algorithms (GAs) Another set of adaptive algorithms derived from natural process (evolution) Organisms possess chromosomes made up of genes encoding for traits There is variability among organisms Some individuals will naturally be able to reproduce more in a particular environment, making them more “fit” for that environment By definition, genes of the more fit individuals will become more numerous in the population Population is skewed towards more fit individuals for the given environment Forces of variability then act on these genes, leading to new, more “fit” discoveries

The Genetic Algorithm

Designing and Training ANNs with GAs: Rationale Designing the right ANN for a particular task requires manipulating all of the parameters described previously, which requires expertise and much trial and error (and sometimes luck!) GAs are optimizers and can optimize these parameters Traditional training algorithms like backpropagation have a tendency to get stuck in local minima of multimodal or “hilly” error curves, missing the global minimum: GAs perform a “global search” and are hence more likely to find the global minimum Diagram source: classes/cs449/momrate.html

Designing and Training ANNs With GAs: Implementation Direct (Matrix) Encoding Some classes of GAs for evolving ANNs: Darwinian Hybrid Darwinian Baldwinian Lamarckian

Introduction: Wisconsin Card Sorting Test (WCST) Psychological task requiring adaptive thinking: measures flexibility in thought; therefore interesting for testing properties of ANN learning Requires subject to resolve ambiguities… Which card was the correct card when negative feedback is given? Which rule was the current rule when a stimulus card matches a target card on more than one dimension?

Purpose and Implementation

Hypotheses Regarding Learning Highly accurate network trained on unambiguous pattern should produce output identical to the training set Accuracy rate of rule-to-card network should be 100% Calculus proof led to prediction that network trained on ambiguous pattern would output, at each node, the probability of the corresponding rule being the current rule Accuracy rates should be 100%, 50%, and 33.3% for input patterns with 1, 2, and 3 associated target patterns, respectively Minimum error rate for ambiguous pattern is a very high When whole model is combined, will be interesting to see if networks can generalize to data not seen in training

Experiment Performed Compare the performance of six GAs and one non-GA algorithm Algorithms tested: Non-GA “brute force” algorithm: try all combinations of parameters Darwinian evolution-only (Pure Darwinian) Darwinian with additional backpropagation training (Hybrid Darwinian) Baldwinian evolving architecture only Baldwinian evolving architecture and weights Lamarckian One “made up” algorithm: “Reverse Baldwinian” Motivation for Reverse Baldwinian: produce greater variability and evaluate fitness over longer training periods without increasing computation time

Hypotheses Regarding Algorithm Performance Good chance GAs would outperform non-GA, but some doubts due to known problems with GAs Hybrid Darwinian more effective than Pure Darwinian based on previous research Baldwinian and Lamarckian more effective than Darwinian based on previous research Lamarckian more effective than Baldwinian due to relatively short runs (app. 40 generations)

Results and Discussion

Learning Performance Accuracy of Best Networks Found by Best and Second-Worst Algorithms on Unambiguous Rule-to-Card Pattern Accuracy of Best Networks Found by Best and Second-Worst Algorithms on Ambiguous Card-to-Rule Pattern

Sample Output of Best Card-to-Rule Learner

Nature of Learning Ambiguous Pattern

Parameters of Best Non-GA Nets

Lowest Error Rate Found by All Algorithms **Algorithms did not include additional 1000 training epochs; error values are the lowest attained by any of the networks produced by the GA run alone.

Performance of Pure Darwinian Algorithm

Sample Output of Best Pure Darwinian Net on Card-to-Rule Pattern

Did Evolution Work At All? Fitness graphs generally show increase in fitness over generations T-tests show that selection mechanism selected more fit individuals Best Lamarckian nets still “better” than best non-GA net after equivalent amounts of training T-tests show that error rates of nets during Lamarckian run were significantly better than error rates for random nets at equivalent time points for unambiguous pattern However, results were the reverse for the ambiguous pattern Due to the nature of the paired t-test performed, these results can’t easily be explained by the theory about assessment time point being critical

To Evolve or Not To Evolve General reasons why evolution may not have been appropriate in this case (in addition to those specific to the ambiguous pattern): Patterns may have been easy to learn; backpropagation often outperforms GAs on weight training for easy patterns Crossover often not effective when using matrix encoding scheme Although one GA did outperform non-GA, difference was almost irrelevant since both were highly successful Non-GA is easier to program and almost five times faster to run

Suggestions for Future Work Attempt to combine and train the entire ANN model Manipulate GA parameters, such as mutation rate, crossover rate, population size, and number of generations Try different selection mechanisms Use different encoding scheme Experiment with new fitness function for ambiguous pattern Test different GAs or other evolutionary algorithms altogether Investigate ambiguous patterns further, including the role of momentum in their non-linear learning curve

What Does It All Mean? Learning power of ANNs: ANNs learned two sub-tasks that are difficult for many humans Ambiguous patterns may be more difficult to design and train with GAs Training ambiguous patterns may require special modifications such as eliminating the momentum term Additional support for existing theories based on prior research GAs not as effective on easy-to-learn patterns Hybrid algorithms generally outperform evolution-only algorithms Clarifying properties of ANNs and GAs is tremendously useful for engineering and may also elucidate properties of natural processes