Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Performance of Evolutionary Artificial Neural Networks in Ambiguous and Unambiguous Learning Situations Melissa K. Carroll October, 2004.

Similar presentations


Presentation on theme: "The Performance of Evolutionary Artificial Neural Networks in Ambiguous and Unambiguous Learning Situations Melissa K. Carroll October, 2004."— Presentation transcript:

1 The Performance of Evolutionary Artificial Neural Networks in Ambiguous and Unambiguous Learning Situations Melissa K. Carroll October, 2004

2 Artificial Neural Networks and Supervised Learning

3 Backpropagation and Associated Parameters: Gain Activation Function: Used to compute output of neuron from its inputs Sigmoid function: As gain increases, slope of activation function of neurons increases: Red: gain =1 Blue: gain = 2 Green: gain =.5 Diagram source: http://www.willamette.edu/~g orr/classes/cs449/Maple/Activ ationFuncs/active.html

4 Effects of Learning Rate Diagram source: http://www.willamet te.edu/~gorr/classes/ cs449/linear2.html

5 Methods to Ensure or Speed Up Convergence that Often Work Adjust architecture: add more layers or more neurons per layer Adjust topology, or connections between neurons Add bias neuron that outputs 1 No learning can occur with backprop when neuron is outputting 0 Equivalent to shifting the range of the activation function Reduces number of neurons outputting 0 Add momentum term to weight adjustment equations: Smoothes learning to allow high learning rate without divergence ANN programmer must manipulate all of these parameters using expert knowledge

6 Introduction: Genetic Algorithms (GAs) Another set of adaptive algorithms derived from natural process (evolution) Organisms possess chromosomes made up of genes encoding for traits There is variability among organisms Some individuals will naturally be able to reproduce more in a particular environment, making them more “fit” for that environment By definition, genes of the more fit individuals will become more numerous in the population Population is skewed towards more fit individuals for the given environment Forces of variability then act on these genes, leading to new, more “fit” discoveries

7 The Genetic Algorithm

8 Designing and Training ANNs with GAs: Rationale Designing the right ANN for a particular task requires manipulating all of the parameters described previously, which requires expertise and much trial and error (and sometimes luck!) GAs are optimizers and can optimize these parameters Traditional training algorithms like backpropagation have a tendency to get stuck in local minima of multimodal or “hilly” error curves, missing the global minimum: GAs perform a “global search” and are hence more likely to find the global minimum Diagram source: http://www.willamette.edu/~gorr/ classes/cs449/momrate.html

9 Designing and Training ANNs With GAs: Implementation Direct (Matrix) Encoding Some classes of GAs for evolving ANNs: Darwinian Hybrid Darwinian Baldwinian Lamarckian

10 Introduction: Wisconsin Card Sorting Test (WCST) Psychological task requiring adaptive thinking: measures flexibility in thought; therefore interesting for testing properties of ANN learning Requires subject to resolve ambiguities… Which card was the correct card when negative feedback is given? Which rule was the current rule when a stimulus card matches a target card on more than one dimension?

11 Purpose and Implementation

12

13

14 Hypotheses Regarding Learning Highly accurate network trained on unambiguous pattern should produce output identical to the training set Accuracy rate of rule-to-card network should be 100% Calculus proof led to prediction that network trained on ambiguous pattern would output, at each node, the probability of the corresponding rule being the current rule Accuracy rates should be 100%, 50%, and 33.3% for input patterns with 1, 2, and 3 associated target patterns, respectively Minimum error rate for ambiguous pattern is a very high.22916 When whole model is combined, will be interesting to see if networks can generalize to data not seen in training

15 Experiment Performed Compare the performance of six GAs and one non-GA algorithm Algorithms tested: Non-GA “brute force” algorithm: try all combinations of parameters Darwinian evolution-only (Pure Darwinian) Darwinian with additional backpropagation training (Hybrid Darwinian) Baldwinian evolving architecture only Baldwinian evolving architecture and weights Lamarckian One “made up” algorithm: “Reverse Baldwinian” Motivation for Reverse Baldwinian: produce greater variability and evaluate fitness over longer training periods without increasing computation time

16 Hypotheses Regarding Algorithm Performance Good chance GAs would outperform non-GA, but some doubts due to known problems with GAs Hybrid Darwinian more effective than Pure Darwinian based on previous research Baldwinian and Lamarckian more effective than Darwinian based on previous research Lamarckian more effective than Baldwinian due to relatively short runs (app. 40 generations)

17 Results and Discussion

18 Learning Performance Accuracy of Best Networks Found by Best and Second-Worst Algorithms on Unambiguous Rule-to-Card Pattern Accuracy of Best Networks Found by Best and Second-Worst Algorithms on Ambiguous Card-to-Rule Pattern

19 Sample Output of Best Card-to-Rule Learner

20 Nature of Learning Ambiguous Pattern

21 Parameters of Best Non-GA Nets

22 Lowest Error Rate Found by All Algorithms **Algorithms did not include additional 1000 training epochs; error values are the lowest attained by any of the networks produced by the GA run alone.

23 Performance of Pure Darwinian Algorithm

24 Sample Output of Best Pure Darwinian Net on Card-to-Rule Pattern

25

26

27

28

29 Did Evolution Work At All? Fitness graphs generally show increase in fitness over generations T-tests show that selection mechanism selected more fit individuals Best Lamarckian nets still “better” than best non-GA net after equivalent amounts of training T-tests show that error rates of nets during Lamarckian run were significantly better than error rates for random nets at equivalent time points for unambiguous pattern However, results were the reverse for the ambiguous pattern Due to the nature of the paired t-test performed, these results can’t easily be explained by the theory about assessment time point being critical

30 To Evolve or Not To Evolve General reasons why evolution may not have been appropriate in this case (in addition to those specific to the ambiguous pattern): Patterns may have been easy to learn; backpropagation often outperforms GAs on weight training for easy patterns Crossover often not effective when using matrix encoding scheme Although one GA did outperform non-GA, difference was almost irrelevant since both were highly successful Non-GA is easier to program and almost five times faster to run

31 Suggestions for Future Work Attempt to combine and train the entire ANN model Manipulate GA parameters, such as mutation rate, crossover rate, population size, and number of generations Try different selection mechanisms Use different encoding scheme Experiment with new fitness function for ambiguous pattern Test different GAs or other evolutionary algorithms altogether Investigate ambiguous patterns further, including the role of momentum in their non-linear learning curve

32 What Does It All Mean? Learning power of ANNs: ANNs learned two sub-tasks that are difficult for many humans Ambiguous patterns may be more difficult to design and train with GAs Training ambiguous patterns may require special modifications such as eliminating the momentum term Additional support for existing theories based on prior research GAs not as effective on easy-to-learn patterns Hybrid algorithms generally outperform evolution-only algorithms Clarifying properties of ANNs and GAs is tremendously useful for engineering and may also elucidate properties of natural processes


Download ppt "The Performance of Evolutionary Artificial Neural Networks in Ambiguous and Unambiguous Learning Situations Melissa K. Carroll October, 2004."

Similar presentations


Ads by Google