Stochastic Optimization and Simulated Annealing Psychology 85-419/719 January 25, 2001.

Slides:



Advertisements
Similar presentations
Bioinspired Computing Lecture 16
Advertisements

Stochastic Neural Networks Deep Learning and Neural Nets Spring 2015.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.
Schema Theory and Soft Constraint Satisfaction Psych /719 January 23, 2001.
Simulated Annealing Premchand Akella. Agenda Motivation The algorithm Its applications Examples Conclusion.
Deep Learning and Neural Nets Spring 2015
Laurent Itti: CS564 - Brain Theory and Artificial Intelligence
CS 678 –Boltzmann Machines1 Boltzmann Machine Relaxation net with visible and hidden units Learning algorithm Avoids local minima (and speeds up learning)
Practical Advice For Building Neural Nets Deep Learning and Neural Nets Spring 2015.
Tuomas Sandholm Carnegie Mellon University Computer Science Department
Applying Machine Learning to Circuit Design David Hettlinger Amy Kerr Todd Neller.
CSC321: Neural Networks Lecture 3: Perceptrons
1 Neural networks 3. 2 Hopfield network (HN) model A Hopfield network is a form of recurrent artificial neural network invented by John Hopfield in 1982.
CPSC 322, Lecture 16Slide 1 Stochastic Local Search Variants Computer Science cpsc322, Lecture 16 (Textbook Chpt 4.8) February, 9, 2009.
Unsupervised Learning With Neural Nets Deep Learning and Neural Nets Spring 2015.
Optimization via Search CPSC 315 – Programming Studio Spring 2009 Project 2, Lecture 4 Adapted from slides of Yoonsuck Choe.
MAE 552 – Heuristic Optimization Lecture 6 February 6, 2002.
Nature’s Algorithms David C. Uhrig Tiffany Sharrard CS 477R – Fall 2007 Dr. George Bebis.
Network Goodness and its Relation to Probability PDP Class Winter, 2010 January 13, 2010.
Basic Models in Neuroscience Oren Shriki 2010 Associative Memory 1.
Simulated Annealing Van Laarhoven, Aarts Version 1, October 2000.
Processing and Constraint Satisfaction: Psychological Implications The Interactive-Activation (IA) Model of Word Recognition Psychology /719 January.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 25 Instructor: Paul Beame.
Neural Networks Chapter 2 Joost N. Kok Universiteit Leiden.
December 7, 2010Neural Networks Lecture 21: Hopfield Network Convergence 1 The Hopfield Network The nodes of a Hopfield network can be updated synchronously.
8/6/20151 Neural Networks CIS 479/579 Bruce R. Maxim UM-Dearborn.
Optimization via Search CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 4 Adapted from slides of Yoonsuck Choe.
Elements of the Heuristic Approach
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
General Knowledge Dr. Claudia J. Stanny EXP 4507 Memory & Cognition Spring 2009.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
A Model of Object Permanence Psych 419/719 March 6, 2001.
Heuristic Optimization Methods
Neural Networks Architecture Baktash Babadi IPM, SCS Fall 2004.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Local Search Algorithms This lecture topic Chapter Next lecture topic Chapter 5 (Please read lecture topic material before and after each lecture.
The Boltzmann Machine Psych 419/719 March 1, 2001.
Simulated Annealing.
CSC321: Introduction to Neural Networks and machine Learning Lecture 16: Hopfield nets and simulated annealing Geoffrey Hinton.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 9: Ways of speeding up the learning and preventing overfitting Geoffrey Hinton.
CSC321: Neural Networks Lecture 19: Simulating Brain Damage Geoffrey Hinton.
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam
1 CSC 9010 Spring Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011.
Local Search Algorithms
CSC2535 Lecture 4 Boltzmann Machines, Sigmoid Belief Nets and Gibbs sampling Geoffrey Hinton.
B. Stochastic Neural Networks
CSC321: Introduction to Neural Networks and Machine Learning Lecture 18 Learning Boltzmann Machines Geoffrey Hinton.
The Essence of PDP: Local Processing, Global Outcomes PDP Class January 16, 2013.
Constraint Satisfaction and Schemata Psych 205. Goodness of Network States and their Probabilities Goodness of a network state How networks maximize goodness.
Optimization Problems
Framework For PDP Models Psych /719 Jan 18, 2001.
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam en Universiteit Utrecht
Lecture 6 – Local Search Dr. Muhammad Adnan Hashmi 1 24 February 2016.
CSC321: Computation in Neural Networks Lecture 21: Stochastic Hopfield nets and simulated annealing Geoffrey Hinton.
Escaping Local Optima. Where are we? Optimization methods Complete solutions Partial solutions Exhaustive search Hill climbing Exhaustive search Hill.
CSC2535: Computation in Neural Networks Lecture 8: Hopfield nets Geoffrey Hinton.
Optimization via Search
CSC321 Lecture 18: Hopfield nets and simulated annealing
Heuristic Optimization Methods
ECE 471/571 - Lecture 17 Back Propagation.
Artificial Neural Networks
School of Computer Science & Engineering
More on Search: A* and Optimization
Artificial Intelligence
More on HW 2 (due Jan 26) Again, it must be in Python 2.7.
More on HW 2 (due Jan 26) Again, it must be in Python 2.7.
Neural Network Training
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

Stochastic Optimization and Simulated Annealing Psychology /719 January 25, 2001

In Previous Lecture... Discussed constraint satisfaction networks, having: –Units, weights, and a “goodness” function Updating states involves computing input from other units –Guaranteed to locally increase goodness –Not guaranteed to globally increase goodness

The General Problem: Local Optima Goodness Activation State Local Optima True Optima

How To Solve the Problem of Local Optima? Exhaustive search? –Nah. Takes too long. n units have 2 to the nth power possible states (if binary) Random re-starts? –Seems wasteful. How about something that generally goes in the right direction, with some randomness?

Sometimes It Isn’t Best To Always Go Straight Towards The Goal Rubik’s Cube: Undo some moves in order to make progress Baseball: sacrifice fly Navigation: move away from goal, to get around obstacles

Randomness Can Help Us Escape Bad Solutions Activation State

So, How Random Do We Want to Be? We can take a cue from physical systems In metallurgy, metals can reach a very strong (stable) state by: –Melting it; scrambles molecular structure –Gradually cooling it –Resulting molecular structure very stable New terminology: reduce energy (which is kind of like the negative of goodness)

Simulated Annealing Odds that a unit is on is a function of: The input to the unit, net The temperature, T

Picking it Apart... As net increases, probability that output is 1 increases –e is raised to the negative of net/T; so as net gets big, e to the negative of net/T goes to zero. So probability goes to 1/1=1.

The Temperature Term When T is big, the exponent for e goes to zero. e (or anything) to the zero power is 1 So, odds output is 1 goes to 1/(1+1)=0.5

The Temperature Term (2) When T gets small, exponent gets big. Effect of net becomes amplified.

Different Temperatures... Net Input Probability Output is 1 High Temp Med Temp Low Temp 0 1.5

Ok, So At What Rate Do We Reduce Temperature? In general, must decrease it very slowly to guarantee convergence to global optimum T In practice, we can get away with a more aggressive annealing schedule..

Putting it Together... We can represent facts, etc. as units Knowledge about these facts encoded as weights Network processing fills in gaps, makes inferences, forms interpretations Stable Attractors form; the weights and input sculpt these attractors. Stability (and goodness) enhanced with randomness in updating process.

Stable Attractors Can Be Thought Of As Memories How many stable patterns can be remembered by a network with N units? There are 2 to the N possible patterns… … but only about 0.15*N will be stable To remember 100 things, need 100/0.15=666 units! (then again, the brain has about 10 to the 12th power neurons…)

Human Performance, When Damaged (some examples) Category coordinate errors –Naming a CAT as a DOG Superordinate errors –Naming a CAT as an ANIMAL Visual errors (deep dyslexics) –Naming SYMPATHY as SYMPHONY –or, naming SYMPATHY as ORCHESTRA

The Attractors We’ve Talked About Can Be Useful In Understanding This CAT COT “CAT” CAT COT “CAT” Normal Performance A Visual Error (see Plaut Hinton, Shallice)

Properties of Human Memory Details tend to go first, more general things next. Not all-or-nothing forgetting. Things tend to be forgotten, based on –Salience –Recency –Complexity –Age of acquisition?

Do These Networks Have These Properties? Sort of. Graceful degradation. Features vanish as a function of strength of input to them. Complexity: more complex / arbitrary patterns can be more difficult to retain Salience, recency, age of acquisition? –Depends on learning rule. Stay tuned

Next Time: Psychological Implications: The IAC Model of Word Perception Optional reading: McClelland and Rumelhart ‘81 (handout) Rest of this class: Lab session. Help installing software, help with homework.