The Essence of PDP: Local Processing, Global Outcomes PDP Class January 16, 2013.

Slides:



Advertisements
Similar presentations
Deep Learning Bing-Chen Tsai 1/21.
Advertisements

Stochastic Neural Networks Deep Learning and Neural Nets Spring 2015.
Simulated Annealing Premchand Akella. Agenda Motivation The algorithm Its applications Examples Conclusion.
Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
CS 678 –Boltzmann Machines1 Boltzmann Machine Relaxation net with visible and hidden units Learning algorithm Avoids local minima (and speeds up learning)
Tuomas Sandholm Carnegie Mellon University Computer Science Department
Learning in Recurrent Networks Psychology 209 February 25, 2013.
CS Perceptrons1. 2 Basic Neuron CS Perceptrons3 Expanded Neuron.
CS 678 –Relaxation and Hopfield Networks1 Relaxation and Hopfield Networks Totally connected recurrent relaxation networks Bidirectional weights (symmetric)
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
MAE 552 – Heuristic Optimization Lecture 6 February 6, 2002.
Stochastic Neural Networks, Optimal Perceptual Interpretation, and the Stochastic Interactive Activation Model PDP Class January 15, 2010.
Network Goodness and its Relation to Probability PDP Class Winter, 2010 January 13, 2010.
Back-Propagation Algorithm
1 Simulated Annealing Terrance O ’ Regan. 2 Outline Motivation The algorithm Its applications Examples Conclusion.
Stochastic Interactive Activation and Interactive Activation in the Brain PDP Class January 20, 2010.
Restricted Boltzmann Machines and Deep Belief Networks
CSC321: Introduction to Neural Networks and Machine Learning Lecture 20 Learning features one layer at a time Geoffrey Hinton.
December 7, 2010Neural Networks Lecture 21: Hopfield Network Convergence 1 The Hopfield Network The nodes of a Hopfield network can be updated synchronously.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Jochen Triesch, UC San Diego, 1 Short-term and Long-term Memory Motivation: very simple circuits can store patterns of.
Chapter 7 Other Important NN Models Continuous Hopfield mode (in detail) –For combinatorial optimization Simulated annealing (in detail) –Escape from local.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Stochastic Optimization and Simulated Annealing Psychology /719 January 25, 2001.
1 Local search and optimization Local search= use single current state and move to neighboring states. Advantages: –Use very little memory –Find often.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
The Boltzmann Machine Psych 419/719 March 1, 2001.
Learning Lateral Connections between Hidden Units Geoffrey Hinton University of Toronto in collaboration with Kejie Bao University of Toronto.
Back Propagation and Representation in PDP Networks Psychology 209 February 6, 2013.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Geoffrey Hinton CSC2535: 2013 Lecture 5 Deep Boltzmann Machines.
CSC321: Introduction to Neural Networks and machine Learning Lecture 16: Hopfield nets and simulated annealing Geoffrey Hinton.
CIAR Second Summer School Tutorial Lecture 1a Sigmoid Belief Nets and Boltzmann Machines Geoffrey Hinton.
CSC 2535 Lecture 8 Products of Experts Geoffrey Hinton.
Linear Discrimination Reading: Chapter 2 of textbook.
CSC2535 Lecture 4 Boltzmann Machines, Sigmoid Belief Nets and Gibbs sampling Geoffrey Hinton.
B. Stochastic Neural Networks
CSC321: Introduction to Neural Networks and Machine Learning Lecture 18 Learning Boltzmann Machines Geoffrey Hinton.
Constraint Satisfaction and Schemata Psych 205. Goodness of Network States and their Probabilities Goodness of a network state How networks maximize goodness.
CIAR Summer School Tutorial Lecture 1b Sigmoid Belief Nets Geoffrey Hinton.
Boltzman Machines Stochastic Hopfield Machines Lectures 11e 1.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter The Normal Probability Distribution 7.
Lecture 9 Model of Hopfield
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 17: Boltzmann Machines as Probabilistic Models Geoffrey Hinton.
CSC2535 Lecture 5 Sigmoid Belief Nets
CSC321: Computation in Neural Networks Lecture 21: Stochastic Hopfield nets and simulated annealing Geoffrey Hinton.
CSC2535: Computation in Neural Networks Lecture 8: Hopfield nets Geoffrey Hinton.
CSC Lecture 23: Sigmoid Belief Nets and the wake-sleep algorithm Geoffrey Hinton.
Systematic errors of MC simulations Equilibrium error averages taken before the system has reached equilibrium  Monitor the variables you are interested.
Some Slides from 2007 NIPS tutorial by Prof. Geoffrey Hinton
Fall 2004 Backpropagation CS478 - Machine Learning.
Simulated Annealing Premchand Akella.
Network States as Perceptual Inferences
CSC321 Lecture 18: Hopfield nets and simulated annealing
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
CSC321: Neural Networks Lecture 19: Boltzmann Machines as Probabilistic Models Geoffrey Hinton.
Network States as Perceptual Inferences
Artificial Neural Networks
Recurrent Networks A recurrent network is characterized by
Boltzmann Machine (BM) (§6.4)
Xin-She Yang, Nature-Inspired Optimization Algorithms, Elsevier, 2014
Graded Constraint Satisfaction, the IA Model, and Network States as Perceptual Inferences Psychology 209 January 15, 2019.
Back Propagation and Representation in PDP Networks
Simulated Annealing & Boltzmann Machines
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

The Essence of PDP: Local Processing, Global Outcomes PDP Class January 16, 2013

Goodness of Network States and their Probabilities Goodness of a network state How networks maximize goodness The Hopfield network and Rumelhart’s continuous version Stochastic networks: The Boltzmann Machine, and the relationship between goodness and probability Equilibrium, ergodicity, and annealing Exploring the relationship between Goodness and Probability in an ensemble of networks

Network Goodness and How to Increase it

The Hopfield Network Assume symmetric weights. Units have binary states [+1,-1] Units are set into initial states Choose a unit to update at random If net > 0, then set state to 1. Else set state to -1. Goodness always increases… or stays the same.

Rumelhart’s Continuous Version Unit states have values between 0 and 1. Units are updated asynchronously. Update is gradual, according to the rule: There are separate scaling parameters for external and internal input:

The Cube Network Positive weights have value +1 Negative weights have value -1.5 ‘External input’ is implemented as a positive bias of.5 to all units. These values are all scaled by the istr parameter in calculating goodness in the program (istr = 0.4).

Goodness Landscape of Cube Network

Rumelhart’s Room Schema Model Units for attributes/objects found in rooms Data: lists of attributes found in rooms No room labels Weights and biases: Modes of use in simulation: –Clamp one or more units, let the network settle –Clamp all units, let the network calculate the Goodness of a state (‘pattern’ mode)

Weights for all units

Goodness Landscape for Some Rooms

Slices thru landscape with three different starting points

The Boltzmann Machine: The Stochastic Hopfield Network Units have binary states [0,1], Update is asynchronous. The activation function is: Assuming processing is ergodic: that is, it is possible to get from any state to any other state, then when the state of the network reaches equilibrium, the relative probability and relative goodness of two states are related as follows: More generally, at equilibrium we have the Probability-Goodness Equation: or

Simulated Annealing Start with high temperature. This means it is easy to jump from state to state. Gradually reduce temperature. In the limit of infinitely slow annealing, we can guarantee that the network will be in the best possible state (or in one of them, if two or more are equally good). Thus, the best possible interpretation can always be found (if you are patient)!

Exploring Probability Distributions over States Imagine settling to a non-zero temperature, such as T = 0.5. At this temperature, there’s still some probability of being in a state that is less than perfect. Consider an ensemble of networks –At equilibrium (i.e. after enough cycles, possibly with annealing) the relative frequencies of being in the different states will approximate the relative probabilities given by the Probability- Goodness equation. You will have an opportunity to explore this situation in the homework assignment