10/6/20151 III. Recurrent Neural Networks. 10/6/20152 A. The Hopfield Network.

Slides:



Advertisements
Similar presentations
Bioinspired Computing Lecture 16
Advertisements

Chapter3 Pattern Association & Associative Memory
Computational Intelligence
Introduction to Neural Networks Computing
Lecture 13: Associative Memory References: D Amit, N Brunel, Cerebral Cortex 7, (1997) N Brunel, Network 11, (2000) N Brunel, Cerebral.
CS 678 –Relaxation and Hopfield Networks1 Relaxation and Hopfield Networks Totally connected recurrent relaxation networks Bidirectional weights (symmetric)
1 Neural networks 3. 2 Hopfield network (HN) model A Hopfield network is a form of recurrent artificial neural network invented by John Hopfield in 1982.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Pattern Association A pattern association learns associations between input patterns and output patterns. One of the most appealing characteristics of.
Correlation Matrix Memory CS/CMPE 333 – Neural Networks.
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Un Supervised Learning & Self Organizing Maps Learning From Examples
AN INTERACTIVE TOOL FOR THE STOCK MARKET RESEARCH USING RECURSIVE NEURAL NETWORKS Master Thesis Michal Trna
November 30, 2010Neural Networks Lecture 20: Interpolative Associative Memory 1 Associative Networks Associative networks are able to store a set of patterns.
Neural Networks Chapter 2 Joost N. Kok Universiteit Leiden.
December 7, 2010Neural Networks Lecture 21: Hopfield Network Convergence 1 The Hopfield Network The nodes of a Hopfield network can be updated synchronously.
CS623: Introduction to Computing with Neural Nets (lecture-10) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Unsupervised learning
Supervised Hebbian Learning
Chapter 6 Associative Models. Introduction Associating patterns which are –similar, –contrary, –in close proximity (spatial), –in close succession (temporal)
© Negnevitsky, Pearson Education, Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works Introduction, or.
Jochen Triesch, UC San Diego, 1 Short-term and Long-term Memory Motivation: very simple circuits can store patterns of.
Artificial Neural Networks
Associative-Memory Networks Input: Pattern (often noisy/corrupted) Output: Corresponding pattern (complete / relatively noise-free) Process 1.Load input.
Neural Networks Architecture Baktash Babadi IPM, SCS Fall 2004.
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
Artificial Neural Network Supervised Learning دكترمحسن كاهاني
Hebbian Coincidence Learning
Recurrent Network InputsOutputs. Motivation Associative Memory Concept Time Series Processing – Forecasting of Time series – Classification Time series.
Neural Networks and Fuzzy Systems Hopfield Network A feedback neural network has feedback loops from its outputs to its inputs. The presence of such loops.
IE 585 Associative Network. 2 Associative Memory NN Single-layer net in which the weights are determined in such a way that the net can store a set of.
7 1 Supervised Hebbian Learning. 7 2 Hebb’s Postulate “When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
CSC321: Introduction to Neural Networks and machine Learning Lecture 16: Hopfield nets and simulated annealing Geoffrey Hinton.
Unsupervised learning
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam
B. Stochastic Neural Networks
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Chapter 18 Connectionist Models
EEE502 Pattern Recognition
Signal & Weight Vector Spaces
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
Joint Moments and Joint Characteristic Functions.
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam en Universiteit Utrecht
1 Perceptron as one Type of Linear Discriminants IntroductionIntroduction Design of Primitive UnitsDesign of Primitive Units PerceptronsPerceptrons.
Chapter 6 Neural Network.
CS623: Introduction to Computing with Neural Nets (lecture-12) Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.
Lecture 9 Model of Hopfield
ECE 471/571 - Lecture 16 Hopfield Network 11/03/15.
Computational Intelligence Winter Term 2015/16 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.
“Principles of Soft Computing, 2 nd Edition” by S.N. Sivanandam & SN Deepa Copyright  2011 Wiley India Pvt. Ltd. All rights reserved. CHAPTER 2 ARTIFICIAL.
CSC2535: Computation in Neural Networks Lecture 8: Hopfield nets Geoffrey Hinton.
Lecture 39 Hopfield Network
J. Kubalík, Gerstner Laboratory for Intelligent Decision Making and Control Artificial Neural Networks II - Outline Cascade Nets and Cascade-Correlation.
CSC321 Lecture 18: Hopfield nets and simulated annealing
Chapter 6 Associative Models
Ch7: Hopfield Neural Model
Real Neurons Cell structures Cell body Dendrites Axon
Hebb and Perceptron.
Corso su Sistemi complessi:
Perceptron as one Type of Linear Discriminants
Recurrent Networks A recurrent network is characterized by
Lecture 39 Hopfield Network
Computational Intelligence
16. Mean Square Estimation
Computational Intelligence
CS623: Introduction to Computing with Neural Nets (lecture-11)
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

10/6/20151 III. Recurrent Neural Networks

10/6/20152 A. The Hopfield Network

10/6/20153 Typical Artificial Neuron inputs connection weights threshold output

10/6/20154 Typical Artificial Neuron linear combination net input (local field) activation function

10/6/20155 Equations Net input: New neural state:

10/6/20156 Hopfield Network Symmetric weights: w ij = w ji No self-action: w ii = 0 Zero threshold:  = 0 Bipolar states: s i  {–1, +1} Discontinuous bipolar activation function:

10/6/20157 What to do about h = 0? There are several options:   (0) = +1   (0) = –1   (0) = –1 or +1 with equal probability  h i = 0  no state change (s i = s i ) Not much difference, but be consistent Last option is slightly preferable, since symmetric

10/6/20158 Positive Coupling Positive sense (sign) Large strength

10/6/20159 Negative Coupling Negative sense (sign) Large strength

10/6/ Weak Coupling Either sense (sign) Little strength

10/6/ State = –1 & Local Field < 0 h < 0

10/6/ State = –1 & Local Field > 0 h > 0

10/6/ State Reverses h > 0

10/6/ State = +1 & Local Field > 0 h > 0

10/6/ State = +1 & Local Field < 0 h < 0

10/6/ State Reverses h < 0

10/6/ NetLogo Demonstration of Hopfield State Updating Run Hopfield-update.nlogo

10/6/ Hopfield Net as Soft Constraint Satisfaction System States of neurons as yes/no decisions Weights represent soft constraints between decisions –hard constraints must be respected –soft constraints have degrees of importance Decisions change to better respect constraints Is there an optimal set of decisions that best respects all constraints?

10/6/ Demonstration of Hopfield Net Dynamics I Run Hopfield-dynamics.nlogo

10/6/ Convergence Does such a system converge to a stable state? Under what conditions does it converge? There is a sense in which each step relaxes the “tension” in the system But could a relaxation of one neuron lead to greater tension in other places?

10/6/ Quantifying “Tension” If w ij > 0, then s i and s j want to have the same sign (s i s j = +1) If w ij < 0, then s i and s j want to have opposite signs (s i s j = –1) If w ij = 0, their signs are independent Strength of interaction varies with |w ij | Define disharmony (“tension”) D ij between neurons i and j: D ij = – s i w ij s j D ij < 0  they are happy D ij > 0  they are unhappy

10/6/ Total Energy of System The “energy” of the system is the total “tension” (disharmony) in it:

10/6/ Review of Some Vector Notation (column vectors) (inner product) (outer product) (quadratic form)

10/6/ Another View of Energy The energy measures the number of neurons whose states are in disharmony with their local fields (i.e. of opposite sign):

10/6/ Do State Changes Decrease Energy? Suppose that neuron k changes state Change of energy:

10/6/ Energy Does Not Increase In each step in which a neuron is considered for update: E{s(t + 1)} – E{s(t)}  0 Energy cannot increase Energy decreases if any neuron changes Must it stop?

10/6/ Proof of Convergence in Finite Time There is a minimum possible energy: –The number of possible states s  {–1, +1} n is finite –Hence E min = min { E(s) | s  {  1} n } exists Must show it is reached in a finite number of steps

10/6/ Steps are of a Certain Minimum Size

10/6/ Conclusion If we do asynchronous updating, the Hopfield net must reach a stable, minimum energy state in a finite number of updates This does not imply that it is a global minimum

10/6/ Lyapunov Functions A way of showing the convergence of discrete- or continuous-time dynamical systems For discrete-time system: –need a Lyapunov function E (“energy” of the state) –E is bounded below (E{s} > E min ) –  E < (  E) max  0 (energy decreases a certain minimum amount each step) –then the system will converge in finite time Problem: finding a suitable Lyapunov function

10/6/ Example Limit Cycle with Synchronous Updating w > 0

10/6/ The Hopfield Energy Function is Even A function f is odd if f (–x) = – f (x), for all x A function f is even if f (–x) = f (x), for all x Observe:

10/6/ Conceptual Picture of Descent on Energy Surface (fig. from Solé & Goodwin)

10/6/ Energy Surface (fig. from Haykin Neur. Netw.)

10/6/ Energy Surface + Flow Lines (fig. from Haykin Neur. Netw.)

10/6/ Flow Lines (fig. from Haykin Neur. Netw.) Basins of Attraction

10/6/ Bipolar State Space

10/6/ Basins in Bipolar State Space energy decreasing paths

10/6/ Demonstration of Hopfield Net Dynamics II Run initialized Hopfield.nlogo

10/6/ Storing Memories as Attractors (fig. from Solé & Goodwin)

10/6/ Example of Pattern Restoration (fig. from Arbib 1995)

10/6/ Example of Pattern Restoration (fig. from Arbib 1995)

10/6/ Example of Pattern Restoration (fig. from Arbib 1995)

10/6/ Example of Pattern Restoration (fig. from Arbib 1995)

10/6/ Example of Pattern Restoration (fig. from Arbib 1995)

10/6/ Example of Pattern Completion (fig. from Arbib 1995)

10/6/ Example of Pattern Completion (fig. from Arbib 1995)

10/6/ Example of Pattern Completion (fig. from Arbib 1995)

10/6/ Example of Pattern Completion (fig. from Arbib 1995)

10/6/ Example of Pattern Completion (fig. from Arbib 1995)

10/6/ Example of Association (fig. from Arbib 1995)

10/6/ Example of Association (fig. from Arbib 1995)

10/6/ Example of Association (fig. from Arbib 1995)

10/6/ Example of Association (fig. from Arbib 1995)

10/6/ Example of Association (fig. from Arbib 1995)

10/6/ Applications of Hopfield Memory Pattern restoration Pattern completion Pattern generalization Pattern association

10/6/ Hopfield Net for Optimization and for Associative Memory For optimization: –we know the weights (couplings) –we want to know the minima (solutions) For associative memory: –we know the minima (retrieval states) –we want to know the weights

10/6/ Hebb’s Rule “When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.” —Donald Hebb (The Organization of Behavior, 1949, p. 62)

10/6/ Example of Hebbian Learning: Pattern Imprinted

10/6/ Example of Hebbian Learning: Partial Pattern Reconstruction

10/6/ Mathematical Model of Hebbian Learning for One Pattern For simplicity, we will include self-coupling:

10/6/ A Single Imprinted Pattern is a Stable State Suppose W = xx T Then h = Wx = xx T x = nx since Hence, if initial state is s = x, then new state is s = sgn (n x) = x May be other stable states (e.g., –x)

10/6/ Questions How big is the basin of attraction of the imprinted pattern? How many patterns can be imprinted? Are there unneeded spurious stable states? These issues will be addressed in the context of multiple imprinted patterns

10/6/ Imprinting Multiple Patterns Let x 1, x 2, …, x p be patterns to be imprinted Define the sum-of-outer-products matrix:

10/6/ Definition of Covariance Consider samples (x 1, y 1 ), (x 2, y 2 ), …, (x N, y N )

10/6/ Weights & the Covariance Matrix Sample pattern vectors: x 1, x 2, …, x p Covariance of i th and j th components:

10/6/ Characteristics of Hopfield Memory Distributed (“holographic”) –every pattern is stored in every location (weight) Robust –correct retrieval in spite of noise or error in patterns –correct operation in spite of considerable weight damage or noise

10/6/ Demonstration of Hopfield Net Run Malasri Hopfield Demo

10/6/ Stability of Imprinted Memories Suppose the state is one of the imprinted patterns x m Then:

10/6/ Interpretation of Inner Products x k  x m = n if they are identical –highly correlated x k  x m = –n if they are complementary –highly correlated (reversed) x k  x m = 0 if they are orthogonal –largely uncorrelated x k  x m measures the crosstalk between patterns k and m

10/6/ Cosines and Inner products u v

10/6/ Conditions for Stability

10/6/ Sufficient Conditions for Instability (Case 1)

10/6/ Sufficient Conditions for Instability (Case 2)

10/6/ Sufficient Conditions for Stability The crosstalk with the sought pattern must be sufficiently small

10/6/ Capacity of Hopfield Memory Depends on the patterns imprinted If orthogonal, p max = n –but every state is stable  trivial basins So p max < n Let load parameter  = p / n equations

10/6/ Single Bit Stability Analysis For simplicity, suppose x k are random Then x k  x m are sums of n random  1  binomial distribution ≈ Gaussian  in range –n, …, +n  with mean  = 0  and variance  2 = n Probability sum > t: [See “Review of Gaussian (Normal) Distributions” on course website]

10/6/ Approximation of Probability

10/6/ Probability of Bit Instability (fig. from Hertz & al. Intr. Theory Neur. Comp.)

10/6/ Tabulated Probability of Single-Bit Instability P error  0.1% % % % %0.61 (table from Hertz & al. Intr. Theory Neur. Comp.)

10/6/ Spurious Attractors Mixture states: –sums or differences of odd numbers of retrieval states –number increases combinatorially with p –shallower, smaller basinsbasins –basins of mixtures swamp basins of retrieval states  overload –useful as combinatorial generalizations? –self-coupling generates spurious attractors Spin-glass states: –not correlated with any finite number of imprinted patterns –occur beyond overload because weights effectively random

10/6/ Basins of Mixture States

10/6/ Fraction of Unstable Imprints (n = 100) (fig from Bar-Yam)

10/6/ Number of Stable Imprints (n = 100) (fig from Bar-Yam)

10/6/ Number of Imprints with Basins of Indicated Size (n = 100) (fig from Bar-Yam)

10/6/ Summary of Capacity Results Absolute limit: p max <  c n = n If a small number of errors in each pattern permitted: p max  n If all or most patterns must be recalled perfectly: p max  n / log n Recall: all this analysis is based on random patterns Unrealistic, but sometimes can be arranged III B