Download presentation
Presentation is loading. Please wait.
Published byNaomi Spencer Modified over 9 years ago
1
10/6/20151 III. Recurrent Neural Networks
2
10/6/20152 A. The Hopfield Network
3
10/6/20153 Typical Artificial Neuron inputs connection weights threshold output
4
10/6/20154 Typical Artificial Neuron linear combination net input (local field) activation function
5
10/6/20155 Equations Net input: New neural state:
6
10/6/20156 Hopfield Network Symmetric weights: w ij = w ji No self-action: w ii = 0 Zero threshold: = 0 Bipolar states: s i {–1, +1} Discontinuous bipolar activation function:
7
10/6/20157 What to do about h = 0? There are several options: (0) = +1 (0) = –1 (0) = –1 or +1 with equal probability h i = 0 no state change (s i = s i ) Not much difference, but be consistent Last option is slightly preferable, since symmetric
8
10/6/20158 Positive Coupling Positive sense (sign) Large strength
9
10/6/20159 Negative Coupling Negative sense (sign) Large strength
10
10/6/201510 Weak Coupling Either sense (sign) Little strength
11
10/6/201511 State = –1 & Local Field < 0 h < 0
12
10/6/201512 State = –1 & Local Field > 0 h > 0
13
10/6/201513 State Reverses h > 0
14
10/6/201514 State = +1 & Local Field > 0 h > 0
15
10/6/201515 State = +1 & Local Field < 0 h < 0
16
10/6/201516 State Reverses h < 0
17
10/6/201517 NetLogo Demonstration of Hopfield State Updating Run Hopfield-update.nlogo
18
10/6/201518 Hopfield Net as Soft Constraint Satisfaction System States of neurons as yes/no decisions Weights represent soft constraints between decisions –hard constraints must be respected –soft constraints have degrees of importance Decisions change to better respect constraints Is there an optimal set of decisions that best respects all constraints?
19
10/6/201519 Demonstration of Hopfield Net Dynamics I Run Hopfield-dynamics.nlogo
20
10/6/201520 Convergence Does such a system converge to a stable state? Under what conditions does it converge? There is a sense in which each step relaxes the “tension” in the system But could a relaxation of one neuron lead to greater tension in other places?
21
10/6/201521 Quantifying “Tension” If w ij > 0, then s i and s j want to have the same sign (s i s j = +1) If w ij < 0, then s i and s j want to have opposite signs (s i s j = –1) If w ij = 0, their signs are independent Strength of interaction varies with |w ij | Define disharmony (“tension”) D ij between neurons i and j: D ij = – s i w ij s j D ij < 0 they are happy D ij > 0 they are unhappy
22
10/6/201522 Total Energy of System The “energy” of the system is the total “tension” (disharmony) in it:
23
10/6/201523 Review of Some Vector Notation (column vectors) (inner product) (outer product) (quadratic form)
24
10/6/201524 Another View of Energy The energy measures the number of neurons whose states are in disharmony with their local fields (i.e. of opposite sign):
25
10/6/201525 Do State Changes Decrease Energy? Suppose that neuron k changes state Change of energy:
26
10/6/201526 Energy Does Not Increase In each step in which a neuron is considered for update: E{s(t + 1)} – E{s(t)} 0 Energy cannot increase Energy decreases if any neuron changes Must it stop?
27
10/6/201527 Proof of Convergence in Finite Time There is a minimum possible energy: –The number of possible states s {–1, +1} n is finite –Hence E min = min { E(s) | s { 1} n } exists Must show it is reached in a finite number of steps
28
10/6/201528 Steps are of a Certain Minimum Size
29
10/6/201529 Conclusion If we do asynchronous updating, the Hopfield net must reach a stable, minimum energy state in a finite number of updates This does not imply that it is a global minimum
30
10/6/201530 Lyapunov Functions A way of showing the convergence of discrete- or continuous-time dynamical systems For discrete-time system: –need a Lyapunov function E (“energy” of the state) –E is bounded below (E{s} > E min ) – E < ( E) max 0 (energy decreases a certain minimum amount each step) –then the system will converge in finite time Problem: finding a suitable Lyapunov function
31
10/6/201531 Example Limit Cycle with Synchronous Updating w > 0
32
10/6/201532 The Hopfield Energy Function is Even A function f is odd if f (–x) = – f (x), for all x A function f is even if f (–x) = f (x), for all x Observe:
33
10/6/201533 Conceptual Picture of Descent on Energy Surface (fig. from Solé & Goodwin)
34
10/6/201534 Energy Surface (fig. from Haykin Neur. Netw.)
35
10/6/201535 Energy Surface + Flow Lines (fig. from Haykin Neur. Netw.)
36
10/6/201536 Flow Lines (fig. from Haykin Neur. Netw.) Basins of Attraction
37
10/6/201537 Bipolar State Space
38
10/6/201538 Basins in Bipolar State Space energy decreasing paths
39
10/6/201539 Demonstration of Hopfield Net Dynamics II Run initialized Hopfield.nlogo
40
10/6/201540 Storing Memories as Attractors (fig. from Solé & Goodwin)
41
10/6/201541 Example of Pattern Restoration (fig. from Arbib 1995)
42
10/6/201542 Example of Pattern Restoration (fig. from Arbib 1995)
43
10/6/201543 Example of Pattern Restoration (fig. from Arbib 1995)
44
10/6/201544 Example of Pattern Restoration (fig. from Arbib 1995)
45
10/6/201545 Example of Pattern Restoration (fig. from Arbib 1995)
46
10/6/201546 Example of Pattern Completion (fig. from Arbib 1995)
47
10/6/201547 Example of Pattern Completion (fig. from Arbib 1995)
48
10/6/201548 Example of Pattern Completion (fig. from Arbib 1995)
49
10/6/201549 Example of Pattern Completion (fig. from Arbib 1995)
50
10/6/201550 Example of Pattern Completion (fig. from Arbib 1995)
51
10/6/201551 Example of Association (fig. from Arbib 1995)
52
10/6/201552 Example of Association (fig. from Arbib 1995)
53
10/6/201553 Example of Association (fig. from Arbib 1995)
54
10/6/201554 Example of Association (fig. from Arbib 1995)
55
10/6/201555 Example of Association (fig. from Arbib 1995)
56
10/6/201556 Applications of Hopfield Memory Pattern restoration Pattern completion Pattern generalization Pattern association
57
10/6/201557 Hopfield Net for Optimization and for Associative Memory For optimization: –we know the weights (couplings) –we want to know the minima (solutions) For associative memory: –we know the minima (retrieval states) –we want to know the weights
58
10/6/201558 Hebb’s Rule “When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.” —Donald Hebb (The Organization of Behavior, 1949, p. 62)
59
10/6/201559 Example of Hebbian Learning: Pattern Imprinted
60
10/6/201560 Example of Hebbian Learning: Partial Pattern Reconstruction
61
10/6/201561 Mathematical Model of Hebbian Learning for One Pattern For simplicity, we will include self-coupling:
62
10/6/201562 A Single Imprinted Pattern is a Stable State Suppose W = xx T Then h = Wx = xx T x = nx since Hence, if initial state is s = x, then new state is s = sgn (n x) = x May be other stable states (e.g., –x)
63
10/6/201563 Questions How big is the basin of attraction of the imprinted pattern? How many patterns can be imprinted? Are there unneeded spurious stable states? These issues will be addressed in the context of multiple imprinted patterns
64
10/6/201564 Imprinting Multiple Patterns Let x 1, x 2, …, x p be patterns to be imprinted Define the sum-of-outer-products matrix:
65
10/6/201565 Definition of Covariance Consider samples (x 1, y 1 ), (x 2, y 2 ), …, (x N, y N )
66
10/6/201566 Weights & the Covariance Matrix Sample pattern vectors: x 1, x 2, …, x p Covariance of i th and j th components:
67
10/6/201567 Characteristics of Hopfield Memory Distributed (“holographic”) –every pattern is stored in every location (weight) Robust –correct retrieval in spite of noise or error in patterns –correct operation in spite of considerable weight damage or noise
68
10/6/201568 Demonstration of Hopfield Net Run Malasri Hopfield Demo
69
10/6/201569 Stability of Imprinted Memories Suppose the state is one of the imprinted patterns x m Then:
70
10/6/201570 Interpretation of Inner Products x k x m = n if they are identical –highly correlated x k x m = –n if they are complementary –highly correlated (reversed) x k x m = 0 if they are orthogonal –largely uncorrelated x k x m measures the crosstalk between patterns k and m
71
10/6/201571 Cosines and Inner products u v
72
10/6/201572 Conditions for Stability
73
10/6/201573 Sufficient Conditions for Instability (Case 1)
74
10/6/201574 Sufficient Conditions for Instability (Case 2)
75
10/6/201575 Sufficient Conditions for Stability The crosstalk with the sought pattern must be sufficiently small
76
10/6/201576 Capacity of Hopfield Memory Depends on the patterns imprinted If orthogonal, p max = n –but every state is stable trivial basins So p max < n Let load parameter = p / n equations
77
10/6/201577 Single Bit Stability Analysis For simplicity, suppose x k are random Then x k x m are sums of n random 1 binomial distribution ≈ Gaussian in range –n, …, +n with mean = 0 and variance 2 = n Probability sum > t: [See “Review of Gaussian (Normal) Distributions” on course website]
78
10/6/201578 Approximation of Probability
79
10/6/201579 Probability of Bit Instability (fig. from Hertz & al. Intr. Theory Neur. Comp.)
80
10/6/201580 Tabulated Probability of Single-Bit Instability P error 0.1%0.105 0.36%0.138 1%0.185 5%0.37 10%0.61 (table from Hertz & al. Intr. Theory Neur. Comp.)
81
10/6/201581 Spurious Attractors Mixture states: –sums or differences of odd numbers of retrieval states –number increases combinatorially with p –shallower, smaller basinsbasins –basins of mixtures swamp basins of retrieval states overload –useful as combinatorial generalizations? –self-coupling generates spurious attractors Spin-glass states: –not correlated with any finite number of imprinted patterns –occur beyond overload because weights effectively random
82
10/6/201582 Basins of Mixture States
83
10/6/201583 Fraction of Unstable Imprints (n = 100) (fig from Bar-Yam)
84
10/6/201584 Number of Stable Imprints (n = 100) (fig from Bar-Yam)
85
10/6/201585 Number of Imprints with Basins of Indicated Size (n = 100) (fig from Bar-Yam)
86
10/6/201586 Summary of Capacity Results Absolute limit: p max < c n = 0.138 n If a small number of errors in each pattern permitted: p max n If all or most patterns must be recalled perfectly: p max n / log n Recall: all this analysis is based on random patterns Unrealistic, but sometimes can be arranged III B
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.