Download presentation
Presentation is loading. Please wait.
1
Ch7: Hopfield Neural Model
HNM is a kind of BAM Two versions of Hopfield memory 1. Discrete: (a) sequential, (b) parallel 2. Continuous 7.1. Discrete Hopfield Memory Recall auto-BAM Mathematically, Training set: Weight matrix:
2
The HNM has no self-loop connection for each unit.
HNM Architecture: The HNM has no self-loop connection for each unit. Each unit has an external input signal . 2 2
3
Weight matrix: 2. Force the diagonal elements of W to be zero ( no self-loop) Input: Output: Threshold to be defined is different from BAM (= 0).
4
Energy function: : external input (w = 1) : threshold viewed as a negative (inhibitive) input (w = 1) 7.1.1 Sequential (Asynchronous) Hopfield Model Given a set of M binary patterns of N components Weight matrix: Threshold vector:
5
If stored exemplars are orthogonal, every exemplar
□ Energy Function ( with a minus sign) If stored exemplars are orthogonal, every exemplar corresponds to a local minimum of E. Feature space Energy space A particular exemplar is located by looking for its corresponding local minimum in the energy space using a descent approach.
8
8
10
where
11
Let i.e., the net input to unit i. Consider one-bit change, say To decrease energy, should be in consistence with in sign.
12
□ Algorithm (Sequential Hopfield Model)
Input a i. Compute Sequential fashion: ii. Update iii. Repeat until none of elements changes state
13
□ Convergence proof From (A), on any bit-change. □ Local minimum and attractors Local minimum: a point that has an energy level ≦ any nearest neighbor Attraction: an equilibrium state ※ Local minimum must be attraction, while it is not necessarily true for the reverse
14
○ Example 1: Two training pattern vectors
Weight matrix:
15
where nullifies the diagonal element
Threshold vector: Suppose input vector By cyclic update ordering: i. First iteration (k = 0) Initial vector =
16
a. lst bit (i = 1) Compute the net input Update Obtain 1st bit updated
17
b. 2nd bit (i = 2) Compute the net input Update the state Obtain 2nd bit unchanged c. 3rd bit (i = 3) Unchanged d. 4th bit (i = 4) Unchanged
18
The above can simply be performed as
1. Compute 2. Update
19
ii. Second iteration 1. Compute 2. Update
20
iii Terminate ※ Different ordering retrieves different output ○ Example 2: Convergent state depends on the order of update Two patterns: Weight matrix:
21
Threshold vector: ※ The output can be obtained by following the energy-descending directions in a hypercube.
22
i. Energy level for [ ]
23
ii. Energy level for [ ] There are more than one directions in which the energy level can descend. The selection of the path is determined by the order of updating bits.
24
。 Start with with energy -1
Two paths lead to lower energy ( )→( )/-2, ( )→( )/-2 depend on left- or right-most bit updated first
25
7.1.2 Parallel (Synchronous) Hopfield Model
□ Weights: ※ The diagonal weights are not set to zero (i.e., having self-loop) Thresholds: 。 Algorithm: During the kth iteration: i. Compute the net input in parallel i = 1, …, N 25
26
ii. Update the states in parallel
Repeat until none of the element changes □ Convergence: At the kth parallel iteration, energy function The energy-level change due to one iteration
27
∵ W is a nonnegative definite matrix (∵ W formed by outer product Symmetric, nonnegative definite (1), (2)
28
□ Remarks A local/global minimum must be an attractor. An attractor is not necessarily a local/global minimum. There are many more spurious attractors in the parallel model than sequential model. The parallel model does not get trapped to local minimum as easily as the sequential model (∵ Even if a state is one bit away from a local minimum, it may not be trapped by that attractor because more bits changed in one iteration)
29
The parallel model appears to outperform the
sequential model in terms of percentage of correct retrieval Capacities of Hopfield and Hamming Networks Capacity: the number of distinct patterns that can be stored in the network. □ If a neural network contain N neurons, the capacity C of the network is at most
30
Proof: Given p patterns
Idea: (i) For a pattern, if it is of sufficiently low probability that any bit may change, then the pattern is considered to be a good attractor (ii) If all the p patterns are good attractors, then the network is said to have a capacity p; otherwise, lower than p.
31
。Work with bipolar representation
Consider an input examplar Ignore and let
32
Multiply where When changes to When changes to 1 changes when and only when (i.e., or 。 Define bit-error-rate
33
Suppose If Np large, from central limit theory
34
Suppose the total error probability < ε
criterion of stability discernibility The error probability for each pattern and each neuron (bit) This leads to Take the logarithm If N large, (N dominates)
35
□ Central Limit Theorem
iid r.v. with mean and variance □ Change of variation formulas : differentiable strictly increasing or strictly decreasing function X : a continuous r.v.,
36
7.2. Continuous Hopfield memory
□ Resemble actual neuron having continuous graded output □ An electronic circuit using amplifies, resistors, and capacitors is possibly built using VLSI A PE consists of an amplifier and an inverting amplifier used to simulate inhibitory signal. A resister , where is the weight matrix, is placed at the intersection connecting units i and j. 36
37
Total input current: external current, linking current leakage current
where 37
38
Treat the circuit as a transient RC circuit, in which
charging the capacitor is due to the net-input current, i.e., □ Energy function: c : capacity From (A), Show E is a Lyapunov function From (C), From (B),
39
Let the output function
monotonically increasing functions
40
7.3.3 The Traveling-Salesperson Problem
Constraints: 1. visit each city, 2. only once, 3. minimum distance Brute force: 2 directions fixed starting city
41
i. A set of n PEs representing n possible positions
□ Hopfield solution i. A set of n PEs representing n possible positions for a city in the tour Example tour solution: BAECD A: 01000, B: 10000, C: 00010, D: 00001, E: 00100 ii. Entries of matrix x: city, i: position Matrix representation
42
( Distance between cities x and y)
iii. Energy function Criteria: (a) Each city visited only once (b) Each position appearing on the tour only once (c) Include all cities, (d) Shortest total distance (1) (2) (3) (4) ( Distance between cities x and y)
43
When the network is stabilized, ideally
Term 1: Each row of the matrix contains a single value 1 Term 2: Each column of the matrix contains a single 1 Term 3: Each row and each column contain at most one 1
44
Term 4: minimum when x, y are not in sequence on the tour when x, y are in sequence on the tour
45
iv. Weight matrix Defined in terms of inhibitions between PEs (A) Inhibition term from criterion (a) (B) Inhibition term from criterion (b)
46
(C) Inhibition term from criterion (c) -C
-C : constant (global inhibition) (D) Inhibition term from criterion (d) If j = i-1 or i+1, x and y are adjacent cities. two cities far apart should receive a large inhibition
47
Pattern of inhibitory connections
Unit a illustrates the inhibition between units on a single row. Unit b illustrates the inhibition between units on a single column. Unit c shows the inhibition of units in adjacent columns. The global inhibition is not shown.
48
。 Evolution of the network
where Discretize: (1-D) (2-D)
49
Substitute into
50
Update: Output: 。 Example: n = 10 (cities) Select A = B = 500, C = 200, D = 500 Initialize s.t.
51
1. Initialize s.t. 4. Update: 6. Repeat steps 1-5 until stopping criteria are satisfied.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.