Download presentation
1
Bioinspired Computing Lecture 16
Associative Memories with Artificial Neural Networks Netta Cohen
2
Last time Today Biologically realistic architecture
Dynamic interactive behaviour Natural learning protocols Biologically-inspired associative memories Also… steps away from biologically realistic model Unsupervised learning Applications Recurrent neural nets: Attractor neural nets:
3
Recurrent Nets: Pros & Cons
Biologically-realistic architecture/performance Complex self-sustained activity Distributed representations Dynamic interactions with environment Powerful computation Noise tolerance Graceful degradation Hard to formalise in information processing terms Hard to visualise activity Hard to train with no guarantee of convergence No guaranteed solution Cons Pros attractor neural nets are a special case of recurrent nets.
4
Training Methods for Recurrent Networks
Genetic algorithms Usually in a controller Fitness evaluated on the controller not the network Requires sophisticated genome Backpropagation through time (BPTT) Understand principle
6
+1 -1
7
Training Methods for Recurrent Networks
Jets and Sharks: interesting dynamic behaviour but weights set by hand Looking for an automatic, ‘grounded’ way of training networks Hopfield (today)! Elman (also today … turn a ffd net into something recurrent)
8
Associative Memory The imprinting and recollection of memories is an important component of what we do & how we process information. If we were to model these processes, here are a few conditions we might want to include in our model: Store and reliably recall multiple independent memories. Given only partial input, recall complete information, or Given noisy input, recall noise-free prototype information. Learn new memories in a biologically realistic manner. Recall memories fast enough (before next input is received) Once recalled, maintain attention or memory long enough (for information processing & transmission elsewhere in brain).
9
Attractor Neural Nets In RNNs, the state of the system is dictated both by internal dynamics & the system response to inputs from the environment. state space cross-section of energy landscape In some cases, trajectories in state-space can be guaranteed to lead to one of several stable states (fixed points, cycles or generic attractors).
10
Attractor Neural Nets (cont.)
Dynamical systems such as RNNs could serve as models of associative memory if it was possible to encode each memory in a specific stable state or attractor. In 1982, John Hopfield realised that by imposing a couple of restrictions on the architecture of the nets, he could guarantee the existence of attractors, such that every initial condition would necessarily evolve to a stable solution, where it would stay. This is tantamount to the requirement that the above picture be described in terms of an energy landscape.
11
Attractor Neural Nets: Architecture
All connections are symmetric wij = wji wij wji j No self-connectedness “Gerard Toulouse has called Hopfield’s use of symmetric connections a ‘clever step backwards from biological realism’. The cleverness arises from the existence of an energy function.”* i wii=0 The existence of an energy function provides us with: A formalism of the process of memory storage and recall A tool to visualise the activity (both learning and recall) A straightforward way to train the net Once trained, a guaranteed solution (recall of the correct memory). * Hertz, Krogh & Palmer Introduction to the theory of neural computation (1990).
12
How Does It Work? Nodes are modelled by conventional binary MP neurons. Each neuron serves both as an input and output unit. (There are no hidden units.) States are given by the pattern of activity of the neurons (e.g. 101 for a network with three neurons). The number of neuron sets the maximum length for a bit-string of memory. Different patterns can be simultaneously stored in the network. The number of independent patterns that can be remembered is less than or equal to the number of nodes. Memory recall corresponds to a trajectory taking the system from some initial state (input) to the local energy minimum (closest association) . Each step along the (recall) trajectory results in the same or a lower energy. Since energy is bounded from below, a solution is guaranteed for every input.
13
A working example Weight matrix
1 2 3 4 5 Input (t=0) (neuron i) Weight matrix (neuron j) t=1 1 1 1 1 t=2 t=3 t=4 t ... ∑ xi wij i 1 3 -1 threshold = 0 Exercise: repeat this example with an initial input of [ ].
14
More general examples Stability:
The stable pattern reached in the working example represents a fixed point in the dynamics. While stable solutions are guaranteed, not all stable solutions are fixed point solutions. State update rule: This example used a “synchronous updating” method. Asynchronous (sequential or random) updating methods can also be implemented.
15
Asynchronous random updating
Pick any node Update it Repeat Example difference between asynchronous/synchronous: Hopfield network with two nodes: W =1 Initial state -1, 1
16
0,1 or -1,1 No essential difference!
Suppose we use weights w_ij for nodes that are -1,1
17
Trajectories in energy landscape
Where does energy come in? The formalism we need to answer this question comes from physics and requires slight modifications to our notation: 1) neuron coding [0 1] [-1 1] 2) Threshold now becomes a “sign” function: sign(input) = x=1 if input>0 old x if input=0 x= -1 if input<0 { Spin Glass 3) Asynchronous update
18
From energies to MP neurons
Define the energy of node i as Ei = - xi inputi where inputi to node i is the weighted sum over all neurons inputi = ∑ xj wij j This is called the mean field approximation: the “magnetic field” at each node (each spin) corresponds to a weighted average over all the fields generated by all other spins. When a specific node senses this field, it wants to align with the mean field, thus reducing its energy. This is the update rule: xi sign(inputi) We have just re-discovered that the MP update rule exactly corresponds to magnetic field alignment in Spin Glasses!
19
Ansynchronous Random updating
Easy to show that updating minimizes energy Example given …
20
Attractor Neural Nets The restrictions imposed on the recurrent net are now: j wij wji All connections are symmetric wij = wji No self-connectedness “Gerard Toulouse has called Hopfield’s use of symmetric connections a ‘clever step backward from biological realism’. The cleverness arises from the existence of an energy function.”* i wii=0 The existence of an energy function provides us with: A formalism of the process of memory storage and recall A tool to visualise the activity (both learning and recall) A straightforward way to train the net Once trained, a guaranteed solution (recall of the correct memory).
21
Training the Net We need to find the set of weights that encode a single pattern p of length N bits as a minimum energy solution. The minimum energy is obtained when the output of node i exactly matches the inputs to that node. Plug in a guess for w: so that OK
22
Training the Net (cont.)
This weight assignment is remarkably reminiscent of Hebbian learning: If two nodes are spiking at the same time, then the weight connecting them is strengthened. Here anti-correlated nodes result in negative (inhibitory) weights. For gradual learning Only patterns introduced repeatedly will result in the formation of new memories; noise will be ignored. Now generalising for M memories or patterns: Generalised Hebb Rule
23
Does it work? This applet demonstrates: The distributed representation
The ability to perform powerful computation High storage capacity (7 100-bit patterns in 100 neurons) High fidelity and noise tolerance Graceful degradation (for more memories) Eliminated features: Dynamic inputs in training Intrinsic background activity from where no longer available.
24
Adding extra training patterns
For a single training pattern it is clear that the pattern is an energy minimum (why?) For two training patterns, each of them is still an energy minimum, slightly perturbed by the other (explicit example) General condition p << N Consequence resistance to damage No localisation of memory: ‘holographic’
25
Storage Capacity How many memories can be stored in the network? To store M memories, each of length N bits, in a network of N neurons, we first ask how many stable patterns can be reached? In 1987, McEliece et al derived an upper limit for the number of memories that can be stored accurately: M = N/(2 logN). e.g. for N = 100 neurons, M = 11 distinct memories, each 100 bits long can be faithfully stored and recalled. To write out these 11 distinct memories, would take 1100 bits! In general, the coding efficiency of the network can be summarised as 2 log N neurons per pattern (each N bits long). This enormous capacity is paid for by a potentially lengthy recall process. McEliece et al., (1987) IEEE Trans. Inf. Theor. IT-33:
26
Applications Hopfield nets have obvious applications for any problem that can be posed in terms of optimisation in the sense of maximising or minimising some function, that can be likened to an energy function. The distance matching problem: The travelling salesman problem: given points match pairs shortest length shortest find path given points
27
What about the brain (pros)?
Hopfield nets maintain some very attractive features from recurrent net architectures. However, the imposition of symmetric weights was a conscious move away from biological realism and toward engineering-like reliability. In contrast, Hopfield nets seem more biologically realistic in disallowing self-connected neurons. Hebbian-like learning is also a great appeal of Hopfield nets, capturing several important principles: (1) unsupervised learning (2) natural synaptic plasticity (3) No necessary distinction between training & testing. (4) robustness to details of training procedure
28
What about the brain (cons)?
While we now have dynamics in training and in recall, we might still ask is this dynamics realistic in the brain? 1) In the memory recall stage: we consider inputs one at a time, waiting for the association to be made before proceeding to the next pattern. Is this how the brain works? 2) The aspiration of every Hopfield net is to arrive at a stable solution. Is this a realistic representation of association, or cognition in general in the brain? In other words, do we represent solutions to problems by relaxing to stable states of activity in the brain, or does the brain represent solutions according to very different, dynamic paradigms that handle continuous inputs and actively resist falling into the abyss of equilibrium?
29
What about the brain? (cont.)
How memories are implanted in our brains remains an exciting research question. While Hopfield nets no longer participate in this discourse, their formative role in shaping our intuition about associative memories remains admirable.
30
Next time… Final lecture about neural networks (for the time being) Reading John Hopfield (1982) “Neural Networks and Physical Systems with Emergent Collective Computational Properties”, Proc. Nat. Acad. Sci. 79: A highly accessible introduction to the subject, incl. both non-technical and technical approaches can be found at: Some food for thought: a popular article on CNN: “Researchers: It’s easy to plant false memories”, CNN.com, Feb 16, 2003.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.