Memory
Hopfield Network Content addressable Attractor network
Hopfield Network
General Case: Lyapunov function
Neurophysiology
Mean Field Approximation
Null Cline Analysis What are the fixed points? EI CECE CICI
Null Cline Analysis What are the fixed points?
Null Cline Analysis E E I I Unstable fixed point Stable fixed point
Null Cline Analysis E E I
E I E
E I E
E I E
E I E Unstable branch Stable branches
Null Cline Analysis E I E
E I I Stable fixed point
Null Cline Analysis E I I E
E I I E
E I Excitatory null cline Inhibitory null cline Fixed points
Binary Memory E I EI CECE CICI
E I Storing Decrease inhibition (C I ) EI CECE CICI
Binary Memory E I Storing Back to rest EI CECE CICI
Binary Memory E I Reset Increase inhibition EI CECE CICI
Binary Memory E I Reset Back to rest EI CECE CICI
Networks of Spiking Neurons Problems with the previous approach: 1.Spiking neurons have monotonic I-f curves (which saturate, but only at very high firing rates) 2.How do you store more than one memory? 3.What is the role of spontaneous activity?
Networks of Spiking Neurons
IjIj R(Ij)R(Ij)
A memory network must be able to store a value in the absence of any input:
Networks of Spiking Neurons
IiIi cR(I i ) I aff
Networks of Spiking Neurons With a non saturating activation function and no inhibition, the neurons must be spontaneously active for the network to admit a non zero stable state: cR(I i ) I2*I2* IiIi
Networks of Spiking Neurons To get several stable fixed points, we need inhibition: I2*I2* Unstable fixed point Stable fixed points IiIi
Networks of Spiking Neurons Clamping the input: inhibitory I aff I aff IiIi
Networks of Spiking Neurons Clamping the input: excitatory I aff cR(I i ) I2*I2* I aff IiIi
Networks of Spiking Neurons IjIj R(Ij)R(Ij)
Major Problem: the memory state has a high firing rate and the resting state is at zero. In reality, there is spontaneous activity at 0- 10Hz and the memory state is around Hz (not 100Hz) Solution: you don’t want to know (but it involves a careful balance of excitation and inhibition)…
Line Attractor Networks Continuous attractor: line attractor or N- dimensional attractor Useful for storing analog values Unfortunately, it’s virtually impossible to get a neuron to store a value proportional to its activity
Line Attractor Networks Storing analog values: difficult with this scheme…. cR(I i ) IiIi
Line Attractor Networks Implication for transmitting rate and integration… cR(I i ) IiIi
Line Attractor Networks Head direction cells Preferred Head Direction (deg) Activity HH
Line Attractor Networks Attractor network with population code Translation invariant weights Preferred Head Direction (deg) Activity HH
Line Attractor Networks Computing the weights:
Line Attractor Networks The problem with the previous approach is that the weights tend to oscillate. Instead, we minimize: The solution is:
Line Attractor Networks Updating of memory: bias in the weights, integrator of velocity…etc.
Line Attractor Networks How do we know that the fixed points are stable? With symmetric weights, the network has a Lyapunov function (Cohen, Grossberg 1982):
Line Attractor Networks Line attractor: the set of stable points forms a line in activity space. Limitations: Requires symmetric weights… Neutrally stable along the attractor: unavoidable drift
Memorized Saccades + T1 + T2
Memorized Saccades + T1 + T2 R2 S1 S2 R1 S1=R1 S2=R2-S1
Memorized Saccades + T1 + T2 R2 S1 S2 R1 S2 S1 T2T1
Memorized Saccades + T1 + T2 R2 S1 S2 R1 S2 S1T2T1
Memorized Saccades Horizontal Ret. Pos. (deg) Vertical Ret. Pos. (deg) Activity Horizontal Ret. Pos. (deg) Vertical Ret. Pos. (deg) Activity A B E
Neural Integrator Oculomotor theory Evidence integrator for decision making Transmitting rates in multilayer networks Maximum likelihood estimator
Semantic Memory Memory of words is sensitive to semantic (not just spelling) Experiment: Subjects are first trained to remember a list of words. A few hours later, they are presented with a list of words and they have to pick the ones they were supposed to remember. Many mistakes involve words semantically related to the remembered words.
Semantic Memory Usual solution: semantic networks (nodes: words, links: semantic similarities) and spreading activation Problem 1: The same word can have several meanings (e.g. bank). This is not captured by semantic network Problem 2: some interaction between words are negative, even when they have no semantic relationship (e.g. doctor and hockey).
Semantic Memory Usual solution: semantic networks (nodes: words, links: semantic similarities) and spreading activation
Semantic Memory Bayesian approach (Griffiths, Steyvers, Tenenbaum, Psych Rev 06) Documents are bags of words (we ignore word ordering). Generative model for document. Each document has a gist which is a mixture of topics. A topic in turn defines a probability distribution over words.
Semantic Memory Bayesian approach Generative model for document gzw GistTopicswords
Semantic Memory z = Topics = finance, english country side… etc. Gist: mixture of topics. P(z|g) mixing proportions. Some documents might be 0.9 finance, 0.1 english country side (e.g. wheat market). P(z=finance|g 1 )=0.9, P(z=engl country|g 1 )=0.1 Other might be 0.2 finance, 0.8 english country side (e.g. Lloyds CEO buys a mansion) P(z=finance|g 1 )=0.2, P(z=engl country|g 1 )=0.8
Semantic Memory Bayesian approach Generative model for document gzw GistTopicswords
Semantic Memory Topic (z 1 )=finance Words: P(w|z 1 ) 0.01 bank, money, 0.0 meadow… Topic (z 2 )=english country side Words: P(w|z 2 ) bank, money, meadow…
The gist is shared within a document but the topics can be varied from one sentence (or even word) to the next.
Semantic Memory Problem: we only observe the words, not the topic of the gist… How do we know how many topics and how many gists to pick to account for a corpus of words, and how do we estimate their probabilities? To pick the number of topics and gist: Chinese restaurant process, Dirichlet process and hierarchical Dirichlet process. MCMC sampling. Use techniques like EM to learn the probability for the latent variables (topics and gists). However, a human is still needed to label the topics…
Semantic Memory Words in Topic 1 Words in Topic 2 Words in Topic 3
Semantic Memory Bayesian approach Generative model for document gzw GistTopicswords
Semantic Memory Problems we may want to solve Prediction P(w n+1 |w). What’s the next word? Disambiguation P(z|w). What are the mixture of topics in this document? Gist extraction P(g|w). What’s the probability distribution over gists?
Semantic Memory What we need is a representation of P(w,z,g)
Semantic Memory P(w,z,g) is given by the generative model. gzw GistTopicswords
Semantic Memory Explain semantic interferences in list will tend to favor words that are semantically related through the topics and gists. Capture the fact that a given word can have different meanings (topics and gists) depending on the context.
Countryside Finance Money less likely to be seen if the topic is country side Predicted next word Word being observed