Memory. Hopfield Network Content addressable Attractor network.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Bioinspired Computing Lecture 16
Example Project and Numerical Integration Computational Neuroscience 03 Lecture 11.
Xiaolong Wang and Daniel Khashabi
Hierarchical Dirichlet Process (HDP)
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
Factorial Mixture of Gaussians and the Marginal Independence Model Ricardo Silva Joint work-in-progress with Zoubin Ghahramani.
Neural networks Introduction Fitting neural networks
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
2806 Neural Computation Self-Organizing Maps Lecture Ari Visa.
Biointelligence Laboratory, Seoul National University
Supervised Learning Recap
Flipping A Biased Coin Suppose you have a coin with an unknown bias, θ ≡ P(head). You flip the coin multiple times and observe the outcome. From observations,
Probabilistic inference in human semantic memory Mark Steyvers, Tomas L. Griffiths, and Simon Dennis 소프트컴퓨팅연구실오근현 TRENDS in Cognitive Sciences vol. 10,
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.
Switch to Top-down Top-down or move-to-nearest Partition documents into ‘k’ clusters Two variants “Hard” (0/1) assignment of documents to clusters “soft”
End of Chapter 8 Neil Weisenfeld March 28, 2005.
Visual Recognition Tutorial
November 30, 2010Neural Networks Lecture 20: Interpolative Associative Memory 1 Associative Networks Associative networks are able to store a set of patterns.
December 7, 2010Neural Networks Lecture 21: Hopfield Network Convergence 1 The Hopfield Network The nodes of a Hopfield network can be updated synchronously.
PATTERN RECOGNITION AND MACHINE LEARNING
Example 16,000 documents 100 topic Picked those with large p(w|z)
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
1 / 41 Inference and Computation with Population Codes 13 November 2012 Inference and Computation with Population Codes Alexandre Pouget, Peter Dayan,
1 Psych 5500/6500 t Test for Two Independent Means Fall, 2008.
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
The Boltzmann Machine Psych 419/719 March 1, 2001.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Chapter 7. Network models Firing rate model for neuron as a simplification for network analysis Neural coordinate transformation as an example of feed-forward.
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam
1 CMSC 671 Fall 2001 Class #25-26 – Tuesday, November 27 / Thursday, November 29.
The Function of Synchrony Marieke Rohde Reading Group DyStURB (Dynamical Structures to Understand Real Brains)
Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
BCS547 Neural Decoding.
Latent Dirichlet Allocation
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
CS246 Latent Dirichlet Analysis. LSI  LSI uses SVD to find the best rank-K approximation  The result is difficult to interpret especially with negative.
Model-based learning: Theory and an application to sequence learning P.O. Box 49, 1525, Budapest, Hungary Zoltán Somogyvári.
From OLS to Generalized Regression Chong Ho Yu (I am regressing)
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam en Universiteit Utrecht
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Machine Learning CUNY Graduate Center Lecture 6: Linear Regression II.
Network Models (2) LECTURE 7. I.Introduction − Basic concepts of neural networks II.Realistic neural networks − Homogeneous excitatory and inhibitory.
Bayesian Perception.
Lecture 39 Hopfield Network
Real Neurons Cell structures Cell body Dendrites Axon
Ch3: Model Building through Regression
Enhancing User identification during Reading by Applying Content-Based Text Analysis to Eye- Movement Patterns Akram Bayat Amir Hossein Bayat Marc.
How Neurons Do Integrals
OCNC Statistical Approach to Neural Learning and Population Coding ---- Introduction to Mathematical.
Confidence as Bayesian Probability: From Neural Origins to Behavior
Volume 36, Issue 5, Pages (December 2002)
CS246: Latent Dirichlet Analysis
Topic Models in Text Processing
H.Sebastian Seung, Daniel D. Lee, Ben Y. Reis, David W. Tank  Neuron 
Lecture 39 Hopfield Network
Ch 3. Linear Models for Regression (2/2) Pattern Recognition and Machine Learning, C. M. Bishop, Previously summarized by Yung-Kyun Noh Updated.
Information Processing by Neuronal Populations Chapter 5 Measuring distributed properties of neural representations beyond the decoding of local variables:
The Network Approach: Mind as a Web
Chapter 8: Recurrent associative networks and episodic memory
Presentation transcript:

Memory

Hopfield Network Content addressable Attractor network

Hopfield Network

General Case: Lyapunov function

Neurophysiology

Mean Field Approximation

Null Cline Analysis What are the fixed points? EI CECE CICI

Null Cline Analysis What are the fixed points?

Null Cline Analysis E E I I Unstable fixed point Stable fixed point

Null Cline Analysis E E I

E I E

E I E

E I E

E I E Unstable branch Stable branches

Null Cline Analysis E I E

E I I Stable fixed point

Null Cline Analysis E I I E

E I I E

E I Excitatory null cline Inhibitory null cline Fixed points

Binary Memory E I EI CECE CICI

E I Storing Decrease inhibition (C I ) EI CECE CICI

Binary Memory E I Storing Back to rest EI CECE CICI

Binary Memory E I Reset Increase inhibition EI CECE CICI

Binary Memory E I Reset Back to rest EI CECE CICI

Networks of Spiking Neurons Problems with the previous approach: 1.Spiking neurons have monotonic I-f curves (which saturate, but only at very high firing rates) 2.How do you store more than one memory? 3.What is the role of spontaneous activity?

Networks of Spiking Neurons

IjIj R(Ij)R(Ij)

A memory network must be able to store a value in the absence of any input:

Networks of Spiking Neurons

IiIi cR(I i ) I aff

Networks of Spiking Neurons With a non saturating activation function and no inhibition, the neurons must be spontaneously active for the network to admit a non zero stable state: cR(I i ) I2*I2* IiIi

Networks of Spiking Neurons To get several stable fixed points, we need inhibition: I2*I2* Unstable fixed point Stable fixed points IiIi

Networks of Spiking Neurons Clamping the input: inhibitory I aff I aff IiIi

Networks of Spiking Neurons Clamping the input: excitatory I aff cR(I i ) I2*I2* I aff IiIi

Networks of Spiking Neurons IjIj R(Ij)R(Ij)

Major Problem: the memory state has a high firing rate and the resting state is at zero. In reality, there is spontaneous activity at 0- 10Hz and the memory state is around Hz (not 100Hz) Solution: you don’t want to know (but it involves a careful balance of excitation and inhibition)…

Line Attractor Networks Continuous attractor: line attractor or N- dimensional attractor Useful for storing analog values Unfortunately, it’s virtually impossible to get a neuron to store a value proportional to its activity

Line Attractor Networks Storing analog values: difficult with this scheme…. cR(I i ) IiIi

Line Attractor Networks Implication for transmitting rate and integration… cR(I i ) IiIi

Line Attractor Networks Head direction cells Preferred Head Direction (deg) Activity HH

Line Attractor Networks Attractor network with population code Translation invariant weights Preferred Head Direction (deg) Activity HH

Line Attractor Networks Computing the weights:

Line Attractor Networks The problem with the previous approach is that the weights tend to oscillate. Instead, we minimize: The solution is:

Line Attractor Networks Updating of memory: bias in the weights, integrator of velocity…etc.

Line Attractor Networks How do we know that the fixed points are stable? With symmetric weights, the network has a Lyapunov function (Cohen, Grossberg 1982):

Line Attractor Networks Line attractor: the set of stable points forms a line in activity space. Limitations: Requires symmetric weights… Neutrally stable along the attractor: unavoidable drift

Memorized Saccades + T1 + T2

Memorized Saccades + T1 + T2 R2 S1 S2 R1 S1=R1 S2=R2-S1

Memorized Saccades + T1 + T2 R2 S1 S2 R1 S2 S1 T2T1

Memorized Saccades + T1 + T2 R2 S1 S2 R1 S2 S1T2T1

Memorized Saccades Horizontal Ret. Pos. (deg) Vertical Ret. Pos. (deg) Activity Horizontal Ret. Pos. (deg) Vertical Ret. Pos. (deg) Activity A B  E

Neural Integrator Oculomotor theory Evidence integrator for decision making Transmitting rates in multilayer networks Maximum likelihood estimator

Semantic Memory Memory of words is sensitive to semantic (not just spelling) Experiment: Subjects are first trained to remember a list of words. A few hours later, they are presented with a list of words and they have to pick the ones they were supposed to remember. Many mistakes involve words semantically related to the remembered words.

Semantic Memory Usual solution: semantic networks (nodes: words, links: semantic similarities) and spreading activation Problem 1: The same word can have several meanings (e.g. bank). This is not captured by semantic network Problem 2: some interaction between words are negative, even when they have no semantic relationship (e.g. doctor and hockey).

Semantic Memory Usual solution: semantic networks (nodes: words, links: semantic similarities) and spreading activation

Semantic Memory Bayesian approach (Griffiths, Steyvers, Tenenbaum, Psych Rev 06) Documents are bags of words (we ignore word ordering). Generative model for document. Each document has a gist which is a mixture of topics. A topic in turn defines a probability distribution over words.

Semantic Memory Bayesian approach Generative model for document gzw GistTopicswords

Semantic Memory z = Topics = finance, english country side… etc. Gist: mixture of topics. P(z|g) mixing proportions. Some documents might be 0.9 finance, 0.1 english country side (e.g. wheat market). P(z=finance|g 1 )=0.9, P(z=engl country|g 1 )=0.1 Other might be 0.2 finance, 0.8 english country side (e.g. Lloyds CEO buys a mansion) P(z=finance|g 1 )=0.2, P(z=engl country|g 1 )=0.8

Semantic Memory Bayesian approach Generative model for document gzw GistTopicswords

Semantic Memory Topic (z 1 )=finance Words: P(w|z 1 ) 0.01 bank, money, 0.0 meadow… Topic (z 2 )=english country side Words: P(w|z 2 ) bank, money, meadow…

The gist is shared within a document but the topics can be varied from one sentence (or even word) to the next.

Semantic Memory Problem: we only observe the words, not the topic of the gist… How do we know how many topics and how many gists to pick to account for a corpus of words, and how do we estimate their probabilities? To pick the number of topics and gist: Chinese restaurant process, Dirichlet process and hierarchical Dirichlet process. MCMC sampling. Use techniques like EM to learn the probability for the latent variables (topics and gists). However, a human is still needed to label the topics…

Semantic Memory Words in Topic 1 Words in Topic 2 Words in Topic 3

Semantic Memory Bayesian approach Generative model for document gzw GistTopicswords

Semantic Memory Problems we may want to solve Prediction P(w n+1 |w). What’s the next word? Disambiguation P(z|w). What are the mixture of topics in this document? Gist extraction P(g|w). What’s the probability distribution over gists?

Semantic Memory What we need is a representation of P(w,z,g)

Semantic Memory P(w,z,g) is given by the generative model. gzw GistTopicswords

Semantic Memory Explain semantic interferences in list will tend to favor words that are semantically related through the topics and gists. Capture the fact that a given word can have different meanings (topics and gists) depending on the context.

Countryside Finance Money less likely to be seen if the topic is country side Predicted next word Word being observed