Download presentation
Presentation is loading. Please wait.
1
Complementary Learning Systems
McClelland, McNaughton & O’Reilly, 1995 McClelland & Goddard, 1996 Anthony Cate March 22, 2001
2
Hippocampal Amnesic Syndrome
Patient HM Bilateral medial temporal lobectomy Still alive
3
Patient HM - Deficits Explicit memory for events and episodes New facts (arbitrary) Paired Associate Learning “locomotive-dishtowel”
4
Patient HM – Preserved Abilities
Motor skill acquisition Implicit learning - priming Memories from before damage, with a qualification…
5
Temporally Graded Retrograde Amnesia
6
The sea monster in your head:
7
Hippocampal Anatomy Archicortex Phylogenetically older than neocortex, fewer layers Much smaller than neocortex, especially in humans
8
Inputs from diffuse areas of neocortex converge on hippocampus (via the Entorhinal Cortex)
9
Hippocampus forms a simple circuit:
10
Hippocampus encodes arbitrary conjunctions
11
Hippocampus reproduces patterns during sleep
12
The McC, McN & O’R model Information processing takes place via the propagation of activation among neurons in the neocortical system
13
The McC, McN & O’R model Implicit learning results from small changes to connections among neurons active in each episode of processing Single processing episodes produce item-specific effects Skills arise through accumulation of episodes
14
The McC, McN & O’R model New arbitrary associations are based on learning that takes place in the hippocampal system that occurs alongside with processing occurs in the cortex Bidirectional connections between neocortex and hippocampal system Large connection changes in hippocampus for fast learning Recall of new associations depends on pattern completion in the hippocampus
15
The McC, McN & O’R model Consolidation of a memory occurs through the accumulation of small weight changes in neocortical connections These changes are spread over time They result from repeated reinstatements of the hippocampal memory to the neocortex This may happen when a memory is retrieved from the hippocampal system, and during sleep.
16
Discovery of Structure
How the cortex learns about the world: Discovery of Structure via Interleaved Learning Rumelhart 1990 Train a multi-layer network on a set of propositions about things in the world
17
An intuitive view of conceptual structure:
18
Rumelhart 1990
19
How does the model discover this structure?
The concept representation units exploit similarities between patterns The most efficient way to reduce error is by grouping inputs with similar outputs on the same concept unit
20
With interleaved learning at a slow rate, concept unit representations differentiate
21
These patterns of activation describe a hierarchical structure!
22
Points from Rumelhart’s model
Interleaved learning allows a network to discover a structure in an environment of patterns A “hierarchical” relationship can be found in this structure based on the graded similarity of particular patterns.
23
Rumelhart’s model depends on:
Interleaved learning Slow learning rate A set of inputs that overlap in similarity What if learning had to take place in a situation where none of these applied?
24
Failures of this architecture (Why a neocortex alone is inufficient):
One-shot learning (=high learning rate) Focused learning (=non-interleaved training)
25
Catastrophic Interference!
26
Modeling Paired Associate Learning
McCloskey & Cohen 1989 AB-AC paradigm: List A: List B: List C: locomotive dishtowel seltzer table pinecone headphones weasel jacket waterfall … … …
27
A simple model for paired associate learning:
28
The model completely fails to retain the AB associations once the AC associations are learned.
29
Catastrophic interference is a general phenomenon that occurs in complex models as well.
Training a network on a set of patterns with backprop and similar algorithms guarantees optimal weights for that set Guarantees nothing about other patterns
30
Catastrophic interference in Rumelhart’s model:
31
Catastrophic interference can be overcome by changing a model’s architecture or the set of training patterns Use patterns with less overlap Weight changes for each pattern won’t affect performance of network for other patterns
32
But maybe we don’t want to make these changes.
Problem with less overlapping patterns: No way to extract structure, because by definition no structure in pattern set Structure = covariations
33
Another problem with less overlap:
Less generalization A novel input will never be very similar to trained patterns, so the network cannot produce an appropriate output We never see exactly the same thing twice
34
How arbitrary are real world exceptions and events?
“Pint:” most of the phonemes are regular JFK assassination example: most of what we know about an event is drawn from existing associations in memory
35
The role of the hippocampus
Encodes patterns in sparse, non-overlapping manner Trains cortex by reinstating this pattern repeatedly over time.
36
A simple model: In this model, hippocampal & cortical representations are same Consolidation rate “C” determines rate at which hippocampal units feed patterns to cortical units
37
The role of the hippocampus
Sparse coding allows a high learning rate, since learning one pattern won’t interfere with weights for another pattern
38
A problem with sparse coding:
There are many fewer cells in the hippocampus than in the neocortex How can the hippocampus encode patterns in a way that is both sparse and compressed?
39
Sparse coding and hidden unit activity:
Coarse coding Sparse coding Really sparse coding
40
Really, really sparse coding:
41
Random, conjunctive coding
k-winners-take-all: the weights between the k most active units and a randomly selected hidden unit are increased
42
Implementing this in a model of the hippocampal system:
Assume that cortex employs componential coding -- efficient, good for generalization: Letters from 9x9 pixel array can be encoded with 13 feature units, instead of 81 pixel units
43
Much overlap between patterns
With componetial coding, about 34% of the 13 input units will be active to represent a given letter With the 9x9=81, 1 pixel = 1 hidden unit system, 28% of input units active Much overlap between patterns This is good for generalization Bad for pattern separation
44
Really, really sparse coding:
Have every possible combination of 3 input units correspond to 1 hidden unit For an input consisting of 5 pixel letters, over 10 million such triples! A 5 letter input would activate about 230,000 of these triples Only 2% of possible input triples active – much more sparse
46
Sparse hippocampal representation
“Compressed” represention from cortex Cortical representation
47
How this model maps onto anatomy:
48
Plausibility of this encoding scheme:
Many more cells in Dental Gyrus (hippocampus proper) than in Entorhinal Cortex Autoassocitor (via recurrent collaterals) and Hebbian pattern associator also present in basic hippocampal circuit
49
In short: Representations in cortex compressed via connections with Entorhinal Cortex These coarse, componential representations are made sparse (random, conjunctive coding) via connections with hippocampus proper First compress, then sparsify
50
Problems with these models:
Does the hippocampus really encode all kinds of arbitrary associations, or just spatial maps? Cortical learning implies that only a prototype stored in memory -- no information about individual training events in cortex
51
In summary: Important pairs of concepts:
Interleaved/Focused (training) Slow/Fast (learning rates) Coarse/Sparse (representations)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.