Download presentation
Presentation is loading. Please wait.
1
Simple recurrent networks.
2
Today What’s so great about backpropagation learning?
Temporal structure and simple recurrent networks—basic concepts Examples applied in language and cognitive neuroscience
3
What’s so great about backpropagation learning?
4
1 2 3 w1 w2 b Input 1 Input 2 Output 1 𝑎 3 = 𝑏 𝑤 +𝑎 1 𝑤 1 + 𝑎 2 𝑤 2 1
5
5 𝑛𝑒𝑡 𝑗 = 𝑖 𝑎 𝑖 𝑤 𝑖𝑗 w5 w6 𝑎 𝑗 = 1 1+ 𝑒 − 𝑛𝑒𝑡 𝑗 3 4 w2 w3 w1 w4 ∆ 𝑤 𝑖𝑗 =𝛼 ( 𝑡 𝑗 −𝑎 𝑗 )× 𝑎 𝑗 (1− 𝑎 𝑗 )×𝑎 𝑖 1 2
6
00 11 01 10 Unit 3 Unit 4 Input space 00 11 01 10 Input space Hidden space 11 01 00 10
7
So backpropagation in a deep network (1 or more hidden layers) allows the network to learn to re-represent the structure provided in the input in ways that suit the task…
8
Rumelhart and Todd (1993) Item Attributes Context pine oak rose daisy
living pretty green big small bright dull… pine oak rose daisy robin canary sunfish salmon grow fly sing swim… leaves roots petals wings feathers scales gills… Context ISA is… can… has…
9
Epoch 150 Pine Oak Rose Daisy Sunfish Salmon Robin Canary
10
Rumelhart and Todd (1993) Attributes Item Context pine oak rose daisy
living pretty green big small bright dull… pine oak rose daisy robin canary sunfish salmon move fly sing swim… Item leaves roots petals wings feathers scales gills… Context ISA is… can… has…
11
Patterns that emerge in hidden layer will be influenced by overlap in both the input units and the output units… Generalization of learning from one item to another depends on how similar the items are in the internal learned representation… So backprop learning provides one way of thinking about how appropriately-structured internal representations might be acquired.
12
Surprise test! Spell MISSISSIPPI
13
Human behavior occurs over time
Speaking and comprehending Making a cup of coffee Walking Seeing Essentially everything How does any information processing system cope with time?
14
1 2 3 4 5 6 … 11 Spoken output Visual input 1 2 3 4 5 6 … 11 A B C D .
Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z Spoken output A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z Visual input 1 2 3 4 5 6 … 11
15
L O G S U N G L A D S W A M S P L I T S P L I T Left-justified Vowel-centred
16
“Jordan” network COPY (t-1)
17
Simple Recurrent Network (SRN)
Current output (t) Hidden units Copy (t-1) Hidden rep at t-1 Current input (t)
18
XOR in time! 1 1 1 1 1 1 COPY (t-1)
19
Network can combine current input plus
“memory” of previous input to predict next input…
20
What about “longer” memory, more complex patterns?
ba dii guuu
21
Error Time step
22
Simple Recurrent Network (SRN)
Current output (t) Hidden units Copy (t-1) Hidden rep at t-1 Current input (t)
23
SRN can “hold on” to prior information over many time steps!
24
But that’s not all…
25
Error Time step
26
High Consonantal
27
SRN “knows” about consonants, sort of!
28
Learning word boundaries
15 words 200 sentences (4-9 words in length) Broken into letter parts Each letter is a random bit vector Predict next letter
29
Maybe predictive error can serve as a cue for parsing
the speech stream (and other sequential behaviors!)
30
What about word categories?
32
Localist representation of 31 words
Series of 2-3 word sentences strung together: woman smash plate cat move man break car boy move girl…. Predict next word…
33
Simple Recurrent Network (SRN)
Current output (t) Hidden units Copy (t-1) Hidden rep at t-1 Current input (t)
35
ZOG break car Zog eat Zog like girl ZOG chase mouse…
37
Words in context…
38
SRN acquires representational structure
based on implicit temporal prediction… Items represented as similar when preceded or followed by similar distributions of items
39
Example in cognitive neuroscience
Hypothesis from literature: Events are parsed when transition probabilities are low… Alternative hypothesis….
41
Example in cognitive neuroscience
Hypothesis from literature: Events are parsed when transition probabilities are low… Alternative hypothesis: events are parsed at transitions between communities
45
So SRNs… Can implicitly learn structure of temporally extended sequences… Can provide graded cues to “segmenting” such sequences… Can learn similarities amongst items that participate in such sequences, based on the similarity of their temporal contexts.
46
Example applications Working memory: n-back, serial order, AX task.
Language parsing Sequential word processing Sentence processing Routine sequential action
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.