Presentation is loading. Please wait.

Presentation is loading. Please wait.

Simple recurrent networks.

Similar presentations


Presentation on theme: "Simple recurrent networks."— Presentation transcript:

1 Simple recurrent networks.

2 Today What’s so great about backpropagation learning?
Temporal structure and simple recurrent networks—basic concepts Examples applied in language and cognitive neuroscience

3 What’s so great about backpropagation learning?

4 1 2 3 w1 w2 b Input 1 Input 2 Output 1 𝑎 3 = 𝑏 𝑤 +𝑎 1 𝑤 1 + 𝑎 2 𝑤 2 1

5 5 𝑛𝑒𝑡 𝑗 = 𝑖 𝑎 𝑖 𝑤 𝑖𝑗 w5 w6 𝑎 𝑗 = 1 1+ 𝑒 − 𝑛𝑒𝑡 𝑗 3 4 w2 w3 w1 w4 ∆ 𝑤 𝑖𝑗 =𝛼 ( 𝑡 𝑗 −𝑎 𝑗 )× 𝑎 𝑗 (1− 𝑎 𝑗 )×𝑎 𝑖 1 2

6 00 11 01 10 Unit 3 Unit 4 Input space 00 11 01 10 Input space Hidden space 11 01 00 10

7 So backpropagation in a deep network (1 or more hidden layers) allows the network to learn to re-represent the structure provided in the input in ways that suit the task…

8 Rumelhart and Todd (1993) Item Attributes Context pine oak rose daisy
living pretty green big small bright dull… pine oak rose daisy robin canary sunfish salmon grow fly sing swim… leaves roots petals wings feathers scales gills… Context ISA is… can… has…

9 Epoch 150 Pine Oak Rose Daisy Sunfish Salmon Robin Canary

10 Rumelhart and Todd (1993) Attributes Item Context pine oak rose daisy
living pretty green big small bright dull… pine oak rose daisy robin canary sunfish salmon move fly sing swim… Item leaves roots petals wings feathers scales gills… Context ISA is… can… has…

11 Patterns that emerge in hidden layer will be influenced by overlap in both the input units and the output units… Generalization of learning from one item to another depends on how similar the items are in the internal learned representation… So backprop learning provides one way of thinking about how appropriately-structured internal representations might be acquired.

12 Surprise test! Spell MISSISSIPPI

13 Human behavior occurs over time
Speaking and comprehending Making a cup of coffee Walking Seeing Essentially everything How does any information processing system cope with time?

14 1 2 3 4 5 6 … 11 Spoken output Visual input 1 2 3 4 5 6 … 11 A B C D .
Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z Spoken output A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z Visual input 1 2 3 4 5 6 11

15 L O G S U N G L A D S W A M S P L I T S P L I T Left-justified Vowel-centred

16 “Jordan” network COPY (t-1)

17 Simple Recurrent Network (SRN)
Current output (t) Hidden units Copy (t-1) Hidden rep at t-1 Current input (t)

18 XOR in time! 1 1 1 1 1 1 COPY (t-1)

19 Network can combine current input plus
“memory” of previous input to predict next input…

20 What about “longer” memory, more complex patterns?
ba dii guuu

21 Error Time step

22 Simple Recurrent Network (SRN)
Current output (t) Hidden units Copy (t-1) Hidden rep at t-1 Current input (t)

23 SRN can “hold on” to prior information over many time steps!

24 But that’s not all…

25 Error Time step

26 High Consonantal

27 SRN “knows” about consonants, sort of!

28 Learning word boundaries
15 words 200 sentences (4-9 words in length) Broken into letter parts Each letter is a random bit vector Predict next letter

29 Maybe predictive error can serve as a cue for parsing
the speech stream (and other sequential behaviors!)

30 What about word categories?

31

32 Localist representation of 31 words
Series of 2-3 word sentences strung together: woman smash plate cat move man break car boy move girl…. Predict next word…

33 Simple Recurrent Network (SRN)
Current output (t) Hidden units Copy (t-1) Hidden rep at t-1 Current input (t)

34

35 ZOG break car Zog eat Zog like girl ZOG chase mouse…

36

37 Words in context…

38 SRN acquires representational structure
based on implicit temporal prediction… Items represented as similar when preceded or followed by similar distributions of items

39 Example in cognitive neuroscience
Hypothesis from literature: Events are parsed when transition probabilities are low… Alternative hypothesis….

40

41 Example in cognitive neuroscience
Hypothesis from literature: Events are parsed when transition probabilities are low… Alternative hypothesis: events are parsed at transitions between communities

42

43

44

45 So SRNs… Can implicitly learn structure of temporally extended sequences… Can provide graded cues to “segmenting” such sequences… Can learn similarities amongst items that participate in such sequences, based on the similarity of their temporal contexts.

46 Example applications Working memory: n-back, serial order, AX task.
Language parsing Sequential word processing Sentence processing Routine sequential action


Download ppt "Simple recurrent networks."

Similar presentations


Ads by Google