Simple recurrent networks.

Slides:



Advertisements
Similar presentations
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.
Advertisements

Introduction to Computational Natural Language Learning Linguistics (Under: Topics in Natural Language Processing ) Computer Science (Under:
Chapter 9 Knowledge.
Learning linguistic structure with simple recurrent networks February 20, 2013.
PDP: Motivation, basic approach. Cognitive psychology or “How the Mind Works”
Does the Brain Use Symbols or Distributed Representations? James L. McClelland Department of Psychology and Center for Mind, Brain, and Computation Stanford.
Components important to the teaching of reading
Knowledge ß How do we organize our knowledge? ß How do we access our knowledge? ß Do we really use categories?
Recurrent Neural Networks
9.012 Brain and Cognitive Sciences II Part VIII: Intro to Language & Psycholinguistics - Dr. Ted Gibson.
Natural Categories Hierarchical organization of categories –Superordinate (e.g., furniture) –Basic-level (e.g., chair) –Subordinate (e.g., armchair) Rosch.
COGNITIVE NEUROSCIENCE
Bernard Ans, Stéphane Rousset, Robert M. French & Serban Musca (European Commission grant HPRN-CT ) Preventing Catastrophic Interference in.
Introduction to Computational Natural Language Learning Linguistics (Under: Topics in Natural Language Processing ) Computer Science (Under:
Sentence Processing using a Simple Recurrent Network EE 645 Final Project Spring 2003 Dong-Wan Kang 5/14/2003.
Categorization  How do we organize our knowledge?  How do we retrieve knowledge when we need it?
Introduction to Computational Natural Language Learning Linguistics (Under: Topics in Natural Language Processing ) Computer Science (Under:
Sound – Print Connection
Modelling Language Evolution Lecture 2: Learning Syntax Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.
Semantic Memory Memory for meaning
Using Backprop to Understand Apects of Cognitive Development PDP Class Feb 8, 2010.
Representation, Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology.
James L. McClelland Stanford University
February 22, 2010 Connectionist Models of Language.
Connectionist Models of Language Development: Grammar and the Lexicon Steve R. Howell McMaster University, 1999.
Similarity and Attribution Contrasting Approaches To Semantic Knowledge Representation and Inference Jay McClelland Stanford University.
Rapid integration of new schema- consistent information in the Complementary Learning Systems Theory Jay McClelland, Stanford University.
CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.
Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.
Semantic Knowledge: Its Nature, its Development, and its Neural Basis James L. McClelland Department of Psychology and Center for Mind, Brain, and Computation.
Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology and Center.
Interactive/Cognitive/Connectionist/ Compensatory Model of Reading S. Rosenberg, Ed.D. EDU 5367 Manhattanville College.
Module 5 Other Knowledge Representation Formalisms
Speech Recognition
Psychology 209 – Winter 2017 January 31, 2017
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
RNNs: An example applied to the prediction task
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
Deep Learning Amin Sobhani.
Natural Language and Text Processing Laboratory
Split-Brain Studies What do you see? “Nothing”
Recurrent Neural Networks for Natural Language Processing
Reading and Frequency Lists
Understanding the Constructs
Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology and Center.
James L. McClelland SS 100, May 31, 2011
Does the Brain Use Symbols or Distributed Representations?
Backpropagation in fully recurrent and continuous networks
Intelligent Information System Lab
CSE 190 Modeling sequences: A brief overview
CSCI 5832 Natural Language Processing
Emergence of Semantics from Experience
RNNs: Going Beyond the SRN in Language Prediction
Overview of Models & Modeling Concepts
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 8, 2018.
The Big Health Data–Intelligent Machine Paradox
Learning linguistic structure with simple recurrent neural networks
RNNs: Going Beyond the SRN in Language Prediction
Chapter 10: The cognitive brain
Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler
Collaborative Learning Scaffolding Talk across the Curriculum
Introduction Dr. Mahmoud Altarabin.
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
CLS, Rapid Schema Consistent Learning, and Similarity-weighted Interleaved learning Psychology 209 Feb 26, 2019.
CSC321: Neural Networks Lecture 11: Learning in recurrent networks
Attention for translation
The Network Approach: Mind as a Web
Word representations David Kauchak CS158 – Fall 2016.
Modeling IDS using hybrid intelligent systems
Sequence-to-Sequence Models
Presentation transcript:

Simple recurrent networks.

Today What’s so great about backpropagation learning? Temporal structure and simple recurrent networks—basic concepts Examples applied in language and cognitive neuroscience

What’s so great about backpropagation learning?

1 2 3 w1 w2 b Input 1 Input 2 Output 1 𝑎 3 = 𝑏 𝑤 +𝑎 1 𝑤 1 + 𝑎 2 𝑤 2 1

5 𝑛𝑒𝑡 𝑗 = 𝑖 𝑎 𝑖 𝑤 𝑖𝑗 w5 w6 𝑎 𝑗 = 1 1+ 𝑒 − 𝑛𝑒𝑡 𝑗 3 4 w2 w3 w1 w4 ∆ 𝑤 𝑖𝑗 =𝛼 ( 𝑡 𝑗 −𝑎 𝑗 )× 𝑎 𝑗 (1− 𝑎 𝑗 )×𝑎 𝑖 1 2

00 11 01 10 Unit 3 Unit 4 Input space 00 11 01 10 Input space Hidden space 11 01 00 10

So backpropagation in a deep network (1 or more hidden layers) allows the network to learn to re-represent the structure provided in the input in ways that suit the task…

Rumelhart and Todd (1993) Item Attributes Context pine oak rose daisy living pretty green big small bright dull… pine oak rose daisy robin canary sunfish salmon grow fly sing swim… leaves roots petals wings feathers scales gills… Context ISA is… can… has…

Epoch 150 Pine Oak Rose Daisy Sunfish Salmon Robin Canary

Rumelhart and Todd (1993) Attributes Item Context pine oak rose daisy living pretty green big small bright dull… pine oak rose daisy robin canary sunfish salmon move fly sing swim… Item leaves roots petals wings feathers scales gills… Context ISA is… can… has…

Patterns that emerge in hidden layer will be influenced by overlap in both the input units and the output units… Generalization of learning from one item to another depends on how similar the items are in the internal learned representation… So backprop learning provides one way of thinking about how appropriately-structured internal representations might be acquired.

Surprise test! Spell MISSISSIPPI

Human behavior occurs over time Speaking and comprehending Making a cup of coffee Walking Seeing Essentially everything How does any information processing system cope with time?

1 2 3 4 5 6 … 11 Spoken output Visual input 1 2 3 4 5 6 … 11 A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z Spoken output A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z A B C D . Z Visual input 1 2 3 4 5 6 … 11

1 2 3 4 5 -3-2-1 0 1 L O G S U N G L A D S W A M S P L I T S P L I T Left-justified Vowel-centred

“Jordan” network COPY (t-1)

Simple Recurrent Network (SRN) Current output (t) Hidden units Copy (t-1) Hidden rep at t-1 Current input (t)

XOR in time! 1 1 1 1 1 1 COPY (t-1)

Network can combine current input plus “memory” of previous input to predict next input…

What about “longer” memory, more complex patterns? ba dii guuu

Error Time step

Simple Recurrent Network (SRN) Current output (t) Hidden units Copy (t-1) Hidden rep at t-1 Current input (t)

SRN can “hold on” to prior information over many time steps!

But that’s not all…

Error Time step

High Consonantal

SRN “knows” about consonants, sort of!

Learning word boundaries 15 words 200 sentences (4-9 words in length) Broken into letter parts Each letter is a random bit vector Predict next letter

Maybe predictive error can serve as a cue for parsing the speech stream (and other sequential behaviors!)

What about word categories?

Localist representation of 31 words Series of 2-3 word sentences strung together: woman smash plate cat move man break car boy move girl…. Predict next word…

Simple Recurrent Network (SRN) Current output (t) Hidden units Copy (t-1) Hidden rep at t-1 Current input (t)

ZOG break car Zog eat Zog like girl ZOG chase mouse…

Words in context…

SRN acquires representational structure based on implicit temporal prediction… Items represented as similar when preceded or followed by similar distributions of items

Example in cognitive neuroscience Hypothesis from literature: Events are parsed when transition probabilities are low… Alternative hypothesis….

Example in cognitive neuroscience Hypothesis from literature: Events are parsed when transition probabilities are low… Alternative hypothesis: events are parsed at transitions between communities

So SRNs… Can implicitly learn structure of temporally extended sequences… Can provide graded cues to “segmenting” such sequences… Can learn similarities amongst items that participate in such sequences, based on the similarity of their temporal contexts.

Example applications Working memory: n-back, serial order, AX task. Language parsing Sequential word processing Sentence processing Routine sequential action