Computational models of cognitive control (II) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University.

Slides:



Advertisements
Similar presentations
Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience.
Advertisements

Computational Analysis of Motor Learning. Three paradigms Force field adaptation Visuomotor transformations Sequence learning Does one term (motor learning)
Journal club Marian Tsanov Reinforcement Learning.
Organizational Notes no study guide no review session not sufficient to just read book and glance at lecture material midterm/final is considered hard.
Neural Networks Basic concepts ArchitectureOperation.
EE141 1 Broca’s area Pars opercularis Motor cortexSomatosensory cortex Sensory associative cortex Primary Auditory cortex Wernicke’s area Visual associative.
What is Cognitive Science? … is the interdisciplinary study of mind and intelligence, embracing philosophy, psychology, artificial intelligence, neuroscience,
Chapter 6: Temporal Difference Learning
Chapter 6: Temporal Difference Learning
Baysian Approaches Kun Guo, PhD Reader in Cognitive Neuroscience School of Psychology University of Lincoln Quantitative Methods 2011.
What is Cognitive Science? … is the interdisciplinary study of mind and intelligence, embracing philosophy, psychology, artificial intelligence, neuroscience,
Reward processing (1) There exists plenty of evidence that midbrain dopamine systems encode errors in reward predictions (Schultz, Neuron, 2002) Changes.
Executive Functions and Control of Processing PDP Class March 2, 2011.
Connectionist Time and Dynamic Systems Time in One Architecture? Modeling Word Learning at Two Timescales Jessica S. Horst Bob.
Computational models of cognitive control (I) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University.
Development and Theorists
Chapter 3 Neural Network Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
Hybrid AI & Machine Learning Systems Using Ne ural Networks and Subsumption Architecture By Logan Kearsley.
Advances in Modeling Neocortex and its impact on machine intelligence Jeff Hawkins Numenta Inc. VS265 Neural Computation December 2, 2010 Documentation.
Rhythmic Movements Questions: –How do they happen? –What do they mean? –Where do they come from? Reflex chain? Sequential pattern of activation? Reverberatory.
Infant Cognition Interest in infant cognitive abilities NOT new  Piaget’s interest in infant cognition  Importance of motor activity  Internalized sensorimotor.
Background The physiology of the cerebral cortex is organized in hierarchical manner. The prefrontal cortex (PFC) constitutes the highest level of the.
Chapter 16. Basal Ganglia Models for Autonomous Behavior Learning in Creating Brain-Like Intelligence, Sendhoff et al. Course: Robots Learning from Humans.
Chapter 50 The Prefrontal Cortex and Executive Brain Functions Copyright © 2014 Elsevier Inc. All rights reserved.
Cognitive Modeling / University of Groningen / / Artificial Intelligence |RENSSELAER| Cognitive Science CogWorks Laboratories › Christian P. Janssen ›
Decision Making Under Uncertainty Lec #8: Reinforcement Learning UIUC CS 598: Section EA Professor: Eyal Amir Spring Semester 2006 Most slides by Jeremy.
Curiosity-Driven Exploration with Planning Trajectories Tyler Streeter PhD Student, Human Computer Interaction Iowa State University
Introduction to Neural Networks and Example Applications in HCI Nick Gentile.
Metacognition Neil H. Schwartz Psych 605. Three Terms: Organizing the Concept Exogenous Constructivism Endogenous Constructivism Spence Piaget Dialectical.
CMSC 471 Fall 2009 Temporal Difference Learning Prof. Marie desJardins Class #25 – Tuesday, 11/24 Thanks to Rich Sutton and Andy Barto for the use of their.
Pattern Classification of Attentional Control States S. G. Robison, D. N. Osherson, K. A. Norman, & J. D. Cohen Dept. of Psychology, Princeton University,
1 The Way Ahead Dr. Ed Smith Boeing Effects-Based Operations: The “how to”
Model Minimization in Hierarchical Reinforcement Learning Balaraman Ravindran Andrew G. Barto Autonomous Learning Laboratory.
Human and Optimal Exploration and Exploitation in Bandit Problems Department of Cognitive Sciences, University of California. A Bayesian analysis of human.
Perseveration following a temporal delay in the Dimensional Change Card Sort. Anthony Steven Dick and Willis F. Overton Temple University Correspondence.
Chapter 6 Neural Network.
A B SAMPLE DELAY CHOICE A B A B 1st Reversal2nd Reversal etc… Trial Time (ms) Conditional Visuomotor Learning Task Asaad, W.F., Rainer, G. and.
Cognitive control, hierarchy, & the rostro-caudal organization of the prefrontal cortex David Badre Department of Cognitive & Linguistic Sciences Department.
Biological and cognitive plausibility in connectionist networks for language modelling Maja Anđel Department for German Studies University of Zagreb.
Neural correlates of risk sensitivity An fMRI study of instrumental choice behavior Yael Niv, Jeffrey A. Edlund, Peter Dayan, and John O’Doherty Cohen.
Presented By Dr. Paul Cottrell Company: Reykjavik.
Symbolic Reasoning in Spiking Neurons: A Model of the Cortex/Basal Ganglia/Thalamus Loop Terrence C. Stewart Xuan Choo Chris Eliasmith Centre for Theoretical.
Ensemble Classifiers.
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
Simple recurrent networks.
Visual Computation and Learning Lab
Chapter 6: Temporal Difference Learning
James L. McClelland SS 100, May 31, 2011
Intelligent Information System Lab
Motivation Computers are good at some things… Calculating 
Action strengths State values Prediction error.
Prof. Carolina Ruiz Department of Computer Science
Model-based RL (+ action sequences): maybe it can explain everything
Continous-Action Q-Learning
Psychology 209 – Winter, 2018 March 6, 2018
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 8, 2018.
Introduction to Neural Networks And Their Applications - Basics
The use of Neural Networks to schedule flow-shop with dynamic job arrival ‘A Multi-Neural Network Learning for lot Sizing and Sequencing on a Flow-Shop’
Robot Intelligence Kevin Warwick.
Chapter 6: Temporal Difference Learning
Intrinsic and Task-Evoked Network Architectures of the Human Brain
Prepared by: Mahmoud Rafeek Al-Farra
Learning linguistic structure with simple recurrent neural networks
III. Introduction to Neural Networks And Their Applications - Basics
Unsupervised Perceptual Rewards For Imitation Learning
August 8, 2006 Danny Budik, Itamar Elhanany Machine Intelligence Lab
Introduction to Neural Network
Orbitofrontal Cortex as a Cognitive Map of Task Space
Prof. Carolina Ruiz Department of Computer Science
Presentation transcript:

Computational models of cognitive control (II) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University

Banishing the homunculus

Decision-making in control:

Banishing the homunculus Decision-making in control: Not only, “How does control shape decision-making?”

Banishing the homunculus Decision-making in control: Not only, “How does control shape decision-making?” But also, “How are ‘control states’ selected?”

Banishing the homunculus Decision-making in control: Not only, “How does control shape decision-making?” But also, “How are ‘control states’ selected?” And, “How are they updated over time?”

1. Routine sequential action Botvinick & Plaut, Psychological Review, 2004 Botvinick, Proceedings of the Royal Society, B, Botvinick, TICS, 2008

‘Routine sequential action’ Action on familiar objects Well-defined sequential structure Concrete goals Highly routine Everyday tasks

Computational models of cognitive control (II) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University ?!

Hierarchical structure MAKE INSTANT COFFEE ADD GROUNDSADD CREAMADD SUGAR SCOOP ADD SUGAR FROM SUGARPACK ADD SUGAR FROM SUGARBOWL PICK-UPPUT-DOWNPOURSTIRTEAR

Hierarchical models of action ADD SUGAR FROM SUGARBOWL / PACKET MAKE INSTANT COFFEE ADD GROUNDS ADD CREAM ADD SUGAR PICK-UPPUT-DOWNPOURSTIRTEAR SCOOP Hierarchical structure of task built directly into architecture (e.g.,Cooper & Shallice, 2000; Estes, 1972; Houghton, 1990; MacKay, 1987, Rumelhart & Norman, 1982) Schemas as primitive elements

p t+2 a t+2 s t+2 An alternative approach ptpt atat stst p t+1 a t+1 s t+1

ptpt atat stst p t+1 a t+1 s t+1 p t+2 a t+2 s t+2 p, s, a = patterns of activation over simple processing units Weighted, excitatory/inhibitory connections Weights adjusted through gradient-descent learning in target task domains

Recurrent neural networks Feedback as well as feedforward connections Allow preservation of information over time Demonstrated capacity to learn sequential behaviors (e.g., Cleermans, 1993; Elman, 1990)

environment action internal representation perceptual input The model

Fixate(Blue)Fixate(Green)Fixate(Top) PickUpFixate(Table)PutDown Fixate(Green)PickUp Ballard, Hayhoe, Pook & Rao, (1996). BBS.

environment action perceptual input viewed object held object Model architecture manipulative perceptual

Routine sequential action: Task domain Hierarchically structured Actions/subtasks may appear in multiple contexts Environmental cues alone sometimes insufficient to guide action selection Subtasks that may be executed in variable order Subtask disjunctions

drink steep tea cream ` drink grounds Start End

Representations sugar - packet Manipulative actions Perceptual actions

Input Target/ output

Input Target/ output

Input Target/ output

Input Target/ output

Input Target/ output

Input Target/ output

Input Target/ output

Model behavior

15%18% 12%10% 20%25% cream drink grounds Start End cream drink grounds Start End drink steep tea Start End cream drink grounds Start End drink steep tea Start End

Slips of action (after Reason) Occur at decision (or fork) points Sequence errors involve subtask omissions, repetitions, and lapses Lapses show effect of relative task frequency

environment action perceptual input viewed object held object manipulative perceptual

Sample of behavior: pick-up coffee-pack pull-open coffee-pack pour coffee-pack into cup put-down coffee-pack pick-up spoon stir cup put-down spoon pick-up sugar-pack tear-open sugar-pack pour sugar-pack into cup put-down sugar-pack pick-up spoon stir cup put-down spoon pick-up cup* sip cup say-done grounds sugar (pack) drink cream omitted

subtask 1 subtask 2 subtask 3 subtask 4 Step in coffee sequence Percentage of trials error-free 100 0

Noise level (variance) Percentage of trials Omissions / anticipations Repetitions / perseverations Intrusions / lapses

steep tea sugar cream * :11:11:5 Tea : coffee Odds of lapse into coffee-making drink steep tea cream drink grounds Start End

Action disorganization syndrome (after Schwartz and colleagues) Fragmentation of sequential structure (independent actions) Specific error types Omission effect

environment action perceptual input viewed object held object manipulative perceptual

Sample of behavior: pick-up coffee-pack pull-open coffee-pack put-down coffee-pack* pick-up coffee-pack pour coffee-pack into cup put-down coffee-pack pick-up spoon stir cup put-down spoon pick-up sugar-pack tear-open sugar-pack pour sugar-pack into cup put-down sugar-pack pick-up cup* put-down cup pull-off sugarbowl lid* put-down lid pick-up spoon scoop sugarbowl with spoon put-down spoon* pick-up cup* sip cup say-done sugar repeated cream omitted disrupted subtask subtask fragment

Empirical data: Schwartz, et al. Neuropsychology, Noise (variance) Proportion Independents

From: Schwartz, et al. Neuropsychology, Noise (variance) Errors (per opportunity) Sequence errors Omission errors

Internal representations

cream drink grounds drink steep tea

cream drink grounds drink steep tea

Etiology of a slip drink steep tea

Tea representation Coffee representation

tea rep’n coffee rep’n

Coffee more frequent coffee tea Tea more frequent tea coffee

Input Peripheral (input) Output Peripheral (Output) Intermediate (input) Intermediate (Output) Apex

Store-Ignore-Recall (SIR) task R “nine” “eight” “four” “seven” “eight”

Input Peripheral (input) Output Peripheral (Output) Intermediate (input) Intermediate (Output) Apex

Input Peripheral (input) Output Peripheral (Output) Intermediate (input) Intermediate (Output) Apex

Conclusions Architectural hierarchy is not necessary for hierarchically structured behavior (or to understand action errors). Recurrent connectivity combined with graded, distributed representation is sufficient. Nonetheless, if architectural hierarchy is present, it can lead to a graded division of labor, according to which units furthest from sensory and motor peripheries specialize in coding information pertaining to temporal context. This may give us a way of explaining why the prefrontal cortex seems to be involved in routine sequential behavior.

2. Hierarchical reinforcement learning Botvinick, Niv & Barto, Cognition, in press. Botvinick, TICS, 2008

Reinforcement Learning 1. States 2. Actions 3. Transition function 4. Reward function Policy?

Action strengths State values Prediction error

Adapted from Sutton et al., AI, 1999

O  Hierarchical Reinforcement Learning O: I, ,  (After Sutton, Precup & Singh, 1999) GREENRED “green” “red” Color-naming Word-reading Adapted from Cohen et al., Psych. Rev., 1990 “Policy abstraction”

  OOO OOO  OOO

From Humpheys & Forde, Cog. Neuropsych., 2001

1 2

cf. Luchins, Psychol. Monol., 1942

Genetic algorithms (Elfwing, 2003) Frequently visited states (Picket & Barto, 2002; Thrun & Schwartz, 1996) Graph partitioning (Menache et al., 2002; Mannor et al., 2004; Simsek et al., 2005) Intrinsic motivation (Simsek & Barto, 2005) Other possibilities: Impasses (Soar); Social transmission The Option Discovery Problem

Extension 1: Support for representing option identifiers 1

White & Wise, Exp Br Res, 1999 (See also: Assad, Rainer & Miller, 2000; Bunge, 2004; Hoshi, Shima & Tanji, 1998; Johnston & Everling, 2006; Wallis, Anderson & Miller, 2001; White, 1999…)

Miller & Cohen, Ann. Rev. Neurosci, 2001

From Curtis & D’Esposito, TICS, 2003, after Funahashi et al., J. Neurophysiol,1989.

Koechlin, Attn & Perf., 2008

2 Extension 2: Option-specific policies

O’Reilly & Frank, Neural Computation, 2006

Aldridge & Berridge, J Neurosci, 1998

3 Extension 3: Option-specific state values

Schoenbaum, et al. J Neurosci See also: O’Doherty, Critchley, Deichmann, Dolan, 2003

4 Extension 4: Temporal scope of the prediction error

Schoenbaum, Roesch & Stalnaker, TICS, 2006

Roesch, Taylor & Schoenbaum, Neuron, 2006

Daw, NIPS, 2003

3. Goal-directed behavior Botvinick & An, submitted.

Niv, Joel & Dayan, TICS (2006) T R

T R 4023

T R 4023

T R

T R 4023

T R 4023 

Blodgett, 1929 Latent learning

Blodgett, 1929 Latent learning

Tolman & Honzik, 1930 Detour behavior

Tolman & Honzik, 1930 Detour behavior

Tolman & Honzik, 1930 Detour behavior

Niv, Joel & Dayan, TICS (2006) Devaluation

White & Wise, Exp Br Res, 1999 (See also: Assad, Rainer & Miller, 2000; Bunge, 2004; Hoshi, Shima & Tanji, 1998; Johnston & Everling, 2006; Wallis, Anderson & Miller, 2001; White, 1999; Miller & Cohen, 2001…)

Miller & Cohen, Ann. Rev. Neurosci, 2001

Padoa-Schioppa & Assad, Nature, 2006

Gopnik, et al., Psych Rev, 2004

R  T

?

Redish data… Johnson & Redish, J. Neurosci., 2007

,

,

Botvinick & An, submitted

Cf. Tatman & Shachter, 1990

Cf. Verma & Rao, 2006

Policy query

Reward query

Policy query Reward query

Policy query Reward query

+1 / 0 +2 / -3

Collaborators James An Andy Barto Todd Braver Deanna Barch Jonathan Cohen Andrew Ledvina Joseph McGuire David Plaut Yael Niv