Computational models of cognitive control (II) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University.

Computational models of cognitive control (II) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University

Banishing the homunculus

Decision-making in control:

Banishing the homunculus Decision-making in control: Not only, “How does control shape decision-making?”

Banishing the homunculus Decision-making in control: Not only, “How does control shape decision-making?” But also, “How are ‘control states’ selected?”

Banishing the homunculus Decision-making in control: Not only, “How does control shape decision-making?” But also, “How are ‘control states’ selected?” And, “How are they updated over time?”

1. Routine sequential action Botvinick & Plaut, Psychological Review, 2004 Botvinick, Proceedings of the Royal Society, B, 2007. Botvinick, TICS, 2008

‘Routine sequential action’ Action on familiar objects Well-defined sequential structure Concrete goals Highly routine Everyday tasks

Computational models of cognitive control (II) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University ?!

Hierarchical structure MAKE INSTANT COFFEE ADD GROUNDSADD CREAMADD SUGAR SCOOP ADD SUGAR FROM SUGARPACK ADD SUGAR FROM SUGARBOWL PICK-UPPUT-DOWNPOURSTIRTEAR

Hierarchical models of action ADD SUGAR FROM SUGARBOWL / PACKET MAKE INSTANT COFFEE ADD GROUNDS ADD CREAM ADD SUGAR PICK-UPPUT-DOWNPOURSTIRTEAR SCOOP Hierarchical structure of task built directly into architecture (e.g.,Cooper & Shallice, 2000; Estes, 1972; Houghton, 1990; MacKay, 1987, Rumelhart & Norman, 1982) Schemas as primitive elements

p t+2 a t+2 s t+2 An alternative approach ptpt atat stst p t+1 a t+1 s t+1

ptpt atat stst p t+1 a t+1 s t+1 p t+2 a t+2 s t+2 p, s, a = patterns of activation over simple processing units Weighted, excitatory/inhibitory connections Weights adjusted through gradient-descent learning in target task domains

Recurrent neural networks Feedback as well as feedforward connections Allow preservation of information over time Demonstrated capacity to learn sequential behaviors (e.g., Cleermans, 1993; Elman, 1990)

environment action internal representation perceptual input The model

Fixate(Blue)Fixate(Green)Fixate(Top) PickUpFixate(Table)PutDown Fixate(Green)PickUp Ballard, Hayhoe, Pook & Rao, (1996). BBS.

environment action perceptual input viewed object held object Model architecture manipulative perceptual

Routine sequential action: Task domain Hierarchically structured Actions/subtasks may appear in multiple contexts Environmental cues alone sometimes insufficient to guide action selection Subtasks that may be executed in variable order Subtask disjunctions

drink steep tea cream ` drink grounds Start End

Representations sugar - packet Manipulative actions Perceptual actions

Input Target/ output

Model behavior

15%18% 12%10% 20%25% cream drink grounds Start End cream drink grounds Start End drink steep tea Start End cream drink grounds Start End drink steep tea Start End

Slips of action (after Reason) Occur at decision (or fork) points Sequence errors involve subtask omissions, repetitions, and lapses Lapses show effect of relative task frequency

environment action perceptual input viewed object held object manipulative perceptual

Sample of behavior: pick-up coffee-pack pull-open coffee-pack pour coffee-pack into cup put-down coffee-pack pick-up spoon stir cup put-down spoon pick-up sugar-pack tear-open sugar-pack pour sugar-pack into cup put-down sugar-pack pick-up spoon stir cup put-down spoon pick-up cup* sip cup say-done grounds sugar (pack) drink cream omitted

subtask 1 subtask 2 subtask 3 subtask 4 Step in coffee sequence Percentage of trials error-free 100 0

0 20 40 60 80 0.020.10.20.3 Noise level (variance) Percentage of trials Omissions / anticipations Repetitions / perseverations Intrusions / lapses

steep tea sugar cream * 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 5:11:11:5 Tea : coffee Odds of lapse into coffee-making drink steep tea cream drink grounds Start End

Action disorganization syndrome (after Schwartz and colleagues) Fragmentation of sequential structure (independent actions) Specific error types Omission effect

environment action perceptual input viewed object held object manipulative perceptual

Sample of behavior: pick-up coffee-pack pull-open coffee-pack put-down coffee-pack* pick-up coffee-pack pour coffee-pack into cup put-down coffee-pack pick-up spoon stir cup put-down spoon pick-up sugar-pack tear-open sugar-pack pour sugar-pack into cup put-down sugar-pack pick-up cup* put-down cup pull-off sugarbowl lid* put-down lid pick-up spoon scoop sugarbowl with spoon put-down spoon* pick-up cup* sip cup say-done sugar repeated cream omitted disrupted subtask subtask fragment

Empirical data: Schwartz, et al. Neuropsychology, 1991 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.50.40.30.20.10 Noise (variance) Proportion Independents

From: Schwartz, et al. Neuropsychology, 1998. 0 10 20 30 40 50 60 70 0.30.20.10.04 Noise (variance) Errors (per opportunity) Sequence errors Omission errors

Internal representations

-1.6 -1.1 -0.6 -0.1 0.4 0.9 1.4 1.9 -1.2-0.20.8

cream drink grounds drink steep tea

Etiology of a slip drink steep tea

Tea representation Coffee representation

tea rep’n coffee rep’n

Coffee more frequent coffee tea Tea more frequent tea coffee

Input Peripheral (input) Output Peripheral (Output) Intermediate (input) Intermediate (Output) Apex

Store-Ignore-Recall (SIR) task 9 8 4 7 R “nine” “eight” “four” “seven” “eight”

Input Peripheral (input) Output Peripheral (Output) Intermediate (input) Intermediate (Output) Apex

Conclusions Architectural hierarchy is not necessary for hierarchically structured behavior (or to understand action errors). Recurrent connectivity combined with graded, distributed representation is sufficient. Nonetheless, if architectural hierarchy is present, it can lead to a graded division of labor, according to which units furthest from sensory and motor peripheries specialize in coding information pertaining to temporal context. This may give us a way of explaining why the prefrontal cortex seems to be involved in routine sequential behavior.

2. Hierarchical reinforcement learning Botvinick, Niv & Barto, Cognition, in press. Botvinick, TICS, 2008

Reinforcement Learning 1. States 2. Actions 3. Transition function 4. Reward function Policy?

Action strengths State values Prediction error

Adapted from Sutton et al., AI, 1999

O  Hierarchical Reinforcement Learning O: I, ,  (After Sutton, Precup & Singh, 1999) GREENRED “green” “red” Color-naming Word-reading Adapted from Cohen et al., Psych. Rev., 1990 “Policy abstraction”

  OOO OOO  OOO

From Humpheys & Forde, Cog. Neuropsych., 2001

cf. Luchins, Psychol. Monol., 1942

Genetic algorithms (Elfwing, 2003) Frequently visited states (Picket & Barto, 2002; Thrun & Schwartz, 1996) Graph partitioning (Menache et al., 2002; Mannor et al., 2004; Simsek et al., 2005) Intrinsic motivation (Simsek & Barto, 2005) Other possibilities: Impasses (Soar); Social transmission The Option Discovery Problem

1 2 3 4

Extension 1: Support for representing option identifiers 1

White & Wise, Exp Br Res, 1999 (See also: Assad, Rainer & Miller, 2000; Bunge, 2004; Hoshi, Shima & Tanji, 1998; Johnston & Everling, 2006; Wallis, Anderson & Miller, 2001; White, 1999…)

Miller & Cohen, Ann. Rev. Neurosci, 2001

From Curtis & D’Esposito, TICS, 2003, after Funahashi et al., J. Neurophysiol,1989.

Koechlin, Attn & Perf., 2008

2 Extension 2: Option-specific policies

O’Reilly & Frank, Neural Computation, 2006

Aldridge & Berridge, J Neurosci, 1998

3 Extension 3: Option-specific state values

Schoenbaum, et al. J Neurosci. 1999 See also: O’Doherty, Critchley, Deichmann, Dolan, 2003

4 Extension 4: Temporal scope of the prediction error

Schoenbaum, Roesch & Stalnaker, TICS, 2006

Roesch, Taylor & Schoenbaum, Neuron, 2006

Daw, NIPS, 2003

3. Goal-directed behavior Botvinick & An, submitted.

Niv, Joel & Dayan, TICS (2006) T R

T R 4023

T R 4023 4 3

T R 4023

T R 4023 

Blodgett, 1929 Latent learning

Tolman & Honzik, 1930 Detour behavior

Niv, Joel & Dayan, TICS (2006) Devaluation

White & Wise, Exp Br Res, 1999 (See also: Assad, Rainer & Miller, 2000; Bunge, 2004; Hoshi, Shima & Tanji, 1998; Johnston & Everling, 2006; Wallis, Anderson & Miller, 2001; White, 1999; Miller & Cohen, 2001…)

Miller & Cohen, Ann. Rev. Neurosci, 2001

Padoa-Schioppa & Assad, Nature, 2006

Gopnik, et al., Psych Rev, 2004

R  T

Redish data… Johnson & Redish, J. Neurosci., 2007

Botvinick & An, submitted

Cf. Tatman & Shachter, 1990

Cf. Verma & Rao, 2006

Policy query

Reward query

Policy query Reward query

4 0 2 3

2 0 4 1

4 0 2 3 -2

+1 / 0 +2 / -3

+1 0 +2 -3

Collaborators James An Andy Barto Todd Braver Deanna Barch Jonathan Cohen Andrew Ledvina Joseph McGuire David Plaut Yael Niv

Computational models of cognitive control (II) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University.

Similar presentations

Presentation on theme: "Computational models of cognitive control (II) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computational models of cognitive control (II) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University.

Similar presentations

Presentation on theme: "Computational models of cognitive control (II) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University."— Presentation transcript:

Similar presentations

About project

Feedback