Download presentation
Presentation is loading. Please wait.
Published byPhyllis Stanley Modified over 9 years ago
1
Computational models of cognitive control (II) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University
2
Banishing the homunculus
3
Decision-making in control:
4
Banishing the homunculus Decision-making in control: Not only, “How does control shape decision-making?”
5
Banishing the homunculus Decision-making in control: Not only, “How does control shape decision-making?” But also, “How are ‘control states’ selected?”
6
Banishing the homunculus Decision-making in control: Not only, “How does control shape decision-making?” But also, “How are ‘control states’ selected?” And, “How are they updated over time?”
8
1. Routine sequential action Botvinick & Plaut, Psychological Review, 2004 Botvinick, Proceedings of the Royal Society, B, 2007. Botvinick, TICS, 2008
9
‘Routine sequential action’ Action on familiar objects Well-defined sequential structure Concrete goals Highly routine Everyday tasks
10
Computational models of cognitive control (II) Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University ?!
11
Hierarchical structure MAKE INSTANT COFFEE ADD GROUNDSADD CREAMADD SUGAR SCOOP ADD SUGAR FROM SUGARPACK ADD SUGAR FROM SUGARBOWL PICK-UPPUT-DOWNPOURSTIRTEAR
12
Hierarchical models of action ADD SUGAR FROM SUGARBOWL / PACKET MAKE INSTANT COFFEE ADD GROUNDS ADD CREAM ADD SUGAR PICK-UPPUT-DOWNPOURSTIRTEAR SCOOP Hierarchical structure of task built directly into architecture (e.g.,Cooper & Shallice, 2000; Estes, 1972; Houghton, 1990; MacKay, 1987, Rumelhart & Norman, 1982) Schemas as primitive elements
13
p t+2 a t+2 s t+2 An alternative approach ptpt atat stst p t+1 a t+1 s t+1
14
ptpt atat stst p t+1 a t+1 s t+1 p t+2 a t+2 s t+2 p, s, a = patterns of activation over simple processing units Weighted, excitatory/inhibitory connections Weights adjusted through gradient-descent learning in target task domains
15
Recurrent neural networks Feedback as well as feedforward connections Allow preservation of information over time Demonstrated capacity to learn sequential behaviors (e.g., Cleermans, 1993; Elman, 1990)
16
environment action internal representation perceptual input The model
17
Fixate(Blue)Fixate(Green)Fixate(Top) PickUpFixate(Table)PutDown Fixate(Green)PickUp Ballard, Hayhoe, Pook & Rao, (1996). BBS.
18
environment action perceptual input viewed object held object Model architecture manipulative perceptual
19
Routine sequential action: Task domain Hierarchically structured Actions/subtasks may appear in multiple contexts Environmental cues alone sometimes insufficient to guide action selection Subtasks that may be executed in variable order Subtask disjunctions
20
drink steep tea cream ` drink grounds Start End
21
Representations sugar - packet Manipulative actions Perceptual actions
22
Input Target/ output
23
Input Target/ output
24
Input Target/ output
25
Input Target/ output
26
Input Target/ output
27
Input Target/ output
28
Input Target/ output
29
Model behavior
30
15%18% 12%10% 20%25% cream drink grounds Start End cream drink grounds Start End drink steep tea Start End cream drink grounds Start End drink steep tea Start End
31
Slips of action (after Reason) Occur at decision (or fork) points Sequence errors involve subtask omissions, repetitions, and lapses Lapses show effect of relative task frequency
32
environment action perceptual input viewed object held object manipulative perceptual
33
Sample of behavior: pick-up coffee-pack pull-open coffee-pack pour coffee-pack into cup put-down coffee-pack pick-up spoon stir cup put-down spoon pick-up sugar-pack tear-open sugar-pack pour sugar-pack into cup put-down sugar-pack pick-up spoon stir cup put-down spoon pick-up cup* sip cup say-done grounds sugar (pack) drink cream omitted
34
subtask 1 subtask 2 subtask 3 subtask 4 Step in coffee sequence Percentage of trials error-free 100 0
35
0 20 40 60 80 0.020.10.20.3 Noise level (variance) Percentage of trials Omissions / anticipations Repetitions / perseverations Intrusions / lapses
36
steep tea sugar cream * 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 5:11:11:5 Tea : coffee Odds of lapse into coffee-making drink steep tea cream drink grounds Start End
37
Action disorganization syndrome (after Schwartz and colleagues) Fragmentation of sequential structure (independent actions) Specific error types Omission effect
38
environment action perceptual input viewed object held object manipulative perceptual
39
Sample of behavior: pick-up coffee-pack pull-open coffee-pack put-down coffee-pack* pick-up coffee-pack pour coffee-pack into cup put-down coffee-pack pick-up spoon stir cup put-down spoon pick-up sugar-pack tear-open sugar-pack pour sugar-pack into cup put-down sugar-pack pick-up cup* put-down cup pull-off sugarbowl lid* put-down lid pick-up spoon scoop sugarbowl with spoon put-down spoon* pick-up cup* sip cup say-done sugar repeated cream omitted disrupted subtask subtask fragment
41
Empirical data: Schwartz, et al. Neuropsychology, 1991 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.50.40.30.20.10 Noise (variance) Proportion Independents
42
From: Schwartz, et al. Neuropsychology, 1998. 0 10 20 30 40 50 60 70 0.30.20.10.04 Noise (variance) Errors (per opportunity) Sequence errors Omission errors
43
Internal representations
44
-1.6 -1.1 -0.6 -0.1 0.4 0.9 1.4 1.9 -1.2-0.20.8
45
-1.6 -1.1 -0.6 -0.1 0.4 0.9 1.4 1.9 -1.2-0.20.8
46
-1.6 -1.1 -0.6 -0.1 0.4 0.9 1.4 1.9 -1.2-0.20.8
47
-1.6 -1.1 -0.6 -0.1 0.4 0.9 1.4 1.9 -1.2-0.20.8
48
-1.6 -1.1 -0.6 -0.1 0.4 0.9 1.4 1.9 -1.2-0.20.8
49
cream drink grounds drink steep tea
50
cream drink grounds drink steep tea
51
Etiology of a slip drink steep tea
52
Tea representation Coffee representation
53
tea rep’n coffee rep’n
55
Coffee more frequent coffee tea Tea more frequent tea coffee
59
Input Peripheral (input) Output Peripheral (Output) Intermediate (input) Intermediate (Output) Apex
60
Store-Ignore-Recall (SIR) task 9 8 4 7 R “nine” “eight” “four” “seven” “eight”
61
Input Peripheral (input) Output Peripheral (Output) Intermediate (input) Intermediate (Output) Apex
63
Input Peripheral (input) Output Peripheral (Output) Intermediate (input) Intermediate (Output) Apex
64
Conclusions Architectural hierarchy is not necessary for hierarchically structured behavior (or to understand action errors). Recurrent connectivity combined with graded, distributed representation is sufficient. Nonetheless, if architectural hierarchy is present, it can lead to a graded division of labor, according to which units furthest from sensory and motor peripheries specialize in coding information pertaining to temporal context. This may give us a way of explaining why the prefrontal cortex seems to be involved in routine sequential behavior.
65
2. Hierarchical reinforcement learning Botvinick, Niv & Barto, Cognition, in press. Botvinick, TICS, 2008
66
Reinforcement Learning 1. States 2. Actions 3. Transition function 4. Reward function Policy?
67
Action strengths State values Prediction error
69
Adapted from Sutton et al., AI, 1999
73
O Hierarchical Reinforcement Learning O: I, , (After Sutton, Precup & Singh, 1999) GREENRED “green” “red” Color-naming Word-reading Adapted from Cohen et al., Psych. Rev., 1990 “Policy abstraction”
74
OOO OOO OOO
76
From Humpheys & Forde, Cog. Neuropsych., 2001
83
1 2
84
cf. Luchins, Psychol. Monol., 1942
87
Genetic algorithms (Elfwing, 2003) Frequently visited states (Picket & Barto, 2002; Thrun & Schwartz, 1996) Graph partitioning (Menache et al., 2002; Mannor et al., 2004; Simsek et al., 2005) Intrinsic motivation (Simsek & Barto, 2005) Other possibilities: Impasses (Soar); Social transmission The Option Discovery Problem
90
1 2 3 4
91
Extension 1: Support for representing option identifiers 1
93
White & Wise, Exp Br Res, 1999 (See also: Assad, Rainer & Miller, 2000; Bunge, 2004; Hoshi, Shima & Tanji, 1998; Johnston & Everling, 2006; Wallis, Anderson & Miller, 2001; White, 1999…)
94
Miller & Cohen, Ann. Rev. Neurosci, 2001
96
From Curtis & D’Esposito, TICS, 2003, after Funahashi et al., J. Neurophysiol,1989.
97
Koechlin, Attn & Perf., 2008
98
2 Extension 2: Option-specific policies
101
O’Reilly & Frank, Neural Computation, 2006
102
Aldridge & Berridge, J Neurosci, 1998
103
3 Extension 3: Option-specific state values
106
Schoenbaum, et al. J Neurosci. 1999 See also: O’Doherty, Critchley, Deichmann, Dolan, 2003
107
4 Extension 4: Temporal scope of the prediction error
109
Schoenbaum, Roesch & Stalnaker, TICS, 2006
111
Roesch, Taylor & Schoenbaum, Neuron, 2006
113
Daw, NIPS, 2003
114
3. Goal-directed behavior Botvinick & An, submitted.
116
Niv, Joel & Dayan, TICS (2006) T R
117
T R 4023
118
T R 4023
119
T R 4023 4 3
120
T R 4023
121
T R 4023
122
Blodgett, 1929 Latent learning
123
Blodgett, 1929 Latent learning
124
Tolman & Honzik, 1930 Detour behavior
125
Tolman & Honzik, 1930 Detour behavior
126
Tolman & Honzik, 1930 Detour behavior
127
Niv, Joel & Dayan, TICS (2006) Devaluation
129
White & Wise, Exp Br Res, 1999 (See also: Assad, Rainer & Miller, 2000; Bunge, 2004; Hoshi, Shima & Tanji, 1998; Johnston & Everling, 2006; Wallis, Anderson & Miller, 2001; White, 1999; Miller & Cohen, 2001…)
130
Miller & Cohen, Ann. Rev. Neurosci, 2001
131
Padoa-Schioppa & Assad, Nature, 2006
133
Gopnik, et al., Psych Rev, 2004
134
R T
144
?
145
Redish data… Johnson & Redish, J. Neurosci., 2007
152
,
153
,
154
Botvinick & An, submitted
155
Cf. Tatman & Shachter, 1990
156
Cf. Verma & Rao, 2006
162
Policy query
164
Reward query
165
Policy query Reward query
166
Policy query Reward query
169
4 0 2 3
171
2 0 4 1
173
4 0 2 3 -2
174
4 0 2 3 -2
176
+1 / 0 +2 / -3
178
+1 0 +2 -3
179
+1 0 +2 -3
184
Collaborators James An Andy Barto Todd Braver Deanna Barch Jonathan Cohen Andrew Ledvina Joseph McGuire David Plaut Yael Niv
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.