Download presentation
Presentation is loading. Please wait.
Published byRosaline Carpenter Modified over 8 years ago
1
Neural correlates of risk sensitivity An fMRI study of instrumental choice behavior Yael Niv, Jeffrey A. Edlund, Peter Dayan, and John O’Doherty Cohen lab meeting, Feb 2007
2
Background and aims Question: Are learned values adjusted to reflect risk? Do you prefer to get $10, or should we toss a coin for $20 or nothing? Human decision making is sensitive not only to the expected reward, but also to the variance, or risk of the outcome. Personal preferences vary: Risk avoidance vs. risk seeking Models of choice between different outcomes: Expected utility models Risk affects subjective utility For instance: “utility curve” 10 2030 $ subjective utility Reinforcement learning models Online learning of value = expected future reward Learned values indifferent to risk Neuroscientific support : 1.Dopamine firing 2.Imaging correlates of TD prediction error learning
3
Experimental design You won 40 cents < 1 sec 0.5 sec 2-5sec ITI 5 sec ISI 5 stimuli: CS20 → 20 cents CS0/40 → 0 or 40 cents (p=0.5) CS40 → 40 cents CS0 → 0 cents Randomly ordered trials; counterbalanced; 234 trials: 130 choice, 104 single stimulus 19 subjects, 3T scanner, TR=2sec Choice trials: Behavioral risk sensitivity Single trials: Neural values of stimuli
4
Behavioral results 1.Subjects learned the task 2.Subjects showed risk sensitivity reaction time first half second half score (points) Subject # blocks of trials proportion correct on different-mean choices blocks of 10 choices proportion choice of certain option in 20 vs 0/40 choice
5
Why are subjects risk sensitive? At least two possible reasons: Learning according to risk- neutral TD learning δ(t) = r(t) + V(t+1) – V(t) V(t) = V(t) + ηδ(t) Choices can be risk sensitive due to online learning Without choice: no bias in means Interaction between learning and choices in stochastic task A B choice reward value A value B Can be implemented in “risk sensitive TD learning” of values δ(t) = r(t) + V(t+1) – V(t) V(t) = V(t) + ηδ(t)(1±κ) Positive κ risk averse (learned mean < real mean) Negative κ risk seeking (learned mean > real mean) Different utilities for risky (CS0/40) and non-risky (CS20) options, despite similar mean value You won 40 points You won 0 points 20 κ=0 κ>0 κ<0
6
Comparing models: Behavioral fit Both models provide similarly good explanations of the behavioral choices κ (risk adjustment) CS20-CS0/40 value Risk neutral TD can explain risk-sensitive behavior Risk adjustment of temporal difference learning can explain risk aversion (fitted κ related to actual preference) The value of κ predicts a difference between the learned values of CS20 and CS0/40 subjects (ordered by performance) prediction probability per choice trial r 2 =0.83 proportion choice of certain option κ (risk adjustment)
7
Neural correlates of stimulus value: NAC R Y=+6 p<0.001 p<0.0001 seconds from stimulus onset seconds from CS0/40 onset Bilateral nucleus accumbens (ROI) correlated with TD error regressor of both models Time courses extracted from peak voxel in 8mm sphere around group peak L: (-12,3,-15); R: (9,3,-15) Activations consistent with TD error signal
8
The critical question: CS20 vs CS0/40 value seconds from stimulus onset risk averse (> 0.7) subjects risk prone (< 0.7) subjects CS20 - CS0/40 value proportion choice of certain option No evidence for correlation between value differences and risk preference (or for risk adjusted values) Qualitatively different prediction of the models: Risk sensitive TD model: CS20 value ≠ CS0/40 value even when sampled without bias Risk neutral TD model: CS20 value = CS0/40 value when sampled without bias -> Compare single CS averaged values of CS20 and CS0/40
9
Neural correlates: other TD error areas R Y=+20 p<0.001 p<0.0001 R Z=-6 p<0.001 p<0.0001 L X=-12 p<0.001 p<0.0001 Temporal difference error predictor also correlated with (at p< 0.001, uncorrected): - L mPFC and extending to mOFC - L caudate (especially on choice trials) - Bilateral hippocampus - R anterior cingulate - L temporal cortex TD error in single CS trials minus choice trials: Strong bilateral activation in anterior insula and mPFC (also same insula region in comparison between risky and non-risky trials)
10
Neural correlates: Risk vs. No Risk R Z=-4 p<0.005 Comparison of trials which included CS0/40 (trials involving risk) to trials which only involved constant rewarding options: Bilateral activation in anterior insula
11
Discussion and future directions Simple instrumental task confirms TD value learning in the brain Reaction time data illustrate Pavlovian-instrumental interactions, and provide a window to difficulty of decision making No evidence (yet?) for risk adjustment in TD based value learning Noise noise noise noise noise… But: TD learning alone can explain much of data Other interesting comparisons: Trials involving risk and those that don’t (anterior insula) Choice versus single CS trials TD error in choice versus single CS trials - another way to look at the dorsal and ventral divide in the striatum Relationship to neuroeconomic issues: what is the basis of risk seeking? Are there other non-TD mechanisms that represent value? What is the role of the risk related signal in the insula? What is the relationship to a conflict signal in the anterior cingulate cortex?
12
References Preuschoff, Bossaerts & Quartz (2006) – Neural differentiation of expected reward and risk in human subcortical structures, Neuron Morris, Nevet, et al. (2006) – Midbrain dopamine neurons encode decisions for future action, Nature Neuroscience Kuhnen & Knutson (2005) – The neural basis of financial risk taking, Neuron Niv, Duff & Dayan (2005) – Dopamine, uncertainty and TD learning, Behavioral and Brain Functions O’Doherty, Dayan et al. (2004) – Dissociable roles of ventral and dorsal striatum in instrumental conditioning, Science Seymour, O’Doherty, et al. (2004) – Temporal difference models describe higher order learning in humans, Nature Fiorillo, Tobler & Schultz (2003) – Discrete coding of reward probability and uncertainty by dopamine neurons, Science Mihatsch & Neuneier (2002) – Risk sensitive reinforcement learning, Machine Learning Niv, Joel et al. (2002) – Evolution of reinforcement learning in uncertain environments, Adaptive Behavior
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.