Download presentation
Presentation is loading. Please wait.
Published bySamara Tomlin Modified over 9 years ago
1
Learning, Volatility and the ACC Tim Behrens FMRIB + Psychology, University of Oxford FIL - UCL.
2
B Trials Into Past -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Reward History Weight (β) CON i-1i-2i-3i-4i-5i-6i-7i-8 Kennerley, et al., Nature Neuroscience, 2006
3
ACCs B Trials Into Past -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Reward History Weight (β) CON i-1i-2i-3i-4i-5i-6i-7i-8 Kennerley et al. Nature Neuroscience, 2006
4
Monkeys will sacrifice food opportunities to look at other monkeys ACC G Rudebeck,et al. Science 2005
5
Interest in other individuals is reduced after ACC gyrus lesion ACC G Rudebeck,et al. Science 2005
6
Anatomy - Differences in connections between ACCs and ACCg. Connections unique to the sulcus are mainly with motor regions: Primary motor cortex Premotor cortex Parietal motor areas Spinal Cord ACCs has information about our own actions
7
Anatomy - Differences in connections between ACCs and ACCg. Connections unique to the gyrus are mainly with regions that process emotional and biological stimuli: Periacqueductal grey hypothalamus STS/STG Insula/Temporal pole connections are stronger to the gyrus ACCg has access to information about other agents.
8
Anatomy - shared connections between ACCs and ACCg. Some shared connections Orbitofrontal cortex Amydala Ventral striatum ACCg and ACCs are strongly interconnected Both regions have access to and influence over reward and value processing.
9
ACC Sulcus and learning about your actions.
10
ACCs B Trials Into Past -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Reward History Weight (β) CON i-1i-2i-3i-4i-5i-6i-7i-8 Kennerley et al. Nature Neuroscience, 2006
11
Kennerly et al. Nat Neurosci 2006Sugrue et al. Science 2005 Trials Into Past -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Reward History Weight (β) CON i-1i-2i-3i-4i-5i-6i-7i-8 What determines the integration length?
12
Kennerly et al. Nat Neurosci 2006Sugrue et al. Science 2005 VOLATILE Reward probabilities change approximately every 25 trials STABLE Reward probabilities change only after hundreds of trials Trials Into Past -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Reward History Weight (β) CON i-1i-2i-3i-4i-5i-6i-7i-8
13
Reinforcement learning We need to continually re-appraise the value of an action based each new experience. prediction (V t ) outcome x new prediction (V t+1 )
14
Updating beliefs on the basis of new information 14 V t+1 =V t +( x The learning rate is the weight given to the current information The prediction error is the information available from this event
15
The learning rate and the value of information. V t+1 =V t +( x The learning rate should represent the value of the current information for guiding future beliefs.
16
=0.01 =0.4 =0.1 Relationship with integration length
17
stable 37 63 Behrens et al., Nature Neuroscience, 2007
18
Behrens, Woolrich, Walton, Rushworth, Nature Neuroscience, 2007 V t+1 =V t + x
19
changes in reward estimates occur throughout the task… Behrens, Woolrich, Walton, Rushworth, Nature Neuroscience, 2007 …as do change in volatility estimates
20
Decide Monitor x Volatility Behrens et al., Nature Neuroscience, 2007
21
ACC effect size predicts learning rate across subjects Behrens, Woolrich, Walton &Rushworth Nat Neurosci 2007
22
ACC Gyrus and learning about your social partners.
23
Interest in other individuals is reduced after ACC gyrus lesion ACC G Rudebeck et al. Science 2005
24
Rudebeck et al., Science, 2006
25
25 Learning about other agents 37 63 Behrens, Hunt, Woolrich, Rushworth Nature 2008
26
Sources of information Probability that confederate advice is good Probability that correct colour is blue Value of action information Value of social information Behrens, Hunt, Woolrich, Rushworth Nature 2008
27
Social information is integrated over time - behaviour
28
Reward Prediction Error Reward -Expectation V t+1 =V t +( x Outcome Time Effect size Behrens, Hunt, Woolrich, Rushworth Nature 2008
29
Prediction error on a social partner. Lie event -Lie prediction V t+1 =V t +( x Outcome Time Effect size Behrens, Hunt, Woolrich, Rushworth Nature 2008
30
The value of information and the ACC 30 Value of reward information Value of social information V t+1 =V t +( x
31
Combining Information to drive behaviour V t+1 =V t +( x
32
32 Conclusions ACC codes a learning signal when information is observed. This signal predicts the speed of learning. Learning from our own and others’ actions are processed in parallel in ACCs and ACCg. The outputs of these parallel learning processes are combined in the reward system.
33
33 Acknowledgments Matthew Rushworth Mark Woolrich Laurence Hunt Mark Walton 33
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.