Manipulating the teaching signal: effects of dopamine-related drugs on human learning systems Wellcome Trust Centre for NeuroImaging University College.

Slides:



Advertisements
Similar presentations
Reinforcement Learning I: prediction and classical conditioning
Advertisements

The General Linear Model (GLM)
Statistical Inference
SPM Course Zurich, February 2012 Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London.
SPM Software & Resources Wellcome Trust Centre for Neuroimaging University College London SPM Course London, May 2011.
SPM Software & Resources Wellcome Trust Centre for Neuroimaging University College London SPM Course London, October 2008.
SPM for EEG/MEG Guillaume Flandin
Group Analyses Guillaume Flandin SPM Course Zurich, February 2014
Experimental design of fMRI studies Methods & models for fMRI data analysis in neuroeconomics April 2010 Klaas Enno Stephan Laboratory for Social and Neural.
Bayesian models for fMRI data
Learning, Volatility and the ACC Tim Behrens FMRIB + Psychology, University of Oxford FIL - UCL.
The Disordered Brain what happens when decision making goes wrong? Neil Harrison University of Sussex Formerly: Institute of Cognitive Neuroscience & Wellcome.
1 Decision making. 2 How does the brain learn the values?
Journal club Marian Tsanov Reinforcement Learning.
Dopamine, Uncertainty and TD Learning CNS 2004 Yael Niv Michael Duff Peter Dayan Gatsby Computational Neuroscience Unit, UCL.
Reward processing (1) There exists plenty of evidence that midbrain dopamine systems encode errors in reward predictions (Schultz, Neuron, 2002) Changes.
The Social Brain Ben Seymour Wellcome Trust Centre for Neuroimaging ESRC Centre for Economic Learning and Social Evolution.
Functional Magnetic Resonance Imaging Carol A. Seger Psychology Molecular, Cellular, and Integrative Neuroscience Michael Thaut Music, Theater, and Dance.
02/08/2015Regional Writing Centre2 02/08/2015Regional Writing Centre3.
Neural circuits for bias and sensitivity in decision-making Jan Lauwereyns Associate Professor, Victoria University of Wellington, New Zealand Long-term.
Reinforcement learning This is mostly taken from Dayan and Abbot ch. 9 Reinforcement learning is different than supervised learning in that there is no.
Dopamine enhances model-based over model-free choice behavior Peter Smittenaar *, Klaus Wunderlich *, Ray Dolan.
The Basal Ganglia (Lecture 6) Harry R. Erwin, PhD COMM2E University of Sunderland.
CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 26- Reinforcement Learning for Robots; Brain Evidence.
Show Me the Money! Dmitry Kit. Outline Overview Reinforcement Learning Other Topics Conclusions.
A2 Unit 4 Revision Mindmaps. Biological model -Genes -Twins -Pathways -VTA-NA + MDP Initiation Maintenance Relapse 1. Models of addictive behaviour Addictive.
INVESTIGATING THE ROLE OF THE ANTERIOR CINGULATE CORTEX IN THE SELECTION OF WILLED ACTIONS AND PERFORMANCE MONITORING Department of Experimental Psychology,
Chapter 50 The Prefrontal Cortex and Executive Brain Functions Copyright © 2014 Elsevier Inc. All rights reserved.
Abstract We offer a formal treatment of choice behaviour based on the premise that agents minimise the expected free energy of future outcomes. Crucially,
Dopamine Reward Prediction Error Responses Reflect Marginal Utility William R. Stauffer, Armin Lak, Wolfram Schultz Current Biology Volume 24, Issue 21,
Orienting Attention to Semantic Categories T Cristescu, JT Devlin, AC Nobre Dept. Experimental Psychology and FMRIB Centre, University of Oxford, Oxford,
Drug abuse liability is associated with higher reward-sensitivity: An fMRI study using the Monetary Incentive Delay task C. Corbly, T. Kelly, Y. Jiang,
Neural correlates of risk sensitivity An fMRI study of instrumental choice behavior Yael Niv, Jeffrey A. Edlund, Peter Dayan, and John O’Doherty Cohen.
Group Analyses Guillaume Flandin SPM Course London, October 2016
Dopamine system: neuroanatomy
Comparing Single and Multiple Neuron Simulations of Integrated Dorsal and Ventral Striatal Pathway Models of Action Initiation Selin Metin1, Neslihan Serap.
Wellcome Trust Centre for Neuroimaging University College London
Professor Greg Murray, PhD, FAPS
Zhejiang University Ling Shucai
Neuroimaging of associative learning
Wellcome Trust Centre for Neuroimaging University College London
מוטיבציה והתנהגות free operant
Wellcome Trust Centre for Neuroimaging University College London
The Neurobiology of Decision: Consensus and Controversy
How the Opinion of Others Affects Our Valuation of Objects
Dopamine pathways & antipsychotics
Emotion, Decision Making, and the Amygdala
Volume 76, Issue 5, Pages (December 2012)
Neuroimaging of associative learning
Decision Making.
Volume 65, Issue 1, Pages (January 2010)
Volume 38, Issue 2, Pages (April 2003)
The Neurobiology of Decision: Consensus and Controversy
Kerstin Preuschoff, Peter Bossaerts, Steven R. Quartz  Neuron 
Subliminal Instrumental Conditioning Demonstrated in the Human Brain
Neuroimaging of associative learning
Wellcome Trust Centre for Neuroimaging University College London
Hugo D Critchley, Christopher J Mathias, Raymond J Dolan  Neuron 
Megan E. Speer, Jamil P. Bhanji, Mauricio R. Delgado  Neuron 
Predictive Neural Coding of Reward Preference Involves Dissociable Responses in Human Ventral Midbrain and Ventral Striatum  John P. O'Doherty, Tony W.
The General Linear Model
Will Penny Wellcome Trust Centre for Neuroimaging,
Volume 62, Issue 4, Pages (May 2009)
Neural Responses during Anticipation of a Primary Taste Reward
Volume 76, Issue 5, Pages (December 2012)
Brain Responses to the Acquired Moral Status of Faces
Wellcome Trust Centre for Neuroimaging University College London
Striatal Activity Underlies Novelty-Based Choice in Humans
Volume 90, Issue 5, Pages (June 2016)
Figure 2: Figure 2: Schematic diagram of the human dopamine-rich striatum, which is made up of the caudate nucleus, putamen and ventral striatum (left),
Presentation transcript:

Manipulating the teaching signal: effects of dopamine-related drugs on human learning systems Wellcome Trust Centre for NeuroImaging University College London Mathias Pessiglione

How can dopamine improve our decisions ? What we may learn from monkeys δ > 0 δ = 0 δ < 0 Schultz 1997 What we may learn from engineers reward prediction GPi / SNr cortex Barto 1995 δ

δ(t) = R(t)-Va(t) Va(t+1) = Va(t)+α*δ(t) [Va(t) Vb(t)] Pa(t) = 1/(1+exp((Va(t)-Vb(t))/β)) Behavioural choice Hidden values Learning rule fixation + cues + + choice outcome gain Testing and modelling behavioural choices rewarding cues: neutral cues: punishing cues : a = 0.8/0.2 * ‘gain’ £1/£0 a = 0.8/0.2 * ‘look’ £1/£0 a = 0.8/0.2 * ‘loss’ £1/£0 b = 0.2/0.8 * ‘gain’ £1/£0 b = 0.2/0.8 * ‘look’ £1/£0 b = 0.2/0.8 * ‘loss’ £1/£0

Using behavioural choices to optimise parameters Values (£) trials α = learning rate β = temperature δ(t) = R(t)-Va(t) Va(t+1) = Va(t)+α*δ(t) [Va(t) Vb(t)] Pa(t) = 1/(1+exp((Va(t)-Vb(t))/β)) = 0.8 * -£1 = 0.8 * £1 = 0.2 * £1 = 0.2 * -£ trials Choices (%) data model

Looking for values hidden in the brain ventral striatumanterior putamen posterior putamen δ = reward prediction error δ(t) = R(t)-Va(t) Va(t+1) = Va(t)+α*δ(t) [Va(t) Vb(t)] Pa(t) = 1/(1+exp((Va(t)-Vb(t))/β))

Striatal responses reflecting prediction errors LEVODOPAHALOPERIDOL Gain / £1 Gain / £0 Signal change (%) GAIN Signal change (%) Time (sec) Loss / £0 Loss / -£1 1.0 Time (sec) 1.0 LOSS

Explaining drug effects on behavioural choices LEVODOPA PLACEBO HALOPERIDOL 0.8 * £1 Choice (%) * -£ trials data model

How can striatal regions influence behavioural choice ? Go - NoGo posterior putamen (response facilitation) Loss - NeutralGain - Neutral ventral striatum (reward prediction)

A brainy model Haber 2003 posterior putamen ventral striatum SNc / VTA REWARD δ dopamine ACTOR execution CRITIC prediction ACTION GPi / SNr

Thanks Ben Seymour Guillaume Flandin Ray Dolan Chris Frith Wellcome Trust Centre for NeuroImaging University College London