Reward, Motivation, and Reinforcement Learning

Slides:



Advertisements
Similar presentations
Copyright © 2005 Pearson Education Canada Inc. Learning Chapter 5.
Advertisements

Reinforcement & Punishment: What is an S R ? Lesson 11.
Behavioral Learning Theory : Pavlov, Thorndike & Skinner M. Borland E.P. 500 Dr. Mayton Summer 2007.
Visual Control of Altitude in Flying Drosophila
Journal of Vision. 2008;8(7):12. doi: / Figure Legend:
Journal of Vision. 2013;13(3):1. doi: / Figure Legend:
Volume 62, Issue 5, Pages (June 2009)
Grid Cells for Conceptual Spaces?
One-Dimensional Dynamics of Attention and Decision Making in LIP
Fig. S2-B FigureS2. Trade-off between spatial and temporal information. Solid connectors represent spatially reduced versions, while dashed connectors.
Cortico-Accumbens Regulation of Approach-Avoidance Behavior Is Modified by Experience and Chronic Pain  Neil Schwartz, Catriona Miller, Howard L. Fields 
Alexander W. Johnson  Trends in Neurosciences 
Choosing Goals, Not Rules: Deciding among Rule-Based Action Plans
Volume 95, Issue 5, Pages e5 (August 2017)
Emotion, Decision Making, and the Amygdala
Turning the Dial on Object Perception
Nathaniel Kendall-Taylor, Pat Levitt  Neuron 
Visual Attention: Bottom-Up Versus Top-Down
Listening for the Right Sounds
Volume 66, Issue 6, Pages (June 2010)
Seiichiro Amemiya, A. David Redish  Cell Reports 
Perceptual Learning and Decision-Making in Human Medial Frontal Cortex
Neural Correlates of Knowledge: Stable Representation of Stimulus Associations across Variations in Behavioral Performance  Adam Messinger, Larry R. Squire,
The Psychology and Neuroscience of Curiosity
Volume 62, Issue 5, Pages (June 2009)
Rats Remember Items in Context Using Episodic Memory
Sensitivity to Temporal Reward Structure in Amygdala Neurons
Feature- and Order-Based Timing Representations in the Frontal Cortex
Activity in Posterior Parietal Cortex Is Correlated with the Relative Subjective Desirability of Action  Michael C. Dorris, Paul W. Glimcher  Neuron 
A Role for the Superior Colliculus in Decision Criteria
Attentional Modulations Related to Spatial Gating but Not to Allocation of Limited Resources in Primate V1  Yuzhi Chen, Eyal Seidemann  Neuron  Volume.
Volume 66, Issue 4, Pages (May 2010)
Profound Contrast Adaptation Early in the Visual Pathway
Volume 71, Issue 4, Pages (August 2011)
Visual Control of Altitude in Flying Drosophila
Volume 65, Issue 6, Pages (March 2010)
Volume 88, Issue 6, Pages (December 2015)
Thomas Akam, Dimitri M. Kullmann  Neuron 
Pieter R. Roelfsema, Henk Spekreijse  Neuron 
Validation of Clinical Testing for Warfarin Sensitivity
Peter Kok, Janneke F.M. Jehee, Floris P. de Lange  Neuron 
Takashi Sato, Aditya Murthy, Kirk G. Thompson, Jeffrey D. Schall 
Ajay S. Pillai, Viktor K. Jirsa  Neuron 
Prefrontal Cortex Activity Related to Abstract Response Strategies
Heidi C. Meyer, David J. Bucci  Current Biology 
Ajay S. Pillai, Viktor K. Jirsa  Neuron 
Modulation of Caudate Activity by Action Contingency
Ethan S. Bromberg-Martin, Masayuki Matsumoto, Okihide Hikosaka  Neuron 
Sharon C. Furtak, Omar J. Ahmed, Rebecca D. Burwell  Neuron 
Kerstin Preuschoff, Peter Bossaerts, Steven R. Quartz  Neuron 
Fast Sequences of Non-spatial State Representations in Humans
Neural Mechanisms Underlying Human Consensus Decision-Making
Volume 77, Issue 6, Pages (March 2013)
Value-Based Modulations in Human Visual Cortex
Timescales of Inference in Visual Adaptation
Volume 84, Issue 4, Pages (October 2013)
On the Integration of Space, Time, and Memory
Volume 27, Issue 3, Pages (September 2000)
Temporal Specificity of Reward Prediction Errors Signaled by Putative Dopamine Neurons in Rat VTA Depends on Ventral Striatum  Yuji K. Takahashi, Angela J.
Encoding of Stimulus Probability in Macaque Inferior Temporal Cortex
Volume 30, Issue 2, Pages (May 2001)
Visually Mediated Motor Planning in the Escape Response of Drosophila
Top-Down Modulation of Lateral Interactions in Early Vision
The Talin Dimer Structure Orientation Is Mechanically Regulated
Volume 24, Issue 10, Pages (September 2018)
Volume 74, Issue 1, Pages (April 2012)
Neural Circuit Motifs in Valence Processing
Volume 66, Issue 4, Pages (May 2010)
Matthew R. Roesch, Adam R. Taylor, Geoffrey Schoenbaum  Neuron 
Volume 61, Issue 6, Pages (March 2009)
Presentation transcript:

Reward, Motivation, and Reinforcement Learning Peter Dayan, Bernard W. Balleine  Neuron  Volume 36, Issue 2, Pages 285-298 (October 2002) DOI: 10.1016/S0896-6273(02)00963-7

Figure 1 Example Task Animals are presented with two stimuli (blue [b] and green [g] lights) defining two locations in a maze, at each of which they can choose one of two directions (U, D, R, and L). The actions have the consequences shown, leading to outcomes that are either neutral (OU) or potentially appetitive (OL and OR). In brackets opposite the actions are the probabilities with which each is selected by a particular policy in a case in which food reward worth a nominal two units is available at OL. This is a conceptual task; spatial mazes may engage different solution mechanisms. Neuron 2002 36, 285-298DOI: (10.1016/S0896-6273(02)00963-7)

Figure 2 Model of the Pavlovian Motivational System An appetitive US representation is connected with the appetitive affective system via a motivational gate (M) that is sensitive to the specific biological impact of the US and that maintains a fixed connection with the appetitive affective system. CSs can form connections with the US or the appetitive system directly. Modeled after Dickinson and Balleine, 2002. Neuron 2002 36, 285-298DOI: (10.1016/S0896-6273(02)00963-7)

Figure 3 The New Model (A) Development across trials of the advantages of D and U at state b in a case in which the animal has already been shaped to go L at state g and D costs a nominal −0.5 units. The inset graph shows the probability of choosing to go D at b. (B) The development of the value V(b) of state b (solid line), which assumes responsibility for the control of going D once the action has become a habit (i.e., once the associated advantage is 0). The dashed line shows the component of the value that comes from a predictive or forward model. Neuron 2002 36, 285-298DOI: (10.1016/S0896-6273(02)00963-7)