Download presentation
Presentation is loading. Please wait.
1
Reward, Motivation, and Reinforcement Learning
Peter Dayan, Bernard W. Balleine Neuron Volume 36, Issue 2, Pages (October 2002) DOI: /S (02)
2
Figure 1 Example Task Animals are presented with two stimuli (blue [b] and green [g] lights) defining two locations in a maze, at each of which they can choose one of two directions (U, D, R, and L). The actions have the consequences shown, leading to outcomes that are either neutral (OU) or potentially appetitive (OL and OR). In brackets opposite the actions are the probabilities with which each is selected by a particular policy in a case in which food reward worth a nominal two units is available at OL. This is a conceptual task; spatial mazes may engage different solution mechanisms. Neuron , DOI: ( /S (02) )
3
Figure 2 Model of the Pavlovian Motivational System
An appetitive US representation is connected with the appetitive affective system via a motivational gate (M) that is sensitive to the specific biological impact of the US and that maintains a fixed connection with the appetitive affective system. CSs can form connections with the US or the appetitive system directly. Modeled after Dickinson and Balleine, 2002. Neuron , DOI: ( /S (02) )
4
Figure 3 The New Model (A) Development across trials of the advantages of D and U at state b in a case in which the animal has already been shaped to go L at state g and D costs a nominal −0.5 units. The inset graph shows the probability of choosing to go D at b. (B) The development of the value V(b) of state b (solid line), which assumes responsibility for the control of going D once the action has become a habit (i.e., once the associated advantage is 0). The dashed line shows the component of the value that comes from a predictive or forward model. Neuron , DOI: ( /S (02) )
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.