Presentation is loading. Please wait.

Presentation is loading. Please wait.

FIGURE 4 Responses of dopamine neurons to unpredicted primary reward (top) and the transfer of this response to progressively earlier reward-predicting.

Similar presentations


Presentation on theme: "FIGURE 4 Responses of dopamine neurons to unpredicted primary reward (top) and the transfer of this response to progressively earlier reward-predicting."— Presentation transcript:

1

2

3

4

5

6

7 FIGURE 4 Responses of dopamine neurons to unpredicted primary reward (top) and the transfer of this response to progressively earlier reward-predicting conditioned stimuli with training (middle). The bottom record shows a control baseline task when the reward is predicted by an earlier stimulus and not the light. From Schultz et al. (1995) with permission.

8

9

10 Odor Selective Cells in the Amygdala fire preferentially with regard to outcome or reward value of an odor prior to demonstration that the animal has learned this outcome or value. Odor Selective Cells in the Amygdala fire preferentially with regard to outcome or reward value of an odor simultaneous to demonstration that the animal has learned this outcome or value.

11 Cells in Orbitofrontal Cortex (OFC) show less selectivity to outcome, in rats without an amygdala. This demonstrates a role for the amygdala in conveying motivational/reward information to the OFC.

12

13

14 Dopamine, reward processing and optimal prediction ONLY AS A REFERENCE FOR THOSE WHO ARE INTERESTED IN BEGINNING TO CROSS THE NEUROBEHAVIORALCOMPUTATIONAL DIVIDE – Maybe after the Exam??

15 Human dopaminergic system

16 Cortical and striatal projections Schultz, 1998

17 Koob & Le Moal, 2001

18 Schultz, Dayan & Montague 1997

19 Expected Reward v = wu v : expected reward w : weight (association) u : stimulus (binary)

20 Rescorla-Wagner Rule Association update rule: w  w + αδu w : weight (association) α : learning rate u : stimulus Prediction error: δ = r - v r : actual reward v : expected reward

21 Rescorla - Wagner provides account for: Some Pavlovian conditioning Extinction Partial reinforcement and, with more than one stimulus: Blocking Inhibitory conditioning Overshadowing … but not Latent inhibition (CS preexposure effect) Secondary conditioning

22 A recent update: uncertainty (  i ² ) Kakade, Montague & Dayan, 2001

23 Kalman weight update rule: w i  w i + α i δ With associability : α i =  i ² u i  j  j ² u j + E

24 An example:

25 U 1 U 2 U 3 U 4 U 5 U(t)U(t) input

26 U(t)U(t) r(t)r(t)

27 U(t)U(t) r(t)r(t) w(t)w(t)

28 U(t)U(t) ŵ(t)ŵ(t) v(t)v(t)

29 U(t)U(t) r(t)r(t) ŵ(t)ŵ(t) v(t)v(t)

30 U(t)U(t) r(t)r(t) ŵ(t)ŵ(t) v(t)v(t) δ (t)

31  (t) = r(t) - v(t) Error Rule

32 U(t) ŵ(t)ŵ(t) v(t)v(t) inset U i -input ii w i -uncertainty -weight

33 Uncertainty

34 Kalman learning & associability weight update rule: ŵ i (t+1) = ŵ i (t) + α i (t) δ (t) associability: α i (t) =  i (t)² x i (t)  j  j (t)² x j (t) + E

35

36 Stimulus uncertainties

37 Reward prediction

38 Predicting future reward single time steps: v = wu v : expected reward w : weight (association) u : stimulus total predicted reward: v(t) = w(τ) u(t - τ) t : time steps in a trial τ : current time step tτ=0 tτ=0

39 Sum of discounted future rewards: With 0 ≤ γ ≤ 1 In recursive form: Schultz, Dayan & Montague, 1997

40 Exponential discounting, γ =.95

41 Temporal difference rule Total estimated future reward: v(t) = r(t)+ γv(t+1) r(t) = v(t)-γv(t+1) Temporal difference rule : δ = r(t)+γv(t+1)- v(t) ( With single time steps : δ = r - v r : actual reward v : expected reward )

42 Temporal difference rule Total estimated future reward: v(t) = r(t)+v(t+1) r(t) = v(t)-v(t+1) Temporal difference rule : δ = r(t) + v(t+1)- v(t) ( With single time steps : δ = r - v r : actual reward v : expected reward )

43 Schultz, Dayan & Montague, 1997

44 Schultz, 1996

45 Anatomical interpretation Schultz, Dayan & Montague, 1997

46 Temporal Difference Rule for Navigation between successive steps u and u’ δ = r a (u) + γ v(u’)-v(u)

47 Behavior evaluationHippocampal place field Foster, Morris & Dayan 2000

48 Spatial learning Foster, Morris & Dayan 2000

49 Conclusions Behavioral study of (nonhuman) neural systems is interesting Neural processes amenable to contemporary learning theory.. they may play distinct roles a normative framework of learning e.g. vta, hippocampus, subiculum, also- Ach in NBM/SI, NE in LC, 5-HT, ventral striatum, lateral connections, core/shell distinctions of the NAAC, patch-matrix anatomy in basal ganglia, the superior colliculus, psychoalphabetadiscobioaquadodoo


Download ppt "FIGURE 4 Responses of dopamine neurons to unpredicted primary reward (top) and the transfer of this response to progressively earlier reward-predicting."

Similar presentations


Ads by Google