Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hyoung F. Kim, Ali Ghazizadeh, Okihide Hikosaka  Cell 

Similar presentations


Presentation on theme: "Hyoung F. Kim, Ali Ghazizadeh, Okihide Hikosaka  Cell "— Presentation transcript:

1 Dopamine Neurons Encoding Long-Term Memory of Object Value for Habitual Behavior 
Hyoung F. Kim, Ali Ghazizadeh, Okihide Hikosaka  Cell  Volume 163, Issue 5, Pages (November 2015) DOI: /j.cell Copyright © 2015 Elsevier Inc. Terms and Conditions

2 Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions

3 Figure 1 Object-Value Learning and Habitual Visual-Oculomotor Behavior
(A) Fractal objects, each consistently associated with a reward (high-valued) or no reward (low valued). Monkey PK and DW learned 440 and 840 fractals respectively, among which 56 and 376 were long-term learned (>4 days). (B) Learning and testing procedures. (C) Object-value learning task. A fractal object was presented at a neurons’ preferred position, and the monkey made a saccade to it after the central fixation dot turned off. This was followed by a reward if the object was high-valued (top) or no reward if the object was low-valued (bottom). A set of eight objects (as in A) was used in each learning session. (D) Behavioral changes during learning. Mean target acquisition time (time after the fixation dot disappeared until the gaze reached the object) is plotted against the number of trials for each object during the first learning (n = 107). Data are shown separately for high-valued objects (red) and low-valued objects (blue). Green line indicates the difference of target acquisition time between the high- and low-valued objects (mean ± SE). (E) Free viewing procedure to test behavioral changes after learning. Four fractal objects among one set of eight objects were chosen pseudorandomly and presented simultaneously. Monkeys were free to look at the objects (or look elsewhere) for 2 s without reward feedback. (F) Increase in gaze bias during free viewing after repeated learning. The saccade-choice rate (left) and gazing duration rate (right) are plotted before learning (before, n = 42); after 1 day learning (1d, n = 22); after more than 4 days learning (>4d, n = 316). See also Figure S1. Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions

4 Figure 2 Neuronal Coding of Object Values during Learning and Post-Learning Responses of two presumed DA neurons in SNc are shown during learning (object-value learning task) and post-learning (passive viewing procedure). (A) Object-value learning task (see Figure 1C). (B) Passive viewing task. The learned objects were presented sequentially in the neuron’s preferred location, while the monkey was fixating at the center. A reward was delivered non-contingently with the presented objects. (C–E) Responses of neuron #1 (spike shape shown in C, bottom) to eight objects (shown in C, top) during the first object learning task (D), followed by the passive viewing task (E). Average activity (shown by spike density functions [SDFs]) is aligned at the onset of object presentation. (F–H) Responses of neuron #2 in the same format. See also Figure S2. Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions

5 Figure 3 Distinct Patterns of Object Value Responses in Two Types of Presumed DA Neurons (A–E) Average activity (shown by SDFs) of update-type DA neurons (top) and sustain-type DA neurons (bottom). Responses to novel objects during three steps: passive viewing (A), object value learning (B), and passive viewing (C). Responses to well learned objects (>4 days) during relearning (D) and passive viewing after >1 day retention (E). Green line indicates the difference between the high- and low-valued object responses (mean ± SE). The number of neurons examined (n) is shown in each graph. (F) Increase in value discrimination by repeated learning. Neuronal discrimination between high-and low-valued objects in passive viewing task (measured as ROC area, mean ± SE) is plotted before learning (before), after 1 day learning (1d), and after more than 4 days learning (>4d). The number of neurons examined (n) is shown at each data point. ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < by Wilcoxon rank-sum test. (G) Retention of value discrimination after learning. Neuronal discrimination in passive viewing task is plotted before learning (before), immediately after learning (after), 1–4 days after learning (≤4d), and >4 days after learning (>4d). For sustain-type DA neurons, data are separated by the number of learning: 1 day learning (dashed line, 1d) and more than 4 days learning (solid line, >4d). The number of neurons examined (n) is shown at each data point. ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < by Wilcoxon rank-sum test. See also Figures S2, S3, and S6. Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions

6 Figure 4 Differences and Similarities of Update- and Sustain-type DA Neurons (A) Response to the unpredicted reward outcome. Data were collected only for the first trial of each object in object value learning and are shown as average SDFs (left) and individual neuronal discrimination (right) for update-type DA neurons (n = 24) and sustain-type DA neurons (n = 29). The mean neuronal discrimination (indicated by triangle, calculated as ROC area) was significantly higher than 0.5 (i.e., no discrimination) for sustain-type DA neurons (ROC = 0.78), but not for update-type DA neurons (ROC = 0.53) (p < 0.001, Wilcoxon rank-sum test). (B) Spatial selectivity of visual response. This was tested for update-type DA neurons (top, n = 24) and sustain-type DA neurons (bottom, n = 39) by ipsilateral and contralateral object presentations. Data are shown by averaged SDFs (left) (dashed line: ipsilateral, solid line: contralateral). In scatterplots (right), each data point indicates the responses of each neuron to ipsilateral (ordinate) and contralateral (abscissa) objects. Red dots indicate neurons whose spatial selectivity is statistically significant (Wilcoxon rank-sum test, p < 0.05). NS, non-significant. (C) Stereotaxic locations of sustain- and update-type DA neurons in SN in coronal (top left) and sagittal (down left) views. D, dorsal; V, ventral; M, medial; L, lateral; R, rostral; C, caudal. Their distributions are projected to each of 3D axes (right). Number 0 indicates the midline (medial-lateral), the dorsal end of SN (dorsal-ventral), and the rostral end of SN (rostral-caudal). Their means (triangles) were statistically different in the medial-lateral and rostral-caudal dimensions (p < by Wilcoxon rank-sum test). The coordinates 0, 0, 0 (abscissa) are rostral, medial, and dorsal edges of SN. (D) Electrophysiological properties. Sustain- and update-type DA neurons had similar spike shapes, which are different from non-DA (presumed GABAergic) neurons (left). Relationship between spike duration and baseline firing rate for sustain- and update-type DA neurons and non-DA neurons (right). See also Figures S2, S3, and S4. Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions

7 Figure 5 Efferent and Afferent Connections of Sustain-type DA Neurons
(A) Scheme showing electrical stimulation in caudate tail (CDt) and neuronal recording in caudal-lateral SNc (clSNc). SNr, substantia nigra pars reticulata. Coronal view. Scale bar, 1 mm. (B) An SNc neuron activated by electrical stimulation in CDt with a fixed latency (6.9 ms) (PK#1, Figure S5E). This activation was eliminated when CDt stimulation occurred <7.9 ms after a spontaneous spike (bottom), confirming its antidromic nature (collision test). (C) Value discrimination of antidromically activated (Anti(+)) neurons (n = 6) in passive viewing task (>4 days learning and >1 day retention), shown as average SDFs (left) and ROC distribution (right). (D and E) Orthodromic responses of sustain-type DA neurons (D), but not update-type DA neurons (E), by CDt stimulation. Average activity is (shown by peristimulus time histogram [PSTH]) is aligned on CDt stimulation (dotted line). The lack of activity just after the stimulation was caused by stimulus artifact. (F) Responses of individual DA neurons to CDt stimulation shown by a scatterplot. Each data point indicates each neuron’s activity 10–40 ms after (ordinate) and 0–80 ms before (abscissa) CDt stimulation. Red dots indicate neurons whose response is statistically significant (t test, p < 0.05). NS, non-significant. See also Figures S4, S5, and S6. Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions

8 Figure 6 Colocalization of Sustain-type DA Neurons and CDt-Projecting DA Neurons (A) Location of an Anti(+) neuron (PK#3), indicated by a marking lesion (black arrow) in a Nissl-stained coronal section. Black line indicates the border of SN. (B) Combination of antidromic and retrograde tracer experiments in monkey PK. A retrograde tracer, cholera toxin subunit B (CTB), was injected in CDt. (C) An adjacent section (50 μm from the section in (A) showing sensitivity to TH (green) and CTB (red) and the location of the marking lesion (white arrow). A red plexus in the dorsolateral SNr indicates the anterogradely labeled axon terminals of CDt neurons. Scale bar, 2 mm. (D) Enlarged view of the area around the marking lesion. Among many TH-positive DA neurons (green) are TH and CTB double-labeled neurons (yellow or orange color, indicated by white arrowheads). Scale bar, 100 μm. (E) Stereotaxic locations of CDt-projecting neurons (retrogradely CTB-labeled). Note that 98.5% of them were TH-positive. The locations of neurons are projected to the coronal and sagittal perspectives of SN based on MRI. Scale bar, 1 mm. (F) Recording sites of sustain-type and update-type SNc neurons in stereotaxic coordinates. Anti(+) neurons among the sustain-type are indicated by yellow dots. The locations of sustain-type neurons were included in clSNc where DA neurons projected to CDt (E). The locations of neurons are projected to the coronal and sagittal perspectives of SN based on histological sections. Scale bar, 1 mm. See also Figure S6. Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions

9 Figure S1 Gaze Bias in Free Viewing Retained after Object-Value Learning, Related to Figure 1 (A) Object exposure schedule (schematic example). After the object-value learning (> 4 daily sessions), monkeys viewed the learned objects repeatedly without contingent reward outcomes in free and passive viewing tasks. Free viewing was used to test automatic gaze bias, while passive viewing was used to test automatic neuronal bias. For example, during the last free viewing (black), the monkey had viewed each of the learned objects 96 times during the preceding 6 free viewing sessions (blue, each object presented 16 times in one session) and 90 times during the preceding 6 passive viewing sessions (green, each object presented 15 times in one session). (B) Changes in gaze bias during free viewing by the increasing number of object exposures during the preceding free viewing. The saccade-choice rate (top) and gaze duration rate (bottom) are plotted against the number of object exposures. The numbers of exposures were counted for individual objects and divided into 5 groups, and the gaze bias is plotted for each group (mean ± SE). n: the number of free viewing sessions in each group. To test if the gaze bias changed by the repetitive object exposures, statistical comparison (two tailed t test) was done between two sets of data: 1) the first group versus the subsequent groups together, and 2) the second group versus the subsequent groups together. N.S., non-significant. (C) Changes in gaze bias during free viewing by the increasing number of object exposures during the preceding passive viewing. The same format as in (B). The gaze bias declined initially, possibly due to the contribution of flexible short-term object value memories which are vulnerable to the lack of contingent reward outcomes. It was then retained, likely due to the contribution of stable long-term object value memories which are resistant to the lack of contingent reward outcomes. Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions

10 Figure S2 Responses of DA Neurons during Learning and Their Retention after Learning, Related to Figures 2, 3, and 4 (A–D) The responses of update-type DA neurons and sustain-type DA neurons during object-value learning (see Figure 1C), shown separately for two events: reward-associated objects (A) and (C) and reward itself (B) and (D). The averaged responses are plotted against the number of trials for each object, separately for high-valued objects and low-valued objects (A) and (C) and reward and no reward (B) and (D). Green line indicates the difference of neuronal responses between the two conditions (mean ± SE). Data were collected during the 1st learning in which the monkey started seeing a set of eight new objects. The number of update-type neurons tested was 24; the number of sustain-type neurons tested was 29. (E–G) Response bias of sustain-type DA neurons retained after object-value learning. (E) Object exposure schedule (schematic example). After the object-value learning (> 4 daily sessions), monkeys viewed the learned objects repeatedly without contingent reward outcomes in free and passive viewing tasks. Free viewing was used to test automatic gaze bias, while passive viewing was used to test automatic neuronal bias. During the last passive viewing (black), the monkey had viewed each of the learned objects 96 times during the preceding 6 free viewing sessions (blue, each object presented 16 times in one session) and 90 times during the preceding 6 passive viewing sessions (green, each object presented 15 times in one session). (F) Changes in the response bias of sustain-type DA neurons by the increasing number of object exposures during the preceding free viewing. The neuronal bias is plotted against the number of object exposures. The numbers of exposures were counted for individual objects and divided into 4 groups, and the neuronal bias is plotted for each group (mean ± SE). n: the number of passive viewing sessions in each group. To test if the neuronal bias changed by the repetitive object exposures, statistical comparison (two tailed t test) was done between two sets of data: 1) the first group versus the subsequent groups together, and 2) the second group versus the subsequent groups together. N.S., non-significant. (G) Changes in the response bias of sustain-type DA neurons by the increasing number of object exposures during the preceding passive viewing. The same format as in (F). The response bias of sustain-type DA neurons showed no significant decrease in either early or late stage, suggesting that they contribute specifically to long-term object value memories (see Figure S1). Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions

11 Figure S3 Frequent Reversal of Object Values Differentially Affects Update- and Sustain-type DA Neurons, Related to Figures 3 and 4 (A) Reversal object value task. Same as the object-value learning task (Figure 1C), except that only two fractal objects were used and they reversed their values (reward or no reward) across blocks of trials. The objects were presented on the ipsilateral or contralateral side (relative to the neuronal location) (15 deg). (B) The change in the response of update-type DA neurons (n = 24) in each block of trials. The response magnitudes were averaged separately for 1) ipsilateral (left) and contralateral (right) positions and 2) reward-associated trials (red) and no-reward-associated trials (blue). For the contralateral response, the within-trial average activity is shown by SDFs on the right side. Green line in each graph indicates the difference between the two reward conditions (mean ± SE). (C) The change in the response of sustain-type DA neurons (n = 39). The same format as in (B). (D and E) Average responses of update-type DA neurons (D) and sustain-type DA neurons (E) to the outcome (red: reward, blue: no reward). Continuous lines indicate activity in the first trials after the reversal of object-reward contingency. Dotted lines indicate activity in the rest of the trials. (F and G) Sustained value responses are anti-correlated with reversed value responses and pro-correlated with spatial selectivities. For individual DA neurons, sustained value responses in the passive viewing task (abscissa) are compared with reversed value responses (F) and spatial selectivities (G) (ordinate). The spatial selectivity was obtained in the reversal task (A). Magenta: sustain-type DA neurons. Green: update-type DA neurons. Black lines indicate regression lines. The r and p values were calculated by Pearson correlation coefficient. Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions

12 Figure S4 Functional Heterogeneities of DA Neurons Are Related to Their Locations in SN, Related to Figures 4 and 5 For individual DA neurons, the magnitudes (measured as ROC area) of sustained value responses (A), spatial selectivity (B) and reversed value responses (C) are plotted against their locations in stereotaxic coordinates: rostral-caudal (left), medial-lateral (middle) and dorsal-ventral axes (right). Number 0 indicates the rostral end of SN (Rostral-Caudal), the midline (Medial-Lateral), the dorsal end of SN (Dorsal-Ventral). Black lines indicate regression lines. The r and p values were calculated by Pearson correlation coefficient. N.S., non-significant. Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions

13 Figure S5 DA Neurons Antidromically Activated by CDt Stimulation, Related to Figure 5 (A–D) Full or partial spikes of a DA neuron evoked by antidromic stimulation. This SNc neuron (PK#2) responded to CDt stimulation either with a full spike (A) or a partial spike (B) in one experimental session. CDt was stimulated 28.4 ms after a spontaneous spike (SS) of the SNc neuron. In both cases, the evoked spike started with a fixed latency (20.3 ms). The full spike is similar to the spontaneous spike (SS) of the SNc neuron that was used to trigger CDt stimulation. It is likely to reflect the activation of both the initial segment (IS) and the soma-dendrite (SD). The partial spike is likely to reflect the activation of only IS, which occurs commonly in DA neurons (Grace and Bunney, 1983). The full (IS+SD) and partial (IS) spikes occurred in an intermingled manner, but are shown separately in (A) and (B). On the other hand, neither a full nor partial spike occurred when a spontaneous spike occurred shortly before (C) or after (D) CDt stimulation. This indicates that both the full and partial spikes originated from the same SNc neuron, and therefore collided with the spontaneous spike. (E–G) Properties of antidromically activated DA neurons. (E) Spike shapes and antidromic latencies of 7 neurons (6 in monkey PK, 1 in monkey DW). (F) Value discrimination of antidromically activated (Anti(+)) neurons in passive viewing task (> 4d learning and > 1d retention), shown as average SDFs. We were able to keep recording from 6 out of 7 Anti(+) neurons using passive viewing task. All of them showed statistically significant value discriminations (p < 0.05, Wilcoxon rank-sum test). (G) Reward responses of Anti(+) neurons to the reward outcome in passive viewing task, shown as average SDFs. Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions

14 Figure S6 Parallel Basal Ganglia Circuits Guided Selectively by Sustain-type and Update-type DA Neurons, Related to Figures 3, 5, and 6 (A) Locations of CDt-projecting DA neurons. CDt-projecting DA neurons (CTB and TH double-labeled) are shown by small magenta circles in coronal sections from the rostral end to the caudal end of SN (monkey PK). Below each section are shown the percentage of CTB-positive neurons that were TH-positive (e.g., 87.5% in the first SN section), and their numbers (e.g., 7/8). Arrow: Location of a CDt-projecting neuron (antidromically activated from CDt) which showed stable value coding (PK#3, Figures S5E), indicated by a marking lesion (Figure 6A). Green area: pars compacta (SNc), which was determined by TH-labeled cell somas. Cyan area: axon terminals of CDt neurons (anterogradely CTB-labeled) which were located in the caudal-dorsal-lateral part of SNr (cdlSNr). Number indicates the anterior-posterior distance from the anterior commissure (mm). Section at is missing for a technical reason. Scale bar: 2 mm. Note that CDt-projecting DA neurons were located close to cdlSNr which receives focal and dense inputs from CDt and contains GABAergic neurons projecting to SC (Figure 6C, also see Yasuda and Hikosaka, 2015). (B) Locations of electrical stimulation and retrograde tracer injection. Left: Nissl-stained coronal section showing the location of antidromic stimulation and CTB injection in monkey PK. Right: coronal MR image showing the location of antidormic stimulation in monkey DW. Number indicates the distance from the anterior commissure (mm). Scale bar: 5 mm. (C) Object responses of update-type DA neurons and sustain-type DA neurons in passive viewing task (see Figure 3E). (D) Object responses of caudate head (CDh) neurons and caudate tail (CDt) neurons in passive viewing task. They are divided into positive value coding type (top) and negative value coding type (bottom). Reproduced with permission from Kim and Hikosaka (2013). (E) Parallel basal ganglia circuits for goal-directed behavior and habit/skill are guided separately by two groups of DA neurons in SNc: update-type DA neurons in rmSNc and sutain-type DA neurons in clSNc. Both circuits process reward values of visual objects, but relying on memories with different time courses. CDh circuit quickly acquires short-term memories by relying on update-type DA neurons, thus guiding gaze voluntarily toward recently high-valued objects (goal-directed). CDt circuit slowly accumulates long-term memories by relying on sustain-type DA neurons, thus guiding gaze automatically toward consistently high-valued objects among many objects (skill/habit). The long-term memories might be supported by a loop circuit in which CDt induces a disinhibition of sustain-type DA neurons through cdlSNr. rvmSNr: rostral-ventral-medial SNr. cdlSNr: caudal-dorsal-lateral SNr. Cell  , DOI: ( /j.cell ) Copyright © 2015 Elsevier Inc. Terms and Conditions


Download ppt "Hyoung F. Kim, Ali Ghazizadeh, Okihide Hikosaka  Cell "

Similar presentations


Ads by Google