Volume 93, Issue 2, Pages (January 2017)

Slides:



Advertisements
Similar presentations
Soyoun Kim, Jaewon Hwang, Daeyeol Lee  Neuron 
Advertisements

Heterogeneous Coding of Temporally Discounted Values in the Dorsal and Ventral Striatum during Intertemporal Choice  Xinying Cai, Soyoun Kim, Daeyeol.
Acute Stress Impairs Self-Control in Goal-Directed Choice by Altering Multiple Functional Connections within the Brain’s Decision Circuits  Silvia U.
Neuronal Correlates of Metacognition in Primate Frontal Cortex
Volume 77, Issue 5, Pages (March 2013)
Rei Akaishi, Kazumasa Umeda, Asako Nagase, Katsuyuki Sakai  Neuron 
Volume 80, Issue 4, Pages (November 2013)
Volume 95, Issue 5, Pages e5 (August 2017)
Volume 64, Issue 3, Pages (November 2009)
Volume 72, Issue 4, Pages (November 2011)
Frontal Cortex and the Discovery of Abstract Action Rules
Alan N. Hampton, Ralph Adolphs, J. Michael Tyszka, John P. O'Doherty 
Sang Wan Lee, Shinsuke Shimojo, John P. O’Doherty  Neuron 
Choice Certainty Is Informed by Both Evidence and Decision Time
Volume 94, Issue 2, Pages e6 (April 2017)
Volume 66, Issue 6, Pages (June 2010)
Learning to Simulate Others' Decisions
Volume 94, Issue 4, Pages e7 (May 2017)
Sheng Li, Stephen D. Mayhew, Zoe Kourtzi  Neuron 
Nils Kolling, Marco Wittmann, Matthew F.S. Rushworth  Neuron 
Volume 81, Issue 6, Pages (March 2014)
Timothy J. Vickery, Marvin M. Chun, Daeyeol Lee  Neuron 
A Map for Social Navigation in the Human Brain
Perceptual Learning and Decision-Making in Human Medial Frontal Cortex
Neural Correlates of Knowledge: Stable Representation of Stimulus Associations across Variations in Behavioral Performance  Adam Messinger, Larry R. Squire,
Volume 62, Issue 5, Pages (June 2009)
Volume 53, Issue 6, Pages (March 2007)
Volume 79, Issue 1, Pages (July 2013)
Roman F. Loonis, Scott L. Brincat, Evan G. Antzoulatos, Earl K. Miller 
Feature- and Order-Based Timing Representations in the Frontal Cortex
Hedging Your Bets by Learning Reward Correlations in the Human Brain
A Switching Observer for Human Perceptual Estimation
A Role for the Superior Colliculus in Decision Criteria
Volume 82, Issue 5, Pages (June 2014)
Volume 94, Issue 2, Pages e6 (April 2017)
Volume 65, Issue 6, Pages (March 2010)
Dynamic Coding for Cognitive Control in Prefrontal Cortex
A Neurocomputational Model of Altruistic Choice and Its Implications
Huihui Zhou, Robert Desimone  Neuron 
Liu D. Liu, Christopher C. Pack  Neuron 
Human Orbitofrontal Cortex Represents a Cognitive Map of State Space
Franco Pestilli, Marisa Carrasco, David J. Heeger, Justin L. Gardner 
A. Saez, M. Rigotti, S. Ostojic, S. Fusi, C.D. Salzman  Neuron 
A Switching Observer for Human Perceptual Estimation
Integration Trumps Selection in Object Recognition
Caleb E. Strait, Tommy C. Blanchard, Benjamin Y. Hayden  Neuron 
Ryo Sasaki, Takanori Uka  Neuron  Volume 62, Issue 1, Pages (April 2009)
Fast Sequences of Non-spatial State Representations in Humans
Erie D. Boorman, John P. O’Doherty, Ralph Adolphs, Antonio Rangel 
Neural Mechanisms Underlying Human Consensus Decision-Making
Rei Akaishi, Kazumasa Umeda, Asako Nagase, Katsuyuki Sakai  Neuron 
Serial, Covert Shifts of Attention during Visual Search Are Reflected by the Frontal Eye Fields and Correlated with Population Oscillations  Timothy J.
Volume 77, Issue 6, Pages (March 2013)
Value-Based Modulations in Human Visual Cortex
Franco Pestilli, Marisa Carrasco, David J. Heeger, Justin L. Gardner 
Volume 59, Issue 5, Pages (September 2008)
Marlene R. Cohen, John H.R. Maunsell  Neuron 
Volume 76, Issue 4, Pages (November 2012)
Learning to Simulate Others' Decisions
Megan E. Speer, Jamil P. Bhanji, Mauricio R. Delgado  Neuron 
Supervised Calibration Relies on the Multisensory Percept
Volume 92, Issue 2, Pages (October 2016)
Encoding of Stimulus Probability in Macaque Inferior Temporal Cortex
Cortical Signals for Rewarded Actions and Strategic Exploration
Perceptual Classification in a Rapidly Changing Environment
Volume 23, Issue 21, Pages (November 2013)
Volume 50, Issue 1, Pages (April 2006)
An Upside to Reward Sensitivity: The Hippocampus Supports Enhanced Reinforcement Learning in Adolescence  Juliet Y. Davidow, Karin Foerde, Adriana Galván,
Acute Stress Impairs Self-Control in Goal-Directed Choice by Altering Multiple Functional Connections within the Brain’s Decision Circuits  Silvia U.
Volume 28, Issue 18, Pages e3 (September 2018)
Presentation transcript:

Volume 93, Issue 2, Pages 451-463 (January 2017) Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments  Yuan Chang Leong, Angela Radulescu, Reka Daniel, Vivian DeWoskin, Yael Niv  Neuron  Volume 93, Issue 2, Pages 451-463 (January 2017) DOI: 10.1016/j.neuron.2016.12.040 Copyright © 2017 Elsevier Inc. Terms and Conditions

Figure 1 The Dimensions Task (A) Schematic illustration of the task. On each trial, the participant was presented with three stimuli, each defined along face, landmark, and tool dimensions. The participant chose one of the stimuli, received feedback, and continued to the next trial. (B) The fraction of trials on which participants chose the stimulus containing the target feature (i.e., the most rewarding feature) increased throughout games. Dashed line: random choice, shading: SEM. (C) Participants (dots) chose the stimulus that included the target feature (correct stimulus) significantly more often in the last five trials of a game as compared to the first five trials (t24 = 15.42, p < 0.001). Diagonal: equality line. Neuron 2017 93, 451-463DOI: (10.1016/j.neuron.2016.12.040) Copyright © 2017 Elsevier Inc. Terms and Conditions

Figure 2 Schematic of the Learning Models At the time of choice, the value of a stimulus (V) is a linear combination of its feature values (v), weighted by attention weights of the corresponding dimension (ϕF, ϕL, ϕT = attention weights to faces, landmarks, and tools, respectively). In the UA and AL models, attention weights for choice were one-third for all dimensions. Following feedback, a prediction error (δ) is calculated. The prediction error is then used to update the values of the three features of the chosen stimulus (illustrated here only for the face feature), weighted by attention to that dimension and scaled by the learning rate (η). In the UA and AC models, all attention weights were one-third at learning. Neuron 2017 93, 451-463DOI: (10.1016/j.neuron.2016.12.040) Copyright © 2017 Elsevier Inc. Terms and Conditions

Figure 3 The “Attention at Choice and Learning” Model Explains Participants’ Choice Data Best (A) Average choice likelihood per trial for each model and each participant (ordered by likelihood of the model that best explained their data) shows that the Attention at Choice and Learning (ACL) model predicted the data significantly better than other models (paired t tests, p < 0.001). This suggests that attention biased both choice and learning in our task. Solid lines: mean for each model across all participants. (B) BIC scores aggregated over all participants also support the ACL model (lower scores indicate better fits to data). (C) Average choice likelihood of the ACL model was significantly higher than that of the AL and UA models from the second trial of a game and onward (paired t tests, p < 0.05) and was higher than that of the AC model from as early as trial 7 (paired t tests against AC model, ∗p < 0.05). By the end of the game, the ACL model could predict choice with ∼70% accuracy. Error bars, within-subject SE. Neuron 2017 93, 451-463DOI: (10.1016/j.neuron.2016.12.040) Copyright © 2017 Elsevier Inc. Terms and Conditions

Figure 4 Neural Value and Reward Prediction Error Signals Are Biased Both by Attention for Choice and by Attention for Learning (A) BOLD activity in the vmPFC was significantly correlated with the value estimates of the ACL model, controlling for the value estimates of the AC, AL, and UA models. (B) BOLD activity in the striatum correlated with reward prediction errors generated by the ACL model, controlling for reward prediction error estimates from the AC, AL, and UA models. In sum, the ACL model’s predictions for value and prediction errors best corresponded to their respective neural correlates. Neuron 2017 93, 451-463DOI: (10.1016/j.neuron.2016.12.040) Copyright © 2017 Elsevier Inc. Terms and Conditions

Figure 5 Attention Is Modulated by Ongoing Learning (A) Proportion of trials in which the most attended dimension was the dimension that included the highest-valued feature. This proportion, in all cases significantly above chance, was highest in trials with strong attention bias as compared to those with moderate and weak attention bias. (B) Trials with a higher SD of the highest values across dimensions (SDV) showed stronger attention bias. (C) The probability of an attention switch was highest on low SDV trials. Overall, the greater the difference between feature values, the stronger the attention bias and the less likely participants were to switch attention. Black lines: means of corresponding null distributions generated from a bootstrap procedure in which attention weights for each game were replaced by weights from a randomly selected game from the same participant. (D) Coefficients of a logistic regression predicting attention switches from absence of reward on the preceding trials. Outcomes of the past four trials predicted attention switches reliably. (E and F) Comparison of models of attention fitted separately to the eye-tracking (E) and the MVPA (F) measures, according to the root-mean-square deviation (RMSD) of the model’s predictions from the empirical data (lower values indicate a better model). Plotted is the subject-wise average per-trial RMSD calculated from holdout games in leave-one-game-out cross-validation (Experimental Procedures). The Value model (dark grey) has the lowest RMSD. Error bars, 1 SEM. ∗∗∗p < 0.001; ∗∗p < 0.01; ∗p < 0.05. Neuron 2017 93, 451-463DOI: (10.1016/j.neuron.2016.12.040) Copyright © 2017 Elsevier Inc. Terms and Conditions

Figure 6 Neural Correlates of Attention Switches BOLD activity in a frontoparietal network was higher on switch trials than on stay trials. Neuron 2017 93, 451-463DOI: (10.1016/j.neuron.2016.12.040) Copyright © 2017 Elsevier Inc. Terms and Conditions

Figure 7 The Frontoparietal Attention Network Is Selectively Activated on Switch Trials Left: ROIs in the frontoparietal attention network defined using Neurosynth. Right: regression coefficients for switch trials (t) and four trials preceding a switch (t-1 to t-4). Mean ROI activity increased at switch trials (regression weights for trial t are significantly positive), but not trials preceding a switch. Error bars, SEM. ∗∗p < 0.01, ∗p < 0.05, Tp < 0.1. Neuron 2017 93, 451-463DOI: (10.1016/j.neuron.2016.12.040) Copyright © 2017 Elsevier Inc. Terms and Conditions

Figure 8 vmPFC and Frontal Attention Regions Were Anticorrelated on Stay Trials A PPI analysis with vmPFC activity as seed regressor and two task regressors (and PPI regressors) for switch trials and for stay trials revealed that activity in the dlPFC, preSMA, vlPFC, and striatum (not shown) was anticorrelated with activity in the vmPFC specifically on stay trials, above and beyond the baseline connectivity between these areas. Neuron 2017 93, 451-463DOI: (10.1016/j.neuron.2016.12.040) Copyright © 2017 Elsevier Inc. Terms and Conditions