Human Reward / Stimulus/ Response Signal Experiment: Data and Analysis

Slides:



Advertisements
Similar presentations
STA305 week 31 Assessing Model Adequacy A number of assumptions were made about the model, and these need to be verified in order to use the model for.
Advertisements

Quasi-Continuous Decision States in the Leaky Competing Accumulator Model Jay McClelland Stanford University With Joel Lachter, Greg Corrado, and Jim Johnston.
Decision Dynamics and Decision States: the Leaky Competing Accumulator Model Psychology 209 March 4, 2013.
PSYCHOPHYSICS What is Psychophysics? Classical Psychophysics Thresholds Signal Detection Theory Psychophysical Laws.
Portfolio Construction. Introduction Information analysis ignored real world issues. We now confront those issues directly, especially: –Constraints –Transactions.
From T. McMillen & P. Holmes, J. Math. Psych. 50: 30-57, MURI Center for Human and Robot Decision Dynamics, Sept 13, Phil Holmes, Jonathan.
Forecasting and Statistical Process Control MBA Statistics COURSE #5.
Signal Detection Theory October 10, 2013 Some Psychometrics! Response data from a perception experiment is usually organized in the form of a confusion.
Learning Theory Reza Shadmehr LMS with Newton-Raphson, weighted least squares, choice of loss function.
Decision Dynamics and Decision States in the Leaky Competing Accumulator Model Jay McClelland Stanford University With Juan Gao, Marius Usher and others.
Dynamics of Reward and Stimulus Information in Human Decision Making Juan Gao, Rebecca Tortell & James L. McClelland With inspiration from Bill Newsome.
Rerandomization to Improve Covariate Balance in Randomized Experiments Kari Lock Harvard Statistics Advisor: Don Rubin 4/28/11.
Dynamics of Reward Bias Effects in Perceptual Decision Making Jay McClelland & Juan Gao Building on: Newsome and Rorie Holmes and Feng Usher and McClelland.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Optimal Decision-Making in Humans & Animals Angela Yu March 05, 2009.
Fuzzy Signal Detection Theory: ROC Analysis of Stimulus and Response Range Effects J.L. Szalma and P.A. Hancock Department of Psychology and Institute.
Original analyses All ROIs
Deep Feedforward Networks
Dynamics of Reward Bias Effects in Perceptual Decision Making
From: Rat performance on visual detection task modeled with divisive normalization and adaptive decision thresholds Journal of Vision. 2011;11(9):1. doi: /
Figure 1.16 Detecting a stimulus using the signal detection theory (SDT) approach (Part 1) wolfe2e-fig jpg.
Jay McClelland Stanford University
Hypothesis Testing: Hypotheses
Contribution of spatial and temporal integration in heading perception
Dynamical Models of Decision Making Optimality, human performance, and principles of neural information processing Jay McClelland Department of Psychology.
Evoked Response Potential (ERP) and Face Stimuli N170: negative-going potential at 170 ms Largest over the right parietal lobe,
A Classical Model of Decision Making: The Drift Diffusion Model of Choice Between Two Alternatives At each time step a small sample of noisy information.
Interacting Roles of Attention and Visual Salience in V4
David L. Barack, Steve W.C. Chang, Michael L. Platt  Neuron 
Volume 20, Issue 5, Pages (May 1998)
On the Nature of Decision States: Theory and Data
Comparison of observed switching behavior to ideal switching performance. Comparison of observed switching behavior to ideal switching performance. Conventions.
Changing environment task.
Dynamical Models of Decision Making Optimality, human performance, and principles of neural information processing Jay McClelland Department of Psychology.
Using Time-Varying Motion Stimuli to Explore Decision Dynamics
Marius Usher, Phil Holmes, Juan Gao, Bill Newsome and Alan Rorie
BrainStat - the details
Recency vs Primacy -- an ongoing project
Value Representations in the Primate Striatum during Matching Behavior
Integration of sensory modalities
Volume 94, Issue 2, Pages e6 (April 2017)
Volume 20, Issue 5, Pages (May 1998)
Nils Kolling, Marco Wittmann, Matthew F.S. Rushworth  Neuron 
Ariel Zylberberg, Daniel M. Wolpert, Michael N. Shadlen  Neuron 
Visual Search and Attention
Probabilistic Population Codes for Bayesian Decision Making
Banburismus and the Brain
Confidence as Bayesian Probability: From Neural Origins to Behavior
A Role for the Superior Colliculus in Decision Criteria
Attentional Modulations Related to Spatial Gating but Not to Allocation of Limited Resources in Primate V1  Yuzhi Chen, Eyal Seidemann  Neuron  Volume.
Inactivation of Medial Frontal Cortex Changes Risk Preference
Decision Making as a Window on Cognition
Learning Theory Reza Shadmehr
Volume 94, Issue 2, Pages e6 (April 2017)
Jay and Juan building on Feng and Holmes
Attentional Changes in Either Criterion or Sensitivity Are Associated with Robust Modulations in Lateral Prefrontal Cortex  Thomas Zhihao Luo, John H.R.
Franco Pestilli, Marisa Carrasco, David J. Heeger, Justin L. Gardner 
Attention Increases Sensitivity of V4 Neurons
The Ventriloquist Effect Results from Near-Optimal Bimodal Integration
Neural Mechanisms of Speed-Accuracy Tradeoff
Franco Pestilli, Marisa Carrasco, David J. Heeger, Justin L. Gardner 
Timescales of Inference in Visual Adaptation
Mathematical Foundations of BME Reza Shadmehr
EE Audio Signals and Systems
Mathematical Foundations of BME
Analysis Assumptions -x m - m + c
Volume 16, Issue 20, Pages (October 2006)
ECE 352 Digital System Fundamentals
Matthew I Leon, Michael N Shadlen  Neuron 
Presumptions Subgroups (samples) of data are formed.
Presentation transcript:

Human Reward / Stimulus/ Response Signal Experiment: Data and Analysis Draws on: Alan and Bill’s experiment Usher & McClelland model and experiments Patrick Simen’s model Sam and Phil’s analysis Juan’s further analysis

Human experiment examining reward bias effect with response signal given at different times after target onset Target stimuli are rectangles shifted 1,3, or 5 pixels L or R of fixation Reward cue occurs 750 msec before stimulus. Small arrow head pointing L or R visible for 250 msec. Only biased reward conditions (2 vs 1 and 1 vs 2) are used. Response signal occurs at different times after target onset: 0 75 150 225 300 450 600 900 1200 2000 Participant receives reward only if response is correct and occurs within 250 msec of response signal. Participants were run for 15-25 sessions to provide stable data. Data shown are from later sessions in which effects were all stable.

A participant with very little reward bias Top panel shows probability of response giving larger reward as a function of actual response time for combinations of: Stimulus shift (1 3 5) pixels Reward-stimulus compatibility Lower panel shows data transformed to z scores, and corresponds to the theoretical construct: mean(x1(t)-x2(t))+bias(t) sd(x1(t)-x2(t)) where x1 represents the state of the accumulator associated with greater reward, x2 the same for lesser reward, and S is thought to choose larger reward if x1(t)-x2(t)+bias(t) > 0.

Participants Showing Reward Bias

Analysis Assumptions -x m - m + c 0.6 -x 0.5 c m - m + 0.4 0.3 0.2 0.1 -10 -8 -6 -4 -2 2 4 6 8 10 Decision variable x varies as a function of t. Choice is made at some time t = signal lag + rt. At the time the choice is made: For a single difficulty level, two distributions, with means +m, -m, and equal sd s set to 1. Choose high reward if decision variable x > -Xc For three difficulty levels, fixed s = 1, means mi (i=1,2,3), assume same Xc for all difficulty levels. Xc can be regarded as a positive increment to the state of the decision variable; high reward is chosen if x > 0 in this case.

Only one diff level Three diff levels Subject’s sensitivity, as defined in theory of signal detectability When response signal delay varies For each subject, fit with function from UM’01

Subject Sensitivity

Optimal “bias” Xc/s based on observed sensitivity data. Observed “bias”, treated as positive offset favoring response associated with high reward. 1.5 -Xc/s 1 Optimal “bias” Xc/s based on observed sensitivity data. 0.5 -10 -8 -6 -4 -2 2 4 6 8 10

Some possible models OU process (l < 0, n0 = 0) following F&H, with reward bias effect implemented as: An alteration in initial condition, subject to decay Optimal time-varying decision boundary outside of the OU process An input ‘current’ starting at presentation of reward signal Noise from reward onset Noise from stimulus onset A constant offset or criterion shift unaffected by time

1. Reward as a change in initial condition, subject to decay Note: Effect of the bias decays away for lambda<0. There is a dip at At t=0, p=1. Feng & Holmes notes

2. Time-varying optimal bias (Outside of OU process) Note: Effect of the bias persists. There is a dip at At t=0, p=1. The smaller the stimulus effect, the larger the bias. The harder the stimulus condition, the later the dip.

3.1. Reward acts as input “current”, stays on from reward signal to end of trial, noise starts at reward onset Reward signal comes t seconds before stimulus 2l Note: Effect of the bias persists There is no dip. At t=0, p<1. They forgot the 2 here. Thoeritically, the dip should happen at 1/lambda* log ( (ac-bk)/(ack^2-bk^2) ), where k=exp(lambda*tau). The t calculated is negative. Feng & Holmes notes 15

3.2. Same as 3.1 but variability is introduced only at stimulus onset Note: Effect of the bias persists There is dip at At t=0, p=1 since all accumulators have no variance.

4. Reward as a constant offset Note: Equivalent to 3.2 for large lt There is a dip at At t=0, p=1

Some possible models OU models (l < 0, n0 = 0) following F&H, with reward bias effect implemented as: An alteration in initial condition, subject to decay Optimal time-varying decision boundary outside of the OU process An input ‘current’ starting at presentation of reward signal Noise from reward onset Noise from stimulus onset A constant offset or criterion shift unaffected by time While none fit perfectly, starting point variability (n0 > 0) would potentially improve 3.2 and 4.

Jay’s favorite mechanistic story (draws from Simen’s model) Participant learns to inject waves of activation that prime response accumulators; waves peak just after stimulus onset and have a residual. Wave is higher for hi rwd response. Stimulus activation accumulates as in LCAM. Response signal initiates added drive to both accumulators equally. First accumulator to fixed threshold initiates the response.