Collective neural dynamics and drift-diffusion models for simple decision tasks. Philip Holmes, Princeton University. Eric Brown (NYU), Rafal Bogacz (Bristol,

Slides:



Advertisements
Similar presentations
Adaptive Methods Research Methods Fall 2008 Tamás Bőhm.
Advertisements

Quasi-Continuous Decision States in the Leaky Competing Accumulator Model Jay McClelland Stanford University With Joel Lachter, Greg Corrado, and Jim Johnston.
Decision Dynamics and Decision States: the Leaky Competing Accumulator Model Psychology 209 March 4, 2013.
How well can we learn what the stimulus is by looking at the neural responses? We will discuss two approaches: devise and evaluate explicit algorithms.
Sequential Hypothesis Testing under Stochastic Deadlines Peter Frazier, Angela Yu Princeton University TexPoint fonts used in EMF. Read the TexPoint manual.
The Decisive Commanding Neural Network In the Parietal Cortex By Hsiu-Ming Chang ( 張修明 )
How facilitation influences an attractor model of decision making Larissa Albantakis.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
From T. McMillen & P. Holmes, J. Math. Psych. 50: 30-57, MURI Center for Human and Robot Decision Dynamics, Sept 13, Phil Holmes, Jonathan.
Distinguishing Evidence Accumulation from Response Bias in Categorical Decision-Making Vincent P. Ferrera 1,2, Jack Grinband 1,2, Quan Xiao 1,2, Joy Hirsch.
Does Math Matter to Gray Matter? (or, The Rewards of Calculus). Philip Holmes, Princeton University with Eric Brown (NYU), Rafal Bogacz (Bristol, UK),
Theory of Decision Time Dynamics, with Applications to Memory.
An Integrated Model of Decision Making and Visual Attention Philip L. Smith University of Melbourne Collaborators: Roger Ratcliff, Bradley Wolfgang.
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Artificial Neural Networks
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Seeing Patterns in Randomness: Irrational Superstition or Adaptive Behavior? Angela J. Yu University of California, San Diego March 9, 2010.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 23 Nov 2, 2005 Nanjing University of Science & Technology.
Subject wearing a VR helmet immersed in the virtual environment on the right, with obstacles and targets. Subjects follow the path, avoid the obstacels,
Xiao-Jing Wang Department of Neurobiology Yale University School of Medicine The Concept of a Decision Threshold in Sensory-Motor Processes.
Optimality, robustness, and dynamics of decision making under norepinephrine modulation: A spiking neuronal network model Joint work with Philip Eckhoff.
Neuronal Adaptation to Visual Motion in Area MT of the Macaque -Kohn & Movshon 지각 심리 전공 박정애.
Biological Modeling of Neural Networks: Week 12 – Decision models: Competitive dynamics Wulfram Gerstner EPFL, Lausanne, Switzerland 12.1 Review: Population.
Decision Making Theories in Neuroscience Alexander Vostroknutov October 2008.
Motor Control. Beyond babbling Three problems with motor babbling: –Random exploration is slow –Error-based learning algorithms are faster but error signals.
Dynamic Decision Making in Complex Task Environments: Principles and Neural Mechanisms Annual Workshop Introduction August, 2008.
Dynamic Decision Making in Complex Task Environments: Principles and Neural Mechanisms Progress and Future Directions November 17, 2009.
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
BCS547 Neural Decoding.
Image Stabilization by Bayesian Dynamics Yoram Burak Sloan-Swartz annual meeting, July 2009.
Introduction to Neural Networks. Biological neural activity –Each neuron has a body, an axon, and many dendrites Can be in one of the two states: firing.
The Computing Brain: Focus on Decision-Making
What’s optimal about N choices? Tyler McMillen & Phil Holmes, PACM/CSBMB/Conte Center, Princeton University. Banbury, Bunbury, May 2005 at CSH. Thanks.
Decision Dynamics and Decision States in the Leaky Competing Accumulator Model Jay McClelland Stanford University With Juan Gao, Marius Usher and others.
Classification Ensemble Methods 1
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
Response dynamics and phase oscillators in the brainstem
Progress in MURI 15 ( ) Mathematical modeling of decision behavior. AFOSR, Alexandria, VA, Nov 17th, 2009 Phil Holmes 1. Optimizing monkeys? Balancing.
Dynamics of Reward and Stimulus Information in Human Decision Making Juan Gao, Rebecca Tortell & James L. McClelland With inspiration from Bill Newsome.
Network Models (2) LECTURE 7. I.Introduction − Basic concepts of neural networks II.Realistic neural networks − Homogeneous excitatory and inhibitory.
Simultaneous integration versus sequential sampling in multiple-choice decision making Nate Smith July 20, 2008.
CSC321: Neural Networks Lecture 1: What are neural networks? Geoffrey Hinton
The Physics of Decision-Making: Cognitive Control as the Optimization of Behavior Gary Aston-Jones ∞ Rafal Bogacz * † ª Eric Brown † Jonathan D. Cohen.
Psychology and Neurobiology of Decision-Making under Uncertainty Angela Yu March 11, 2010.
Does the brain compute confidence estimates about decisions?
Dynamics of Reward Bias Effects in Perceptual Decision Making Jay McClelland & Juan Gao Building on: Newsome and Rorie Holmes and Feng Usher and McClelland.
Optimal Decision-Making in Humans & Animals Angela Yu March 05, 2009.
Mechanisms of Simple Perceptual Decision Making Processes
Dynamics of Reward Bias Effects in Perceptual Decision Making
Jay McClelland Stanford University
Dynamical Models of Decision Making Optimality, human performance, and principles of neural information processing Jay McClelland Department of Psychology.
A Classical Model of Decision Making: The Drift Diffusion Model of Choice Between Two Alternatives At each time step a small sample of noisy information.
Dynamical Models of Decision Making Optimality, human performance, and principles of neural information processing Jay McClelland Department of Psychology.
Using Time-Varying Motion Stimuli to Explore Decision Dynamics
Human Reward / Stimulus/ Response Signal Experiment: Data and Analysis
Marius Usher, Phil Holmes, Juan Gao, Bill Newsome and Alan Rorie
Braden A. Purcell, Roozbeh Kiani  Neuron 
Probabilistic Population Codes for Bayesian Decision Making
Banburismus and the Brain
C. Shawn Green, Alexandre Pouget, Daphne Bavelier  Current Biology 
Decision Making as a Window on Cognition
Volume 36, Issue 5, Pages (December 2002)
Interaction of Sensory and Value Information in Decision-Making
Wallis, JD Helen Wills Neuroscience Institute UC, Berkeley
Redmond G. O’Connell, Michael N. Shadlen, KongFatt Wong-Lin, Simon P
Volume 75, Issue 5, Pages (September 2012)
Perceptual learning Nisheeth 15th February 2019.
Timescales of Inference in Visual Adaptation
The Normalization Model of Attention
Volume 75, Issue 5, Pages (September 2012)
Presentation transcript:

Collective neural dynamics and drift-diffusion models for simple decision tasks. Philip Holmes, Princeton University. Eric Brown (NYU), Rafal Bogacz (Bristol, UK), Jeff Moehlis (UCSB), Phil Eckhoff, Sophie Liu, Angela Yu and Jonathan Cohen (Princeton), Miriam Zacksenhouse (Technion), C. Law, P.M. Conolly and Josh Gold (Penn). Thanks to: NIMH, DoE, AFOSR and the Burroughs-Wellcome Foundation. ELE ISI seminar, Dec 6, 2007.

The multiscale brain: Ingredients: ~ neurons, ~ synapses. Structure: layers and folds. Communication: via action potentials, spikes, bursts. Sources: webvision.med.utah.edu/VisualCortex.html

Multiple scales in the brain and in math: Today ’ s talk

Contents 1: Drift-diffusion models and an optimal speed-accuracy tradeoff: behavioral tests. 2: Threshold setting, uncertainty and Information gap theory: more behavioral tests. 3: Drift-diffusion models, deadlined responses, psychometric fucntions, and learning. 4: Incorporating biases and priors: top-down vs. bottom up. Moral: You can learn a lot from a simple model

A really simple decision task: “ On each trial you will be shown one of two stimuli, drawn at random. You must identify the direction (L or R) in which the majority of dots are moving. ” The experimenter can vary the coherence of movement (% moving L or R) and the delay between response and next stimulus. Correct decisions are rewarded. “ Your goal is to maximize rewards over many trials in a fixed period. ” You must be fast, and right! 30% coherence 5% coherence Courtesy: W. Newsome Behavioral measures: reaction time distributions, error rates. More complex decisions: buy or sell? Neural economics.

1. Making the most of a stochastic process. Underlying hypothesis: Human and animal behaviors have evolved to be (near) optimal. (Bialek et al., : Fly vision & steering. ) Drift-diffusion among multiple alternatives. McMillen & H, J. Math. Psych, 2006.

An optimal decision procedure for noisy data: the Sequential Probability Ratio Test Mathematical idealization: During the trial, we draw noisy samples from one of two distributions p L (x) or p R (x) (left or right-going dots). The SPRT works like this: set up two thresholds and keep a running tally of the ratio of likelihood ratios: When first exceeds or falls below, declare victory for R or L. Theorem: (Wald, Barnard) Among all fixed sample or sequential tests, SPRT minimizes expected number of observations n for given accuracy. [For fixed n, SPRT maximises accuracy (Neyman-Pearson lemma).] p L (x)p R (x)

The continuum limit is a DD process Take logarithms: multiplication in becomes addition. Take a continuum limit: addition becomes integration. The SPRT becomes a drift-diffusion (DD) process (a cornerstone of 20th century physics): drift rate noise strength Here is the accumulated evidence (the log likelihood ratio). When reaches either threshold, declare R or L the winner. But do humans (or monkeys, or rats) drift and diffuse? Evidence comes from three sources: behavior, neural recordings, and mathematical models.

Behavioral evidence: RT distributions Human reaction time data in free response mode can be fitted to the first passage threshold crossing times of a DD process. Prior or bias toward one alternative can be implemented by setting starting point. Ratcliff et al., Psych Rev. 1978, 1999, 2004, … Simen et al., in review, thresh. +Z thresh. -Z drift A

Neural evidence 1: firing rates Spike rates of neurons in oculomotor areas rise during stimulus presentation, monkeys signal their choice after a threshold is crossed. J. Schall, V. Stuphorn, J. Brown, Neuron, Frontal eye field (FEF) recordings. J.I Gold, M.N. Shadlen, Neuron, Lateral interparietal area (LIP) recordings. thresholds

Neural evidence 2: spiking neuron models Working hypothesis: MT motion sensitive cells in visual cortex pass noisy signals on to LIP, FEF, … where integration occurs. thresholds Adapted from K.H. Britten, M.N. Shadlen, W.T. Newsome, J.D. Schall & A. Movshon, various papers, LIP MT Adapted from X.-J. Wang, Neuron, 2002; K-F. Wong & X.-J. Wang, J. Neurosci., Pools of spiking neurons with synaptic connections. Ongoing work extending to model NE modulation of LIP: P. Eckhoff & K-F. Wong. Stochastic averaging over populations: A. Saxe.

Model evidence: integration of noisy signals We can model the decision process as the integration of evidence by leaky competing accumulators (LCAs): (Usher & McClelland, 1995,2001) Subtracting the accumulated evidence yields a DD process for. thresh. 1 thresh. 2 Brown et al. Int. J. Bifurcation & Chaos 15, Bogacz et al. Psych. Review 113, Stochastic center manifold

Reduction of a neural network to a DD/OU process OK, maybe. But do humans (or monkeys, or rats) optimize? Model for the Eriksen task : control of attention leads to variable drift rates. Servan-Schrieber et al., 1998; Liu, H & Cohen, Neural Comp. (in press), Accuracy Resp. time Data fits bottom-up Attention top-down

An optimal speed-accuracy tradeoff 1 Threshold too low Too high Optimal D RT $ $ DDD D D D $$ XXX $$$ X DDD The task: maximize rewards for a succession of free response trials in a fixed period. Reward Rate: response-to-stimulus interval RT DD (% correct/average time between resps.)

An optimal speed-accuracy tradeoff 2 How fast to be? How careful? The DDM delivers an explicit solution to the speed-accuracy tradeoff in terms of just 3 parameters: normalized threshold and signal-to-noise ratio and D. So, setting the threshold we can express RT in terms of ER and calculate a unique, parameter-free Optimal Performance Curve: RT/D = F(ER)

Behavioral test 1: most are not optimal Do people adopt the optimal strategy? Some do; some don ’ t. Is this because they are optimizing a different function? e.g. weighting accuracy more? Or are they trying, but unable to adjust their thresholds? Bottom line: Too much accuracy is bad for your bottom line. (Princeton undergrads don’t like to make mistakes.) OPC

Behavioral test 2: a premium on accuracy? A modified reward rate function with a penalty for errors gives a family of OPCs with a free parameter: the weight placed on accuracy. It fits the data as a whole better, but what’s explained? OPC accuracy weight increasing data fit Short version: Holmes et al., IEICE Trans. E88A, Long version: Bogacz et al. Psych. Review 113, 2006.

2. Choosing thresholds Q: Suboptimal behavior could be reckless (threshold too low) or cautious (threshold too high)? Why do most people tend to be cautious? Could it be a rational choice? Which type of behavior leads to smaller losses? A: Examine the RR function. Slope on high threshold side is smaller than slope on low threshold side, so for equal magnitudes, conservative errors cost less. (Simple answer, can learn more from info gap theory.) threshold threshold too high: small reward loss threshold too low: larger reward loss.

Thresholds and gain changes So, how might thresholds be adjusted ‘ on the fly ’ when task conditions change? Neurons act like amplifiers, transforming input spikes to output spike rates. Gain improves discrimination, reduces noise. Servan-Schreiber et al., Science, input output (spike rate) gain threshold Neurotransmitter release can increase gain. Specifically, norepinephrine can assist processing and speed response in decision tasks

Locus coeruleus offers a mechanism for gain changes The LC, a neuromodulatory nucleus in the brainstem, releases norepinephrine (NE) widely in the cortex, tuning performance. The LC has only ~ 30,000 neurons, but they each make ~ 250,000 synapses. Transient bursts of spikes triggered by salient stimuli cause gain changes, thus bigger response to same stimulus. Devilbiss and Waterhouse, Synapse, Aston-Jones & Cohen, Ann. Rev. Neurosci., Usher et al, Science, 1999; Brown et al. J Comp. Neurosci same stimulus

A model for block-by-block threshold adjustment An algorithm based on reward rate estimates and a linear reward rate rule can make rapid threshold updates iteratively. Simen, Cohen H, Neural Networks 19, Threshold But: Can RR be estimated sufficiently accurately? Requires good interval timing. Can the rule be learned (RL)? Does noise cause overestimates? A complicated story, which leads us to consider the role of uncertainty.

The information gap approach Info gap allows for uncertainties in parameter estimates, and can be applied to DD process models of forced choice tasks. E.g., suppose that response-to-stimulus interval is only known within bounds around an estimate. uncertainty There are two approaches: Min-max: Choose the threshold that maximizes RR for given uncertainty. Robust satisfy: Maximize uncertainty for which a given (and necessarily suboptimal) RR can be guaranteed. Similar treatment for uncertainties in SNR. Y. Ben-Haim. Information Gap Decision Theory: Decisions under severe uncertainty. Academic Press, New York, 2006.

Uncertainties in response-to-stimulus interval when treated via the min-max strategy appear to match the overall data best: Uncertain RSIs: RSPCs. Uncertain RSIs: min-max. Performance bands & higher thresholds for poorer timers. Uncertain SNRs: min-max.

Split the data into three subgroups by overall winnings The picture becomes much clearer. The top 30% group perform near-optimally; data from the next 60% and bottom 10% are much better fit by the min-max strategy, with uncertainty in RSI. Weighted accuracy is the runner up. Note: 1 parameter to fit for all curves except OPC (parameter free). Conjecture: conservatives are poor interval timers. [A. Saxe is running behavioral experiments to text this.] M. Zacksenhouse, PH & R. Bogacz (in review., 2007). OPC

3. Fixed viewing time (deadlined) tests can also be modeled by DD processes. One considers the PDF of sample paths of the SDE, which is governed by the forward Fokker- Planck or Kolmogorov PDE: General solutions for time-varying drift (SNR) are available, so …

… we can predict psychometric functions (PMFs): Accuracy for fixed viewing time T: DD model with variable drift, e.g.: scale coher. expt. asmpt. decay rate DD model fitted to data from monkeys doing moving dots task during training.

DD parameter changes can characterize learning Scale factor : steady increase in SNR. MT to LIP synapses strengthen. Coherence exponent Decay rate : irregular oscillns. rescaling sensors?????? P. Eckhoff, PH, P. Conolly, C. Law & J.I. Gold, New J Phys., in press, 2007).

Given reward bias, subject must discriminate motion direction. Variable coherences span psychophysical threshold, creating a range of difficulty. Creates conflict between sensory and reward information: bottom-up and top-down influences. Only correct choices are rewarded. A motion discrimination task with multiple reward conditions Thanks to: A. Rorie & W.T. Newsome (in prep., 2007). 4. DD can incorporate bias and expectations as well as sensory evidence (work in progress) Subject informed of reward bias before stimulus appears.

T1 T2 T1 T2 T1 T2 T1 T2 Absolute magnitude no effect on choice Relative magnitudebiases choices T1 T2 n=51 Relative reward magnitudes modulate behavior

n=51 Chose left (T2) Chose right (T1) The DD model can integrate bottom-up and top-down inputs, may help elucidate this more complex decision process. Neural data: Representation of relative reward magnitude in LIP Rorie &Newsome (in prep., 2007).

Psychometric functions predicted by DD/OU models Using the DD or OU process with interrogation protocol and a model for reward expectation bias, we can compute PMFs: stable OU DD unstable OU

DD/OU models with reward bias priors, 1 Simple models for reward expectation bias modify PMFs, e.g.: reward cue period motion period This produces a shifted PMF (previous slide). We can also calculate optimal biases that maximize rewards, and compare with animal performance: … calculate, calculate, … reward ratio S. Feng & H, work in progress, /SNR

DD/OU models with reward bias priors, 2 Optimal biases depend logarithmically on reward ratio (bias on initial condition shown here): We will fit such models to the Rorie-Newsome data. increasing coherence

Summary Neural activity in simple decisions resembles a DD process: the model predicts optimal speed-accuracy tradeoffs. Information gap theory allows for uncertainty in parameter estimates: robust suboptimal performance. Fast threshold adjustments can optimize rewards in free response mode: relies on interval timing ability. The DD process can model cued responses, predict psychometric functions for fixed viewing times: variable SNR and drift rates can track slow learning. DD processes extend to include top-down cognitive control. Good mathematical models are not just (reasonably) faithful; they’re also simple and (approximately) soluble. They focus, sharpen questions, and simplify. Thanks for your attention!

Additional material DDM approximates Bayesian updating Incorporating biased stimuli via initial conditions Incorporating sequence effects via initial conditions Some details on information gap theory

Bayesian updating can also be reduced to DD processes S. Liu, A. Yu & PH (in review, 2007). Eriksen task, 2AFC with conflict: Respond to center stimulus compatible: SSS, HHH incompatible: HSH, SHS

Biasing choices via initial conditions 1 Prior expectations can be incorporated into DD models by biasing the initial conditions. For example, in case of biased stimuli S1 with probability and S2 with probability, we set: Unbiased DD: Biased DD: Simen et al., in review, Biased rewards can be treated similarly, with dependence on D also. For free response protocol.

Biasing choices via initial conditions 2 More frequent stimuli (S1 > S2) or biased rewards also affect optimal thresholds, and initial conditions and thresholds can collide, predicting instant responding. Biased stimuli Biased rewards Simen et al., in review, Thresholds Initial conds Always choose more probable alternative if above critical surface. SNR RSI

DD dynamics during successive blocks of trials with different stimulus probabilities and RSIs:

Behavioral data: biased stimuli Simen et al., in review, RSIs 0.5 s1.0 s 2.0 s RTs ERs Prob resp.

Including sequence effects in 2AFC models LCA inherits initial conditions from prior sequence: Gao et al., in prep., Post response Symm bias due to Asymm bias due to relaxation in RSI conflict (ACC) expectation (PFC) Automatic Facilitation & Subjective Expectancy? Or just different time constants of internal & top-down dynamics?

Model and behavioral data: RSI effects Gao et al., in prep., 2007.

Uncertainties in delays: min-max approach We work with inverse RR: Assume SNR is known. The min-max thresholds optimize the worst possible performance (i.e. the max of IR ): Internal maximization is achieved at with threshold It ’ s just a scaled version of the OPC!

Uncertainties in delays: robust satisfying approach Robustness is max uncertainty for given desired performance: It differs from the OPC, even if.

Robustness and opportuneness (!) curves Opportunity describes ‘ windfall performance ’ that can occur under unusually favorable conditions (e.g. RSI is shorter than you think: ).

Details: parameter fits and likelihood ratios: L to R: Performance bands grow (greater uncertainty); desired performance degrades, and accuracy weights increase. similar fit quality Runners up: RSPC for D OPC for RA

Conte Meeting Nov 12, 2007 More DDancing about (OUch!) Project 6 DD/OU models can incorporate bias and expectations as well as sensory evidence. Initial conditions and/or drift rates can reflect prior expectations regarding stimuli or rewards, or other top-down controls: And nonlinear terms may be essential to capture attractor dynamics (KF Wong, … ): NOT TODAY