Presentation is loading. Please wait.

Presentation is loading. Please wait.

Psychology and Neurobiology of Decision-Making under Uncertainty Angela Yu March 11, 2010.

Similar presentations


Presentation on theme: "Psychology and Neurobiology of Decision-Making under Uncertainty Angela Yu March 11, 2010."— Presentation transcript:

1 Psychology and Neurobiology of Decision-Making under Uncertainty Angela Yu March 11, 2010

2 Introduction Speed Accuracyvs. Faster responses save time, but more errors What is optimal policy? What computations are involved? Are humans/animals optimal? How is it neurally implemented?

3 Outline Review of Bayesian decision theory: examples Quantifying temporal cost Optimal policy for 2-alternative forced choice – optimal policy (SPRT) – behavioral & neural data Decision under deadline

4 Bayesian Decision Theory: Estimation s x hidden variable observation ? loss function

5 Bayesian Decision Theory Decision policy Loss function: Expected loss: A policy  : x  d is a mapping from input x into decision d, where x is a sampled from the distribution p(x|s). The loss function l(d(x); s) depends on the decision d(x) and the hidden variable s. The expected loss associated with a policy  is averaged over x and s: The optimal policy minimizes this expected loss.

6 Examples Ex. 1: loss function is mean square error For each observation x, we need to minimize: Differentiating and setting to 0, we find loss is minimized if: Optimal policy is to report the mean of posterior p(s|x).

7 Examples Ex. 2: loss function is absolute error: Optimal policy is to report the median of posterior p(s|x). Ex. 3: loss function is 0-1 error: 0 if correct, 1 all other choices Optimal policy is to report the mode of posterior p(s|x).

8 Time Matters  Repeated Decisions ABABAB wait +: more info -: more time

9 Hypothesis Testing is a Stopping Problem Decision Problem: xx … x1x1 s x2x2 x3x3 Incorporate evidence iteratively (Bayes’ Rule):

10 Loss function penalizes both error and delay: Optimal Policy (Bellman equation): Loss Function and Optimal Policy Stopping cost: Expected loss: Continuation cost: After observing inputs x 1, …, x t, continue (observe x t+1 ) only if the continuation cost < stopping cost

11 An Intractable Solution in Practice Stopping Cost Too hard!

12 b a Useful Properties of Q s and Q c ? Wald & Wolfowitz (1948): Continuation region is an interval (a, b) on the unit interval: At time t, continue if a < q t < b, stop and report =1 if q t >b, report =0 if q t <a.

13 Intuition for Optimality of SPRT Cost qtqt 01 c ab Q s (0) = w q t Q s (1) = v (1 - q t ) Q c is concave decide 0decide 1continue

14 Sequential Probability Ratio Test Log posterior ratio (monotonic to q t ): b’ a’ 0 rtrt Additive updates (random walk):

15 Do People/Animals Behave Optimally? A simple decision-making (DM) task binary decision rate of information flow constant over time task difficulty easily manipulated

16 Random-Dot Coherent Motion Paradigm 30% Coherence5% Coherence

17 Do People/Animals Behave Optimally? Accuracy vs. Coherence vs. Coherence (Roitman & Shadlen, 2002) Model: SPRT

18 What is Happening in the Brain? Saccade generation system (Roitman & Shalden, 2002) Model: SPRT Neural Activities in LIP

19 Mid-way Summary Optimal binary DM = evidence accumulation to thresholds Perceptual DM behavior similar to that expected from SPRT Neural activities in LIP  Bayesian evidence integration SPRT a formal link between biology and behavior in DM

20 Outline Review of Bayesian decision theory: examples Quantifying temporal cost Optimal policy for 2-alternative forced choice – optimal policy (SPRT) – behavioral & neural data Decision under deadline

21 Caveat: Model Fit Imperfect Mean RTRT Distributions (Data from Roitman & Shadlen, 2002; analysis from Ditterich, 2007) SPRT predicts same RT distribution for correct & error trials Real error trials are slower

22 b a SPRT: Same RT for Correct and Errors

23 Fix 1: Variable Drift Rate (Data from Roitman & Shadlen, 2002; analysis from Ditterich, 2007) (idea from Ratcliff & Rouder, 1998) AccuracyMean RTRT Distributions

24 Fix 2: Increasing Drift Rate (Data from Roitman & Shadlen, 2002; analysis from Ditterich, 2007) AccuracyMean RTRT Distributions

25 A Principled Approach Trials aborted w/o response or reward if monkey breaks fixation (Data from Roitman & Shadlen, 2002) (Analysis from Ditterich, 2007) More “urgency” over time as risk of aborting trial increases Can we model this in a principled way?

26 Optimal Policy is a pair of of monotonically decaying thresholds Probability 1 0 Generalized SPRT Classic SPRT Time Intuition for decay With limited time, there is limited opportunity to observe enough evidence to bias the evidence in the opposite direction Optimal Decision Making Under a Deadline (Frazier & Yu, 2007) Loss function depends on error, delay, and whether deadline exceed

27 Threshold Decay  Fast RT & Slower Errors Probability 1 0 Generalized SPRT Classic SPRT Time (Frazier & Yu, 2007) Earlier deadline  greater urgency Time Probability

28 Timing Uncertainty (Frazier & Yu, 2007) Timing uncertainty  more “urgency” (lower threhsolds) Time Probability This leads to a set of testable predictions

29 Summary Bayesian decision theory as a formal framework for DM Inadequacies of SPRT to account for behavior Inherent time pressure  decaying thresholds Neural implementation of decaying thresholds?

30 Perceptual DM + Saccade Planning Task: find motion target sensory processing cost (time, accuracy) learning/memory attention Current/Future Project BehaviorTheoryNeurobiology Human Rats Bayesian modeling Stochastic control theory fMRI Recording

31

32 Generalize Loss Function We assumed loss is linear in error and delay: Wald also proved a dual statement: Amongst all decision policies satisfying the criterion SPRT (with some thresholds) minimizes the expected sample size. This implies that the SPRT is optimal for all loss functions that increase with inaccuracy and (linearly in) delay (proof by contradiction). (Bogacz et al, 2006) What if it’s non-linear, e.g. maximize reward rate:

33 Some Intuitions for the Proof (Frazier & Yu, 2007) Continuation Region Shrinking!

34 Generalizing the Cost Function Loss function increases faster than linear in time f(  ) is convex and increasing. Nonlinear trade-off between accuracy and time (e.g. maximizing reward rate)


Download ppt "Psychology and Neurobiology of Decision-Making under Uncertainty Angela Yu March 11, 2010."

Similar presentations


Ads by Google