Seeing Patterns in Randomness: Irrational Superstition or Adaptive Behavior? Angela J. Yu University of California, San Diego March 9, 2010.

Slides:



Advertisements
Similar presentations
Adaptive Methods Research Methods Fall 2008 Tamás Bőhm.
Advertisements

Bayesian Belief Propagation
Institute for Theoretical Physics and Mathematics Tehran January, 2006 Value based decision making: behavior and theory.
Spike Train Statistics Sabri IPM. Review of spike train  Extracting information from spike trains  Noisy environment:  in vitro  in vivo  measurement.
Dynamic Bayesian Networks (DBNs)
Tuomas Sandholm Carnegie Mellon University Computer Science Department
The free-energy principle: a rough guide to the brain? Karl Friston
Quasi-Continuous Decision States in the Leaky Competing Accumulator Model Jay McClelland Stanford University With Joel Lachter, Greg Corrado, and Jim Johnston.
Artificial Spiking Neural Networks
Decision Dynamics and Decision States: the Leaky Competing Accumulator Model Psychology 209 March 4, 2013.
For stimulus s, have estimated s est Bias: Cramer-Rao bound: Mean square error: Variance: Fisher information How good is our estimate? (ML is unbiased:
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Sequential Hypothesis Testing under Stochastic Deadlines Peter Frazier, Angela Yu Princeton University TexPoint fonts used in EMF. Read the TexPoint manual.
An Optimal Learning Approach to Finding an Outbreak of a Disease Warren Scott Warren Powell
Baysian Approaches Kun Guo, PhD Reader in Cognitive Neuroscience School of Psychology University of Lincoln Quantitative Methods 2011.
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Prediction and Change Detection Mark Steyvers Scott Brown Mike Yi University of California, Irvine This work is supported by a grant from the US Air Force.
From T. McMillen & P. Holmes, J. Math. Psych. 50: 30-57, MURI Center for Human and Robot Decision Dynamics, Sept 13, Phil Holmes, Jonathan.
Uncertainty, Neuromodulation and Attention Angela Yu, and Peter Dayan.
Normative models of human inductive inference Tom Griffiths Department of Psychology Cognitive Science Program University of California, Berkeley.
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
Online Learning Algorithms
Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.
The free-energy principle: a rough guide to the brain? K Friston Summarized by Joon Shik Kim (Thu) Computational Models of Intelligence.
Markov Localization & Bayes Filtering
1 / 41 Inference and Computation with Population Codes 13 November 2012 Inference and Computation with Population Codes Alexandre Pouget, Peter Dayan,
Emergence of Semantic Knowledge from Experience Jay McClelland Stanford University.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Optimality, robustness, and dynamics of decision making under norepinephrine modulation: A spiking neuronal network model Joint work with Philip Eckhoff.
Decision Making Theories in Neuroscience Alexander Vostroknutov October 2008.
Dynamic Decision Making in Complex Task Environments: Principles and Neural Mechanisms Annual Workshop Introduction August, 2008.
Dynamic Decision Making in Complex Task Environments: Principles and Neural Mechanisms Progress and Future Directions November 17, 2009.
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
A View from the Bottom Peter Dayan Gatsby Computational Neuroscience Unit.
BCS547 Neural Decoding.
Image Stabilization by Bayesian Dynamics Yoram Burak Sloan-Swartz annual meeting, July 2009.
Sequential Monte-Carlo Method -Introduction, implementation and application Fan, Xin
The Computing Brain: Focus on Decision-Making
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.
What’s optimal about N choices? Tyler McMillen & Phil Holmes, PACM/CSBMB/Conte Center, Princeton University. Banbury, Bunbury, May 2005 at CSH. Thanks.
Decision Dynamics and Decision States in the Leaky Competing Accumulator Model Jay McClelland Stanford University With Juan Gao, Marius Usher and others.
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
Dynamic Causal Model for evoked responses in MEG/EEG Rosalyn Moran.
The Physics of Decision-Making: Cognitive Control as the Optimization of Behavior Gary Aston-Jones ∞ Rafal Bogacz * † ª Eric Brown † Jonathan D. Cohen.
Neural correlates of risk sensitivity An fMRI study of instrumental choice behavior Yael Niv, Jeffrey A. Edlund, Peter Dayan, and John O’Doherty Cohen.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
Psychology and Neurobiology of Decision-Making under Uncertainty Angela Yu March 11, 2010.
Bayesian Perception.
Does the brain compute confidence estimates about decisions?
Dynamics of Reward Bias Effects in Perceptual Decision Making Jay McClelland & Juan Gao Building on: Newsome and Rorie Holmes and Feng Usher and McClelland.
Optimal Decision-Making in Humans & Animals Angela Yu March 05, 2009.
Bayesian Brain - Chapter 11 Neural Models of Bayesian Belief Propagation Rajesh P.N. Rao Summary by B.-H. Kim Biointelligence Lab School of.
LEARNING FROM EXAMPLES AIMA CHAPTER 18 (4-5) CSE 537 Spring 2014 Instructor: Sael Lee Slides are mostly made from AIMA resources, Andrew W. Moore’s tutorials:
Authors: Peter W. Battaglia, Robert A. Jacobs, and Richard N. Aslin
Variational filtering in generated coordinates of motion
Effective Connectivity
Special Topics In Scientific Computing
Dynamical Models of Decision Making Optimality, human performance, and principles of neural information processing Jay McClelland Department of Psychology.
A Classical Model of Decision Making: The Drift Diffusion Model of Choice Between Two Alternatives At each time step a small sample of noisy information.
Computational models for imaging analyses
Dynamical Models of Decision Making Optimality, human performance, and principles of neural information processing Jay McClelland Department of Psychology.
The free-energy principle: a rough guide to the brain? K Friston
Ralf M. Haefner, Pietro Berkes, József Fiser  Neuron 
Effective Connectivity
Joseph T. McGuire, Matthew R. Nassar, Joshua I. Gold, Joseph W. Kable 
CS639: Data Management for Data Science
Will Penny Wellcome Trust Centre for Neuroimaging,
Presentation transcript:

Seeing Patterns in Randomness: Irrational Superstition or Adaptive Behavior? Angela J. Yu University of California, San Diego March 9, 2010

“Irrational” Probabilistic Reasoning in Humans … Random stimulus sequence: 1 2 “hot hand” 2AFC: sequential effects (rep/alt) (Gillovich, Vallon, & Tversky, 1985) (Soetens, Boer, & Hueting, 1985) (Wilke & Barrett, 2009)

“Superstitious” Predictions Subjects are “superstitious” when viewing randomized stimuli O o o o o o O O o O o O O… repetitionsalternations slow fast Trials Subjects slower & more error-prone when local pattern is violated Patterns are by chance, not predictive of next stimulus Such “superstitious” behavior is apparently sub-optimal

“Graded” Superstition (Cho et al, 2002) (Soetens et al, 1985) [o o O O O] RARR = or [O O o o o] RT ER Hypothesis: Sequential adjustments may be adaptive for changing environments. t t-1 t-2 t-3

Outline “Ideal predictor” in a fixed vs. changing world Exponential forgetting normative and descriptive Optimal Bayes or exponential filter? Neural implementation of prediction/learning

I. Fixed Belief Model (FBM) A (0) R (1) hidden bias observed stimuli ? … ?

II. Dynamic Belief Model (DBM) A (0)R (1) changing bias observed stimuli ? ?.3.8.3

RA bias  What the FBM subject should believe about the bias of the coin, given a sequence of observations: R R A R R R FBM Subject’s Response to Random Inputs

What the FBM subject should believe about the bias of the coin, given a long sequence of observations: R R A R A A R A A R A… RA bias 

What the DBM subject should believe about the bias of the coin, given a long sequence of observations: R R A R A A R A A R A… RA bias  DBM Subject’s Response to Random Inputs

Randomized Stimuli: FBM > DBM Given a sequence of truly random data (  =.5) … FBM: belief distrib. over  Simulated trials Probability DBM: belief distrib. over  Simulated trials Probability Driven by long-term average Driven by transient patterns

“Natural Environment”: DBM > FBM In a changing world, where  undergoes un-signaled changes … FBM: posterior over  Simulated trials Probability Adapt poorly to changes Adapt rapidly to changes DBM: posterior over  Simulated trials Probability

Persistence of Sequential Effects Sequential effects persist in data DBM produces R/A asymmetry Subjects=DBM (changing world) FBM P(stimulus) DBM P(stimulus) Human Data (data from Cho et al, 2002) RT

Outline “Ideal predictor” in a fixed vs. changing world Exponential forgetting normative and descriptive Optimal Bayes or exponential filter? Neural implementation of prediction/learning

Bayesian Computations in Neurons? Optimal Prediction What subjects need to compute Too hard to represent, too hard to compute! Generative Model What subjects need to know

(Sugrue, Corrado, & Newsome, 2004) Simpler Alternative for Neural Computation? Inspiration: exponential forgetting in tracking true changes

Exponential Forgetting in Behavior Exponential discounting is a good descriptive model Linear regression: R/A Human Data Trials into the Past Coefficients (re-analysis of Cho et al)

Linear regression: R/A Exponential discounting is a good normative model DBM Prediction Trials into the Past Coefficients Exponential Forgetting Approximates DBM

Discount Rate vs. Assumed Rate of Change … DBM  =.95 Simulated trials Probability  =.77 Simulated trials

Trials into the Past DBM Simulation Coefficients Human Data Trials into the Past Coefficients  =.57 Reverse-engineering Subjects’ Assumptions  = p(  t =  t-1 )  =.57  =.77  changes once every four trials   2/3 

Analytical Approximation Quality of approximation  vs.    nonlinear Bayesian computations3-param model 1-param linear model

Outline “Ideal predictor” in a fixed vs. changing world Exponential forgetting normative and descriptive Optimal Bayes or exponential filter? Neural implementation of prediction/learning

Subjects’ RT vs. Model Stimulus Probability Repetition Trials R A R R R R …

Subjects’ RT vs. Model Stimulus Probability Repetition Trials R A R R R R … RT

Subjects’ RT vs. Model Stimulus Probability Repetition Trials Alternation Trials R A R R R R … RT

Subjects’ RT vs. Model Stimulus Probability Repetition vs. Alternation Trials

Multiple-Timescale Interactions Optimal discrimination (Wald, 1947) 2 1 discrete time, SPRT continuous-time, DDM DBM (Yu, NIPS 2007) (Frazier & Yu, NIPS 2008) (Gold & Shadlen, Neuron 2002)

SPRT/DDM & Linear Effect of Prior on RT Timesteps RT hist Bias: P(s 1 ) Bias : P(s 1 ) x tanh x 0

SPRT/DDM & Linear Effect of Prior on RT Empirical RT vs. Stim Probability Bias: P(s 1 ) Predicted RT vs. Stim Probability

Outline “Ideal predictor” in a fixed vs. changing world Exponential forgetting normative and descriptive Optimal Bayes or exponential filter? Neural implementation of prediction/learning

Neural Implementation of Prediction Leaky-integrating neuron: Perceptual decision-making (Grice, 1972; Smith, 1995; Cook & Maunsell, 2002; Busmeyer & Townsend, 1993; McClelland, 1993; Bogacz et al, 2006; Yu, 2007; …) Trial-to-trial interactions (Kim & Myung, 1995; Dayan & Yu, 2003; Simen, Cohen & Holmes, 2006; Mozer, Kinoshita, & Shettel, 2007; …) bias input recurrent = 1/2 (1-  )1/3  2/3 

Neuromodulation & Dynamic Filters Leaky-integrating neuron: bias input recurrent Norepinephrine (NE) (Hasselmo, Wyble, & Wallenstein 1996; Kobayashi, 2000) Trials NE: Unexpected Uncertainty (Yu & Dayan, Neuron, 2000)

Learning the Value of  Humans (Behrens et al, 2007) and rats (Gallistel & Latham, 1999) may encode meta-changes in the rate of change,  Bayesian Learning … … … … Iteratively compute joint posterior Marginal posterior over  Marginal posterior over 

Neurons don’t need to represent probabilities explicitly Just need to estimate  Stochastic gradient descent (  -rule) Neural Parameter Learning? learning rate error gradient

Learning Results Trials Stochastic Gradient Descent Trials Bayesian Learning

Summary H: “ Superstition” reflects adaptation to changing world Exponential “memory” near-optimal & fits behavior; linear RT Neurobiology: leaky integration, stochastic  -rule, neuromodulation Random sequence and changing biases hard to distinguish Questions: multiple outcomes? Explicit versus implicit prediction?

Unlearning Temporal Correlation is Slow Marginal posterior over  Marginal posterior over  Trials Probability (see Bialek, 2005)

Insight from Brain’s “Mistakes” Ex: visual illusions (Adelson, 1995)

lightness depth context Neural computation specialized for natural problems Ex: visual illusions Insight from Brain’s “Mistakes”

Discount Rate vs. Assumed Rate of Change Iterative form of linear exponential Exact inference is non-linear Linear approximation Empirical distribution

Bayesian Inference Posterior Generative Model (what subject “knows”) 1: repetition 0: alternation Optimal Prediction (Bayes’ Rule)

Bayesian Inference Optimal Prediction (Bayes’ Rule) Generative Model (what subject “knows”)

Power-Law Decay of Memory Human memory Stationary process! Hierarchical Chinese Restaurant Process 1074  … (Teh, 2006) Natural (language) statistics (Anderson & Schooler, 1991)

Ties Across Time, Space, and Modality Sequential effects RT Stroop GREEN SSHSS Eriksen time modality space (Yu, Dayan, Cohen, JEP: HPP 2008) (Liu, Yu, & Holmes, Neur Comp 2008)

Sequential Effects  Perceptual Discrimination Optimal discrimination (Wald, 1947) R A discrete time, SPRT continuous-time, DDM DBM PFC (Yu & Dayan, NIPS 2005) (Yu, NIPS 2007) (Frazier & Yu, NIPS 2008) (Gold & Glimcher, Neuron 2002)

Monkey G Coefficients Trials into past  =.72 Exponential Discounting for Changing Rewards Monkey F Coefficients Trials into past  =.63 (Sugrue, Corrado, & Newsome, 2004)

Monkey G Coefficients Trials into past  =.72 Monkey F Coefficients Trials into past  =.63 Human & Monkey Share Assumptions? MonkeyHuman ≈ !  =.68  =.80

Simulation Results Trials Learning via stochastic  -rule

Monkeys’ Discount Rates in Choice Task (Sugrue, Corrado, & Newsome, 2004) Monkey F Coefficients Trials into past  = Monkey G Coefficients Trials into past  =

Human & Monkey Share Assumptions? MonkeyHuman ≈ !