Decoding How well can we learn what the stimulus is by looking

Slides:



Advertisements
Similar presentations
What is the neural code? Puchalla et al., What is the neural code? Encoding: how does a stimulus cause the pattern of responses? what are the responses.
Advertisements

Brief introduction on Logistic Regression
Pattern Recognition and Machine Learning
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
Neural Computation Chapter 3. Neural Computation Outline Comparison of behavioral and neural response on a discrimination task –Bayes rule –ROC curves.
ROC Statistics for the Lazy Machine Learner in All of Us Bradley Malin Lecture for COS Lab School of Computer Science Carnegie Mellon University 9/22/2005.
How well can we learn what the stimulus is by looking at the neural responses? We will discuss two approaches: devise and evaluate explicit algorithms.
For stimulus s, have estimated s est Bias: Cramer-Rao bound: Mean square error: Variance: Fisher information How good is our estimate? (ML is unbiased:
CAP and ROC curves.
Statistical Methods Chichang Jou Tamkang University.
Statistical Decision Theory, Bayes Classifier
Kernel Methods Part 2 Bing Han June 26, Local Likelihood Logistic Regression.
From T. McMillen & P. Holmes, J. Math. Psych. 50: 30-57, MURI Center for Human and Robot Decision Dynamics, Sept 13, Phil Holmes, Jonathan.
Psychophysics 3 Research Methods Fall 2010 Tamás Bőhm.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Classification, Part 2 BMTRY 726 4/11/14. The 3 Approaches (1) Discriminant functions: -Find function f(x) that maps each point x directly into class.
Chapter 21 R(x) Algorithm a) Anomaly Detection b) Matched Filter.
Automatic Minirhizotron Root Image Analysis Using Two-Dimensional Matched Filtering and Local Entropy Thresholding Presented by Guang Zeng.
CSE 446 Logistic Regression Winter 2012 Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer.
Lecture notes for Stat 231: Pattern Recognition and Machine Learning 3. Bayes Decision Theory: Part II. Prof. A.L. Yuille Stat 231. Fall 2004.
MEASURES OF TEST ACCURACY AND ASSOCIATIONS DR ODIFE, U.B SR, EDM DIVISION.
This supervised learning technique uses Bayes’ rule but is different in philosophy from the well known work of Aitken, Taroni, et al. Bayes’ rule: Pr is.
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
BCS547 Neural Decoding.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
1 Chihiro HIROTSU Meisei (明星) University Estimating the dose response pattern via multiple decision processes.
Jen-Tzung Chien, Meng-Sung Wu Minimum Rank Error Language Modeling.
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
Information Processing by Neuronal Populations Chapter 6: Single-neuron and ensemble contributions to decoding simultaneously recoded spike trains Information.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.
Does the brain compute confidence estimates about decisions?
1 Neural Codes. 2 Neuronal Codes – Action potentials as the elementary units voltage clamp from a brain cell of a fly.
Optimal Decision-Making in Humans & Animals Angela Yu March 05, 2009.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Data Modeling Patrice Koehl Department of Biological Sciences
Why does it work? We have not addressed the question of why does this classifier performs well, given that the assumptions are unlikely to be satisfied.
Lecture 1.31 Criteria for optimal reception of radio signals.
Psychometric Functions
Probability Theory and Parameter Estimation I
Methods in Brain Research: psychophysics
Origins of Signal Detection Theory
Synaptic Dynamics: Unsupervised Learning
Synapses Signal is carried chemically across the synaptic cleft.
A Classical Model of Decision Making: The Drift Diffusion Model of Choice Between Two Alternatives At each time step a small sample of noisy information.
LECTURE 05: THRESHOLD DECODING
Dynamical Models of Decision Making Optimality, human performance, and principles of neural information processing Jay McClelland Department of Psychology.
Linear regression Fitting a straight line to observations.
LECTURE 05: THRESHOLD DECODING
David Kellen, Henrik Singmann, Sharon Chen, and Samuel Winiger
Dopamine DA serotonin 5-HT noradrenaline NA acetylchol. ACh.
spike-triggering stimulus features
EE513 Audio Signals and Systems
Probabilistic Population Codes for Bayesian Decision Making
Generally Discriminant Analysis
Banburismus and the Brain
Confidence as Bayesian Probability: From Neural Origins to Behavior
Volume 71, Issue 4, Pages (August 2011)
Liu D. Liu, Christopher C. Pack  Neuron 
Franco Pestilli, Marisa Carrasco, David J. Heeger, Justin L. Gardner 
Ryo Sasaki, Takanori Uka  Neuron  Volume 62, Issue 1, Pages (April 2009)
Signal detection theory
LECTURE 05: THRESHOLD DECODING
Volume 75, Issue 5, Pages (September 2012)
Perceptual learning Nisheeth 15th February 2019.
Franco Pestilli, Marisa Carrasco, David J. Heeger, Justin L. Gardner 
Ethan S. Bromberg-Martin, Okihide Hikosaka  Neuron 
Roc curves By Vittoria Cozza, matr
Encoding of Stimulus Probability in Macaque Inferior Temporal Cortex
Volume 75, Issue 5, Pages (September 2012)
Volume 27, Issue 6, Pages (March 2017)
Presentation transcript:

Decoding How well can we learn what the stimulus is by looking at the neural responses? Two approaches: devise explicit algorithms for extracting a stimulus estimate directly quantify the relationship between stimulus and response using information theory

Predicting the firing rate Starting with a rate response, r(t) and a stimulus, s(t), the optimal linear estimator finds the best kernel K such that: is close to r(t), in the least squares sense. Solving for K(t),

Stimulus reconstruction K t

Stimulus reconstruction

Stimulus reconstruction

Reading minds: the LGN Yang Dan, UC Berkeley

Motion detection task: two-state forced choice Britten et al. ‘92: behavioral monkey data + neural responses

Discriminability: d’ = ( <r>+ - <r>- )/ sr Behavioral performance Neural data at different coherences Discriminability: d’ = ( <r>+ - <r>- )/ sr

z p(r|+) p(r|-) <r>+ <r>- Signal detection theory: r is the “test”. Decoding corresponds to comparing test to threshold. a(z) = P[ r ≥ z|-] false alarm rate, “size” b(z) = P[ r ≥ z|+] hit rate, “power” Find by maximizing P[correct] = (b(z) + 1 – a(z))/2

ROC curves: summarize performance of test for different thresholds z Want b  1, a  0.

The area under the ROC curve corresponds to P[correct] for a two-alternative forced choice task: Response on one presentation acts as threshold for the other. Say present + then -; procedure will be correct if r1 > z where z = r2, true with probability P[r1 > z | +] = b(z), z = r2. total probability correct is this times probability of obtaining a particular value r2 = z Probability of r2 being in range z to z+dz, p[z|-]dz Then P[correct] = ∫∞0 dz p[z|-] b(z).

P[correct] = ∫∞0 dz p[z|-] b(z). As a = probability that r > z for a - stimulus, da = -p[z|-] dz, P[correct] = ∫ da b.

Appearance of the logistic function: If p[r|+] and p[r|-] are both Gaussian, P[correct] = ½ erfc(-d’/2). To interpret results as two-alternative forced choice, need simultaneous responses from + neuron and from – neuron. Get “- neuron” responses from same neuron in response to – stimulus. Ideal observer: performs as area under ROC curve.

Close correspondence between neural and behaviour.. Why so many neurons? Correlations limit performance.

What is the best test function to use? (other than r) The optimal test function is the likelihood ratio, l(r) = p[r|+] / p[r|-]. (Neyman-Pearson lemma) Note that l(z) = (db/dz) / (da/dz) = db/da i.e. slope of ROC curve

Likelihood as loss minimisation: Penalty for incorrect answer: L+, L- Observe r; Expected loss Loss+ = L+P[-|r] Loss- = L-P[+|r] Cut your losses: answer + when Loss+ < Loss- i.e. L+P[-|r] > L-P[+|r]. Using Bayes’, P[+|r] = p[r|+]P[+]/p(r); P[-|r] = p[r|-]P[-]/p(r);  l(r) = p[r|+]/p[r|-] > L+P[-] / L-P[+] .

Again, if p[r|-] and p[r|+] are Gaussian, P[+] and P[-] are equal, P[+|r] = 1/ [1 + exp(-d’ (r - <r>)/ s)]. d’ is the slope of the sigmoidal fitted to P[+|r]

s and s + ds “Score” Compared to threshold For small separations between stimulus values, s and s + ds “Score” Compared to threshold