Decoding How well can we learn what the stimulus is by looking

Slides:

Advertisements

Similar presentations

What is the neural code? Puchalla et al., What is the neural code? Encoding: how does a stimulus cause the pattern of responses? what are the responses.

Advertisements

Brief introduction on Logistic Regression

Pattern Recognition and Machine Learning

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.

Neural Computation Chapter 3. Neural Computation Outline Comparison of behavioral and neural response on a discrimination task –Bayes rule –ROC curves.

ROC Statistics for the Lazy Machine Learner in All of Us Bradley Malin Lecture for COS Lab School of Computer Science Carnegie Mellon University 9/22/2005.

How well can we learn what the stimulus is by looking at the neural responses? We will discuss two approaches: devise and evaluate explicit algorithms.

For stimulus s, have estimated s est Bias: Cramer-Rao bound: Mean square error: Variance: Fisher information How good is our estimate? (ML is unbiased:

CAP and ROC curves.

Statistical Methods Chichang Jou Tamkang University.

Statistical Decision Theory, Bayes Classifier

Kernel Methods Part 2 Bing Han June 26, Local Likelihood Logistic Regression.

From T. McMillen & P. Holmes, J. Math. Psych. 50: 30-57, MURI Center for Human and Robot Decision Dynamics, Sept 13, Phil Holmes, Jonathan.

Psychophysics 3 Research Methods Fall 2010 Tamás Bőhm.

EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

Classification, Part 2 BMTRY 726 4/11/14. The 3 Approaches (1) Discriminant functions: -Find function f(x) that maps each point x directly into class.

Chapter 21 R(x) Algorithm a) Anomaly Detection b) Matched Filter.

Automatic Minirhizotron Root Image Analysis Using Two-Dimensional Matched Filtering and Local Entropy Thresholding Presented by Guang Zeng.

CSE 446 Logistic Regression Winter 2012 Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer.

Lecture notes for Stat 231: Pattern Recognition and Machine Learning 3. Bayes Decision Theory: Part II. Prof. A.L. Yuille Stat 231. Fall 2004.

MEASURES OF TEST ACCURACY AND ASSOCIATIONS DR ODIFE, U.B SR, EDM DIVISION.

This supervised learning technique uses Bayes’ rule but is different in philosophy from the well known work of Aitken, Taroni, et al. Bayes’ rule: Pr is.

BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity

BCS547 Neural Decoding.

1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.

1 Chihiro HIROTSU Meisei （明星） University Estimating the dose response pattern via multiple decision processes.

Jen-Tzung Chien, Meng-Sung Wu Minimum Rank Error Language Modeling.

6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,

Information Processing by Neuronal Populations Chapter 6: Single-neuron and ensemble contributions to decoding simultaneously recoded spike trains Information.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.

Does the brain compute confidence estimates about decisions?

1 Neural Codes. 2 Neuronal Codes – Action potentials as the elementary units voltage clamp from a brain cell of a fly.

Optimal Decision-Making in Humans & Animals Angela Yu March 05, 2009.

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.

Data Modeling Patrice Koehl Department of Biological Sciences

Why does it work? We have not addressed the question of why does this classifier performs well, given that the assumptions are unlikely to be satisfied.

Lecture 1.31 Criteria for optimal reception of radio signals.

Psychometric Functions

Probability Theory and Parameter Estimation I

Methods in Brain Research: psychophysics

Origins of Signal Detection Theory

Synaptic Dynamics: Unsupervised Learning

Synapses Signal is carried chemically across the synaptic cleft.

A Classical Model of Decision Making: The Drift Diffusion Model of Choice Between Two Alternatives At each time step a small sample of noisy information.

LECTURE 05: THRESHOLD DECODING

Dynamical Models of Decision Making Optimality, human performance, and principles of neural information processing Jay McClelland Department of Psychology.

Linear regression Fitting a straight line to observations.

LECTURE 05: THRESHOLD DECODING

David Kellen, Henrik Singmann, Sharon Chen, and Samuel Winiger

Dopamine DA serotonin 5-HT noradrenaline NA acetylchol. ACh.

spike-triggering stimulus features

EE513 Audio Signals and Systems

Probabilistic Population Codes for Bayesian Decision Making

Generally Discriminant Analysis

Banburismus and the Brain

Confidence as Bayesian Probability: From Neural Origins to Behavior

Volume 71, Issue 4, Pages (August 2011)

Liu D. Liu, Christopher C. Pack Neuron

Franco Pestilli, Marisa Carrasco, David J. Heeger, Justin L. Gardner

Ryo Sasaki, Takanori Uka Neuron Volume 62, Issue 1, Pages (April 2009)

Signal detection theory

LECTURE 05: THRESHOLD DECODING

Volume 75, Issue 5, Pages (September 2012)

Perceptual learning Nisheeth 15th February 2019.

Franco Pestilli, Marisa Carrasco, David J. Heeger, Justin L. Gardner

Ethan S. Bromberg-Martin, Okihide Hikosaka Neuron

Roc curves By Vittoria Cozza, matr

Encoding of Stimulus Probability in Macaque Inferior Temporal Cortex

Volume 75, Issue 5, Pages (September 2012)

Volume 27, Issue 6, Pages (March 2017)

Presentation transcript:

Decoding How well can we learn what the stimulus is by looking at the neural responses? Two approaches: devise explicit algorithms for extracting a stimulus estimate directly quantify the relationship between stimulus and response using information theory

Predicting the firing rate Starting with a rate response, r(t) and a stimulus, s(t), the optimal linear estimator finds the best kernel K such that: is close to r(t), in the least squares sense. Solving for K(t),

Stimulus reconstruction K t

Stimulus reconstruction

Stimulus reconstruction

Reading minds: the LGN Yang Dan, UC Berkeley

Motion detection task: two-state forced choice Britten et al. ‘92: behavioral monkey data + neural responses

Discriminability: d’ = ( <r>+ - <r>- )/ sr Behavioral performance Neural data at different coherences Discriminability: d’ = ( <r>+ - <r>- )/ sr

z p(r|+) p(r|-) <r>+ <r>- Signal detection theory: r is the “test”. Decoding corresponds to comparing test to threshold. a(z) = P[ r ≥ z|-] false alarm rate, “size” b(z) = P[ r ≥ z|+] hit rate, “power” Find by maximizing P[correct] = (b(z) + 1 – a(z))/2

ROC curves: summarize performance of test for different thresholds z Want b  1, a  0.

The area under the ROC curve corresponds to P[correct] for a two-alternative forced choice task: Response on one presentation acts as threshold for the other. Say present + then -; procedure will be correct if r1 > z where z = r2, true with probability P[r1 > z | +] = b(z), z = r2. total probability correct is this times probability of obtaining a particular value r2 = z Probability of r2 being in range z to z+dz, p[z|-]dz Then P[correct] = ∫∞0 dz p[z|-] b(z).

P[correct] = ∫∞0 dz p[z|-] b(z). As a = probability that r > z for a - stimulus, da = -p[z|-] dz, P[correct] = ∫ da b.

Appearance of the logistic function: If p[r|+] and p[r|-] are both Gaussian, P[correct] = ½ erfc(-d’/2). To interpret results as two-alternative forced choice, need simultaneous responses from + neuron and from – neuron. Get “- neuron” responses from same neuron in response to – stimulus. Ideal observer: performs as area under ROC curve.

Close correspondence between neural and behaviour.. Why so many neurons? Correlations limit performance.

What is the best test function to use? (other than r) The optimal test function is the likelihood ratio, l(r) = p[r|+] / p[r|-]. (Neyman-Pearson lemma) Note that l(z) = (db/dz) / (da/dz) = db/da i.e. slope of ROC curve

Likelihood as loss minimisation: Penalty for incorrect answer: L+, L- Observe r; Expected loss Loss+ = L+P[-|r] Loss- = L-P[+|r] Cut your losses: answer + when Loss+ < Loss- i.e. L+P[-|r] > L-P[+|r]. Using Bayes’, P[+|r] = p[r|+]P[+]/p(r); P[-|r] = p[r|-]P[-]/p(r);  l(r) = p[r|+]/p[r|-] > L+P[-] / L-P[+] .

Again, if p[r|-] and p[r|+] are Gaussian, P[+] and P[-] are equal, P[+|r] = 1/ [1 + exp(-d’ (r - <r>)/ s)]. d’ is the slope of the sigmoidal fitted to P[+|r]

s and s + ds “Score” Compared to threshold For small separations between stimulus values, s and s + ds “Score” Compared to threshold