Adaptive Methods Research Methods Fall 2008 Tamás Bőhm.

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

Detection Chia-Hsin Cheng. Wireless Access Tech. Lab. CCU Wireless Access Tech. Lab. 2 Outlines Detection Theory Simple Binary Hypothesis Tests Bayes.

Chapter 4 Inference About Process Quality

Estimation of Means and Proportions

Adaptive Methods Research Methods Fall 2010 Tamás Bőhm.

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.

Brief introduction on Logistic Regression

Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.

Sampling: Final and Initial Sample Size Determination

Integration of sensory modalities

Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.

Maximum likelihood (ML) and likelihood ratio (LR) test

Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.

Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.

Maximum likelihood (ML)

1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.

Maximum likelihood (ML) and likelihood ratio (LR) test

4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.

Inference about a Mean Part II

July 3, A36 Theory of Statistics Course within the Master’s program in Statistics and Data mining Fall semester 2011.

Uncertainty, Neuromodulation and Attention Angela Yu, and Peter Dayan.

Maximum likelihood (ML)

Chapter 12 Section 1 Inference for Linear Regression.

Generalized Linear Models

CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.

Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.

Overview of Statistical Hypothesis Testing: The z-Test

Overview Definition Hypothesis

Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.

1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.

STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)

Statistical Decision Theory

Model Inference and Averaging

Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.

PARAMETRIC STATISTICAL INFERENCE

Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.

Statistical Sampling & Analysis of Sample Data

9-1 MGMG 522 : Session #9 Binary Regression (Ch. 13)

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.

ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.

Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.

BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity

Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.

BCS547 Neural Decoding.

Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

Ch.9 Bayesian Models of Sensory Cue Integration (Mon) Summarized and Presented by J.W. Ha 1.

Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,

6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,

Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.

Lecture 5: Statistical Methods for Classification CAP 5415: Computer Vision Fall 2006.

Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.

The Probit Model Alexander Spermann University of Freiburg SS 2008.

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.

The Probit Model Alexander Spermann University of Freiburg SoSe 2009

Statistical Modelling

Making inferences from collected data involve two possible tasks:

Data Mining Lecture 11.

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.

Parametric Methods Berlin Chen, 2005 References:

Machine Learning: Lecture 6

Machine Learning: UNIT-3 CHAPTER-1

EE Audio Signals and Systems

STA 291 Spring 2008 Lecture 13 Dustin Lueker.

How Confident Are You?.

Presentation transcript:

Adaptive Methods Research Methods Fall 2008 Tamás Bőhm

Adaptive methods Classical (Fechnerian) methods: stimulus is often far from the threshold inefficient Adaptive methods: accelerated testing –Modifications of the method of constant stimuli and method of limits

Adaptive methods Classical methods: stimulus values to be presented are fixed before the experiment Adaptive methods: stimulus values to be presented depend critically on preceding responses

Adaptive methods Constituents –Stepping rule: which stimulus level to use next? –Stopping criterion: when to finish the session? –What is the final threshold estimate? Performance –Bias: systematic error –Precision: related to random error –Efficiency: # of trials needed for a specific precision; measured by the sweat factor

Notations X n stimulus level at trial n Z n response at trial n Z n = 1detected / correct Z n = 0not detected / incorrect φtarget probability absolute threshold: φ = 50% difference threshold: φ = 75% 2AFC: φ = 50% + 50% / 2 = 75% 4AFC: φ = 25% + 75% / 2 = 62.5% x φ threshold

Adaptive methods Classical methods: stimulus values to be presented are fixed before the experiment Adaptive methods: stimulus values to be presented depend critically on preceding responses X n+1 = f(φ, n, Z n, X n, Z n-1, X n-1,…, Z 1, X 1 )

Adaptive methods Nonparametric methods: –No assumptions about the shape of the psychometric function –Can measure threshold only Parametric methods: –General form of the psychometric function is known, only its parameters (threshold and slope) need to be measured –If slope is also known: measure only threshold

Nonparametric adaptive methods Staircase method (aka. truncated method of limits, simple up-down) Transformed up-down method Nonparametric up-down method Weighted up-down method Modified binary search Stochastic approximation Accelerated stochastic approximation PEST and More Virulent PEST

Staircase method Stepping rule: X n+1 = X n - δ(2Z n - 1) –fixed step size δ –if response changes: direction of steps is reversed Stopping criterion: after a predetermined number of reversals Threshold estimate: average of reversal points (mid-run estimate) Converges to φ = 50% cannot be used for e.g. 2AFC

Transformed up-down method Improvement of the simple up-down (staircase) method X n+1 depends on 2 or more preceding responses –E.g.1-up/2-down or 2- step rule: Increase stimulus level after each incorrect response Decrease only after 2 correct responses φ = 70.7% Threshold: mid-run estimate 8 rules for 8 different φ values (15.9%, 29.3%, 50%, 70.7%, 79.4%, 84.1%) reversal points

Nonparametric up-down method Stepping rule:X n+1 = X n - δ(2Z n S φ - 1) –S φ : random number p(S φ =1) = 1 / 2φ p(S φ =0) = 1 – (1 / 2φ) –After a correct answer: stimulus decreased with p = 1 / 2φ stimulus increased with p = 1 - (1 / 2φ) –After an incorrect answer: stimulus increased Can converge to any φ 50%

Nonparametric up-down method

Weighted up-down method Different step sizes for upward (δ up ) and downward steps (δ down )

Modified binary search Divide and conquer Stimulus interval containing the threshold is halved in every step (one endpoint is replaced by the midpoint) Stopping criterion: a lower limit on the step size Threshold estimate: last tested level Heuristic, no theoretical foundation Figure from Sedgewick & Wayne

Stochastic approximation A theoretically sound variant of the modified binary search Stepping rule: –c: initial step size –Stimulus value increases for correct responses, decreases for incorrect ones –If φ = 50%: upward and downward steps are equal; otherwise asymmetric –Step size (both upward and downward) decreases from trial to trial Can converge to any φ

Stochastic approximation

Accelerated stochastic approximation Stepping rule: –First 2 trials: stochastic approximation –n > 2: step size is changed only when response changes (m reversals : number of reversals) Otherwise the same as stochastic approximation Less trials than stochastic approximation

Accelerated stochastic approximation

Parameter Estimation by Sequential Testing (PEST) Sequential testing: –Run multiple trials at the same stimulus level x –If x is near the threshold, the expected number of correct responses m c after n x presentations will be around φn x the stimulus level is changed if m c is not in φn x ± w –w: deviation limit; w=1 for φ=75% If the stimulus level needs to be changed: step size determined by a set of heuristic rules Variants: MOUSE, RAT, More Virulent PEST

Adaptive methods Nonparametric methods: –No assumptions about the shape of the psychometric function –Can measure threshold only Parametric methods: –General form of the psychometric function is known, only its parameters (threshold and slope) need to be measured –If slope is also known: measure only threshold

Parametric adaptive methods A template for the psychometric function is chosen: –Cumulative normal –Logistic –Weibull –Gumbel

Parametric adaptive methods Only the parameters of the template need to be measured: –Threshold –Slope

Fitting the psychometric function 1.Linearization (inverse transformation) of data points Inverse cumulative normal (probit) Inverse logistic (logit)

Fitting the psychometric function 2.Linear regression 3.Transformation of regression line parameters X-intercept & linear slopeThreshold & logistic slope

Contour integration experiment D = 2 slope = -0.6 D = 65 slope = 0.3

Contour integration experiment 5-day perceptual learning

Adaptive probit estimation Short blocks of method of constant stimuli Between blocks: threshold and slope is estimated (psychometric function is fitted to the data) and stimulus levels adjusted accordingly –Assumes a cumulative normal function probit analysis Stopping criterion: after a fixed number of blocks Final estimate of threshold and slope: re-analysis of all the responses

Adaptive probit estimation Start with an educated guess of the threshold and slope In each block: 4 stimulus levels presented 10 times each After each block: threshold ( ) and slope ( ) is estimated by probit analysis of the responses in block Stimulus levels for the next block are adjusted accordingly –Estimated threshold and slope applied only through correction factors inertia

Measuring the threshold only Function shape (form & slope) is predetermined by the experimenter Only the position along the x-axis (threshold) needs to be measured Iteratively estimating the threshold and adapting the stimulus levels Two ways to estimate the threshold: –Maximum likelihood (ML) –Bayes estimation QUEST, BEST PEST, ML-TEST, Quadrature Method, IDEAL, YAAP, ZEST

Maximum likelihood estimation Construct the psychometric function with each possible threshold value Calculate the probability of the responses with each threshold value (likelihood) Choose the threshold value for which the likelihood is maximal (i.e. the psychometric function that is the most likely to produce such responses)

Bayes estimation Prior information is also used –Distribution of the threshold in the population (e.g. from a survey of the literature) –The experimenters beliefs about the threshold a priori distribution of the threshold values of the psychometric functions at the tested stimulus levels

Bayes theorem (1764) A simple rule to turn around a conditional probability: Application for statistical inference:

Bayesian inference a posteriori probability / likelihood: estimates unknown physical parameters based on known observations conditional probability: predicts unknown observations based on known parameters a priori probability: prior knowledge, top-down effects (normalizing constant)

Bayesian coding hypothesis 1.Neural information representation is probabilistic (not deterministic) 2.At each stage of local computation: instead of decisions, a representation of all possible values of the parameter & their corresponding probabilities (obtained by Bayesian inference) 3.Final decision: can be either the mean, the mode, etc. of the probability distribution Advantages: –no need to commit too early to particular interpretations –uncertainty of information is taken into account (e.g. in cue integration)

Bayesian coding hypothesis Knill, Pouget TiNS 2004

Bayesian coding hypothesis Bayesian theory predicts psychophysical data well (esp. perceptual biases and cue integration) Bayesian computational models: successfully applied in e.g. machine vision & speech recognition Neurophysiology: only sparse results about the coding of uncertainty in neuronal populations