Goals Looking at data –Signal and noise Structure of signals –Stationarity (and not) –Gaussianity (and not) –Poisson (and not) Data analysis as variability.

Slides:



Advertisements
Similar presentations
Change-Point Detection Techniques for Piecewise Locally Stationary Time Series Michael Last National Institute of Statistical Sciences Talk for Midyear.
Advertisements

Spatial point patterns and Geostatistics an introduction
Independent Component Analysis
The linear/nonlinear model s*f 1. The spike-triggered average.
Structural Equation Modeling
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Holland on Rubin’s Model Part II. Formalizing These Intuitions. In the 1920 ’ s and 30 ’ s Jerzy Neyman, a Polish statistician, developed a mathematical.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
NASSP Masters 5003F - Computational Astronomy Lecture 5: source detection. Test the null hypothesis (NH). –The NH says: let’s suppose there is no.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 13) Slideshow: stationary processes Original citation: Dougherty, C. (2012) EC220 -
Omitted Variable Bias Methods of Economic Investigation Lecture 7 1.
LFPs 1: Spectral analysis Kenneth D. Harris 11/2/15.
Pre-processing for EEG and MEG
STAT 497 APPLIED TIME SERIES ANALYSIS
What is the language of single cells? What are the elementary symbols of the code? Most typically, we think about the response as a firing rate, r(t),
Image processing. Image operations Operations on an image –Linear filtering –Non-linear filtering –Transformations –Noise removal –Segmentation.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Spike-triggering stimulus features stimulus X(t) multidimensional decision function spike output Y(t) x1x1 x2x2 x3x3 f1f1 f2f2 f3f3 Functional models of.
SUMS OF RANDOM VARIABLES Changfei Chen. Sums of Random Variables Let be a sequence of random variables, and let be their sum:
Chapter 4 Multiple Regression.
Environmental Data Analysis with MatLab Lecture 24: Confidence Limits of Spectra; Bootstraps.
Don’t spam class lists!!!. Farshad has prepared a suggested format for you final project. It will be on the web
Statistical Background
A Quick Practical Guide to PCA and ICA Ted Brookings, UCSB Physics 11/13/06.
Chi Square Distribution (c2) and Least Squares Fitting
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Chapter 9 - Lecture 2 Computing the analysis of variance for simple experiments (single factor, unrelated groups experiments).
Business Statistics - QBM117 Statistical inference for regression.
Spectra of random processes Signal, noise, smoothing and filters.
Introduction to Regression Analysis, Chapter 13,
1 PREDICTION In the previous sequence, we saw how to predict the price of a good or asset given the composition of its characteristics. In this sequence,
Review of Probability.
Objectives of Multiple Regression
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
ANOVA Greg C Elvers.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
Understanding the Variability of Your Data: Dependent Variable Two "Sources" of Variability in DV (Response Variable) –Independent (Predictor/Explanatory)
Lecture 10: Mean Field theory with fluctuations and correlations Reference: A Lerchner et al, Response Variability in Balanced Cortical Networks, q-bio.NC/ ,
Statistical Power 1. First: Effect Size The size of the distance between two means in standardized units (not inferential). A measure of the impact of.
Biological Modeling of Neural Networks Week 8 – Noisy input models: Barrage of spike arrivals Wulfram Gerstner EPFL, Lausanne, Switzerland 8.1 Variation.
Computer Vision - A Modern Approach Set: Tracking Slides by D.A. Forsyth The three main issues in tracking.
1 Statistical Distribution Fitting Dr. Jason Merrick.
FMRI Methods Lecture7 – Review: analyses & statistics.
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
Digital Image Processing Lecture 10: Image Restoration March 28, 2005 Prof. Charlene Tsai.
Physical Layer Continued. Review Discussed how to design the communication scheme depending on the physical mediums – pulling voltage up and down for.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Digital Image Processing Lecture 10: Image Restoration
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Basic Time Series Analyzing variable star data for the amateur astronomer.
Random processes. Matlab What is a random process?
CSIRO Insert presentation title, do not remove CSIRO from start of footer Experimental Design Why design? removal of technical variance Optimizing your.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Week 6. Statistics etc. GRS LX 865 Topics in Linguistics.
Dynamic Models, Autocorrelation and Forecasting ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Discrete-time Random Signals
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
CLASSIFICATION OF ECG SIGNAL USING WAVELET ANALYSIS
Chapter 6 Random Processes
Biological Modeling of Neural Networks Week 11 – Variability and Noise: Autocorrelation Wulfram Gerstner EPFL, Lausanne, Switzerland 11.1 Variation of.
Random signals Honza Černocký, ÚPGM.
Chapter 6 Random Processes
Marios Mattheakis and Pavlos Protopapas
Presentation transcript:

Goals Looking at data –Signal and noise Structure of signals –Stationarity (and not) –Gaussianity (and not) –Poisson (and not) Data analysis as variability decomposition –Frequency analysis as variance decomposition –Linear models as variability explanation –Information-theoretic methods for variability decomposition –…

Signal and noise Data analysis is about –extracting the signal that is in the noise –Demonstrating that it is ‘real’ Everything else is details

The details What is the signal? What is the noise? How to separate between them?

Example: Looking at data

Signal and noise The signal and the noise depend on the experimental question –For sensory experiments, the signal is the sensory-driven response and everything else is noise –For experiments about the magnitude of channel noise in auditory cortex neurons, sensory responses to environmental sounds are noise and the noise is the signal

Signal and noise Therefore, we have to know what our signal is composed of The signal will have a number of sources of variability The experiment is about some of these sources of variability, which are then the signal, while the others are the noise

Sources of variability Neurobiological: –Channel noise –Spontaneous EPSPs and IPSPs –Other subthreshold voltage fluctuations Intrinsic oscillations Up and down states –Sensory-driven currents and membrane potential changes –Spikes

Sources of variability Non-neuro, but biological: –Breathing, heart-rate and other motion artifacts –ERG, ECG, EMG, … Non-biological: –50 Hz interference –Tip potentials, junction potentials –Noise in the electrical measurement equipment In the experimental part of the course, you will learn how to minimize these.

Sources of variability How to separate sources of variability? –By measuring them directly –By special properties Rates of fluctuations (smoothing and filtering) Shape (spike clipping) –By their timing Event-related analysis

Fee, Neuron 2000 Direct measurement of sources of variability

Respiration and heart rate (for active stabilization of the electrode) 50 Hz interference (for removing it using event- related analysis) Identifying the neuron you are recording from, or at least approximately knowing the layer Video recording of whisker movement during recording from the barrel field in awake rats

Properties of signals Rates of fluctuations –Molecular conformation changes (channel opening and closing) (may be <1ms) –Membrane potential time constants (1-30 ms) –Stimulation rates ( s)

Nelken et al. J. Neurophysiol. 2004

Rates of fluctuations: 1 second (1 Hz)

Rates of fluctuations: 100 ms (10 Hz)

Rates of fluctuations: 10 ms (100 Hz)

Rates of fluctuations: 1 ms (1 kHz)

Rates of fluctuations: 0.1 ms (10 kHz)

Rates of fluctuations If you know what are the relevant rates of fluctuations, you can get rid of other (faster and slower) fluctuations When extracting slow rates of fluctuations while removing the faster ones, this is called smoothing More generally, we can extract any range of fluctuations (within reasonable constraints…) by (linear) filtering

Example: smoothing

Go To Matlab

Introduction to statistical tests We compare two processes with the same means (0), but with obviously different variances We know that the variance ratio is about 4 Is 4 large or small?

Introduction to statistical tests 4 is larger than 1 if the noise around 1 is such that 4 is unlikely to happen by chance

Introduction to statistical tests In order to say whether 4 is large or small, we would really like to compare it with the value we will get if we repeat the experiment under the same conditions Formally, we think of the voltage traces we compare as the result of a sampling experiment from a large set of potential voltage traces When possible, we would like to have a lot of examples of these potential voltage traces

Introduction to statistical tests Many samples from up states Many samples from down states Filter and compute variance Select many pairs from up states and compute ratios (should be close to 1) Select many pairs from down states and compute ratios (should be close to 1) Select many pairs from up and down states and compute ratios (should be close to 4)

Introduction to statistical tests When we have a lot of data, this is an appropriate approach Although note that it pushes the problem one step further (how do we know that the 1ish are indeed smaller than the 4ish)? But often we have only one trace, and we want to say something about it Need further assumptions!

Introduction to statistical tests Since we have only one trace, we need to invent the set from which it came We tend to use relatively simple assumptions about this set, which are usually wrong but not too wrong

Introduction to statistical tests Here we make the following two assumptions: –The two processes are stationary gaussian –The two processes have identical means What does that mean? –We will see later the details –We can select an independent subset of samples from each process

Go to Matlab

Introduction to statistical tests Why choosing independent samples is important? Many years ago, people showed that ratios of sum of squares of independent, zero-mean gaussian variables with the same variance have a specific distribution, called the F distribution So we actually know what the expected distribution is if the variances are the same

Introduction to statistical tests The distribution depends on how many samples are added together (obviously…). These are called degrees of freedom, and there are two of them: one for the numerator and one for the denominator Here both numbers are 51

Introduction to statistical tests So our question got transformed to the following one: We got a variance ratio of 5.6, how surprising is it when we assume that the variance ratios have an F(51,51) distribution? In order to do that, let’s look at the F distribution…

Go to Matlab

Introduction to statistical tests To recapitulate: We got data –One trace from an up state, one trace from a down state We made assumptions about how many repeats should look like –Gaussian stationary with zero mean We generated a test for which we know the answer under the assumptions –F test (variance ratio) We go to the theoretical distribution and ask whether our result is extreme –Yes!

Introduction to statistical tests A test is as good as its assumptions… Are our assumptions good? How bad are our departures from the assumptions?

Introduction to statistical tests Stationarity means that –means do not depend on where they are measured –Variances and covariances do not depend on where they are measured –… Gaussian processes are processes such that –Samples are gaussian –Pairs of samples are jointly gaussian (and when stationary, the distribution depends only on the time interval between them) –…

When the data is non- stationary?

Event-related analysis of data We select events that serve as ‘renewal points’ Renewal points are points in time where the statistical structure is restarted, in the sense that everything depends only on the time after the last renewal point

Event-related analysis of data Examples of possible renewal points: –Stimulation times –Spike times (when you believe that everything depends only on the time since the last spike) –Spike times in another neuron (when you believe that …)

Event-related analysis of data Less obvious renewal points: –Reverse correlation analysis –The random process for which we look for renewal points is now the stimulus –The renewal points are spike times

Event-related analysis of data Assume that we have renewal points in the membrane potential data This means that we believe that segments of membrane potential traces that start at the renewal points are samples from one and the same distribution We want to characterize that distribution

Event-related analysis of data We usually characterize the mean of the distribution This is called event-triggered averaging When your event is a spike, the result is spike- triggered averaging –If the spike is from the same neuron, the result is a kind of autocorrelation –If the spike is from a different neuron, the result is cross-correlation When your event is a stimulus, the result is the PSTH (or PSTA)

Go to Matlab

Event-related analysis of data We saw that using the mean does not necessarily capture well the statistics of the ensemble Nevertheless, mean is always the first choice because in many respects it is the simplest Variance and correlations are also used for event-related analysis, but this gets us beyond this elementary treatment

When the data is not gaussian?

Morphological processing Identifying signal events by their shape Usually based on case-specific methods Very little general theory behind it Closely linked to clustering

Morphological processing When the shapes are highly repetitive and very different from the noise, we can use matched filters A matched filter is a filter whose shape is precisely that of the shape to be detected

Go to Matlab

Morphological processing When the shapes are not necessarily highly repetitive but are still very different from the noise, we can use a generalization of matched filters Principal components are basic shapes whose combinations (with different weights) fit our shapes, but should poorly fit the noise But this takes out beyond this level…

Morphological processing Some spike sorting is done using principal components or matched filters Some EPSP identification is done using principal components Like all data processing techniques, this is a GIGO process and should be checked very carefully

Morphological processing Eventually, much of morphological processing is about deciding about classes You get a single number and you want to say whether it is large or small Some standard statistics can be used here, but mostly treatment is data-analytic