Stochastic Process Theory and Spectral Estimation Bijan Pesaran Center for Neural Science New York University
Overview Stochastic process theory – see Appendix Spectral estimation
Fourier Transform Real functions: Parseval’s Theorem (Total power is conserved)
Examples of Fourier Transforms Time domain Frequency domain
Time translation invariance Leads directly to spectral analysis Fourier basis is eigenbasis of
Implications for second moment If process is stationary, second moment is time translation invariant Hence, for Because
Stationarity Stationarity means neighboring frequencies are uncorrelated Not true for neighboring times Also due to stationarity, (In general)
Cross-spectrum and coherence
Coherence Coherence measures the linear association between two time series. Cross-spectrum is the Fourier transform of the cross-correlation function
Coherence Frequency-dependent time delay
Advantages of spectral estimation Neighboring bins are uncorrelated Error bars relatively easy to calculate Stable statistical estimators Separate signals together that have different frequencies Normalized quantities Allow averaging and comparisons
Estimating spectra Simple spectral estimates: Periodogram Bias Variance Tapering is smoothing your spectrum Multitaper estimates using Slepians Spectrum and coherence
Example LFP spectrum Multitaper estimate Periodogram – Single Trial - Single Trial, 2NT=10 Periodogram – Single Trial
Spectral estimation problem The Fourier transform requires an infinite sequence of data In reality, we only have finite sequences of data and so we calculate truncated DFT
What happens if we have a finite sequence of data? Finite sequence means DFT is convolution of and
Fourier transform of a rectangular window is the Dirichlet kernel: The Fourier transform of a rectangular window Convolution in frequency = product in time
Bias Bias is the difference between the expected value of an estimator and the true value. The Dirichlet kernel is not a delta function, therefore the sample estimate is biased and doesn’t equal the true value.
Normalized Dirichlet kernel 20% height Narrowband bias: Local bias due to central lobe Broadband bias: Bias from distant frequencies due to sidelobes
Data tapers We can do better than multiplying the data by a rectangular kernel. Choose a function that tapers the data to zero towards the edge of the segment Many choices of data taper exist: Hanning taper, Hamming taper, triangular taper and so on
Triangular taper Broadens central lobe Reduces sidelobes Fejer kernel, for triangular taper, compared with Dirichlet kernel, for rectangular taper.
Spectral concentration problem Tapering the data reduces sidelobes but broadens the central lobes. Are there “optimal” tapers? Find strictly time-localized functions, , whose Fourier transforms are maximally localized on the frequency interval [-W,W]
Optimal tapers The DFT, , of a finite series, Find series that maximizes energy in a [-W,W] frequency band
Discrete Prolate Spheroidal Sequences Solved by Slepian, Landau and Pollack Solutions are an orthogonal family of sequences which are solutions to the following eigenvalue functions
Slepian functions Eigenvectors of eigenvalue equation Orthonormal on [-1/2,1/2] Orthogonal on [-W,W] K=2WT-1 eigenvalues are close to 1, the rest are close to 0. Correspond to 2WT-1 functions within [-W,W]
Power of the kth Slepian function within the bandwidth [-W,W]
Comparing Slepian functions Systematic trade-off between narrowband and broadband bias
Advantages of Slepian tapers 2WT=6 Using multiple tapers recovers edge of time window
Multitaper spectral estimation Each data taper provides uncorrelated estimate. Average over them to get spectral estimate. Treat different trials as additional tapers and average over them as well
Cross-spectrum and coherency
Advantages of multiple tapers Increasing number of tapers reduces variance of spectral estimators. Explicitly control trade-off between narrowband bias, broadband bias and variance “Better microscope” Local frequency basis for analyzing signals
Time-frequency resolution 2W Frequency T Time Control resolution in the time-frequency plane using parameters of T and W in Slepians
Example LFP spectrograms Multitaper estimate - T = 0.5s, W = 10Hz Multitaper estimate - T = 0.2s, W = 25Hz
Summary Time series present particular challenges for statistical analysis Spectral analysis is a valuable form of time series analysis
Appendix
Data is modeled as a stochastic process Spikes LFP Similar considerations for EEG, MEG, ECoG, intracellular membrane potentials, intrinsic and extrinsic optical images, 2-photon line scans and so on
Stochastic process theory Defining stochastic processes Time translation invariance; Ergodicity Moments (Correlation functions) and spectra Example Gaussian processes
Stochastic processes Each time series is a realization of a stochastic process Given a sequence of observations, at times, a stochastic process is characterized by the probability distribution Akin to rolling a die for each time series Probability distribution for time series Alternative is deterministic process No stochastic variability
Defining stochastic processes High dimensional random variables Rolling one die picks a point in high dimensional space. Function in ND space. Indexed families of random variables Roll many dice
Challenge of data analysis We can never know the full probability distribution of the data Curse of dimensionality
Parametric methods Parametric methods infer the PDF by considering a parameterized subspace Employ relatively strong models of underlying process
Non-parametric methods Non-parametric methods use the observed data to infer statistical properties of the PDF Employ relatively weak models of the underlying process
Stationarity Stochastic processes don’t exactly repeat themselves They have statistical regularities: Stationarity
Ergodicity Ensemble averages are equivalent to time averages Often assumed in experimental work More stringent than stationarity is not ergodic unless only one constant Is activity with time-varying constant ergodic?
Gaussian processes Ornstein Uhlenbeck process Weiner process
Ornstein Uhlenbeck Process Exponentially decaying correlation function Obtained by passing passing white noise through a ‘leaky’ integrator Spectrum is Lorentzian
Ornstein Uhlenbeck process
Markovian process “Future depends on the past given the present” Simplifies joint probability density