Change-Point Detection Techniques for Piecewise Locally Stationary Time Series Michael Last National Institute of Statistical Sciences Talk for Midyear.

Slides:



Advertisements
Similar presentations
Definitions Periodic Function: f(t +T) = f(t)t, (Period T)(1) Ex: f(t) = A sin(2Πωt + )(2) has period T = (1/ω) and ω is said to be the frequency (angular),
Advertisements

Inferential Statistics and t - tests
Brief introduction on Logistic Regression
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Periodograms Bartlett Windows Data Windowing Blackman-Tukey Resources:
Introduction to Regression with Measurement Error STA431: Spring 2015.
ACHIZITIA IN TIMP REAL A SEMNALELOR. Three frames of a sampled time domain signal. The Fast Fourier Transform (FFT) is the heart of the real-time spectrum.
Models for Measuring. What do the models have in common? They are all cases of a general model. How are people responding? What are your intentions in.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Filtering Filtering is one of the most widely used complex signal processing operations The system implementing this operation is called a filter A filter.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Chapter 2: Lasso for linear models
Uncertainty Representation. Gaussian Distribution variance Standard deviation.
Evaluation (practice). 2 Predicting performance  Assume the estimated error rate is 25%. How close is this to the true error rate?  Depends on the amount.
Kernel methods - overview
Image processing. Image operations Operations on an image –Linear filtering –Non-linear filtering –Transformations –Noise removal –Segmentation.
Environmental Data Analysis with MatLab Lecture 24: Confidence Limits of Spectra; Bootstraps.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Fitting a Model to Data Reading: 15.1,
Estimation and the Kalman Filter David Johnson. The Mean of a Discrete Distribution “I have more legs than average”
Using wavelet tools to estimate and assess trends in atmospheric data NRCSE.
Review of Probability and Random Processes
Introduction to Regression with Measurement Error STA431: Spring 2013.
Introduction To Signal Processing & Data Analysis
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Introduction to Spectral Estimation
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
LE 460 L Acoustics and Experimental Phonetics L-13
© Copyright McGraw-Hill CHAPTER 6 The Normal Distribution.
STAT 497 LECTURE NOTES 2.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
Understanding the Variability of Your Data: Dependent Variable Two "Sources" of Variability in DV (Response Variable) –Independent (Predictor/Explanatory)
The Story of Wavelets.
Wireless and Mobile Computing Transmission Fundamentals Lecture 2.
Physics 270 – Experimental Physics. Standard Deviation of the Mean (Standard Error) When we report the average value of n measurements, the uncertainty.
T – Biomedical Signal Processing Chapters
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
1 Statistical Distribution Fitting Dr. Jason Merrick.
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
Wavelet transform Wavelet transform is a relatively new concept (about 10 more years old) First of all, why do we need a transform, or what is a transform.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
PPA 501 – Analytical Methods in Administration Lecture 6a – Normal Curve, Z- Scores, and Estimation.
Basic Time Series Analyzing variable star data for the amateur astronomer.
Experiments on Noise CharacterizationRoma, March 10,1999Andrea Viceré Experiments on Noise Analysis l Need of noise characterization for  Monitoring the.
Lecture#10 Spectrum Estimation
GG313 Lecture 24 11/17/05 Power Spectrum, Phase Spectrum, and Aliasing.
1 Mean Analysis. 2 Introduction l If we use sample mean (the mean of the sample) to approximate the population mean (the mean of the population), errors.
FIT ANALYSIS IN RASCH MODEL University of Ostrava Czech republic 26-31, March, 2012.
Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,
Search for bursts with the Frequency Domain Adaptive Filter (FDAF ) Sabrina D’Antonio Roma II Tor Vergata Sergio Frasca, Pia Astone Roma 1 Outlines: FDAF.
Power Spectral Estimation
1 Review of Probability and Random Processes. 2 Importance of Random Processes Random variables and processes talk about quantities and signals which.
CHAPTER 11 Mean and Standard Deviation. BOX AND WHISKER PLOTS  Worksheet on Interpreting and making a box and whisker plot in the calculator.
PART II: TRANSIENT SUPPRESSION. IntroductionIntroduction Cohen, Gannot and Talmon\11 2 Transient Interference Suppression Transient Interference Suppression.
Locating a Shift in the Mean of a Time Series Melvin J. Hinich Applied Research Laboratories University of Texas at Austin
Short Time Fourier Transform (STFT) CS474/674 – Prof. Bebis.
Data statistics and transformation revision Michael J. Watts
Yun, Hyuk Jin. Theory A.Nonuniformity Model where at location x, v is the measured signal, u is the true signal emitted by the tissue, is an unknown.
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
Estimation and Confidence Intervals
A 2 veto for Continuous Wave Searches
Introduction to Instrumentation Engineering
Linear Predictive Coding Methods
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
The Empirical FT. What is the large sample distribution of the EFT?
Wavelet transform Wavelet transform is a relatively new concept (about 10 more years old) First of all, why do we need a transform, or what is a transform.
EE513 Audio Signals and Systems
Presentation transcript:

Change-Point Detection Techniques for Piecewise Locally Stationary Time Series Michael Last National Institute of Statistical Sciences Talk for Midyear Anomaly Detection Workshop 2/3/2006

Stationary Time Series We call a time series stationary if the distribution of (x i,x k ) depends only on l=i-k We call a time series stationary if the distribution of (x i,x k ) depends only on l=i-k Usually use weakly stationary, where we only look at the first two moments (equivalent in Gaussian case) Usually use weakly stationary, where we only look at the first two moments (equivalent in Gaussian case) Example: Sunspot numbers, Chandler Wobble, rainfall (over decades) Example: Sunspot numbers, Chandler Wobble, rainfall (over decades)

Detecting Changes: Piecewise Stationary Time Series Many series not stationary Many series not stationary Earthquakes Earthquakes Speech Speech Finance Finance How to model? How to model? Try stationary between change-points Try stationary between change-points

Problems With This Approach Adak (1998) proposed computing distance between power spectrum computed over small windows – if adjacent windows are close, then merge them into a larger window Adak (1998) proposed computing distance between power spectrum computed over small windows – if adjacent windows are close, then merge them into a larger window Finds too many change-points in earthquakes. Finds too many change-points in earthquakes. E.g. secondary wave tapers off, but change- points will be detected E.g. secondary wave tapers off, but change- points will be detected

Time-Varying Power Spectrum Power spectrum computed over a window about a point Power spectrum computed over a window about a point Window width selection an open question Window width selection an open question Does this have features we can use? Does this have features we can use? Yes! Yes!

Time-Varying Power Spectrum

Finding Abrupt Changes What do we mean by abrupt changes? What do we mean by abrupt changes? Distance between spectrum Distance between spectrum Spectrum as distribution, K-L Information Discrimination Spectrum as distribution, K-L Information Discrimination Requirement of local estimation Requirement of local estimation

Our Distance Function

Theoretical Performance Maximum away from change-points converges to 1. Rate of convergence: Maximum away from change-points converges to 1. Rate of convergence: Consistently estimated with smoothed periodograms Consistently estimated with smoothed periodograms Asymptotically normal Asymptotically normal Finite sample critical values independent of underlying signal Finite sample critical values independent of underlying signal n is length of window, T is length of series n is length of window, T is length of series

Example Series

Simulation Results Simulations to determine effectiveness of change-point localization and identification Simulations to determine effectiveness of change-point localization and identification Separated tasks Separated tasks 8 types of series with different features 8 types of series with different features Minimal amount of tuning Minimal amount of tuning Compared with other methods Compared with other methods Results: Results: Good localization Good localization 65+% correct identification 65+% correct identification

Data Performance

Primary Wave

Secondary Wave

Speech Segmentation Abrupt changes exist at transitions between phonemes Abrupt changes exist at transitions between phonemes Can we reliably recover these? Can we reliably recover these? Given segmented speech, can we meaningfully cluster it? Given segmented speech, can we meaningfully cluster it? Can we interpret clusters? Can we interpret clusters? Can we use the clusters to deduce speaker, accent, or language? Can we use the clusters to deduce speaker, accent, or language?

Time-Varying Power- Spectra

Speech

Window Width Considerations Need a window with enough data to estimate several frequencies in the range where interesting events happen Need a window with enough data to estimate several frequencies in the range where interesting events happen Below 10Hz for earthquakes Below 10Hz for earthquakes At least down to 20Hz for audio At least down to 20Hz for audio At present, this remains one of the major tuning parameters. In effect, wide windows have low variance but risk higher bias At present, this remains one of the major tuning parameters. In effect, wide windows have low variance but risk higher bias

How to asses a Significant Change Asymptotic Distribution: Asymptotic Distribution: Test statistic sum of variables with an F distribution plus their inverses Test statistic sum of variables with an F distribution plus their inverses Asymptotic normality Asymptotic normality Problem: Events of interest are in the tail, asymptotic results break down in tails of distributions Problem: Events of interest are in the tail, asymptotic results break down in tails of distributions Test statistic signal independent Test statistic signal independent Simulate on white noise, pick significance from there Simulate on white noise, pick significance from there

End of Talk Slides which may address specific questions follow, but unless Ive talked way too fast, there probably wont be time to show these. So lets break for coffee, and if anybody has a burning desire to learn more about what Ive said, please come and ask me – Im happy to answer any questions, and may just have a slide lying around to answer with Slides which may address specific questions follow, but unless Ive talked way too fast, there probably wont be time to show these. So lets break for coffee, and if anybody has a burning desire to learn more about what Ive said, please come and ask me – Im happy to answer any questions, and may just have a slide lying around to answer with

Finding the Change- Point(s) Assume correct number of change-points, and find Assume correct number of change-points, and find

Issues How to assess a significant change? How to assess a significant change? Uncertainty in location? Uncertainty in location? Choosing parameters Choosing parameters Window width Window width Smoothing Smoothing Weights Weights

Choosing Parameters: Window Width We need a window width much wider or much narrower than the scale interesting changes happen on We need a window width much wider or much narrower than the scale interesting changes happen on Much wider and the series mixes within a window Much wider and the series mixes within a window Much narrower and continuity of time- varying power spectrum kicks in Much narrower and continuity of time- varying power spectrum kicks in Same scale and oscillations can be detected as big changes Same scale and oscillations can be detected as big changes

Smoothing Makes estimate consistent Makes estimate consistent Ruins independence in frequency Ruins independence in frequency Another tuning parameter Another tuning parameter Bandwidth matters more than shape Bandwidth matters more than shape Current heuristic is about square-root of number of frequencies, seems to work well Current heuristic is about square-root of number of frequencies, seems to work well

Weights Method for incorporating prior knowledge Method for incorporating prior knowledge High weights for frequencies where real changes likely, low for where real changes unlikely High weights for frequencies where real changes likely, low for where real changes unlikely Akin to placing a prior on what frequencies changes will happen on Akin to placing a prior on what frequencies changes will happen on Equivalent to linear filter of signal Equivalent to linear filter of signal

Speech: Unresolved Issues Frequency domain representation of speech different across speakers – e.g. Jessica speaks at a higher pitch (frequency) than I do Frequency domain representation of speech different across speakers – e.g. Jessica speaks at a higher pitch (frequency) than I do Can we find a transform to fix this? Can we find a transform to fix this? After solving this problem, what is the next problem? After solving this problem, what is the next problem?