Detecting Time Series Motifs Under Uniform Scaling D. Yankov, E. Keogh, J. Medina, B. Chiu, V. Zordan Dept. of Computer Science & Eng. University of California Riverside
Outline Problem definition Motivation Formalization and approach Experimental evaluation
Problem definition Given is a long time series or a data set of shorter sequences Goal: Detect similar patterns of various scaling 100 200 300 400 500 600 A C B 100 200 300 400 500 600
Motivation Object recognition with time series representation Animation
Motivation (cont) Time series sampled at different rate Physiological time series of different frequencies
Similarity under uniform scaling Formalization Similarity under uniform scaling Motifs under uniform scaling
Approach Observation: only a limited set of scaling factors need to be checked Algorithm. For every scaling factor do: rescale all query subsequences represent all time series as equal length words over the same alphabet (apply SAX)
Approach (cont) Using PROJECTION (a locality sensitive hashing approach), filter out all non-matching words. Compute the distance between the unfiltered time series pairs.
Experimental evaluation Brain activity time series Valuable in predicting epileptic seizure periods.
Experimental evaluation Effectiveness of the algorithm Efficiency
Experimental evaluation (cont) Projectile shapes Lampasas River Cornertang Castroville Cornertang The algorithm detects a rare cornertang segment – an object that has long intrigued anthropologists. 50 100 150 200 250 300 350
Experimental evaluation (cont) Motion-capture motifs On this sequence the method detects the same blocking movement performed by the actor. The Euclidean distance fails to detect this motif.
Conclusion Uniform scaling motifs appear in diverse areas as – animation, object recognition, medical sequence mining, etc. The presented probabilistic approach for mining such motifs is accurate and extremely effective. The method works in an entirely unsupervised way, requiring only a specified motif length. Possible extensions – multivariate time series, disk resident modifications.
Poster# 28 THANK YOU!