1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside.

Slides:



Advertisements
Similar presentations
L3S Research Center University of Hanover Germany
Advertisements

Yinyin Yuan and Chang-Tsun Li Computer Science Department
SAX: a Novel Symbolic Representation of Time Series
Indexing DNA Sequences Using q-Grams
Jose-Luis Blanco, Javier González, Juan-Antonio Fernández-Madrigal University of Málaga (Spain) Dpt. of System Engineering and Automation May Pasadena,
Learning Rules from System Call Arguments and Sequences for Anomaly Detection Gaurav Tandon and Philip Chan Department of Computer Sciences Florida Institute.
Evaluating “find a path” reachability queries P. Bouros 1, T. Dalamagas 2, S.Skiadopoulos 3, T. Sellis 1,2 1 National Technical University of Athens 2.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Fast Algorithms For Hierarchical Range Histogram Constructions
1 Approximated tracking of multiple non-rigid objects using adaptive quantization and resampling techniques. J. M. Sotoca 1, F.J. Ferri 1, J. Gutierrez.
Efficient Anomaly Monitoring over Moving Object Trajectory Streams joint work with Lei Chen (HKUST) Ada Wai-Chee Fu (CUHK) Dawei Liu (CUHK) Yingyi Bu (Microsoft)
Linear Obfuscation to Combat Symbolic Execution Zhi Wang 1, Jiang Ming 2, Chunfu Jia 1 and Debin Gao 3 1 Nankai University 2 Pennsylvania State University.
Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,
Locally Constraint Support Vector Clustering
1 Manifold Clustering of Shapes Dragomir Yankov, Eamonn Keogh Dept. of Computer Science & Eng. University of California Riverside.
Themis Palpanas1 VLDB - Aug 2004 Fair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use.
Efficient Query Filtering for Streaming Time Series
Jessica Lin, Eamonn Keogh, Stefano Loardi
Motion Detail Preserving Optical Flow Estimation Li Xu 1, Jiaya Jia 1, Yasuyuki Matsushita 2 1 The Chinese University of Hong Kong 2 Microsoft Research.
Regular Expression Constrained Sequence Alignment Abdullah N. Arslan Assistant Professor Computer Science Department.
Recovering Articulated Object Models from 3D Range Data Dragomir Anguelov Daphne Koller Hoi-Cheung Pang Praveen Srinivasan Sebastian Thrun Computer Science.
A TABU SEARCH APPROACH TO POLYGONAL APPROXIMATION OF DIGITAL CURVES.
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Visually Mining and Monitoring Massive Time Series Amy Karlson V. Shiv Naga Prasad 15 February 2004 CMSC 838S Images courtesy of Jessica Lin and Eamonn.
Detecting Time Series Motifs Under
1 Ensembles of Nearest Neighbor Forecasts Dragomir Yankov, Eamonn Keogh Dept. of Computer Science & Eng. University of California Riverside Dennis DeCoste.
A Multiresolution Symbolic Representation of Time Series
Sequence similarity. Motivation Same gene, or similar gene Suffix of A similar to prefix of B? Suffix of A similar to prefix of B..Z? Longest similar.
Energy-efficient Self-adapting Online Linear Forecasting for Wireless Sensor Network Applications Jai-Jin Lim and Kang G. Shin Real-Time Computing Laboratory,
Time Series Data Analysis - II
Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.
Exact Indexing of Dynamic Time Warping
On comparison of different approaches to the stability radius calculation Olga Karelkina Department of Mathematics University of Turku MCDM 2011.
Analysis of Constrained Time-Series Similarity Measures
Computer Science Department University of Pittsburgh 1 Evaluating a DVS Scheme for Real-Time Embedded Systems Ruibin Xu, Daniel Mossé and Rami Melhem.
Discovering the Intrinsic Cardinality and Dimensionality of Time Series using MDL BING HU THANAWIN RAKTHANMANON YUAN HAO SCOTT EVANS1 STEFANO LONARDI EAMONN.
City University of Hong Kong 18 th Intl. Conf. Pattern Recognition Self-Validated and Spatially Coherent Clustering with NS-MRF and Graph Cuts Wei Feng.
1 ENTROPY-BASED CONCEPT SHIFT DETECTION PETER VORBURGER, ABRAHAM BERNSTEIN IEEE ICDM 2006 Speaker: Li HueiJyun Advisor: Koh JiaLing Date:2007/11/6 1.
Mining Multidimensional Sequential Patterns over Data Streams Chedy Raїssi and Marc Plantevit DaWak_2008.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
A Query Adaptive Data Structure for Efficient Indexing of Time Series Databases Presented by Stavros Papadopoulos.
Abdullah Mueen Eamonn Keogh University of California, Riverside.
Discovering Deformable Motifs in Time Series Data Jin Chen CSE Fall 1.
Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti University of Cyprus Song Lin
DYNAMIC FACILITY LAYOUT : GENETIC ALGORITHM BASED MODEL
ICDE, San Jose, CA, 2002 Discovering Similar Multidimensional Trajectories Michail VlachosGeorge KolliosDimitrios Gunopulos UC RiversideBoston UniversityUC.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
Exact indexing of Dynamic Time Warping
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy Nurjahan BegumLiudmila Ulanova Jun Wang 1 Eamonn Keogh University.
NSF Career Award IIS University of California Riverside Eamonn Keogh Efficient Discovery of Previously Unknown Patterns and Relationships.
Sequence Alignment.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
VizTree Huyen Dao and Chris Ackermann. Introducing example
ITree: Exploring Time-Varying Data using Indexable Tree Yi Gu and Chaoli Wang Michigan Technological University Presented at IEEE Pacific Visualization.
A Time Series Representation Framework Based on Learned Patterns
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
Effective Variation Management for Pseudo Periodical Streams SIGMOD’07.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
Feature learning for multivariate time series classification Mustafa Gokce Baydogan * George Runger * Eugene Tuv † * Arizona State University † Intel Corporation.
Experience Report: System Log Analysis for Anomaly Detection
Learning Mid-Level Features For Recognition
Supervised Time Series Pattern Discovery through Local Importance
Visually Mining and Monitoring Massive Time Series
Homomorphic Hashing for Sparse Coefficient Extraction
Enumeration of Time Series Motifs of All Lengths
A Time Series Representation Framework Based on Learned Patterns
Time Relaxed Spatiotemporal Trajectory Joins
Finding Periodic Discrete Events in Noisy Streams
Presentation transcript:

1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside Ada Waichee Fu Dept. of Computer Science & Eng. The Chinese University of Hong Kong

2 Sequence analysis with dot plots t a g t a atgtag Introduced by Gibbs & McIntyre (1970) Observed patterns –Matches (homologies) –Reverses –Gaps (differences or mutations)

3 Dot Plots For Time Series Analysis Problem statement: How can we meaningfully adapt the DP analysis for real value data The DP method would ideally be: –Robust to noise –Invariant to value and time shifts –Invariant to certain amount of time warping –Efficiently computable

4 Related work Recurrence plots (Eckman et al (1987)) Problem with recurrence plots Matches are locally (point) based rather than subsequence based -Provide intuitive 2D view of multidimensional dynamical systems -Matrix is computed over the heaviside function

5 The proposed solution Reducing the dot plot procedure to the motif finding problem Applying the Random Projection algorithm for finding motifs in time series data (Chiu et al 2003) Presegmenting the series to achieve time warping invariance It satisfies the initial requirements of robustness to outliers and invariance to time and value shifts

6 Dot plots and motif finding Def: match, trivial match, motif - D(P,Q) <= R, we say that Q is a match of P - D(P,Q) <= R,D(P,Q 1 )<= R, we say that Q 1 is a trivial match of P - A non trivial match is a motif Def: Time series dot plot – a plot that contains a point at position (i,j) iff TS1(i) and TS2(j) represent the same motif

7 The Random Projection algorithm Based on PROJECTION (Buhler & Tompa 2002) Algorithm outline –Split the TS into subsequences and symbolize them –Separate the symbolic sequences into classes of equivalence using PROJECTION –Mark as motifs sequences from the same class of equivalence

8 Random Projection – symbolization -Applies PAA (Piecewise Aggregate Approximation) Input TS: PAA TS: - Assigns letters to the PAA segments Utilizes the Symbolic Aggregate Approximation (SAX) scheme:

9 Random Projection–motif finding - The symbolic representations of the plotted time series are stored into tables - d random dimensions are masked and the strings are divided into separate bins

10 Random Projection–motif finding - Updating the dot plot collision matrix - The update is performed for m iterations.

11 Random Projection for streaming Complexity: space – O(|M|), time – O(m|M|) –For practical data sets M is “very sparse” –For time series data small values of m (order of 10) generate highly descriptive plots Random Projection as online algorithm –Good time performance –Updatability

12 Experimental evaluation Recurrent data with variable state length -The anomaly is of the same type: A -Small time warpings (shifts) are detected: B -Larger time warpings are omitted: C Dot Plots for anomaly detection

13 Experimental evaluation Recurrent data with fixed state length Dot Plots for anomaly detection

14 Experimental evaluation Dot Plots for pattern detection Stock market data

15 Experimental evaluation Dot Plots for pattern detection Audio data

16 Experimental evaluation Dot Plots for pattern detection Discrete data: for some tasks obtaining a real value representation is beneficial MUMer Random Projection

17 Dynamic sliding window The fixed window does not perform well when: –The size of the recurrent states varies –We do not “guess” correctly the size of the states Solution: use time series segmentation heuristics and a dynamic sliding window

18 Dynamic sliding window Comparison of the dynamic and fixed sliding windows The dynamic sliding window preserves more information about the frequency variability Synthetic dataset Tide data set

19 Conclusion This work studies the problem of building dot plots for real value time series data It demonstrates its equivalence to the motif finding problem Introduced is an efficient and robust approach for building the dot plots The performance of the tool is evaluated empirically on a number of data sets with different characteristics Finally, a dynamic sliding window technique is proposed, which improves the quality and the descriptiveness of the plots