Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside.

Similar presentations


Presentation on theme: "1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside."— Presentation transcript:

1 1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside Ada Waichee Fu Dept. of Computer Science & Eng. The Chinese University of Hong Kong

2 2 Sequence analysis with dot plots t a g t a atgtag Introduced by Gibbs & McIntyre (1970) Observed patterns –Matches (homologies) –Reverses –Gaps (differences or mutations)

3 3 Dot Plots For Time Series Analysis Problem statement: How can we meaningfully adapt the DP analysis for real value data The DP method would ideally be: –Robust to noise –Invariant to value and time shifts –Invariant to certain amount of time warping –Efficiently computable

4 4 Related work Recurrence plots (Eckman et al (1987)) Problem with recurrence plots Matches are locally (point) based rather than subsequence based -Provide intuitive 2D view of multidimensional dynamical systems -Matrix is computed over the heaviside function

5 5 The proposed solution Reducing the dot plot procedure to the motif finding problem Applying the Random Projection algorithm for finding motifs in time series data (Chiu et al 2003) Presegmenting the series to achieve time warping invariance It satisfies the initial requirements of robustness to outliers and invariance to time and value shifts

6 6 Dot plots and motif finding Def: match, trivial match, motif - D(P,Q) <= R, we say that Q is a match of P - D(P,Q) <= R,D(P,Q 1 )<= R, we say that Q 1 is a trivial match of P - A non trivial match is a motif Def: Time series dot plot – a plot that contains a point at position (i,j) iff TS1(i) and TS2(j) represent the same motif

7 7 The Random Projection algorithm Based on PROJECTION (Buhler & Tompa 2002) Algorithm outline –Split the TS into subsequences and symbolize them –Separate the symbolic sequences into classes of equivalence using PROJECTION –Mark as motifs sequences from the same class of equivalence

8 8 Random Projection – symbolization -Applies PAA (Piecewise Aggregate Approximation) Input TS: PAA TS: - Assigns letters to the PAA segments Utilizes the Symbolic Aggregate Approximation (SAX) scheme:

9 9 Random Projection–motif finding - The symbolic representations of the plotted time series are stored into tables - d random dimensions are masked and the strings are divided into separate bins

10 10 Random Projection–motif finding - Updating the dot plot collision matrix - The update is performed for m iterations.

11 11 Random Projection for streaming Complexity: space – O(|M|), time – O(m|M|) –For practical data sets M is “very sparse” –For time series data small values of m (order of 10) generate highly descriptive plots Random Projection as online algorithm –Good time performance –Updatability

12 12 Experimental evaluation Recurrent data with variable state length -The anomaly is of the same type: A -Small time warpings (shifts) are detected: B -Larger time warpings are omitted: C Dot Plots for anomaly detection

13 13 Experimental evaluation Recurrent data with fixed state length Dot Plots for anomaly detection

14 14 Experimental evaluation Dot Plots for pattern detection Stock market data

15 15 Experimental evaluation Dot Plots for pattern detection Audio data

16 16 Experimental evaluation Dot Plots for pattern detection Discrete data: for some tasks obtaining a real value representation is beneficial MUMer Random Projection

17 17 Dynamic sliding window The fixed window does not perform well when: –The size of the recurrent states varies –We do not “guess” correctly the size of the states Solution: use time series segmentation heuristics and a dynamic sliding window

18 18 Dynamic sliding window Comparison of the dynamic and fixed sliding windows The dynamic sliding window preserves more information about the frequency variability Synthetic dataset Tide data set

19 19 Conclusion This work studies the problem of building dot plots for real value time series data It demonstrates its equivalence to the motif finding problem Introduced is an efficient and robust approach for building the dot plots The performance of the tool is evaluated empirically on a number of data sets with different characteristics Finally, a dynamic sliding window technique is proposed, which improves the quality and the descriptiveness of the plots


Download ppt "1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside."

Similar presentations


Ads by Google