Enumeration of Time Series Motifs of All Lengths Abdullah Mueen Department of Computer Science University of new Mexico
Example: Repeating Pattern (Motif) 2000 4000 6000 8000 10000 10 20 30 100 200 300 400 500 600 700 800 Chiu et al. KDD 2003
Motivation: Enumerating Motifs Find the most similar pairs of time series at every lengths. Brown A E X et al. PNAS 2013;110:791-796
Goals: Enumerating Motifs
Outline 1.Bounding correlation 2.Enumerating motifs of all lengths Intuitive Example Experimental Results Case Study: Activity Recognition 3.Conclusion
Pearson’s Correlation Coefficient
Correlation Advantage: 1. Scale and Shift invariant 2. Linear scans to compute Disadvantage: 1. Don’t consider warping 2. Is not a metric
Relationship with Euclidean Distance
Bounding Euclidean Distance Values Changed 1 2 3 4 5 6 7 8 9 10 Without Normalization 1 2 3 4 5 -4 -3 -2 -1 With Normalization
Intuition Normalized Append 10 and re-normalize 2 3 4 5 -2 -1.5 -1 -0.5 0.5 1.5 1 2 3 4 5 -2 -1.5 -1 -0.5 0.5 1.5 1 2 3 4 5 -2 -1.5 -1 -0.5 0.5 1.5 Length 5 Length 4 Length 5
Bounding Euclidean Distance
Bounding Euclidean Distance 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 x 105 10 15 20 25 30 35 2.1 2.2 2.3 2.4 2.6 2.7 2.8 2.9 20.5 21 21.5 22 22.5 23 23.5 24 24.5 Pairs in ascending order of distances Normalized Distance
Outline 1.Bounding correlation 2.Enumerating motifs of all lengths Intuitive Example Experimental Results Case Study: Activity Recognition 3.Conclusion
Intuition 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 -8.5 -8 -7.5 -7 x 103 145, 5410, 1.26 8345, 4211, 2.63 1655, 9461, 2.96 6531, 2501, 3.17 851, 1440, 3.73 2512, 3110, 3.98 1685, 9260, 4.57 145, 5410, 1.79 8345, 4211, 1.63 1655, 9461, 3.61 6531, 2501, 2.71 851, 1440, 3.83 2512, 3110, 4.18 1685, 9260, 4.27 8345, 4211, 1.63 145, 5410, 1.79 6531, 2501, 2.71 1655, 9461, 3.61 851, 1440, 3.83 2512, 3110, 4.18 1685, 9260, 4.27
Intuition 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 -8.5 -8 -7.5 -7 x 103 8345, 4211, 1.63 145, 5410, 1.79 6531, 2501, 2.71 1655, 9461, 3.61 851, 1440, 3.83 8345, 4211, 1.23 145, 5410, 1.98 6531, 2501, 1.71 1655, 9461, 3.68 851, 1440, 3.61 8345, 4211, 1.23 6531, 2501, 1.71 145, 5410, 1.98 851, 1440, 3.61 1655, 9461, 3.68
Outline 1.Bounding correlation 2.Enumerating motifs of all lengths Intuitive Example Experimental Results Case Study: Activity Recognition 3.Conclusion
Sanity Check White Noise (1) Length :87 Length :105 Length :299 (2) 1000 2000 3000 4000 5000 6000 -2 2 4 1380 1400 1420 1440 1460 -5 5 Length :87 1320 1340 1360 Length :105 600 700 800 Length :299 2200 2300 2400 (1) (2) (3) (4) White Noise http://www.cs.unm.edu/~mueen/Projects/MOEN/index.html
Experimental Results: Scalability 2 4 6 8 10 12 14 16 x 104 1 3 5 7 x 105 Data Length (n) Execution Time in Seconds Smart Brute Force EEG EOG Random Walk Iterative MK 9 18 Range of Lengths (maxLen-minLen+1) x 102
Outline 1.Bounding correlation 2.Enumerating motifs of all lengths Intuitive Example Experimental Results Case Study: Activity Recognition 3.Conclusion
Activity Recognition A B C E F D 0.5 1 1.5 2 2.5 3 0/2 2/4 1/4 1/2 0/3 x 104 A B C E F D 0.5 1 1.5 2 2.5 3 0/2 2/4 1/4 1/2 0/3 0/4 Hip Hand Arm Leg x y z Step Action A Side steps with no arm movement B Rock steps sideways without arm movement C Rock steps sideways with arm movement D Side steps with arm movement E Side steps with arms up in the air F Standing still with head bopping H. Pohl et al. SMC 2010
Thank You
Backup Slides
Experimental Results 7 Execution Time in Seconds 5 3 1 K c 4 6 8 10 12 14 2 n=10k n=20k n=40k n=80k n=160k x 103 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1 3 5 7 c x 102
Sample Output http://www.cs.unm.edu/~mueen/Projects/MOEN/index.html 3960 3980 4000 4020 4040 4060 4080 4100 4120 Length :186 1.634 1.636 1.638 1.64 1.642 1.644 1.646 1.648 1.65 x 104 -5 5 5260 5280 5300 5320 5340 5360 5380 5400 5420 5440 Length :187 9100 9120 9140 9160 9180 9200 9220 9240 9260 3450 3500 3550 3600 3650 Length :255 8800 8850 8900 8950 9000 7050 7100 7150 7200 7250 7300 7350 7400 Length :373 9600 9650 9700 9750 9800 9850 9900 1000 2000 3000 5000 6000 7000 8000 10000 -8.5 -8 -7.5 -7 x 103 http://www.cs.unm.edu/~mueen/Projects/MOEN/index.html
Time Series Join Best Match Lengths x1.5x10-3 Correlation 100 200 300 400 500 600 700 800 0.5 1 1.5 2 Lengths Best Match Correlation Length-adjusted Correlation
Motif Covering Locations of the First Occurrences Covering Motifs 50 100 150 200 250 300 350 400 2000 4000 6000 8000 Length Covering Motifs Locations of the First Occurrences