Download presentation
Presentation is loading. Please wait.
Published byAdelia Nash Modified over 8 years ago
1
Abdullah Mueen Eamonn Keogh University of California, Riverside
2
Repeated pattern in a time series Utility: – Activity/Event discovery – Summarization – Classification Application – Data Center Chiller Management (Patnaik, et al. KDD 09) – Designing Smart Cane for Elders (Vahdatpour, et al. IJCAI 09) 0 2000400060008000 0 10 20 30
3
Streaming time series Sliding window of the recent history – What minute long trace repeated in the last hour?
4
Discovery 1002003004005006007008009001000 -8000 -7500 -7000... 1 2 3 4 5 6 7 8. 873 The most similar pair of non-overlapping subsequences Maintenance 1 2 3 4 5 6 7 8. 873 874 time:1001 1 2 3 4 5 6 7 8. 873 874 875 time:1002 1 2 3 4 5 6 7 8. 873 874 875 876 time:1003 1 2 3 4 5 6 7 8. 873 874 875 876 877 time:1004 1 2 3 4 5 6 7 8. 873 874 875 876 877 878 time:1005 Update motif pair after every time tick time
5
A subsequence is a high dimensional point – The dynamic closest pair of points problem Closest pair may change upon every update Naïve approach: Do quadratic comparisons. 4 5 2 7 3 6 8 9 4 5 2 7 3 6 8 1 4 5 2 7 3 6 8 1 Deletion Insertion
6
Goal: Algorithm with Linear update time Previous method for dynamic closest pair (Eppstein,00) – A matrix of all-pair distances is maintained O( w 2 ) space required – Quad-tree is used to update the matrix Maintain a set of neighbors and reverse neighbors for all points We do it in O( ) space
7
Methods Evaluation Extensions Case Studies Conclusion
8
Smallest nearest neighbor → Closest pair Upon insertion – Find the nearest neighbor; Needs O( w ) comparisons. Upon deletion – Find the next NN of all the reverse NN 4 5 2 7 3 6 8 1 4 5 2 7 3 6 8 1 Insertion 9 4 5 2 7 3 6 8 1 Deletion 9
9
1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 FIFO Sliding window N-lists Neighbors in order of distances 8 8 7 7 2 2 1 1 8 8 1 1 2 2 4 4 5 5 1 1 1 1 7 7 3 3 3 3 6 6 7 7 7 7 6 6 7 7 5 5 2 2 4 4 5 5 5 5 1 1 4 4 8 8 3 3 7 7 5 5 7 7 2 2 4 4 8 8 4 4 2 2 5 5 2 2 1 1 3 3 6 6 1 1 3 3 8 8 1 1 8 8 3 3 8 8 3 3 5 5 6 6 6 6 4 4 6 6 4 4 6 6 R-lists Points that should be updated Total number of nodes in R-lists is w 5 5 1 1 3 3 2 2 4 4 8 8 6 6 7 7 4 5 2 7 3 6 8 1 4 4 7 7 Still O( w 2 ) space (ptr, distance)
10
While inserting Updating NN of old points is not necessary A point can be removed from the neighbor list if it violates the temporal order Average length O( )
11
4 5 2 7 3 6 8 1 1 1 1 1 2 2 1 1 3 3 1 1 2 2 5 5 2 2 8 8 3 3 2 2 4 4 3 3 5 5 3 3 6 6 4 4 7 7 5 5 6 6 4 4 7 7 6 6 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 2 2 3 3 3 3 2 2 3 3 2 2 6 6 3 3 7 7 9 9 5 5 5 5 4 4 4 4 6 6 8 8 5 5 7 7 6 6 6 6 8 8 4 4 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 1 1 4 5 2 7 3 6 8 1 9 4 5 2 7 3 6 8 9 10 3 3 3 3 4 4 3 3 6 6 6 6 5 5 7 7 9 9 5 5 4 4 4 4 7 7 8 8 5 5 6 6 4 4 6 6 8 8 5 5 7 7 8 8 9 9 3 3 4 4 5 5 6 6 7 7 8 8 9 9 2 2 1 1
12
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 Fast Pair Insect EEG EOG RW Average Update Time (Sec) 00.511.522.533.54 x 10 4 Window Size (w) 02004006008001000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Insect EEG EOG RW Fast Pair Average Update Time (Sec) Motif Length (m) 00.511.522.533.54 x 10 4 10 20 30 40 50 60 70 80 90 100 110 Insect EEG EOG RW Average Length of N-list Window Size (w) 02004006008001000 0 50 100 150 200 250 300 350 Insect EEG EOG RW Motif Length (m) Average Length of N-list
13
Methods Evaluation Extensions Case Studies Conclusion
14
Multidimensional Motif Maintain motif in Multiple Synchronous time series Points in one series can have neighbors in others Arbitrary Data Rate Load shedding: Skip points if can’t handle 0.40.50.60.70.80.91.0 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Fraction of Data Used Fraction of Motifs Discovered EOG Random Walk EEG Insect FIFO for each streams
15
Replace all the occurrences of a motif by a pointer to that motif. For signals with regular repetitions – Higher compression rate with less error. Dictionary
16
Check if a robot closed a loop Convert the stream of video frames to stream of color histograms. Most similar color histograms are good candidates for loop detection. 040080012001600 0 4 8 12 16 0 40080012001600 0 4 8 12 16 Start Loop Closure Detected
17
First attempt to maintain time series motif online – Maintains minutes long repetition in hours long sliding window Linear update time with less space cost than quad-tree based method – O( ) Vs O(w 2 ) Faster than general dynamic closest pair solution
18
Code and Data: http://www.cs.ucr.edu/~mueen/OnlineMotif
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.