Download presentation
Presentation is loading. Please wait.
1
University of Macau, Macau
Quick-Motif: An Efficient and Scalable Framework for Exact Motif Discovery Yuhong Li Department of Computer and Information Science University of Macau, Macau
2
Quick-Motif: What is Motif ?
Most similar subsequence pair in a Time Series Applications A core subroutine for activity discovery, e.g., elder care, surveillance and sports training. Clustering enumerated motifs is more meaningful than clustering all the subsequences in a long time series.
3
Quick-Motif: Formal Definition
time series subsequence s π time series π π π+ββ1 πβ1 Timeline Exact Motif Discovery Input: time series π and target motif length β Output: most similar subsequence pair in terms of normalized Euclidean distance. Avoid trivial match ο Non-overlapping Adjacent subsequence pairs are expected to similar to each other naturally.
4
Quick-Motif: NaΓ―ve Solution
Sliding window size = β, Step size = 1 Subsequences of length β Subsequences of length β Test all subsequence pairs normalize β¦ Motif ο most similar subsequence pair β¦ β¦ Time complexity is O( π 2 β).
5
Quick-Motif: Existing Solutions
Reference-based Index (MK) [Mueen & Keogh, SDM 2009] Good: Prune unpromising pairs by batches. Bad: π(β) time distance computations. Smart Brute Force (SBF) [Mueen, ICDM 2013] Good: π(1) time distance computations. Bad: examine all subsequence pairs. β¦ β¦ ? π(β) π(1)
6
Quick-Motif: Fast Distance Computation
Incremental distance computation. π 0 π 20 β¦β¦ π 1 π 21 π 2 π 22 π 23 π 3 π 4 β¦ π 24 π 0 π 1 π 2 π 3 π 4 π 20 π 21 π 22 9 subsequence pairs ο π β 16 subsequence pairs ο π(1) π 23 π 24
7
Quick-Motif: Pruning of Subsequence Pairs
Group every w consecutive subsequences as a PAA MBR. π€ = 5 π 2 π 3 5 π 1 5 minDist π 2 5 PAA feature space π 1 Minimum distance between two PAA MBRs ο Distance LBs. If distance LB is smaller than ππ π ο Further refinement.
8
Quick-Motif: Filter-and-Refinement
NaΓ―ve Solution. Check the distance LBs for all π€-MBR pairs. The time complexity is π( (π/π€) 2 π) , π is the PAA dimensionality. How to Efficiently Find Surviving π€-MBR Pairs? Enable batch pruning. Discover the true motif as soon as possible to improve the pruning ability.
9
Quick-Motif: Filter-and-Refinement
Enable Batch Pruning ο Hierarchical Structure Offer reasonable grouping quality, thus good pruning ability. Can be constructed very efficiently. π 2 π 8 π€ π 1 π€ Level 2 π 3 π€ π ππππ‘ π 6 π€ Level 1 π 5 π€ π 0 π€ π π π π π π π 7 π€ minDist π 4 π€ π 2 π€ π 4 π€ π 6 π€ π 0 π€ π 2 π€ π 7 π€ π 5 π€ π 3 π€ π 1 π€ π 8 π€ PAA feature space π 1 Hilbert curve sort list
10
Quick-Motif: Filter-and-Refinement
Discover true motif as soon as possible ο Locality-based Search Strategy Level 2 π ππππ‘ Bad locality Level 1 π π π π π π Hilbert curve sort list Leaf nodes Good locality π 4 π€ π 6 π€ π 0 π€ π 2 π€ π 7 π€ π 5 π€ π 3 π€ π 1 π€ π 8 π€ Locality-based search vs Best-first search Locality-based Best-first Surviving pairs 0.1256M 0.1249M Heap size N/A 2.78M # pushes 11.73 M (queue) 6.75 M (heap) Resp. time 1.56 s 6.32 s
11
Quick-Motif: Experimental Evaluation
Programming Language: C++ Machine: Ubuntu 12.04, 4GB RAM Datasets RW: Random generate. EEG: Reflect the activity of neurons, length ECG: The Koski ECG. Length EPG: Sequence that traces insect behaviour, length TAO: Sea surface temperatures, length
12
Quick-Motif: Performance Evaluation
(a), Effect of β on ECG (b), Effect of β on EEG (c), Effect of β on EPG (d), Effect of β on TAO
13
Thanks Q A input hidden output
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.