Download presentation
Presentation is loading. Please wait.
1
Silvestro Roberto Poccia
On the Effectiveness of Distance Measures for Similarity Search in Multi-Variate Sensory Data June 8th, 2017 International Conference on Multimedia Retrieval ICMR 2017 Bucharest, Romania Yash Garg CIDSE, Arizona State University Tempe, USA 85281 Silvestro Roberto Poccia University of Turin Corso Svizzera 185 Turin, Italy 10149
2
EmitLAB, Arizona State University
What is Sensory Data? Multiple attributes simultaneously tracked. Generates Multi Variate Time Series. Correlated Variates. Contains background information, metadata Jun 8, 2017 EmitLAB, Arizona State University
3
Applications of Sensory Data
Gesture recognition Health Monitoring Green Energy Buildings Aviation Automobiles Surveillance Wearables Image Source (Left): CMU graphics lab motion capture database, 2015. Image Source (Right): Jun 8, 2017 EmitLAB, Arizona State University
4
EmitLAB, Arizona State University
What is the problem? Given the information-rich sensory data, how to find similar instances in the data corpus? Jun 8, 2017 EmitLAB, Arizona State University
5
EmitLAB, Arizona State University
Existing measures Non-Elastic Distance: Euclidean Cosine Elastic Distance: Dynamic Time Warping, DTW[1] Other Representations: Symbolic Aggregate Approximation, SAX[2] [1] Donald J Berndt and James Clifford. Using dynamic time warping to find patterns in time series. In KDD workshop, volume 10, pages 359–370. Seatle, WA, 1994. [2] Jessica Lin, Eamonn Keogh, Stefano Lonardi, and Bill Chiu. A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, pages 2–11. ACM, 2003. Jun 8, 2017 EmitLAB, Arizona State University
6
Why non-elastic distances might not work?
Two time series must be of same length. Patterns must be synchronized. Patterns must have same speed. In general, this is the problem with any non-elastic distance. Jun 8, 2017 EmitLAB, Arizona State University
7
What might work? A elastic distance measure that,
Does not require time series of same length. Can account for non-synchronized patterns. Can account for difference in speed of patterns. Solution is: Dynamic Time Warping. Jun 8, 2017 EmitLAB, Arizona State University
8
Dynamic Time Warping (DTW)
Aligned Difference in speed Delay Image Source: Jun 8, 2017 EmitLAB, Arizona State University
9
Is naïve-DTW sufficient for today’s data?
Advancements in technology has increased the dimensionality of time series (Multi-variate time series). Ex 1. Lets’ assume this room. How many parameters we can track, at a given time? # of people at a given time Temperature Humidity Pressure, and more. Ex 2. A human gesture, which hand, leg, head, waist, wrist movements etc. NO Jun 8, 2017 EmitLAB, Arizona State University
10
EmitLAB, Arizona State University
Extensions to DTW 𝕏 𝐷𝑇𝑊(𝕏,𝕐)=𝐷𝑇𝑊( 𝑋 , 𝑌 ) 𝕐 𝕏 𝕐 𝐷𝑇𝑊(𝕏,𝕐)= 𝑖=1 𝑁 𝐷𝑇𝑊( 𝕏 𝑖 , 𝕐 𝑖 ) Jun 8, 2017 EmitLAB, Arizona State University
11
Are these extensions sufficient?
Independent-DTW, ignore the relationships between the variates. In case of Vectorized-DTW, it assumes all variates are synchronized. Jun 8, 2017 EmitLAB, Arizona State University
12
How to account for relationships?
Weighted Dynamic Time Warping (W-DTW) 𝐷𝑇𝑊 𝕏,𝕐 = 𝑖=1 𝑁 𝑤 𝑖 ∗𝐷𝑇𝑊 𝕏 𝑖 , 𝕐 𝑖 Jun 8, 2017 EmitLAB, Arizona State University
13
How to extract the weights?
Using the information associated with the time series, metadata. Jun 8, 2017 EmitLAB, Arizona State University
14
Data-driven weights contd.
Weights represents the important patterns contained in the time series. But, these wait are susceptible to noise. Weights might not represent entire data corpus. Solution : Use METADATA to extract weights. Jun 8, 2017 EmitLAB, Arizona State University
15
EmitLAB, Arizona State University
What is Metadata? Metadata, is an additional information associated with a multi-variate time series, defines the relationships between the variates[1]. Metadata, mathematically, is a matrix 𝑹→𝑵×𝑵, where 𝑁 is the number of variates in a multi-variate time series. Ex. A distance matrix between sensors, or a hop matrix representing a connectivity between the sensors in a graph. [1] Xiaolan Wang, K Selcuk Candan, and Maria Luisa Sapino. Leveraging metadata for identifying local, robust multi-variate temporal (RMT) features. In Data Engineering (ICDE), 2014 IEEE 30th International Conference on, pages 388–399. IEEE, 2014. Jun 8, 2017 EmitLAB, Arizona State University
16
Metadata-driven Weights contd.
* * S V U 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 𝑤 =𝑈∗ 𝑆 Jun 8, 2017 EmitLAB, Arizona State University
17
Extracting Projected Variates
Using metadata, to cluster related variates. Leveraging structural dependence between variates. Jun 8, 2017 EmitLAB, Arizona State University
18
Extracting Projected Variates contd.
Metadata U S V * * Jun 8, 2017 EmitLAB, Arizona State University
19
Extracting Projected Variates contd.
Multi Variate Time Series Projected Multi Variate Time Series * Jun 8, 2017 EmitLAB, Arizona State University
20
Why use of metadata might work?
Identifies related clusters of variates. Provides a uniform weights for entire data corpus. Requires rich metadata. Jun 8, 2017 EmitLAB, Arizona State University
21
EmitLAB, Arizona State University
Experimental Results Dataset: Criteria: Classification Accuracy Class Tightness (Ratio of inter class distance to intra class distance) Mocap[1] Kaggle[2] Dimensionality(Variates) 62 20 Temporal Length 7-35 Class Labels 8 Instances per class 6-36 31 Metadata Sensor Distance Sensor Connectivity [1] CMU graphics lab motion capture database, 2015. [2] Kaggle gesture dataset, 2015. Jun 8, 2017 EmitLAB, Arizona State University
22
EmitLAB, Arizona State University
Results for Mocap Jun 8, 2017 EmitLAB, Arizona State University
23
EmitLAB, Arizona State University
Results for Kaggle Jun 8, 2017 EmitLAB, Arizona State University
24
EmitLAB, Arizona State University
Conclusion Metadata has a positive impact on the measure. Impact of metadata depends on inherent synchrony in the time series. Need of Rich metadata. Jun 8, 2017 EmitLAB, Arizona State University
25
EmitLAB, Arizona State University
Future Work Use of specialized metadata for each individual time series. Use feature based measures. Use of motif in multi-dimensional time series. Jun 8, 2017 EmitLAB, Arizona State University
26
EmitLAB, Arizona State University
THANK YOU & Questions Jun 8, 2017 EmitLAB, Arizona State University
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.