ICDE, San Jose, CA, 2002 Discovering Similar Multidimensional Trajectories Michail VlachosGeorge KolliosDimitrios Gunopulos UC RiversideBoston UniversityUC.

Slides:

Advertisements

Similar presentations

Indexing Time Series Based on original slides by Prof. Dimitrios Gunopulos and Prof. Christos Faloutsos with some slides from tutorials by Prof. Eamonn.

Advertisements

In Search of Meaning for Time Series Subsequence Clustering

Choosing Distance Measures for Mining Time Series Data

Indexing DNA Sequences Using q-Grams

Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.

Mining Frequent Spatio-temporal Sequential Patterns

Word Spotting DTW.

Fast Algorithms For Hierarchical Range Histogram Constructions

Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.

Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.

Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.

1 Longest Common Subsequence (LCS) Problem: Given sequences x[1..m] and y[1..n], find a longest common subsequence of both. Example: x=ABCBDAB and y=BDCABA,

Avatar Path Clustering in Networked Virtual Environments Jehn-Ruey Jiang, Ching-Chuan Huang, and Chung-Hsien Tsai Adaptive Computing and Networking Lab.

Non-metric affinity propagation for unsupervised image categorization Delbert Dueck and Brendan J. Frey ICCV 2007.

Indexing Time Series. Time Series Databases A time series is a sequence of real numbers, representing the measurements of a real variable at equal time.

Indexing Time Series Based on Slides by C. Faloutsos (CMU) and D. Gunopulos (UCR)

Themis Palpanas1 VLDB - Aug 2004 Fair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use.

Reza Sherkat ICDE061 Reza Sherkat and Davood Rafiei Department of Computing Science University of Alberta Canada Efficiently Evaluating Order Preserving.

Distance Functions for Sequence Data and Time Series

Indexing of Network Constrained Moving Objects Dieter Pfoser Christian S. Jensen Chia-Yu Chang.

Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.

Based on Slides by D. Gunopulos (UCR)

Learning the space of time warping functions for Activity Recognition Function-Space of an Activity Ashok Veeraraghavan Rama Chellappa Amit K. Roy-Chowdhury.

San Diego, 06/12/03 San Diego, 06/12/03 Martin Pfeifle, Database Group, University of Munich Using Sets of Feature Vectors for Similarity Search on Voxelized.

A Multiresolution Symbolic Representation of Time Series

1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside.

Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.

Pattern Matching with Acceleration Data Pramod Vemulapalli.

Detecting Distance-Based Outliers in Streams of Data Fabrizio Angiulli and Fabio Fassetti DEIS, Universit `a della Calabria CIKM 07.

Exact Indexing of Dynamic Time Warping

Space-Efficient Sequence Alignment Space-Efficient Sequence Alignment Bioinformatics 202 University of California, San Diego Lecture Notes No. 7 Dr. Pavel.

Qualitative approximation to Dynamic Time Warping similarity between time series data Blaž Strle, Martin Možina, Ivan Bratko Faculty of Computer and Information.

1 TEMPLATE MATCHING  The Goal: Given a set of reference patterns known as TEMPLATES, find to which one an unknown pattern matches best. That is, each.

Analysis of Constrained Time-Series Similarity Measures

Knowledge Discovery and Delivery Lab (ISTI-CNR & Univ. Pisa)‏ www-kdd.isti.cnr.it Anna Monreale Fabio Pinelli Roberto Trasarti Fosca Giannotti A. Monreale,

Similarity based Retrieval from Sequence Databases using Automata as Queries 作者 : A. Prasad Sistla, Tao Hu, Vikas howdhry 出處 :CIKM 2002 ACM 指導教授 : 郭煌政老師.

Dynamic Programming.

Shape-based Similarity Query for Trajectory of Mobile Object NTT Communication Science Laboratories, NTT Corporation, JAPAN. Yutaka Yanagisawa Jun-ichi.

Acknowledgements Contact Information Anthony Wong, MTech 1, Senthil K. Nachimuthu, MD 1, Peter J. Haug, MD 1,2 Patterns and Rules  Vital signs medoids.

Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti University of Cyprus Song Lin

Exact indexing of Dynamic Time Warping

Stream Monitoring under the Time Warping Distance Yasushi Sakurai (NTT Cyber Space Labs) Christos Faloutsos (Carnegie Mellon Univ.) Masashi Yamamuro (NTT.

University of Macau Discovering Longest-lasting Correlation in Sequence Databases Yuhong Li Department of Computer and Information Science.

CS848 Similarity Search in Multimedia Databases Dr. Gisli Hjaltason Content-based Retrieval Using Local Descriptors: Problems and Issues from Databases.

Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy Nurjahan BegumLiudmila Ulanova Jun Wang 1 Eamonn Keogh University.

Chapter 7 Dynamic Programming 7.1 Introduction 7.2 The Longest Common Subsequence Problem 7.3 Matrix Chain Multiplication 7.4 The dynamic Programming Paradigm.

Time Series Sequence Matching Jiaqin Wang CMPS 565.

Intelligent Database Systems Lab Advisor ： Dr. Hsu Graduate ： Chien-Shing Chen Author ： Jessica K. Ting Michael K. Ng Hongqiang Rong Joshua Z. Huang 國立雲林科技大學.

Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Time Series databases Text databases.

Part 2 # 68 Longest Common Subsequence T.H. Cormen et al., Introduction to Algorithms, MIT press, 3/e, 2009, pp Example: X=abadcda, Y=acbacadb.

Measuring the Structural Similarity of Semistructured Documents Using Entropy Sven Helmer University of London, Birkbeck VLDB’07, September 23-28, 2007,

Distance/Similarity Functions for Pattern Recognition J.-S. Roger Jang ( 張智星 ) CS Dept., Tsing Hua Univ., Taiwan

Rethinking Choices for Multi-dimensional Point Indexing You Jung Kim and Jignesh M. Patel University of Michigan.

A Sampling-based Estimator for Top-k Selection Query Chung-Min ChenYibei Ling ICDE 2002 Presented by Kan Kin Fai.

High-Dimensional Data. Topics Motivation Similarity Measures Index Structures.

Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.

Fast Subsequence Matching in Time-Series Databases.

Machine Learning for the Quantified Self

Supervised Time Series Pattern Discovery through Local Importance

Least common subsequence:

Distance Functions for Sequence Data and Time Series

Overview Of Clustering Techniques

Distance Functions for Sequence Data and Time Series

Robust Similarity Measures for Mobile Object Trajectories

Finding Similar Time Series

Time Series Data and Moving Object Trajectory

Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.

Similarity Search: A Matching Based Approach

Dynamic Programming.

Presentation transcript:

ICDE, San Jose, CA, 2002 Discovering Similar Multidimensional Trajectories Michail VlachosGeorge KolliosDimitrios Gunopulos UC RiversideBoston UniversityUC Riverside

Outline Introduction Similarity Measures Compute the Similarity Indexing Trajectories Experimental Evaluation Related Work Conclusion

Introduction The trajectory of a moving object is typically modeled as a sequence of consecutive locations in a multidimensional Euclidean space An appropriate and efficient model for defining the similarity for trajectory data will be very important for the quality of the data analysis tasks.

Examples of 2D trajectories

Hierarchical clustering of 2D series (displayed as 1D for clariry)

Similarity Measures Let A and B be two trajectories of moving objects with size n and m respectively, where A = ((a x,1, a y,1 ),…,(a x,n, a y,n )) and B = ((b x,1, b y,1 ),…,(b x,m, b y,m )). For a trajectory A, let Head(A) be the sequence Head(A) = ((a x,1, a y,1 ),…,(a x,n-1, a y,n-1 ))

Deﬁnition 1 Given an integer δ and a real number 0 < ε <1, we de ﬁ ne the LCSS δ,ε (A; B) as follows:

Deﬁnition 2 We de ﬁ ne the similarity function S1 between two trajectories A and B, given δ and ε, as follows:

A region of δ & ε for a trajectory

Deﬁnition 3 Given δ, ε and the family F of translations, we de ﬁ ne the similarity function S2 between two trajectories A and B, as follows:

Translation of trajectory B

Deﬁnition 4 Given δ, ε and two trajectories A and B we de ﬁ ne the following distance functions:

Compute the Similarity Similarity function S1 Given two trajectories A and B, with |A| = n and |B| = m, we can find the LCSS δ,ε (A, B) in O( δ (n + m)) time. Similarity function S2 Given two trajectories A and B, with |A| = n and |B| = m, we can compute the S2( δ,ε, A, B) in O((n+m) 3 δ 3 ) time.

Approximate Algorithm

Indexing Structure For every node C of the tree we store the medoid (M C ) of each cluster. The medoid is the trajectory that has the minimum distance (or maximum LCSS) from every other trajectory in the cluster:

Time and Accuracy Experiments Similarity values and running times from SEALS dataset

Experiment 1 - Video tracking data

Experiment 2 & 3 - Australian Sign Language Dataset (ASL)

Evaluating the indexing technique

Related Work Use a p-norm distance to de ﬁ ne the similarity measure. [2, 37, 18, 14, 10, 32, 10, 20, 24, 23] Based on the time warping technique.[5, 25, 28, 33] Find the longest common subsequence (LCSS) of two sequences.[3, 7, 11] Define time series similarity are based on extracting certain features.[13, 17, 29, 31]

Conclusion Efficient techniques to accurately compute the similarity between trajectories Approximate algorithms with provable performance bounds ef ﬁ cient index structure

Comments Good approach for similarity queries Use real GPS trajectory data? …

Dynamic Time Warping Sequences are similar but accelerate differently along the time axis Enforcing a temporal constraint δ on the warping window size improves computation efficiency and accuracy Application : Speech recognition ( Berndt and Clifford, 1996)

1 Longest Common Subsequence Similarity Dissimilarity: Tolerance: Match 2 sequences by allowing some elements to be unmatched C = {1,2,3,4,5,1,7} and Q = {2,5,4,5,3,1,8} Longest is {2,4,5,1} Application : Bioinformatics Vlachos et al., 2002

1 Longest Common Subsequence Similarity for i := 1..m for j := 1..n if C[i] = Q[j] L[i,j] := L[i-1,j-1] + 1 else: L[i,j] := max(L[i,j-1], L[i-1,j]) return L[m,n] Input sequences C[1..m] and Q[1..n] Compute LCS btwn C[1..i] and Q[1..j] for all 1 ≤ i ≤ m and 1 ≤ j ≤ n Stores it in L[i,j] L[m,n] = length of the LCS Vlachos et al., 2002