Chapter Two Trajectory Indexing and Retrieval

Slides:



Advertisements
Similar presentations
Principles of Density Estimation
Advertisements

On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus.
Finding the Sites with Best Accessibilities to Amenities Qianlu Lin, Chuan Xiao, Muhammad Aamir Cheema and Wei Wang University of New South Wales, Australia.
Ranking Outliers Using Symmetric Neighborhood Relationship Wen Jin, Anthony K.H. Tung, Jiawei Han, and Wei Wang Advances in Knowledge Discovery and Data.
        iDistance -- Indexing the Distance An Efficient Approach to KNN Indexing C. Yu, B. C. Ooi, K.-L. Tan, H.V. Jagadish. Indexing the distance:
On Map-Matching Vehicle Tracking Data
Continuous Intersection Joins Over Moving Objects Rui Zhang University of Melbourne Dan Lin Purdue University Kotagiri Ramamohanarao University of Melbourne.
Constructing Popular Routes from Uncertain Trajectories Authors of Paper: Ling-Yin Wei (National Chiao Tung University, Hsinchu) Yu Zheng (Microsoft Research.
Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.
Critical Analysis Presentation: T-Drive: Driving Directions based on Taxi Trajectories Authors of Paper: Jing Yuan, Yu Zheng, Chengyang Zhang, Weilei Xie,
--Presented By Sudheer Chelluboina. Professor: Dr.Maggie Dunham.
Spatio-Temporal Databases
Computer Science Spatio-Temporal Aggregation Using Sketches Yufei Tao, George Kollios, Jeffrey Considine, Feifei Li, Dimitris Papadias Department of Computer.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
Continuous Data Stream Processing MAKE Lab Date: 2006/03/07 Post-Excellence Project Subproject 6.
Dieter Pfoser, LBS Workshop1 Issues in the Management of Moving Point Objects Dieter Pfoser Nykredit Center for Database Research Aalborg University, Denmark.
Based on Slides by D. Gunopulos (UCR)
Tracking Moving Objects in Anonymized Trajectories Nikolay Vyahhi 1, Spiridon Bakiras 2, Panos Kalnis 3, and Gabriel Ghinita 3 1 St. Petersburg State University.
Spatio-Temporal Databases. Introduction Spatiotemporal Databases: manage spatial data whose geometry changes over time Geometry: position and/or extent.
Indexing Spatio-Temporal Data Warehouses Dimitris Papadias, Yufei Tao, Panos Kalnis, Jun Zhang Department of Computer Science Hong Kong University of Science.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
1 Indexing Large Trajectory Data Sets With SETI V.Prasad Chakka Adam C.Everspaugh Jignesh M.Patel University of Michigan Presented by Guangyue Jia.
Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.
Pattern Matching with Acceleration Data Pramod Vemulapalli.
GPS Trajectories Analysis in MOPSI Project Minjie Chen SIPU group Univ. of Eastern Finland.
Surface Simplification Using Quadric Error Metrics Michael Garland Paul S. Heckbert.
AAU Novel Approaches to the Indexing of Moving Object Trajectories Presented by YuQing Zhang  Dieter Pfoser Christian S. Jensen Yannis Theodoridis.
Introduction to The NSP-Tree: A Space-Partitioning Based Indexing Method Gang Qian University of Central Oklahoma November 2006.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
1 The MV3R-Tree: A Spatio- Temporal Access Method for Timestamp and Interval Queries Yufei Tao and Dimitris Papadias Hong Kong University of Science and.
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.
Efficient OLAP Operations in Spatial Data Warehouses Dimitris Papadias, Panos Kalnis, Jun Zhang and Yufei Tao Department of Computer Science Hong Kong.
1 Complex Spatio-Temporal Pattern Queries Cahide Sen University of Minnesota.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Spatio-Temporal Databases. Term Project Groups of 2 students You can take a look on some project ideas from here:
1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005.
Spatial Data Management
Presented by: Omar Alqahtani Fall 2016
School of Computing Clemson University Fall, 2012
Data Mining – Intro.
Hierarchical Clustering: Time and Space requirements
Clustering CSC 600: Data Mining Class 21.
Spatio-Temporal Databases
Pagerank and Betweenness centrality on Big Taxi Trajectory Graph
Data Science Algorithms: The Basic Methods
Spatial Indexing I Point Access Methods.
Query in Streaming Environment
CS 685: Special Topics in Data Mining Jinze Liu
Jiawei Han Department of Computer Science
Spatio-temporal Pattern Queries
DISTRIBUTED CLUSTERING OF UBIQUITOUS DATA STREAMS
Chao Zhang1, Yu Zheng2, Xiuli Ma3, Jiawei Han1
Chapter 4: Probabilistic Query Answering (2)
Spatio-Temporal Databases
Finding Fastest Paths on A Road Network with Speed Patterns
I don’t need a title slide for a lecture
Showcasing work by Jing Yuan, Yu Zheng, Xing Xie, Guangzhou Sun
Scale-Space Representation for Matching of 3D Models
Efficient Cache-Supported Path Planning on Roads
Continuous Density Queries for Moving Objects
Spatial Databases: Spatio-Temporal Databases
Topological Signatures For Fast Mobility Analysis
CSE572: Data Mining by H. Liu
Chapter Two Trajectory Indexing and Retrieval
Efficient Processing of Top-k Spatial Preference Queries
Data Mining CSCI 307, Spring 2019 Lecture 21
CS 685: Special Topics in Data Mining Jinze Liu
Presentation transcript:

Chapter Two Trajectory Indexing and Retrieval Ke Deng, Kexin Xie, Kevin Zheng and Xiaofang Zhou 20/11/2018

Chapter Overview Trajectory Query Classification Trajectory Similarity Measure Trajectory Data Index Trajectory Query Processing This chapter covers four topics. Due to the time reason, we will briefly look through Trajectory query classification and Trajectory data index.

Trajectory Query Classification P-query R-query T-query In trajectory query, spatiotemporal relationships between spatial objects are evaluated against the specified predicate. In this chapter, we classify three types of trajectory queries according to the spatial data types involved. They are point related trajectory query, simply P-query, region related trajectory query, R-query and trajectory related trajectory query - T-query. I will later explain the reason why we class in this way. Before that, let us see what are these queries first.

Trajectory Query Classification P-query (point and trajectory) p2 timestamp a T timestamp b p2 timestamp c p1 p3 p1 Where - Lp-norm (Euclidean space) shortest network path distance (road network) T is a trajectory, a sequence of sampling points when the object is moving. Each small vertical line segment marks a sampling point. From many points on interests, such as gas stations, we want to ask for the ones close to this trajectory during a time interval. P1 is closer to this trajectory from time stamp a to c and p2 is closer to this trajectory from time stamp c to b. The closeness is measured by distance. The distance can be Lp-norm, such as Euclidean distance - L2-norm, or network distance in road network. In this query, we asks for some points from many according to the spatiotemporal relationship against the trajectory. Let’s look at another situation. Give a number of points, we are interested to find from many trajectories the one which best links these points. The best link can be measured by the aggregated value of distances from each points to the trajectory. In this situation, we are looking for trajectory according a distance measure against a number of specified points. p3 p4 [Tao2002] Tao Y., Papadias D. and Shen Q., Continuous nearest neighbour search, VLDB, 2002 [Chen2010] Chen Z., Shen HT., Zhou X., Zheng Y and Xie X., Searching trajectories by locations – an efficient study. SIGMOD 2010

Trajectory Query Classification R-query (region and trajectory) T1 a R T1 a b b a a b b T2 T2 T3 a T3 a Now, let look at the trajectory queries related to regions. These are a set of trajectories. We want to find the ones which pass through a specified region in a given time interval. In this situation, trajectories are returned against a specified region. In other situation, we may want to find the regions where trajectories frequently pass. b b Ask for trajectories passing a given region in a time interval Ask for regions overlapped or frequently passed by trajectories in a time interval [Jeung2008] Jeung, H., Yiu, M.L. Zhou, X., Jensen C.S. and Shen H.T., Discovery of convoys in trajectory databases, VLDB, 2008 [Pfoster 2000] Dieter Pfoster, Christian S. Jensen, Yannis T., Novel approaches to the indexing of moving object trajectories. VLDB, 2000

Trajectory Query Classification T-query (trajectory and trajectory) t1 t2 T1 t3 t4 t7 t5 t6 t8 T2 Closest pair distance t1 t2 t3 t4 The last trajectory query type is related to other trajectories. The similarity between trajectories are measured to decide whether they should be clustered for the purpose such as traffic flow pattern identification, predicate the future location of moving object. There are several similarity measure between trajectories have been proposed. The simplest measure is called closest pair distance. It is the distance between the closest points from two different trajectories. This measure of course is not accurate in many situations. The second measure is called Sum of Pair Distance. The sampling points from two trajectories are matched if they are at the same timestamp. The distances between matching sampling points are measured and their sum is used as the similarity between the two trajectories. This measure is better than the closest pair distance but if two moving objects exact follow the same path but in different moving speed or having different starting time. The Sum of Pair Distance between them can be very large. To overcome this situation, the distance measure called Dynamic Time Warp can be used. t1 t2 T1 t3 t4 t7 t5 t6 t8 T2 t1 Sum of pair distance t2 t3 t4 [Agrawl1993]Agrawal, R., Faloutsos, C., Swami, A.N.: Ecient similarity search in sequence databases. FODO pp. 69{84 (1993)

Trajectory Query Classification T-query (trajectory and trajectory) t1 t2 T1 t3 t4 t7 t5 t6 t8 T2 Dynamic Time Warp t1 t2 t3 t4 In the Dynamic Time Warp, the sampling points in two trajectories are not required to match at the same timestamps. For example, t3 can match t1 in order to find the minimum sum of distances as the measure of two trajectories. In many situations, outlier may exist in trajectory data due to the reason such as data collection errors. Even though Dynamic Time Warp is used, two very far sampling points may lead to very large distance measure even though these two trajectories are quite similar for all other matched sampling points. In this situation, longest common subsequence can be used. In LCSS, if the distance between two matching sampling points is over a threshold, it is not included in the similarity measure. t1 t2 T1 t3 t4 t7 t5 t6 t8 T2 t1 Longest Common Subsequence t2 t3 t4 [Yi1998] Yi, B.K., Jagadish, H., Faloutsos, C.: Ecient retrieval of similar time sequences under time warping. ICDE (1998) [Zheng2009]Zheng, Y., Zhang, L., Xie, X., Ma, W.Y.: Mining interesting locations and travel sequences from gps trajectories. WWW (2009)

Trajectory Query Classification Existing Trajectory Query Classification[Pfoster 2000, Tang 2010] Coordinate-based query Windows Query Select all objects within a given time slot and given time period/slice. Nearest Neighbour Query Approximate Query Trajectory-based query Topological query, The basic predicate is “pass by”, “leave”, “cross” Navigation query, Information not directly kept in database but can be derived, for example, “At what speed does this plane move? What is its top speed?“ If you are familiar with trajectory queries, you may ask why p-trajectory, r-trajectory and t-trajectory? In existing work, the trajectories are usually classified as coordinate-based query and trajectory-based query. In coordinate-based query, there are window query, …, nearest neighbour query, …, approximate query, … In trajectory-based query, there are topological query, navigation query, … These types have a good cover of the existing trajectory queries. However, we notice the query classification develops by itself. For example, the approximate query comes later than others. We also feel it is not easy to fit some new trajectory query into existing types, such as Reverse Nearest Neighbour query. We may need to add more new query types in. So, we decide to provide a more general and stable framework for trajectory query classification. This is motivation of the P-query, R-query and T-query. [Tang 2010] Yong Tang, Xiaoping Ye and Na Tang, Temporal information processing technology and its applications, Springer, New York, 2010 [Pfoster 2000] Dieter Pfoster, Christian S. Jensen, Yannis Theodoridis, Novel approaches to the indexing of moving object trajectories. VLDB, 2000 20/11/2018

Trajectory Data Index Classification Augmented R-tree Multi-version R-tree (partition temporal dimension) Grid Based Index (partition spatial space) 20/11/2018

Trajectory Data Index Augmented R-tree 3D R-tree Time y x The straightforward way to index trajectory data is to augment temporal dimension to the traditional spatial index Along with the time, when a piece of recently collected trajectory data is coming, that is, M line segment, it is inserted into 3D R-tree according to its position in the 3D space. As a result, in many cases when we need to retrieve a long trajectory, we have to visit many different paths in 3D R-tree. To avoid this situation, ST-R-tree and TB-tree have been proposed. y x 20/11/2018

Trajectory Data Index Augmented 3D R-tree In ST R-tree, when a piece of new trajectory data is coming, it is inserted to the leaf node containing the predecessor. If no space available to accommodate the new data, the leaf node split. If no place to accommodate the new node, this piece of new trajectory data is inserted to the ST R-tree according to its position in the 3D space only. This method partially preserve the trajectory. In TB-tree, when a piece of new trajectory data is coming, it is inserted to the leaf node containing the predecessor. If no space available to accommodate the new data, the leaf node is created. A linked is set up between leaf nodes containing the segments of the same trajectory. STR-tree (Spatial-Temporal R-tree) TB-tree (Trajectory Bundle tree) [Pfoster 2000] Dieter Pfoster, Christian S. Jensen, Yannis T., Novel approaches to the indexing of moving object trajectories. VLDB, 2000 20/11/2018

Trajectory Data Index Multi-version R-tree (HR-tree [Tao2001a], HR+-tree[Tao2001b], MR-tree[Xu2005]) For each timestamp, an R-tree is created. So, there are many R-trees. These R-trees are indexed. HR-tree [Tao2001] 3D R-tree may have significant overlap between non-leaf nodes. When processing query, this may lead to visit much more irrelevant nodes. To overcome this problem, Query for trajectories in a given region and in a given time interval: The R-tree at the timestamp is found first The trajectories in the specified region are retrieved from the R-tree. [Tao2001a] Tao, Y., Papadias, D.: Efficient historical r-trees. In: ssdbm, p. 0223. Published by the IEEE Computer Society (2001) [Xu2005]Xu, X., Han, J., Lu, W.: Rt-tree: An improved r-tree indexing structure for temporal spatial databases. In: Int. Symp. on Spatial Data Handling, 2005 [Tao2001b] Tao, Y., Papadias, D.: Mv3r-tree: A spatio-temporal access method for timestamp and interval queries. In: VLDB, pp. 431{440 (2001) 20/11/2018

The trajectory segments in each cell are indexed in temporal dimension Trajectory Data Index Grid Based Index The trajectory segments in each cell are indexed in temporal dimension To cope with the non-leaf node overlap problem in 3D R-tree, another way is to use grid based index. . Spatial Filtering – cells overlap with the query box are retrieved . Temporal Filtering – the temporal . Refinement Step . Duplicate Elimination [Prasad2003] V. Prasad Chakka Adam C. Everspaugh Jignesh M., Patel, Indexing Large Trajectory Data Sets With SETI, CIDR 2003 20/11/2018

Summary Trajectory query is essential for complex analysis in various applications, such as traffic flow pattern identification, path planning, behaviour mining This chapter introduce the fundamental aspects of trajectory queries. These techniques and knowledge will provide the background for further study of this book. 20/11/2018