1 Indexing Large Trajectory Data Sets With SETI V.Prasad Chakka Adam C.Everspaugh Jignesh M.Patel University of Michigan Presented by Guangyue Jia
2 Overview Motivation Problem definition and query types SETI Experimental Evaluation Strong and weak points Relation and stimulation to our project Conclusion
3 Motivation Location based systems are used everywhere. –Existing LBS: GPS, Navigation systems, Others –How many cars were in the center of Aalborg from 10 to 11 o´clock. Efficient and Inexpensive techniques. Previous Indices: B-tree, R-tree, Others New methord. SETI—Scalable and Efficient Trajectory Index
4 Overview Motivation Problem definition and query types SETI Experimental Evaluation Strong and weak points Relation and stimulation to our project Conclusion
5 Problem definition and query types Data model – Trajectory is represented as trj (tid, ). – Segment is represented as s i (tid, sid, u i-1, u i ). – Point u is a three-tuple u i (x i, y i, t i )
6 Problem definition and query types Query types –Queries that ask questions about the future positions of moving objects. Where is car A after one hour? answered by storing current position, speed and the direction of the moving objects. –Queries that ask questions about the historical positions of moving objects. time interval query time slice query nearest neighbor query Where is car A at 5pm, yesterday?
7 Overview Motivation Problem definition and query types SETI Experimental Evaluation Strong and weak points Relation and stimulation to our project Conclusion
8 SETI Description Insert Example of the Insert Procedure Search Deletes and Updates
9 SETI- Description Is a logical indexing structure that built on top of an existing spatial indexing techniques. –R-tree. Partition + temporal indices. –Abandon 3D indexing technology. –Partition the 2D spatial data. –Index lines in 1D(time) dimension. Data page –Each data page only contains segments that belong to the same spatial cell. –Lifetime of the data page. Use of multiple sparse indices. –One entry for each data page.
10 SETI -Insert -is a cache -Maintains the last updated location -pull out the last known location -updated with the new location -Determines the particular spatial cells -split segments which span multiple spatial cells
11 SETI- Example of the Insert Procedure Description: -A is the current location of O. -O move from A to A´. -AA´ represent the movement of O between the two updates. Procedure: 1, A´ are sent to insert module. 2, Front Line receive A´, pull out A, update by A´, and send AA´to Partitioning Module. 3, Partitioning Module receive AA´, and determine the spatial cells for AA´, and also break AA´ if it spans multiple cells. 4, Update the temporal indices and Data File
12 SETI- Example of the Insert Procedure -AA´spans two spatial cells. -AA´ is broken into two smaller segments: AX and XA´. -X is the intersection point. -X is a logical update location. -AX and XA´are inserted into the spatial cells. -AX and XA still represent the single segment AA´. -Also need calculate the time of point X.
13 SETI- Search Spatial Filtering: produce candidate cells Temporal Filtering: probe temporal indices in the candidate cells. Refinement Step: if page completely inside the spatial predicate box. then if the temporal predicate range contains the page lifetime then select all segments on the page else apply query on each segments Duplicate Elimination: use bitmap
14 SETI- Deletes and Updates Deletion types: –Delete particular segment –Delete complete trajectory Segment deletion –Use bounding box Complete trajectory deletion –All the segments of the trajectory must be identified. –Use an auxiliary composite B+-tree index the trajectory ID and the segment number of the trajectory. Updates –Deletion+Insertion
15 Overview Motivation Problem definition and query types SETI Experimental Evaluation Strong and weak points Relation and stimulation to our project Conclusion
16 Experimental Evaluation Experimental platform and software –Intel Pentium III 600MHz, 384MB main memory, 60GB IBM Deskstar 7200 RPM ULtra ATA/100 disk, Debian Linux version –Software is a system called COMET. Data Sets –GSTD –Net work data Queries –Time interval query: Equal normalized widths 3D box. –Time slice query: time stamp value and 2D spatial range.
17 Experimental Evaluation Effect of Number of spatial Partitioning Cells, GSTD(1K, 4M), 0.1% Time-interval Query Index Sizes, GSTD(1K, X)
18 Experimental Evaluation Comparing Insert Performance, GSTD(1K, 4M), 10K Inserts Scaling with Number of Segments, GSTD(1K, X), 0.01% Time-interval Query
19 Overview Motivation Problem definition and query types SETI Experimental Evaluation Strong and weak points Relation and stimulation to our project Conclusion
20 Strong and weak points Strong points –The structure of the paper is clear –Nearly complete experiment –Use sparse indices Weak points –No algorithm to contrast –Too briefly introduce some important technique: section 3.1 about indices clustered. and section 3.5 about dynamic partition.
21 Overview Motivation Problem definition and query types SETI Experimental Evaluation Strong and weak points Relation and stimulation to our project Conclusion
22 Relation and stimulation to our project Same problem –Very similar data model and query types. Same experimental procedure –We also plan to compare different indexing techniques. Different partitioning structure –We use static partitioning strategy. –We insert the segment which spans multiple spatial cells into all cells it spans. Create Data Page and use sparse indices.
23 Overview Motivation Problem definition and query types SETI Experimental Evaluation Strong and weak points Relation and stimulation to our project Conclusion
24 Conclusion SETI is a new indexing method which build on an existing index(R-tree). SETI use sparse temporal indices + spatial partitions. SETI is good at range space based queries, but maybe not good at specific object based queries.