Indexing and Range Queries in Spatio-Temporal Databases

Slides:



Advertisements
Similar presentations
The A-tree: An Index Structure for High-dimensional Spaces Using Relative Approximation Yasushi Sakurai (NTT Cyber Space Laboratories) Masatoshi Yoshikawa.
Advertisements

Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Nearest Neighbor Queries using R-trees Based on notes from G. Kollios.
Effectively Indexing Uncertain Moving Objects for Predictive Queries School of Computing National University of Singapore Department of Computer Science.
Nearest Neighbor Search in Spatial and Spatiotemporal Databases
Indexing the Positions of Continuously Moving Objects Saltenis, Jensen, Leutenegger and Lopez.
2-dimensional indexing structure
Spatio-temporal Databases Time Parameterized Queries.
Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Spatial Indexing for NN retrieval
Spatial Indexing SAMs. Spatial Access Methods PAMs Grid File kd-tree based (LSD-, hB- trees) Z-ordering + B+-tree R-tree Variations: R*-tree, Hilbert.
Accessing Spatial Data
Spatio-Temporal Databases
Project Proposals Simonas Šaltenis Aalborg University Nykredit Center for Database Research Department of Computer Science, Aalborg University.
Computer Science Spatio-Temporal Aggregation Using Sketches Yufei Tao, George Kollios, Jeffrey Considine, Feifei Li, Dimitris Papadias Department of Computer.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
Spatial Indexing SAMs.
Introduction to Evolutionary Computation  Genetic algorithms are inspired by the biological processes of reproduction and natural selection. Natural selection.
1 R-Trees for Spatial Indexing Yanlei Diao UMass Amherst Feb 27, 2007 Some Slide Content Courtesy of J.M. Hellerstein.
Chapter 3: Data Storage and Access Methods
An Incremental Refining Spatial Join Algorithm for Estimating Query Results in GIS Wan D. Bae, Shayma Alkobaisi, Scott T. Leutenegger Department of Computer.
Spatial Queries Nearest Neighbor Queries.
R-tree Analysis. R-trees - performance analysis How many disk (=node) accesses we’ll need for range nn spatial joins why does it matter?
Spatio-Temporal Databases. Introduction Spatiotemporal Databases: manage spatial data whose geometry changes over time Geometry: position and/or extent.
Handling Location Imprecision in Moving Object Database Xinfa Hu March 2007.
R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.
Indexing Spatio-Temporal Data Warehouses Dimitris Papadias, Yufei Tao, Panos Kalnis, Jun Zhang Department of Computer Science Hong Kong University of Science.
Spatial Indexing SAMs. Spatial Access Methods PAMs Grid File kd-tree based (LSD-, hB- trees) Z-ordering + B+-tree R-tree Variations: R*-tree, Hilbert.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
R-tree Analysis. R-trees - performance analysis How many disk (=node) accesses we’ll need for range nn spatial joins why does it matter?
Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries.
R-TREES: A Dynamic Index Structure for Spatial Searching by A. Guttman, SIGMOD Shahram Ghandeharizadeh Computer Science Department University of.
Moving Objects Databases Nilanshu Dharma Shalva Singh.
Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.
R-Trees: A Dynamic Index Structure for Spatial Data Antonin Guttman.
INDEXING SPATIAL DATABASES Atinder Singh Department of Computer Science University of California Riverside, CA
Improving Min/Max Aggregation over Spatial Objects Donghui Zhang, Vassilis J. Tsotras University of California, Riverside ACM GIS’01.
AAU A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing Presented by YuQing Zhang  Slobodan Rasetic Jorg Sander James Elding Mario A.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
Join-Queries between two Spatial Datasets Indexed by a Single R*-tree Join-Queries between two Spatial Datasets Indexed by a Single R*-tree Michael Vassilakopoulos.
1 SD-Rtree: A Scalable Distributed Rtree Witold Litwin & Cédric du Mouza & Philippe Rigaux.
KNR-tree: A novel R-tree-based index for facilitating Spatial Window Queries on any k relations among N spatial relations in Mobile environments ANIRBAN.
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
CSIS7101 – Advanced Database Technologies Spatio-Temporal Data (Part 1) On Indexing Mobile Objects Kwong Chi Ho Leo Wong Chi Kwong Simon Lui, Tak Sing.
Nearest Neighbor Queries Chris Buzzerd, Dave Boerner, and Kevin Stewart.
Bin Yao (Slides made available by Feifei Li) R-tree: Indexing Structure for Data in Multi- dimensional Space.
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
Bin Yao, Feifei Li, Piyush Kumar Presenter: Lian Liu.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.
R-trees: An Average Case Analysis. R-trees - performance analysis How many disk (=node) accesses we ’ ll need for range nn spatial joins why does it matter?
R-T REES Accessing Spatial Data. I N THE BEGINNING … The B-Tree provided a foundation for R-Trees. But what’s a B-Tree? A data structure for storing sorted.
Efficient OLAP Operations in Spatial Data Warehouses Dimitris Papadias, Panos Kalnis, Jun Zhang and Yufei Tao Department of Computer Science Hong Kong.
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
Spatio-Temporal Databases
R* Tree By Rohan Sadale Akshay Kulkarni.  Motivation  Optimization criteria for R* Tree  High level Algorithm  Example  Performance Agenda.
Spatio-Temporal Databases. Term Project Groups of 2 students You can take a look on some project ideas from here:
Jeremy Iverson & Zhang Yun 1.  Chapter 6 Key Concepts ◦ Structures and access methods ◦ R-Tree  R*-Tree  Mobile Object Indexing  Questions 2.
1 R-Trees Guttman. 2 Introduction Range queries in multiple dimensions: Computer Aided Design (CAD) Geo-data applications Support special data objects.
Spatial Data Management
Spatio-Temporal Databases
Azita Keshmiri CS 157B Ch 12 indexing and hashing
Spatial Indexing.
Database Management Systems (CS 564)
Spatio-Temporal Databases
Indexing and Hashing Basic Concepts Ordered Indices
Indexing the Positions of Continuously Moving Objects
Spatial Indexing I R-trees
R-trees: An Average Case Analysis
Donghui Zhang, Tian Xia Northeastern University
Presentation transcript:

Indexing and Range Queries in Spatio-Temporal Databases Danzhou Liu, Wei Cui, Yun Fan School of Computer Science University of Central Florida

Outline Introduction The R*-tree The TPR-tree The TPR*-tree Experiments Conclusions

Introduction Spatio-temporal databases record moving objects’ geographical locations (sometimes also shapes) at various timestamps. support queries that explore their historical and future (predictive) behaviors. Applications. applications: flight control systems, weather forecast and mobile computing The database stores the motion functions of moving objects. For each object o, its motion function gives its location o(t) at any future time t. A predictive window query specifies a query region qR and a future time interval qT retrieves the set of all objects that will fall in qR during qT. our goal: index moving objects so that a predictive window query can be answered with as few disk I/Os as possible. Examples Find all airplanes that will be over Florida in the next 10 minutes. Report all vessels that will enter the United States in the next hour.

Motion Function We consider linear motion. For each object, the database stores Its minimum bounding rectangle (MBR) at the reference time 0 Its current velocity bounding rectangle (VBR) Examples: MBR(a)={2,4,3,4}, VBR(a)={1,1,1,1}; MBR(c)={8,9,3,4}, VBR(c)={-2,0,0,2}; An update is necessary only when an object’s VBR changes.

R*-tree The R*-tree aims at minimizing: the area The perimeter of each MBR The overlap between two MBRs (e.g., N1, N2) in the same node The distance between the centroid of an MBR and that of the node containing it

R*-tree Insertion

The Time Parameterized R-Tree (TPR-Tree) Extends the R-tree by introducing the velocity bounding rectangle (VBR) in all entries. Queries are compared with conservative MBRs of non-leaf entries. N1v={-2,1,-2,1} and N2v={-2,0,-1,2}

TPR*-Tree Our goal index moving objects so that a predictive window query can be answered with as few disk I/Os as possible. A mathematical model that estimates the cost of answering a predictive window query using TPR-like structures. Number of node accesses. Application of the model to derive the optimal performance. The TPR-tree is much worse than the optimal structure. Exam the algorithms of the TPR-tree, identify their deficiencies, and propose new ones. The TPR*-tree.

TPR deficiency 1: Choosing sub-tree to insert To insert an entry, the TPR-tree picks the sub-tree incurring the minimum penalty (smallest MBR/VBR enlargement). May result in inserting an entry into a bad sub-tree; this problem is increasingly serious as time evolves.

TPR* solution: Choose path Aims at finding the best insertion path globally, namely, among all possible paths. Observation: We can find this path by accessing only a few more nodes (than the TPR-tree algorithm). Maintain a heap: [(g),0], [(h),0], [(i),20] the path expanded so far the accumulated penalty so far

TPR* solution: Choose path Aims at finding the best insertion path globally, namely, among all possible paths. Observation: We can find this path by accessing only a few more nodes (than the TPR-tree algorithm). Visit node g: [(h),0], [(a,g),3], [(i),20], [(b,g),32] complete paths already although nodes a and b are not visited

TPR* solution: Choose path Aims at finding the best insertion path globally, namely, among all possible paths. Observation: We can find this path by accessing only a few more nodes (than the TPR-tree algorithm). Visit node h: [(a,g),3], [(d,h),9], [(c,h),17], [(i),20], [(b,g),32] The algorithm stops now.

TPR deficiency 2: Which entries to re-insert When a node overflows, some of its entries are re-inserted to defer node split (the ones that diverge most from the node centroid). The entries chosen by the TPR-tree are very likely to be re-inserted back to the same node, so that a node split is still necessary.

TPR* solution: Pick worst Aims at selecting entries that can most effectively “shrink” the MBR or VBR of the node for re-insertion. The first step picks an appropriate dimension (either spatial or velocity) based purely on estimation using our cost model (see the paper for details). The second step performs sorting on this dimension and decides the entries to be removed . Example: If the axis chosen in the first step is the x-axis, then the sorting list is {b,d,a,c}. Either b or c is removed.

TPR deficiency 3: Tightening MBR in deletion Entry deletion requires first finding the entry, which accesses many nodes of the tree. The TPR-tree uses this fact to tighten the MBR of non-leaf entries. Assume nodes h and i are accessed before e is found; then the TPR-tree will tighten the MBR of i only (enclosing g and f).

TPR deficiency 3: Tightening MBR in deletion Entry deletion requires first finding the entry, which accesses many nodes of the tree. The TPR-tree uses this fact to tighten the MBR of non-leaf entries. Assume nodes h and i are accessed before e is found; then the TPR-tree will tighten the MBR of i only (enclosing g and f).

TPR* solution: Active tightening Tightening more entries for free. Assume nodes h and i are accessed before e is found; then the TPR*-tree will tighten the MBR of both h and i.

TPR* solution: Active tightening Tightening more entries for free. Assume nodes h and i are accessed before e is found; then the TPR*-tree will tighten the MBR of both h and i.

TPR* solution: Active tightening (Cont.) Another example: Assume the shaded nodes are accessed to find e. The active tightening can tighten the MBR of n5, n6, n3, and n4. But not n1 and n2.

Challenge of Migration 3 Operating Systems: Microsoft Windows Sun Solaris Redhat Fedora Core 1 2 Compilers: CL, GCC (2.9.5, 3.3.2) Difference of Code Conversion How close the compilers to the standard? Compatibility of Library

Experiments: Settings (query and tree) Dataset 50,000 sampled objects’ MBRs are taken from a real spatial dataset NJ [Tiger] each object is associated with a VBR such that on each dimension The velocity extent is zero (i.e., the object does not change spatial extents during its movement) the velocity value distribution is randomed in range [0,8] the velocity can be positive or negative with equal probability. We compare TPR*- with TPR-trees. Disk page size=1k bytes (node capacity=27 for both trees). For each object update, perform a deletion followed by an insertion on each tree. Each predictive query is a moving rectangle, and has these parameters: qRlen: The length of the query’s MBR qVlen: The length of the query’s VBR qTlen: The number of timestamps covered.

TPR-tree

TPR*-tree

Conclusions The TPR-tree combines the idea of conservative MBR directly with the tree construction algorithms of R*-trees. The TPR*-tree improves it by designing algorithms that take into account the special features for moving objects. Cost model for performance analysis The optimal performance of a “hypothetically best structure” Reduce disk I/Os for predictive queries

Q&A

Thanks!