Efficient Cost Models for Spatial Queries Using R-Trees

Slides:



Advertisements
Similar presentations
P2PR-tree: An R-tree-based Spatial Index for P2P Environments ANIRBAN MONDAL YI LIFU MASARU KITSUREGAWA University of Tokyo.
Advertisements

Ranking Outliers Using Symmetric Neighborhood Relationship Wen Jin, Anthony K.H. Tung, Jiawei Han, and Wei Wang Advances in Knowledge Discovery and Data.
Evaluating “find a path” reachability queries P. Bouros 1, T. Dalamagas 2, S.Skiadopoulos 3, T. Sellis 1,2 1 National Technical University of Athens 2.
Spatial Join Queries. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries.
Fast Algorithms For Hierarchical Range Histogram Constructions
Danzhou Liu Ee-Peng Lim Wee-Keong Ng
DETECTING REGIONS OF INTEREST IN DYNAMIC SCENES WITH CAMERA MOTIONS.
A Query-Based Routing Tree in Sensor Networks In Chul Song Yohan Roh Dongjoon Hyun Myoung Ho Kim GSN 2006 (Geosensor Network) 1.
TIME 2002, Manchester, UK Index Based Processing of Semi- Restrictive Temporal Joins Donghui Zhang, Vassilis J. Tsotras University of California, Riverside.
Indexing Network Voronoi Diagrams*
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
--Presented By Sudheer Chelluboina. Professor: Dr.Maggie Dunham.
1 Efficient Retrieval of User Contents in MANETs Marco Fiore, Claudio Casetti, Carla-Fabiana Chiasserini Dipartimento di Elettronica, Politecnico di Torino,
Hierarchical Constraint Satisfaction in Spatial Database Dimitris Papadias, Panos Kalnis And Nikos Mamoulis.
An Incremental Refining Spatial Join Algorithm for Estimating Query Results in GIS Wan D. Bae, Shayma Alkobaisi, Scott T. Leutenegger Department of Computer.
R-tree Analysis. R-trees - performance analysis How many disk (=node) accesses we’ll need for range nn spatial joins why does it matter?
Abstract Shortest distance query is a fundamental operation in large-scale networks. Many existing methods in the literature take a landmark embedding.
Evaluation of Top-k OLAP Queries Using Aggregate R-trees Nikos Mamoulis (HKU) Spiridon Bakiras (HKUST) Panos Kalnis (NUS)
BTREE Indices A little context information What’s the purpose of an index? Example of web search engines Queries do not directly search the WWW for data;
Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Improving Min/Max Aggregation over Spatial Objects Donghui Zhang, Vassilis J. Tsotras University of California, Riverside ACM GIS’01.
AAU A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing Presented by YuQing Zhang  Slobodan Rasetic Jorg Sander James Elding Mario A.
Join-Queries between two Spatial Datasets Indexed by a Single R*-tree Join-Queries between two Spatial Datasets Indexed by a Single R*-tree Michael Vassilakopoulos.
1 SD-Rtree: A Scalable Distributed Rtree Witold Litwin & Cédric du Mouza & Philippe Rigaux.
VLDB '2006 Haibo Hu (Hong Kong Baptist University, Hong Kong) Dik Lun Lee (Hong Kong University of Science and Technology, Hong Kong) Victor.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
R ++ -tree: an efficient spatial access method for highly redundant point data Martin Šumák, Peter Gurský University of P. J. Šafárik in Košice.
Benjamin AraiUniversity of California, Riverside Reliable Hierarchical Data Storage in Sensor Networks Song Lin – Benjamin.
Nearest Neighbor Queries Chris Buzzerd, Dave Boerner, and Kevin Stewart.
Reporter : Yu Shing Li 1.  Introduction  Querying and update in the cloud  Multi-dimensional index R-Tree and KD-tree Basic Structure Pruning Irrelevant.
A Combination of Trie-trees and Inverted files for the Indexing of Set-valued Attributes Manolis Terrovitis (NTUA) Spyros Passas (NTUA) Panos Vassiliadis.
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
ICDE 2002, San Jose, CA Efficient Temporal Join Processing using Indices Donghui Zhang University of California, Riverside Vassilis J. Tsotras University.
A New Spatial Index Structure for Efficient Query Processing in Location Based Services Speaker: Yihao Jhang Adviser: Yuling Hsueh 2010 IEEE International.
MindReader: Querying databases through multiple examples Yoshiharu Ishikawa (Nara Institute of Science and Technology, Japan) Ravishankar Subramanya (Pittsburgh.
Data Indexing in Peer- to-Peer DHT Networks Garces-Erice, P.A.Felber, E.W.Biersack, G.Urvoy-Keller, K.W.Ross ICDCS 2004.
R-trees: An Average Case Analysis. R-trees - performance analysis How many disk (=node) accesses we ’ ll need for range nn spatial joins why does it matter?
Efficient OLAP Operations in Spatial Data Warehouses Dimitris Papadias, Panos Kalnis, Jun Zhang and Yufei Tao Department of Computer Science Hong Kong.
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
Rethinking Choices for Multi-dimensional Point Indexing You Jung Kim and Jignesh M. Patel University of Michigan.
1 Along & across algorithm for routing events and queries in wireless sensor networks Tat Wing Chim Department of Electrical and Electronic Engineering.
Scalable Multi-match Packet Classification Using TCAM and SRAM Author: Yu-Chieh Cheng, Pi-Chung Wang Publisher: IEEE Transactions on Computers (2015) Presenter:
A Flexible Spatio-temporal indexing Scheme for Large Scale GPS Tracks Retrieval Yu Zheng, Longhao Wang, Xing Xie Microsoft Research.
Mehdi Kargar Department of Computer Science and Engineering
CS522 Advanced database Systems
Multiway Search Trees Data may not fit into main memory
Azita Keshmiri CS 157B Ch 12 indexing and hashing
Database Management System
Physical Database Design
Tree-Structured Indexes
Sameh Shohdy, Yu Su, and Gagan Agrawal
Spatial Online Sampling and Aggregation
File Processing : Query Processing
Joining Massive High-Dimensional Datasets
On Spatial Joins in MapReduce
Query Processing B.Ramamurthy Chapter 12 11/27/2018 B.Ramamurthy.
Indexing and Hashing Basic Concepts Ordered Indices
B+Trees The slides for this text are organized into chapters. This lecture covers Chapter 9. Chapter 1: Introduction to Database Systems Chapter 2: The.
Chapter 12 Query Processing (1)
Indexing 4/11/2019.
Multiway Search Tree (MST)
2019/5/5 A Flexible Wildcard-Pattern Matching Accelerator via Simultaneous Discrete Finite Automata Author: Hsiang-Jen Tsai, Chien-Chih Chen, Yin-Chi Peng,
R-trees: An Average Case Analysis
Tree-Structured Indexes
Indexing, Access and Database System Architecture
Donghui Zhang, Tian Xia Northeastern University
Efficient Aggregation over Objects with Extent
Presentation transcript:

Efficient Cost Models for Spatial Queries Using R-Trees Reference: Y Theodoridis, E Stefanakis, T Sellis, Efficient cost models for spatial queries using R-trees, IEEE Transactions on Knowledge and Data Engineering , 2000 Speaker: Kai-Yun Ho 2019/2/25 MCSE LAB

Outline Introduction Background Analytical Cost Models for Spatial Queries Selection Oueries Join Oueries Introducing a Path Buffer Evaluation of the Cost Models Conclusions 2019/2/25 MCSE LAB

Introduction Spatial queries addressed by users of SDBMS usually involve selection (point or range) and join operations. We present analytical models that estimate the cost of selection and join queries using R-tree-based structures. 2019/2/25 MCSE LAB

Background (1/2) The processing of any type of spatial query can be accelerated when a spatial index exists. Selection queries Search all data rectangles that overlap the query window q. Join queries Search all pairs of rectangles that overlap each other. 2019/2/25 MCSE LAB

Background (2/2) For both operations, the total cost is measured by the total amount of page accesses in the R-tree index. By definition, the number of node accesses is always greater than or equal to the number of actual disk accesses. The equality only holds for the case where no buffering scheme exists. 2019/2/25 MCSE LAB

Background : R-tree r A C B r A D D G E B C E F G F 2019/2/25 MCSE LAB

Analytical Cost Models for Spatial Queries What is sought? A formula that estimates the average number NA of node accesses using only knowledge about data properties. 2019/2/25 MCSE LAB

For Selection Oueries (1/2) The number of nodes at level l intersected by the query window q 2019/2/25 MCSE LAB

For Selection Oueries (2/2) A function of the data properties NR1 and DR1 …..…. … f0 f1 fh-1 + 2019/2/25 MCSE LAB

For Join Oueries 1 - £ + h l and where For the upper levels of two R-tree : ( ) 1 2 - £ + R h l and where 因為對於兩顆tree來說,access的node個數是一樣的,所以NA(R1,R2,l1)=NA(R2,R1,i2) 2019/2/25 MCSE LAB

Introducing a Path Buffer (1/2) The existence of such a buffer mainly affects the performance of the tree index that plays the role of the query set. 2019/2/25 MCSE LAB

Disk Access (DA) data set query set 2019/2/25 MCSE LAB

Introducing a Path Buffer (2/2) “query” tree R2 and “data” tree R1 : the propagation of R1 down to leaf adds no extra cost (disk accesses) to R2 that has already reached its leaf level. : each propagation of R2 down to its lower levels adds equal cost to R1 2019/2/25 MCSE LAB

Evaluation of the Cost Models synthetic and real data sets LBeach data set : consisting of 53,143 line segments (stored as rectangles) indicating roads of Long Beach, California. MGcounty data set : consisting of 39,221 line segments (stored as rectangles)indicating roads of Montgomery County, Maryland. synthetic random synthetic skewed LBeach real data MGcounty real data 2019/2/25 MCSE LAB

Evaluation of the Cost Models For selection queries on synthetic random data set Density D=0.1 2019/2/25 MCSE LAB

Evaluation of the Cost Models For join queries on synthetic random data set node and disk accesses Density D=2 Density D=1 2019/2/25 MCSE LAB

Evaluation of the Cost Models Performance comparison for selection queries on (a) skewed and (b) real data. (a) (b) 2019/2/25 MCSE LAB

Evaluation of the Cost Models For join queries on (a) skewed and (b) real data. (a) (b) 2019/2/25 MCSE LAB

Conclusions For query optimization purposes, efficient cost models should be also available in order to make accurate cost estimations under various data distributions. The proposed cost formulae are functions of data properties only, and, therefore, can be used without any knowledge of the R-tree index properties. Experimental results on both synthetic and real data sets showed that the proposed analytical model is very accurate. 2019/2/25 MCSE LAB

Q & A 2019/2/25 MCSE LAB