Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza Approximate Query Processing in Spatial Databases Using Raster Signatures Federal University.

Slides:



Advertisements
Similar presentations
Approximate Spatial Query Processing Using Raster Signatures Leonardo Guerreiro Azevedo, Rodrigo Salvador Monteiro, Geraldo Zimbrão & Jano Moreira de Souza.
Advertisements

Indexing DNA Sequences Using q-Grams
The Role of Error Map and attribute data errors are the data producer's responsibility, GIS user must understand error. Accuracy and precision of map and.
Data Models There are 3 parts to a GIS: GUI Tools
Kaushik Chakrabarti(Univ Of Illinois) Minos Garofalakis(Bell Labs) Rajeev Rastogi(Bell Labs) Kyuseok Shim(KAIST and AITrc) Presented at 26 th VLDB Conference,
Introduction to Algorithms Rabie A. Ramadan rabieramadan.org 2 Some of the sides are exported from different sources.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Effective Keyword Based Selection of Relational Databases Bei Yu, Guoliang Li, Karen Sollins, Anthony K.H Tung.
Efficient access to TIN Regular square grid TIN Efficient access to TIN Let q := (x, y) be a point. We want to estimate an elevation at a point q: 1. should.
Kien A. Hua Division of Computer Science University of Central Florida.
Fast Algorithms For Hierarchical Range Histogram Constructions
Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University.
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
Intersections. Intersection Problem 3 Intersection Detection: Given two geometric objects, do they intersect? Intersection detection (test) is frequently.
Query Processing in Databases Dr. M. Gavrilova.  Introduction  I/O algorithms for large databases  Complex geometric operations in graphical querying.
2-dimensional indexing structure
TERMS, CONCEPTS and DATA TYPES IN GIS Orhan Gündüz.
ACM GIS An Interactive Framework for Raster Data Spatial Joins Wan Bae (Computer Science, University of Denver) Petr Vojtěchovský (Mathematics,
An Incremental Refining Spatial Join Algorithm for Estimating Query Results in GIS Wan D. Bae, Shayma Alkobaisi, Scott T. Leutenegger Department of Computer.
R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.
Indexing Spatio-Temporal Data Warehouses Dimitris Papadias, Yufei Tao, Panos Kalnis, Jun Zhang Department of Computer Science Hong Kong University of Science.
Exposure In Wireless Ad-Hoc Sensor Networks S. Megerian, F. Koushanfar, G. Qu, G. Veltri, M. Potkonjak ACM SIG MOBILE 2001 (Mobicom) Journal version: S.
Distance Indexing on Road Networks A summary Andrew Chiang CS 4440.
1 Efficient packet classification using TCAMs Authors: Derek Pao, Yiu Keung Li and Peng Zhou Publisher: Computer Networks 2006 Present: Chen-Yu Lin Date:
Improving Min/Max Aggregation over Spatial Objects Donghui Zhang, Vassilis J. Tsotras University of California, Riverside ACM GIS’01.
AAU A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing Presented by YuQing Zhang  Slobodan Rasetic Jorg Sander James Elding Mario A.
CONGRESSIONAL SAMPLES FOR APPROXIMATE ANSWERING OF GROUP-BY QUERIES Swarup Acharya Phillip Gibbons Viswanath Poosala ( Information Sciences Research Center,
MINING RELATED QUERIES FROM SEARCH ENGINE QUERY LOGS Xiaodong Shi and Christopher C. Yang Definitions: Query Record: A query record represents the submission.
KNR-tree: A novel R-tree-based index for facilitating Spatial Window Queries on any k relations among N spatial relations in Mobile environments ANIRBAN.
Mehdi Kargar Aijun An York University, Toronto, Canada Discovering Top-k Teams of Experts with/without a Leader in Social Networks.
Approximate Frequency Counts over Data Streams Loo Kin Kong 4 th Oct., 2002.
DANIEL J. ABADI, ADAM MARCUS, SAMUEL R. MADDEN, AND KATE HOLLENBACH THE VLDB JOURNAL. SW-Store: a vertically partitioned DBMS for Semantic Web data.
Approximate Encoding for Direct Access and Query Processing over Compressed Bitmaps Tan Apaydin – The Ohio State University Guadalupe Canahuate – The Ohio.
Join Synopses for Approximate Query Answering Swarup Achrya Philip B. Gibbons Viswanath Poosala Sridhar Ramaswamy Presented by Bhushan Pachpande.
Swarup Acharya Phillip B. Gibbons Viswanath Poosala Sridhar Ramaswamy Presented By Vinay Hoskere.
1 / 18 Federal University of Rio de Janeiro – COPPE/UFRJ Author : Wladimir S. Meyer – Doctorate Student Advisors : Jano Moreira de Souza – Ph.D. Milton.
VIRTUAL MEMORY By Thi Nguyen. Motivation  In early time, the main memory was not large enough to store and execute complex program as higher level languages.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
1 Biometric Databases. 2 Overview Problems associated with Biometric databases Some practical solutions Some existing DBMS.
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
Spatial DBMS Spatial Database Management Systems.
NR 143 Study Overview: part 1 By Austin Troy University of Vermont Using GIS-- Introduction to GIS.
CSE554ContouringSlide 1 CSE 554 Lecture 4: Contouring Fall 2015.
1 Overview Importing data from generic raster files Creating surfaces from point samples Mapping contours Calculating summary attributes for polygon features.
L1-Spatial Concepts NGEN06 & TEK230: Algorithms in Geographical Information Systems by: Irene Rangel, updated by Sadegh Jamali 1.
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
Variant Indexes. Specialized Indexes? Data warehouses are large databases with data integrated from many independent sources. Queries are often complex.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.
Efficient OLAP Operations in Spatial Data Warehouses Dimitris Papadias, Panos Kalnis, Jun Zhang and Yufei Tao Department of Computer Science Hong Kong.
Ohio State University Department of Computer Science and Engineering Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng,
1 Complex Spatio-Temporal Pattern Queries Cahide Sen University of Minnesota.
Memory Management OS Fazal Rehman Shamil. swapping Swapping concept comes in terms of process scheduling. Swapping is basically implemented by Medium.
Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University.
Raster Data Models: Data Compression Why? –Save disk space by reducing information content –Methods Run-length codes Raster chain codes Block codes Quadtrees.
Dense-Region Based Compact Data Cube
2010 IEEE Global Telecommunications Conference (GLOBECOM 2010)
Optimizing Parallel Algorithms for All Pairs Similarity Search
Database Management System
Progressive Computation of The Min-Dist Optimal-Location Query
UNIVERSITY OF MASSACHUSETTS Dept
Geographical Information Systems
Chapter 12: Query Processing
Query Processing in Databases Dr. M. Gavrilova
Spatio-temporal Pattern Queries
Spatial Online Sampling and Aggregation
Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
Enumerating Distances Using Spanners of Bounded Degree
Automatic and Efficient Data Virtualization System on Scientific Datasets Li Weng.
Efficient Aggregation over Objects with Extent
Presentation transcript:

Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza Approximate Query Processing in Spatial Databases Using Raster Signatures Federal University of Rio de Janeiro {azevedo,

FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FINAL CONSIDERATIONS FINAL CONSIDERATIONS FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS GOALS AND CONTRIBUTIONS FOUR-COLOR RASTER SIGNATURE (4CRS) Presentation plan 4CRS FOUR-COLOR RASTER SIGNATURE (4CRS) GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

 There are many cases where a query can take a long time to be processed, for example: –When processing huge volume of data that requires a large number of I/O operations Disk access time is still higher than memory access time –When processing high complex queries –When accessing remote data due to a slow network link or even temporary non-availability... Motivation FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S An exact answer can demand a long time

 There are many cases where a query can take a long time to be processed, for example: –When processing huge volume of data that requires a large number of I/O operations Disk access time is still higher than memory access time –When processing high complex queries –When accessing remote data due to a slow network link or even temporary non-availability... A fast answer can be more important than an exact response Motivation FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

Motivation The challenge becomes bigger in spatial data environments. FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S 399,0000 segments 475,434 segments

Motivation Precision of the query can be lessened, and an approximate answer returned to the user –Approximate answers can be quickly computed –Acceptable precision FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

Motivation There are many approaches on the approximate query processing field, however most of them are not suitable for spatial data. “Research new techniques for approximate query processing that support the uniqueness of spatial data is a major issue in the database field”. (Roddick et al., 2004) FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

Scenarios and Applications Decision Support System Increasing business competitiveness More use of accumulated data Data mining During drill down query sequence in ad-hoc data mining Earlier queries in a sequence can be used to find out the interesting queries. Data warehouse Performance and scalability when accessing very large volumes of data during the analysis process. FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

Scenarios and Applications Mobile computing  An approximate answer may be an alternative: When the data is not available To save storage space FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

Exact answeres Traditional SDBMS query processing environment Queries Spatial DBMS Slow New data (inserts or updates) Deleted data FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

SDBMS set-up for providing approximate query answers Spatial DBMS New data (inserts or updates) Deleted data Approximate Answer + conf. Interval Fast answer Approximate Query Processing Engine Exact answer Queries FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

Goals Execute approximate query processing in Spatial Databases using Raster Signature –Four-Color Raster Signature (4CRS) (Zimbrao and Souza, 1998). Provide fast approximate query answers for queries over spatial data. FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

Contributions Proposals of algorithms for many spatial operations that can be approximately processed using 4CRS Spatial operators returning numbers  Area, distance, diameter, perimeter… Spatial predicates  Equal, different, disjoint, area disjoint, inside, meet, adjacent… Operators returning spatial data type values  Intersection, plus (union), minus, common border… Spatial operators on set of objects  Sum, closest, decompose, overlay, fusion. FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

Contributions FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Proposals of algorithms Approximate Area of Polygon Distance Diameter Perimeter and Contour Equal and Different Disjoint, Area Disjoint, Edge Disjoint Inside (Encloses), Edge Inside, Vertex Inside Intersects and Intersection Overlay Adjacent, Border in Common, Common border Plus and Sum Minus Fusion Closest Decompose

Four-Color Raster Signature (4CRS) 4CRS is a raster approximation It is an object representation upon a grid of cells Grid resolution can be changed  Precision × Storage requirements FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

Four-Color Raster Signature (4CRS) Bit valueCell typeDescription 00Empty The cell is not intersected by the polygon 01Weak The cell contains an intersection of 50% or less with the polygon 10Strong The cell contains an intersection of more than 50% with the polygon and less than 100% 11Full The cell is fully occupied by the polygon Each cell stores relevant information using few bits 4CRS  4 types of cells FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

Four-Color Raster Signature (4CRS) - Generation Polygon 4CRS FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

Approximate Area of Polygon Approximate area of polygon Approximate area of polygon within window Approximate overlapping area of polygon join Based on the expected area of polygon within cell Based on the intersection expected area of two types of cells FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

E F W S Expected Area = zero%  µ = 0 Expected Area = 100%  µ = 1 Expected area (µ) of cell type Expected Area (0, 0.50]  µ = 0.25 Expected Area (0.50, 1)  µ = 0.75 Approximate area of polygon Approximate area of polygon within cell FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Grid and polygon are independent from each other

Approximate overlapping area of polygon join FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S WE × SE SW SS × × × µ W×E µ S×E µ S×W µ S×S expected area of cells overlapping

Approximate overlapping area of polygon join Cell typesEmptyWeakStrongFull Empty0000 Weak Strong Full              ji ji cellareaanswer eApproximat  FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Table of expected area of cells overlapping

Affinity degree FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S For other algorithms, when evaluating cell types it is also required to compute an approximate value in the interval [0,1] that indicates a true percentage of the response  Affinity deggree: it is based on expected area of cells overlapping ( Azevedo et al., 2005). Cell typesEmptyWeakStrongFull Empty0000 Weak Strong Full Table of affinity degree For some proposed algorithms, it is possible to return an approximate answer evaluating only cell types.

Equal FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Equal algorithm using 4CRS  the approximate answer is equal to the sum of affinity degrees divided by the number of comparisons of pair of objects, if no trivial case occurs. E × W SS FF × × × µ E×E = 1 µ W×W = µ S×S = µ F×F = 1 E W Sum of affinity degree Trivial case: not equal  overlap of different cell types  result false SE × SW × FS ×

Different FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Different algorithm is opposite to equal algorithm Affinity degree is equal to the 1 - affinity degrees SE × SW × FS × Trivial case: different  overlap of different cell types  result true µ E×E = 0 µ W×W = µ S×S = µ F×F = 0 Sum of affinity degree E × W SS FF × × × E W

Disjoint FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Disjoint: two objects are disjoint if they have no portion in common Case III: weak × weak weak × strong × W W E × S E W Case II: Only overlap of Disjoint (partial answer) Affinity degree += 1 F Disjoint (partial answer) Affinity degree += 1 – expected area(type1,type2) W S × × S F W Case I: At least one overlap of Trivial case: Not disjoint (exact answer) F S S ×

Distance Distance can be estimate from 4CRS signatures computing the distance among cells corresponding to polygons’ borders (Weak and Strong cells). Distance = average of the minimum and maximum distances... (a) (b) (c) Minimum distance Maximum distance FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

FINAL CONSIDERATIONS EXPERIMENTAL RESULTS IMPL. AND EVAL. ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FINAL CONSIDERATIONS EXPERIMENTAL RESULTS IMPL. AND EVAL. ALGORITHMS Conclusions 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Goal Provide an estimated result in orders of magnitude less time than the time to compute an exact answer, along with a confidence interval for the answer. Proposals Use raster approximations for approximate query processing in spatial databases Use 4CRS signature to process the queries over polygons, avoiding accessing the real data. Proposal many algorithms for approximate processing Use expected area of polygons (Azevedo et al., 2005) to estimate responses

Implement and evaluate algorithms involving other kinds of datasets, for example, points and polylines, and combinations of them: point × polyline, polyline × polygon and polygon × polyline. The experimental evaluation is not addressed in this work; it is on going work developed on Secondo (Güting et al., 2005) which is an extensible DBMS platform for research prototyping and teaching. FINAL CONSIDERATIONS EXPERIMENTAL RESULTS IMPL. AND EVAL. ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FINAL CONSIDERATIONS EXPERIMENTAL RESULTS IMPL. AND EVAL. ALGORITHMS Future work 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza Approximate Query Processing in Spatial Databases Using Raster Signatures Federal University of Rio de Janeiro {azevedo,