Presentation is loading. Please wait.

Presentation is loading. Please wait.

Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza Approximate Query Processing in Spatial Databases Using Raster Signatures Federal University.

Similar presentations


Presentation on theme: "Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza Approximate Query Processing in Spatial Databases Using Raster Signatures Federal University."— Presentation transcript:

1 Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza Approximate Query Processing in Spatial Databases Using Raster Signatures Federal University of Rio de Janeiro {azevedo, zimbrao,jano}@cos.ufrj.br

2 FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FINAL CONSIDERATIONS FINAL CONSIDERATIONS FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS GOALS AND CONTRIBUTIONS FOUR-COLOR RASTER SIGNATURE (4CRS) Presentation plan 4CRS FOUR-COLOR RASTER SIGNATURE (4CRS) GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

3  There are many cases where a query can take a long time to be processed, for example: –When processing huge volume of data that requires a large number of I/O operations Disk access time is still higher than memory access time –When processing high complex queries –When accessing remote data due to a slow network link or even temporary non-availability... Motivation FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S An exact answer can demand a long time

4  There are many cases where a query can take a long time to be processed, for example: –When processing huge volume of data that requires a large number of I/O operations Disk access time is still higher than memory access time –When processing high complex queries –When accessing remote data due to a slow network link or even temporary non-availability... A fast answer can be more important than an exact response Motivation FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

5 Motivation The challenge becomes bigger in spatial data environments. FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S 399,0000 segments 475,434 segments

6 Motivation Precision of the query can be lessened, and an approximate answer returned to the user –Approximate answers can be quickly computed –Acceptable precision FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

7 Motivation There are many approaches on the approximate query processing field, however most of them are not suitable for spatial data. “Research new techniques for approximate query processing that support the uniqueness of spatial data is a major issue in the database field”. (Roddick et al., 2004) FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

8 Scenarios and Applications Decision Support System Increasing business competitiveness More use of accumulated data Data mining During drill down query sequence in ad-hoc data mining Earlier queries in a sequence can be used to find out the interesting queries. Data warehouse Performance and scalability when accessing very large volumes of data during the analysis process. FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

9 Scenarios and Applications Mobile computing  An approximate answer may be an alternative: When the data is not available To save storage space FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

10 Exact answeres Traditional SDBMS query processing environment Queries Spatial DBMS Slow New data (inserts or updates) Deleted data FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

11 SDBMS set-up for providing approximate query answers Spatial DBMS New data (inserts or updates) Deleted data Approximate Answer + conf. Interval Fast answer Approximate Query Processing Engine Exact answer Queries FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FIRST CONSIDERATION S

12 Goals Execute approximate query processing in Spatial Databases using Raster Signature –Four-Color Raster Signature (4CRS) (Zimbrao and Souza, 1998). Provide fast approximate query answers for queries over spatial data. FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

13 Contributions Proposals of algorithms for many spatial operations that can be approximately processed using 4CRS Spatial operators returning numbers  Area, distance, diameter, perimeter… Spatial predicates  Equal, different, disjoint, area disjoint, inside, meet, adjacent… Operators returning spatial data type values  Intersection, plus (union), minus, common border… Spatial operators on set of objects  Sum, closest, decompose, overlay, fusion. FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

14 Contributions FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Proposals of algorithms Approximate Area of Polygon Distance Diameter Perimeter and Contour Equal and Different Disjoint, Area Disjoint, Edge Disjoint Inside (Encloses), Edge Inside, Vertex Inside Intersects and Intersection Overlay Adjacent, Border in Common, Common border Plus and Sum Minus Fusion Closest Decompose

15 Four-Color Raster Signature (4CRS) 4CRS is a raster approximation It is an object representation upon a grid of cells Grid resolution can be changed  Precision × Storage requirements FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

16 Four-Color Raster Signature (4CRS) Bit valueCell typeDescription 00Empty The cell is not intersected by the polygon 01Weak The cell contains an intersection of 50% or less with the polygon 10Strong The cell contains an intersection of more than 50% with the polygon and less than 100% 11Full The cell is fully occupied by the polygon Each cell stores relevant information using few bits 4CRS  4 types of cells FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

17 Four-Color Raster Signature (4CRS) - Generation Polygon 4CRS FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

18 Approximate Area of Polygon Approximate area of polygon Approximate area of polygon within window Approximate overlapping area of polygon join Based on the expected area of polygon within cell Based on the intersection expected area of two types of cells FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

19 E F W S Expected Area = zero%  µ = 0 Expected Area = 100%  µ = 1 Expected area (µ) of cell type Expected Area (0, 0.50]  µ = 0.25 Expected Area (0.50, 1)  µ = 0.75 Approximate area of polygon Approximate area of polygon within cell FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Grid and polygon are independent from each other

20 Approximate overlapping area of polygon join FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S WE × SE SW SS × × × µ W×E µ S×E µ S×W µ S×S expected area of cells overlapping

21 Approximate overlapping area of polygon join Cell typesEmptyWeakStrongFull Empty0000 Weak00.06250.18750.25 Strong00.18750.56250.75 Full00.250.751              ji ji cellareaanswer eApproximat  FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Table of expected area of cells overlapping

22 Affinity degree FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S For other algorithms, when evaluating cell types it is also required to compute an approximate value in the interval [0,1] that indicates a true percentage of the response  Affinity deggree: it is based on expected area of cells overlapping ( Azevedo et al., 2005). Cell typesEmptyWeakStrongFull Empty0000 Weak00.06250.18750.25 Strong00.18750.56250.75 Full00.250.751 Table of affinity degree For some proposed algorithms, it is possible to return an approximate answer evaluating only cell types.

23 Equal FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Equal algorithm using 4CRS  the approximate answer is equal to the sum of affinity degrees divided by the number of comparisons of pair of objects, if no trivial case occurs. E × W SS FF × × × µ E×E = 1 µ W×W = 0.0625 µ S×S = 0.5625 µ F×F = 1 E W Sum of affinity degree Trivial case: not equal  overlap of different cell types  result false SE × SW × FS ×

24 Different FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Different algorithm is opposite to equal algorithm Affinity degree is equal to the 1 - affinity degrees SE × SW × FS × Trivial case: different  overlap of different cell types  result true µ E×E = 0 µ W×W = 1-0.0625 µ S×S = 1-0.5625 µ F×F = 0 Sum of affinity degree E × W SS FF × × × E W

25 Disjoint FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Disjoint: two objects are disjoint if they have no portion in common Case III: weak × weak weak × strong × W W E × S E W Case II: Only overlap of Disjoint (partial answer) Affinity degree += 1 F Disjoint (partial answer) Affinity degree += 1 – expected area(type1,type2) W S × × S F W Case I: At least one overlap of Trivial case: Not disjoint (exact answer) F S S ×

26 Distance Distance can be estimate from 4CRS signatures computing the distance among cells corresponding to polygons’ borders (Weak and Strong cells). Distance = average of the minimum and maximum distances... (a) (b) (c) Minimum distance Maximum distance FINAL CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS PROPOSALS OF ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

27 FINAL CONSIDERATIONS EXPERIMENTAL RESULTS IMPL. AND EVAL. ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FINAL CONSIDERATIONS EXPERIMENTAL RESULTS IMPL. AND EVAL. ALGORITHMS Conclusions 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S Goal Provide an estimated result in orders of magnitude less time than the time to compute an exact answer, along with a confidence interval for the answer. Proposals Use raster approximations for approximate query processing in spatial databases Use 4CRS signature to process the queries over polygons, avoiding accessing the real data. Proposal many algorithms for approximate processing Use expected area of polygons (Azevedo et al., 2005) to estimate responses

28 Implement and evaluate algorithms involving other kinds of datasets, for example, points and polylines, and combinations of them: point × polyline, polyline × polygon and polygon × polyline. The experimental evaluation is not addressed in this work; it is on going work developed on Secondo (Güting et al., 2005) which is an extensible DBMS platform for research prototyping and teaching. FINAL CONSIDERATIONS EXPERIMENTAL RESULTS IMPL. AND EVAL. ALGORITHMS 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATIONS FINAL CONSIDERATIONS EXPERIMENTAL RESULTS IMPL. AND EVAL. ALGORITHMS Future work 4CRS GOALS AND CONTRIBUTIONS FIRST CONSIDERATION S

29 Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza Approximate Query Processing in Spatial Databases Using Raster Signatures Federal University of Rio de Janeiro {azevedo, zimbrao,jano}@cos.ufrj.br


Download ppt "Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza Approximate Query Processing in Spatial Databases Using Raster Signatures Federal University."

Similar presentations


Ads by Google