Presentation is loading. Please wait.

Presentation is loading. Please wait.

Approximate Spatial Query Processing Using Raster Signatures Leonardo Guerreiro Azevedo, Rodrigo Salvador Monteiro, Geraldo Zimbrão & Jano Moreira de Souza.

Similar presentations


Presentation on theme: "Approximate Spatial Query Processing Using Raster Signatures Leonardo Guerreiro Azevedo, Rodrigo Salvador Monteiro, Geraldo Zimbrão & Jano Moreira de Souza."— Presentation transcript:

1 Approximate Spatial Query Processing Using Raster Signatures Leonardo Guerreiro Azevedo, Rodrigo Salvador Monteiro, Geraldo Zimbrão & Jano Moreira de Souza Coppe – Graduate School of Engineering Institute of Mathematics – Computer Science Department Federal University of Rio de Janeiro

2 Approximate Spatial Query Processing Using Raster Signatures 2 Common Spatial Queries Area of polygon Area of polygon Area of polygon within window Area of polygon within window Spatial Joins Spatial Joins polygon polygon, polygon polyline & polyline polyline polygon polygon, polygon polyline & polyline polyline Distance Distance Buffer Buffer Perimeter Perimeter Topological queries Topological queries

3 Approximate Spatial Query Processing Using Raster Signatures 3 Common Spatial Queries Approximate Area of polygon Approximate Area of polygon Approximate Area of polygon within window Approximate Area of polygon within window Approximate Spatial Joins Approximate Spatial Joins polygon polygon, polygon polyline & polyline polyline polygon polygon, polygon polyline & polyline polyline Approximate Distance Approximate Distance Approximate Buffer Approximate Buffer Approximate Perimeter Approximate Perimeter Approximate Topological queries Approximate Topological queries

4 Approximate Spatial Query Processing Using Raster Signatures 4 Approximate Answers to Spatial Queries What is an approximate answer? What is an approximate answer? If the exact result is a number, the approximate result will be a number and a confidence interval If the exact result is a number, the approximate result will be a number and a confidence interval If not, the graphical display of approximate answers is something like a fuzzy map If not, the graphical display of approximate answers is something like a fuzzy map

5 Approximate Spatial Query Processing Using Raster Signatures 5 The increase of storage capacity The increase of storage capacity The decrease of hardware costs The decrease of hardware costs Disk access time is still high Disk access time is still high Complex queries Complex queries Data stored in devices that are not on- line. Data stored in devices that are not on- line. A query may take minutes or hours to be processed. Motivation

6 Approximate Spatial Query Processing Using Raster Signatures 6 Motivation Approximate answer may be enough Approximate answer may be enough exact answers are itself approximations exact answers are itself approximations Approximate answers can be computed quickly Approximate answers can be computed quickly Spatial query processing: Spatial query processing: Scale Scale Quality Quality Round-off errors Round-off errors

7 Approximate Spatial Query Processing Using Raster Signatures 7 Decision Support System Decision Support System Increasing business competitiveness Increasing business competitiveness More use of accumulated data More use of accumulated data Data mining Data mining During drill down query sequence in ad-hoc data mining During drill down query sequence in ad-hoc data mining Earlier queries in a sequence can be used to find out the interesting queries. Earlier queries in a sequence can be used to find out the interesting queries. Data warehouse Data warehouse Performance and scalability when accessing very large volumes of data during the analysis process. Performance and scalability when accessing very large volumes of data during the analysis process. Scenarios and Applications

8 Approximate Spatial Query Processing Using Raster Signatures 8 Query optimization Query optimization To define the most efficient access plan for a given query To define the most efficient access plan for a given query Distributed data recording and warehousing environments Distributed data recording and warehousing environments Data may be remote, and even may be unavailable Data may be remote, and even may be unavailable Old data can be disposed in order to make room for new ones. Therefore it becomes impossible to answer to queries on deleted information. Old data can be disposed in order to make room for new ones. Therefore it becomes impossible to answer to queries on deleted information. Scenarios and Applications

9 Approximate Spatial Query Processing Using Raster Signatures 9 Mobile computing Mobile computing An approximate answer may be an alternative: An approximate answer may be an alternative: When the data is not availableWhen the data is not available To save storage spaceTo save storage space Scenarios and Applications

10 Approximate Spatial Query Processing Using Raster Signatures 10 Data environment set-up for providing approximate answers New data Queries Responses Approx. Query Engine A framework for approximate query processing Database

11 Approximate Spatial Query Processing Using Raster Signatures 11 Four Color Raster Signature (4CRS) Raster approximation (VLDB98) Raster approximation (VLDB98) Object representation upon a grid of cells. Object representation upon a grid of cells. Each cell stores relevant information using few bits. Each cell stores relevant information using few bits. Grid resolution can be changed Grid resolution can be changed Precision storage requirementsPrecision storage requirements 4 types of cells 4 types of cells Bit value Cell type Description 00Empty The cell is not intersected by the polygon 01Weak The cell contains an intersection of 50% or less with the polygon 10Strong The cell contains an intersection of more than 50% with the polygon and less than 100% 11Full The cell is fully occupied by the polygon

12 24th VLDB Conference New York, USA, 1998 12 Polygon 4CRS 4CRS Approximation Construction of Signatures

13 Approximate Spatial Query Processing Using Raster Signatures 13 Polygon approximate area The algorithm is based on the sum of the expected area of each cell grid The algorithm is based on the sum of the expected area of each cell grid Empty cells: 0% Empty cells: 0% Full cells: 100% Full cells: 100% Weak and Strong cells supposing uniform distribution Weak and Strong cells supposing uniform distribution Weak cells: (0, 0.5] interval mean 0.25Weak cells: (0, 0.5] interval mean 0.25 Strong cells: (0.5, 1) interval mean 0.75Strong cells: (0.5, 1) interval mean 0.75 Count the number of each cell type in the polygons 4CRS, and multiply these values by the presumed cell area. Count the number of each cell type in the polygons 4CRS, and multiply these values by the presumed cell area.

14 Approximate Spatial Query Processing Using Raster Signatures 14 A measure of answer accuracy A measure of answer accuracy The polygon area inside weak or strong cell is assumed to be uniformly distributed. The polygon area inside weak or strong cell is assumed to be uniformly distributed. Weak cells Weak cells Strong cells Strong cells Using Central Limit Theorem confidence interval Using Central Limit Theorem confidence interval 95% 95% 99% 99% Confidence interval

15 Approximate Spatial Query Processing Using Raster Signatures 15 Confidence interval (example) Query results Query results # weak cells: 100 # weak cells: 100 # strong cells: 120 # strong cells: 120 # full cells: 400 # full cells: 400 Confidence interval: 95% Confidence interval: 95% Weak cells: Weak cells: Strong cells: Strong cells: Full cells: 400 (full cells have the exact area!) Full cells: 400 (full cells have the exact area!) Total: Total: Error between -1.15% and 1.15% Error between -1.15% and 1.15%

16 Approximate Spatial Query Processing Using Raster Signatures 16 Cell Area Distribution WeakStrong Comparable to an uniform distribution Variance: 0.021369 (U: 0.020833) Mean: 0.246453 (U: 0.25)

17 Approximate Spatial Query Processing Using Raster Signatures 17 Example # empty cells: 55 # empty cells: 55 # weak cells: 27 # weak cells: 27 # strong cells: 26 # strong cells: 26 # full cells: 79 # full cells: 79 Approximate area: ( Σ weak * 0.25 + Σ strong * 0.75 + Σ full ) * cellArea Approximate area: ( Σ weak * 0.25 + Σ strong * 0.75 + Σ full ) * cellArea Exact area: 106.40 Exact area: 106.40 Appr. area: 105.25 Appr. area: 105.25 Error: 1.07% Error: 1.07%

18 Approximate Spatial Query Processing Using Raster Signatures 18 This algorithm is similar to the approximate polygon area algorithm This algorithm is similar to the approximate polygon area algorithm There are two kinds of cell overlap: There are two kinds of cell overlap: The cell may be completely contained by the window The cell may be completely contained by the window The cell may be partially contained by the window The cell may be partially contained by the window proportional to its overlapping areaproportional to its overlapping area Approximate area of polygon window intersection

19 Approximate Spatial Query Processing Using Raster Signatures 19 Experimental tests Computer: PC Pentium IV 1,8 GHz, 512 MB RAM Computer: PC Pentium IV 1,8 GHz, 512 MB RAM Page size 2,048 Bytes Page size 2,048 Bytes Target: to evaluate the use of 4CRS for approximate query processing against exact query processing related to the following aspects: Target: to evaluate the use of 4CRS for approximate query processing against exact query processing related to the following aspects: Response time Response time Storage requirements Storage requirements Accuracy Accuracy The algorithms tested were : The algorithms tested were : Polygon approximate area Polygon approximate area Approximate area of polygon x window intersection Approximate area of polygon x window intersection 100 random windows for each data set (different sizes and positions)100 random windows for each data set (different sizes and positions)

20 Approximate Spatial Query Processing Using Raster Signatures 20 Use of R*-trees in order to reduce the search space. Use of R*-trees in order to reduce the search space. Relation ARelation B SAMs Candidate pairs Exact geometry processor $ Response set Step 1 Step 2 Relation ARelation B SAMs Candidate pairs Approximate query processing Response set Step 1 Step 2 4CRS Experimental tests

21 Approximate Spatial Query Processing Using Raster Signatures 21 The polygon real data sets used in the experiments consist of township boundaries, census block-group, topography, geologic map and hydrographic map from Iowa (USA), and Brazilian municipalities. The polygon real data sets used in the experiments consist of township boundaries, census block-group, topography, geologic map and hydrographic map from Iowa (USA), and Brazilian municipalities. Experimental tests

22 Approximate Spatial Query Processing Using Raster Signatures 22 Approximate polygon area

23 Approximate Spatial Query Processing Using Raster Signatures 23 Approximate polygon area

24 Approximate Spatial Query Processing Using Raster Signatures 24 Approximate polygon window area

25 Approximate Spatial Query Processing Using Raster Signatures 25 Approximate polygon window area

26 Approximate Spatial Query Processing Using Raster Signatures 26 Conclusion The experimental results demonstrated the efficiency of the 4CRS use for approximate query processing. The experimental results demonstrated the efficiency of the 4CRS use for approximate query processing. Storage requirements Storage requirements 4CRS has an average of 3.75% of the real data set size4CRS has an average of 3.75% of the real data set size Accuracy Accuracy Approximate area: average error of 2.62%Approximate area: average error of 2.62% Window query approximate area: average error of 1%Window query approximate area: average error of 1% Response time Response time Approximate area: average 28.41%Approximate area: average 28.41% Window query approximate area: average 7.22%Window query approximate area: average 7.22% Disk access Disk access Approximate area: average 1.90%Approximate area: average 1.90% Window query approximate area: average 7.04%Window query approximate area: average 7.04%

27 Approximate Spatial Query Processing Using Raster Signatures 27 Future works Algorithms for the other operations Algorithms for the other operations Approximate area of polygon x polygon intersection algorithm is being evaluated Approximate area of polygon x polygon intersection algorithm is being evaluated Use of approximations for mobile computing Use of approximations for mobile computing


Download ppt "Approximate Spatial Query Processing Using Raster Signatures Leonardo Guerreiro Azevedo, Rodrigo Salvador Monteiro, Geraldo Zimbrão & Jano Moreira de Souza."

Similar presentations


Ads by Google