Download presentation
Presentation is loading. Please wait.
1
ACM GIS 2007 1 An Interactive Framework for Raster Data Spatial Joins Wan Bae (Computer Science, University of Denver) Petr Vojtěchovský (Mathematics, University of Denver) Shayma Alkobaisi (Computer Science, University of Denver) Scott T. Leutenegger (Computer Science, University of Denver) Seon Ho Kim (Computer Science, University of Denver)
2
ACM GIS 2007 2 Outline Introduction Issues and Problems Probabilistic Joins Sampling Joins Interactive Framework Experiments Conclusion
3
ACM GIS 2007 3 Geographic Information Systems Web application datadata data Collect Collect Store Store Retrieve Retrieve Integration of georeferenced data Integration of georeferenced data Spatial queries Spatial queries Complex spatial data analysis & Complex spatial data analysis & modeling for decision support modeling for decision support GIS Web application Users data data data
4
ACM GIS 2007 4 Raster Data Model (a) Satellite Image (b) Raster Model A great portion of georeferenced data Simple data structure but greater storage space Continuously changing data
5
ACM GIS 2007 5 Continuously Changing Data
6
ACM GIS 2007 6 Raster Data Spatial Joins (a) (b) “Find the regions where rainfall rate is greater than 1.0 and wind speed is greater than 50”
7
ACM GIS 2007 7 Issues for User-driven Data Exploration Fast Query response time –Time consuming for exact answers due to large size of data sets –Time intensive GIS decision support queries –Lack of optimization and approximation techniques for raster data joins Interactive query processing –Lack of interactivities in traditional GIS –No user control over query processing Visualization increases the utility of the GIS
8
ACM GIS 2007 8 Our Approach Fast approximation of query results 1. probabilistic join 2. sampling join Visualize intermediate results 1. “big picture” of query result 2. partial result: non-blocking joins Allow users to control query processing For faster and more effective decision support queries:
9
ACM GIS 2007 9 Our Approximations 2. Can use the result of a subset of data cell joins for the final answer? R (8/16) S (9/16) = they must join! 1.What is the probability that R joins S? 1 joins / 2 cells ? / 16 cells
10
ACM GIS 2007 10 Augmented Quad-trees Both data sets are indexed using Quad-trees NW SE SW NENW SE SW NE
11
ACM GIS 2007 11 Join Probability Let X = [0, 1], m and n be randomly chosen intervals in X of length a, b. The probability p that m ∩ n ≠ 0 Join Probability of p (m ∩ n ≠ 0) = ?
12
ACM GIS 2007 12 1-d Join Probability 0 1 overlapped a a1a1 a2a2 m b b1b1 b2b2 n x x+b b 1-b q p a1-a
13
ACM GIS 2007 13 2-d Join Probability 1 1 a1a1 a2a2 a m b1b1 b2b2 b n 0
14
ACM GIS 2007 14 Look-up table for 2-d Join Probability P0.10.20.30.40.5 0.10.46360.62280.74140.83170.8997 0.20.62280.76830.86400.92770.9681 0.30.74140.86400.93430.97380.9930 0.40.83170.92770.97380.99370.9995 0.50.89970.96810.99300.99951.0
15
ACM GIS 2007 15 Probabilistic Join (PJ) p(, )
16
ACM GIS 2007 16 Probabilistic Join Result (b) data set S (65536 x 65536) (a) data set Q (65536 x 65536) (e) 4 th level joins (c) 2 th level joins (d) 3 th level joins
17
ACM GIS 2007 17 Incremental Stratified Sampling Join (ISSJ) Utilize stratified random sampling technique from quad- trees of two data sets R and S Data randomization: Acceptance/Rejection method 1. Sampling step: sample data from outer data set R 2.Spatial joining step: joins with the corresponding data cell on inner data set S 3.Refining step: running estimates and confidence intervals 4. Visualization: display partial results (actual join results)
18
ACM GIS 2007 18 Stratified Random Sampling ST 1 ST 2 ST 3 ST 4 0221 ST 1 ST 2 ST 3 ST 4
19
ACM GIS 2007 19 Estimates and Confidence Interval Population Proportion: fraction indicating the part of the sample having a particular interest Estimated Value: the statistic computed from sample information using population proportion Confidence interval: an interval that estimates a population parameter within a range of possible values at specified probability Confidence level: the specified probability
20
ACM GIS 2007 20 Incremental Sampling Join Result (b) Partial result(a) Estimated result IA NE WI CO KS MI stateairportsconfidenc e interval 13 22 19 15 11 8 0.05 95 10% done
21
ACM GIS 2007 21 Interactive Join Framework
22
ACM GIS 2007 22 Experiments PJ and ISSJ compared to full Quad-tree join. Confidence level set to 95% in ISSJ Varied buffer size and data sets size. Data sets: –Synthetic: U E, E U, U U (65536 65536 and 262144 262144) –Real: 6 data sets mineral resources for each state of AZ, CO, OR and WY from U.S. Geological Survey (65536 65536)
23
ACM GIS 2007 23 Actual joins vs. 2-d PJ sample sizeactual joins2-d (error) 5%5448 (0.1060) 10%10999 (0.0917) 20%218197 (0.0963) 50%545494 (0.0936)
24
ACM GIS 2007 24 Accuracy of Estimates of ISSJ Estimates vs. exact value for real data sets number of processed cells
25
ACM GIS 2007 25 Time for Confidence Interval of ISSJ Confidence Interval and I/Os for real data sets sampling join full quad-tree join
26
ACM GIS 2007 26 ISSJ vs. PJ vs. Actual joins (a) ISSJ w/10% C I (b) ISSJ w/5% C I (a) Actual join (d) PJ
27
ACM GIS 2007 27 Time for Confidence Intervals I/Os of PJ, ISSJ and the full quad-tree join for Colorado
28
ACM GIS 2007 28 Conclusion A novel spatial join, Probabilistic Join, for raster data joins for obtaining a “big picture” visualization of query answer An interactive raster spatial join algorithm, Incremental Refining Spatial Join, for confidence interval bounded estimated query answer of raster data joins
29
ACM GIS 2007 29 Thank you!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.