Presentation is loading. Please wait.

Presentation is loading. Please wait.

Supporting Web-based Visual Exploration of Large-Scale Raster Geospatial Data Using Binned Min-Max Quadtree Jianting Zhang 12, Simin You 2 City College.

Similar presentations


Presentation on theme: "Supporting Web-based Visual Exploration of Large-Scale Raster Geospatial Data Using Binned Min-Max Quadtree Jianting Zhang 12, Simin You 2 City College."— Presentation transcript:

1 Supporting Web-based Visual Exploration of Large-Scale Raster Geospatial Data Using Binned Min-Max Quadtree Jianting Zhang 12, Simin You 2 City College 1 & Graduate Center 2 of The City University of New York

2 Outline Motivation and Introduction Background and Related Work Binned Min-Max Quadtree Index Construction Query Processing System Architecture Experiments and Evaluation Conclusion and Future Work

3 Motivation/Introduction 3 If you load your own data in Google Earth, Wouldn’t it be nicer if you can query your data and highlight the query results? In addition to simple display, zoom in/out, pan

4 Global 30s Precipitation Data from WorldClim (Interpolated 1950-2000) Coloring Schema: Green: 0 mm Red: 100 mm Linear Interpolation Undergraduate Project: Generate Dynamic KML Files for Interactive Visualization in Google Earth ( C. Dasrat/CCNY ) Jan July

5 Motivation/Introduction Task: Find/show regions where precipitation amount in January is between [p1,p2). Intuitive Solution –Loop through all the raster cells and return all the cell locations. –Problem: long evaluation time and difficulty in visualizing query results in Web browsers for practical reasons. Our Solution: –Backend: Index raster data, perform the query in main memory and return a set of quadrants (SSDBM’10) –Middleware: Dynamically generate tiled images on-demand based on user’s current view and cache the tiled images as necessary (Com.Geo’10) –Ongoing work: massively parallel indexing using GPGPU (20X speedup)

6 Background & Related Work Spectral, spatial and temporal resolutions of raster geospatial data are getting increasingly finer  larger data volumes –The next generation GOES-R satellite will provide global coverage at the 0.5-2 km resolution every 5 minutes (16 bands) –Numerous derived products from satellite images https://lpdaac.usgs.gov/lpdaac/products/modis_produc ts_table –Large-scale model simulation results (e.g. WRF)

7 Background & Related Work Manually examine all the data through visual display is not possible anymore –Human eyes can only effectively distinguish a limited number of colors at a time –Studies show that screen resolution beyond 4000 by 4000 pixels is not effective Query data and highlight results (Region of Interests) for further analysis become more preferable

8 Background & Related Work Query Driven Visual Exploration of Scientific Data –Wu et al 2003, Stockinger et al 2005, Rubel et 2008 –Glatter et al 2006, Kendall et al 2009, Fuchs et al 2009 Indexing and Query Processing in Spatial Databases –Overview: Gaede and Gunther 1998, Samet 2005 –Vector data: R-Tree, Quad-Tree –Raster data: very limited (except tiling/pyramid)

9 Background & Related Work Managing Multi-dimensional Array Data –Array query definition language: Baumann et al 1997, Marathe and Salem 1999, Baumann 2009 –Physical data layout: Sarawagi and Stonebraker 1994, Otoo and Rotem 2006, Kim and Jaja 2007, Otoo et al 2007 Information Visualization/Visual Exploration –Desktop Systems: Prefuse, GeoVista, GeoDa, IDV –Web-based: Wood et al 2007, Dork et al 2008 –Main-memory based, no database backend support –Scalability problem  integrating high-performance database engines with information visualization/visual exploration modules

10 Binned Min-Max Quadtree (BMMQ-Tree) Designed to support ROI finding queries Given: a set of rasters representing environmental variables {F i |0<i<n} over a spatial domain D A ROI finding query Q: identifies regions in D whose cells C j satisfy the compound condition op can be either conjunctive and disjunctive, 0<k<n lower and high bounds of query Q for variable i

11 Binned Min-Max Quadtree Why Tree-based indexing? –A ROI query is a global operation on rasters –Without indices, scanning whole rasters is required Disk IOs are most expensive along storage hierarchy Performance is limited by disk IOs. –With tree-based indexing Quickly prune irrelevant branches – reduce disk IOs Access disk files only when necessary Answer a large portion of queries directly without incurring disk IOs Indices with small memory footprint can be main-memory resident

12 Binned Min-Max Quadtree Why Binned Min-Max Quadtree? –Associate min/max values with each quadtree node to help ROI-based queries – popular in 3D graphics for generating iso-surfaces and tracing rays –First law of geography ”Everything is related to everything else, but near things are more related than distant things “ (Tobler 1970) –However, neighboring cells values often are slightly different –Binning helps quadrant uniformity and reduce quadtree complexity

13 Index Construction

14 Query Processing – Arbitrary Spatial Window

15 value range [1,3] under tile (0,1,1) Query Processing –Tile Based (Parallelization possibility) Tile size N*N k=log 2 N

16 Binned Min-Max Quadtree BMMQ-Tree integrates features of Binned Bitmap Indexing and Min-Max kd-trees and octrees A BMMQ-Tree query results is a set of quadrants that can be expressed as (X,Y,L) tuples – suitable for data communication between clients and servers A BMMQ-Tree query can terminates when the spatial extent that a quadtree node represents is less than a screen pixel (Less-than-Single-Pixel stopping policy) May result in false positives - NOT necessarily bad for visual explorations –Identifying Region of Interests is the primary goal –Details on demand for further examination

17 Prototype System Original design –Rendering quadrants as vector objects using Flex RIA APIs at the client side –Powerful and flexible: control rendering at the pixel level in Web browsers –The performance is poor when the number of quadrants is above the order of a few thousands –We consider the results as “lessons” rather than “achievements” Current design (COM.GEO’10) –Support tile based queries –Render resulting quadrants as binary images in the middleware –Client is responsible for formulating tiles, submitting queries and visualizing query results –Significant better performances

18 Prototype System Architecture

19 Online demo: http://134.74.112.202/comgeo/testoverlay.html

20 Experiments and Evaluation Data: WorldClim January Precipitation Data at 30s resolution (43200*21600) –Value range [0,1003] –Quadtree level=16 Query processing server: Dell T5400 Ad-hoc queries (arbitrary parameters) –Three bin sizes: 8, 16, 32 –Query value range [90,300) –Eight spatial query windows of sizes around 65 degrees (lon) by 55 degrees (lat) Tile-based queries (more systematic) –Bin size=32 –Tile size: 256*256 (k=8) –For query value range[0,1003]: 6848 tiles –For query value range[90,300): 1197 tiles

21 Results of Ad-hoc Queries #B=8B=16B=32 Q1160153183 Q2116121163 Q3112162252 Q4160153182 Q5514247 Q69197140 Q786108169 Q88194105 Less-Than-Single-Pixel stopping policy NOT applied (Max Level=16, results in milliseconds)

22 Results of End-to-End Performance using OLD Design Less-Than-Single-Pixel stopping rule Applied Max Level=12 for query window sizes 65*55 degrees Bin size=32

23 Results of End-to-End Performance using New Design Estimating End-to-End time Assume available network bandwidth=300k Bps  TT=10ms Assume client display area 1024*1024  16 tiles (Parallelizable) Assume no server/client side caching (cold start) Assume rendering times for small images in Web browsers are negligible Estimated time: (QT+GT+TT)*16 = (50+10+10)*16=1120 ms

24 Conclusions The proposed BMMQ-Tree data structure can be used to efficiently process ROI-finding queries on large scale raster geospatial data. Queries can be processed in fractions of a second for large query windows. Tile-based query and dynamic tile image generation (middleware) and rendering (client) are more suitable for visualizing complex query results than client side rendering. New experimental results have showed that we are able to achieve an end-to-end performance in the order of sub-second for 1024*1024 pixels display area using 16 tiles. The performance can be further improved by parallel tile-based processing.

25 Additional Information GPU-based indexing –Nvidia Quadro FX3700 GPU card with 112 cores and 512M device memory –Raster size is limited to 4096*4096 due to device memory constraints  11*5 blocks –20X speedup (8.7s vs. 0.4s) –We expect to index the same global data on SGI Octane III 2-node mini- cluster with 4 GPU cards in about 1-5 seconds after fine-tuning our current codebase  real time indexing

26 Relationship with the Big Picture: Visual Explorations of Global Biodiversity Patterns Environment Species Taxonomic (Linnaean ranks) Kingdom Phylum Class Order Family Genus Species SubSpecies Area Water- Energy Latitude Altitude Productivity Environmental Gradient Community – Ecosystem – Biome – Biosphere ACMGIS’08 GeoInfo’09, ACMGIS’09 Com.Geo’10, SSDBM’10 Scale-up online query processing through offline indexing GPGPU-based Indexing Scale-up offline indexing through parallelization


Download ppt "Supporting Web-based Visual Exploration of Large-Scale Raster Geospatial Data Using Binned Min-Max Quadtree Jianting Zhang 12, Simin You 2 City College."

Similar presentations


Ads by Google