Flow Computation on Massive Grid Terrains

Slides:



Advertisements
Similar presentations
Hydrological Modeling. Overview Introduction Watershed delineation Automatic delineation Flow length.
Advertisements

Spatial Analysis with ArcView: 2-D. –Calculating viewshed –Calculating line of sight –Add x and y coordinates –Deriving slope from surface data –Deriving.
Lars Arge 1/43 Big Terrain Data Analysis Algorithms in the Field Workshop SoCG June 19, 2012 Lars Arge.
Lars Arge 1/13 Efficient Handling of Massive (Terrain) Datasets Lars Arge A A R H U S U N I V E R S I T E T Department of Computer Science.
Standard watershed and stream delineation recipe - Vector stream (ex. NHD data) fusion into DEM raster (burning in) - Sink removal - Flow direction - Flow.
CEE 795 Water Resources Modeling and GIS Learning Objectives: Perform raster based network delineation from digital elevation models Perform raster based.
Modeling & Analyzing Massive Terrain Data Sets (STREAM Project) Pankaj K. Agarwal Workshop on Algorithms for Modern Massive Data Sets.
I/O-Algorithms Lars Arge January 31, Lars Arge I/O-algorithms 2 Random Access Machine Model Standard theoretical model of computation: –Infinite.
CURVE NO. DEVELOPMENT STEP 8 Soils data, land use data, watershed data, and CN lookup table are used to develop curve numbers for use in the SCS Curve.
PrePro2004: Comparison with Standard Hydrologic Modeling Procedures Rebecca Riggs April 29, 2005.
I/O-Algorithms Lars Arge Spring 2009 January 27, 2009.
I/O-Algorithms Lars Arge Spring 2007 January 30, 2007.
Digital Elevation Model based Hydrologic Modeling Topography and Physical runoff generation processes (TOPMODEL) Raster calculation of wetness index Raster.
I/O-Algorithms Lars Arge Aarhus University February 16, 2006.
Hydrologic Analysis Francisco Olivera, Ph.D., P.E. Srikanth Koka
I/O-Algorithms Lars Arge Spring 2006 February 2, 2006.
By Jennifer VerWest. Differences between Flat and Average/Steep Terrain Flat Terrain Steep/Average Terrain.
DEM-Based Stream and Watershed Delineation
CRWR-PrePro Francisco “Paco” Olivera, Ph.D. Center for Research in Water Resources University of Texas at Austin Francisco Olivera 1998 ESRI User Conference.
Lars Arge 1/14 A A R H U S U N I V E R S I T E T Department of Computer Science Efficient Handling of Massive (Terrain) Datasets Professor Lars Arge University.
Efficient Algorithms for Large-Scale GIS Applications Laura Toma Duke University.
External Memory Graph Algorithms and Applications to GIS Laura Toma Duke University July
Flow modeling on grid terrains. Why GIS?  How it all started.. Duke Environmental researchers: computing flow accumulation for Appalachian Mountains.
From Elevation Data to Watershed Hierarchies Pankaj K. Agarwal Duke University Supported by ARO W911NF
Flow modeling on grid terrains. DEM Representations TIN Grid Contour lines Sample points.
I/O-Algorithms Lars Arge Spring 2008 January 31, 2008.
Remote Sensing and GIS in Water Dr. A.K.M. Saiful Islam Hands on training on surface hydrologic analysis using GIS Dr. A.K.M. Saiful Islam.
Urban Storm-Water Management Plan Utilizing Arc View and HEC-HMS College Station, Texas Kara Corcoran CVEN 689.
From Topographic Maps to Digital Elevation Models Daniel Sheehan IS&T Academic Computing Anne Graham MIT Libraries.
Terrain Mapping and Analysis
TerraStream: From Elevation Data to Watershed Hierarchies Thursday, 08 November 2007 Andrew Danner (Swarthmore), T. Moelhave (Aarhus), K. Yi (HKUST), P.
TerraFlow Flow Computation on Massive Grid Terrains Helena Mitasova Dept. of Marine, Earth & Atmospheric Sciences, NCSU, USA
FNR 402 – Forest Watershed Management
I/O-Algorithms Lars Arge Fall 2014 August 28, 2014.
DEM’s, Watershed and Stream Network Delineation DEM Data Sources Study Area in West Austin with a USGS 30m DEM from a 1:24,000 scale map Eight direction.
Bin Yao Spring 2014 (Slides were made available by Feifei Li) Advanced Topics in Data Management.
Terrain Analysis Tools for Routing Flow and Calculating Upslope Contributing Areas John P. Wilson Terrain Analysis for Water Resources Applications Symposium.
1 GIS in Hydrology Watershed management Definitions Algorithms Watershed delineation Automatically delineating watersheds Flow length Raster to vector.
Topographic Maps vs DEM. Topographic Map 1:24,000 Scale 20 ft contour 100 ft contour Stream Center Line.
Conclusions and Future Considerations: Parallel processing of raster functions were 3-22 times faster than ArcGIS depending on file size. Also, processing.
Creating Watersheds and Stream Networks
Terracost: Hazel, Toma, Vahrenhold, Wickremesinghe Terracost: A Versatile and Scalable Approach to Computing Least-Cost-Path Surfaces for Massive Grid-Based.
Processing Elevation Data. Limitations of DEMs for hydro work Dates Static, does not evolve Matching to linear line work due to scale Processing errors.
L7 - Raster Algorithms L7 – Raster Algorithms NGEN06(TEK230) – Algorithms in Geographical Information Systems.
Efficient Algorithms for Large-Scale GIS Applications Laura Toma Duke University.
Introduction to GIS. What is GIS? Geographic Information System Geographic implies of or pertaining to the surface of the earth Information implies knowledge.
Flow Modeling on Massive Grids Laura Toma, Rajiv Wickremesinghe with Lars Arge, Jeff Chase, Jeff Vitter Pat Halpin, Dean Urban in collaboration with.
U.S. Department of the Interior U.S. Geological Survey Automatic Generation of Parameter Inputs and Visualization of Model Outputs for AGNPS using GIS.
Lecture 1: Basic Operators in Large Data CS 6931 Database Seminar.
HEC-PrePro Workshop GIS Research Group Center for Research in Water Resources University of Texas at Austin Francisco Olivera HEC-PrePro v. 2.0 Workshop.
External Memory Graph Algorithms and Applications to GIS Laura Toma Bowdoin College.
TerraSTREAM: Terrain Processing Pipeline MADALGO – Center for Massive Data Algorithmics, a Center of the Danish National Research Foundation What TerraSTREAM.
Viewshed Analysis A viewshed refers to the portion of the land surface that is visible from one or more viewpoints. The process for deriving viewsheds.
Hydrologic Terrain Processing Using Parallel Computing
DES 606 : Watershed Modeling with HEC-HMS
Digital Terrain Analysis for Massive Grids
Approaches to Continental Scale River Flow Routing
Digital Elevation Model Based Watershed and Stream Network Delineation
Digital Elevation Model Based Watershed and Stream Network Delineation
Parallel Computation of River Basin Hydrologic Response Using DHM
GIS FOR HYDROLOGIC DATA DEVELOPMENT FOR DESIGN OF HIGHWAY DRAINAGE FACILITIES by Francisco Olivera and David Maidment Center for Research in Water Resources.
Advanced Topics in Data Management
Terrain Analysis Using Digital Elevation Models (TauDEM)
TOPMODEL and the role of topography and variable contributing areas in runoff production Learning objectives Be able to describe the topographic wetness.
May 18, 2016 Spring 2016 Institute of Space Technology
From GIS to HMS U.S. Army Corps of Engineers Hydrologic Engineering Center University of Texas at Austin Center for Research in Water Resources Francisco.
Environmental Modelling with RASTER DEMs: Hydrologic Features
Creating Watersheds and Stream Networks
Presentation transcript:

Flow Computation on Massive Grid Terrains Lars Arge Laura Toma Dept. of Computer Science Duke University, USA Helena Mitasova Dept. of Marine, Earth & Atmospheric Sciences, NCSU, USA http://www.cs.duke.edu/geo*/terraflow

Modeling Flow on Grids Flow direction Flow Routing The direction water flows at a cell Flow Routing Compute flow direction for all cells in the terrain, including flat areas Flow accumulation value Total amount of water which flows through a cell per unit width of contour Flow is distributed according to the flow directions Flow Accumulation Compute flow accumulation values for all cells in the terrain

Modeling Flow Sierra-Nevada DEM Flow Direction Flow Accumulation

Applications Automatic estimation of various terrain parameters watershed basins stream network topographic indices Surface saturation Soil water content Erosion Deposition Forest structure Sediment transport Solar radiation

Massive Data Remote sensing data available NASA-SRTM (whole Earth 5TB at 30m resolution) USGS (entire US at 10m resolution) LIDAR (1m resolution) Ex: Appalachian Mountains dataset 100m resolution (500MB) 30m resolution (5.5GB) 10m resolution (50GB) 1m resolution (5TB)

Process Massive Data? GRASS TARDEM ArcInfo r.watershed, ... Killed after running for 17 days on a 6700 x 4300 grid (approx 50 MB dataset) TARDEM flood, d8, aread8 Killed after running for 20 days on a 12000 x 10000 grid (appox 240 MB dataset) CPU utilization 5%, 3GB swap file ArcInfo flowdirection, flowaccumulation Can handle the 130MB dataset Doesn’t work for datasets bigger than 2GB

TerraFlow Terraflow is Our suite of programs for flow routing and flow accumulation on massive grids [ATV`00,AC&al`02] Flow routing and flow accumulation modeled as graph problems and solved in optimal I/O bounds Efficient 2-1000 times faster on very large grids than existing software Scalable 1 billion elements!! (>2GB data) Flexible Allows for both D8 and D-inf flow modeling http://www.cs.duke.edu/geo*/terraflow

r.terraflow Port of Terraflow into GRASS Preliminary results on Augment with additional features Output plateaus, depressions, tci, water outlet queries, watershed basins Comparison with GRASS flow routines r.watershed, r.flow, r.topidx, ... Performance results

Outline Scalability to large data r.terraflow Why standard programs are not in general scalable One approach to improve scalability I/O-efficient algorithms r.terraflow Algorithm outline Related work and programs Preliminary comparison and performance results Output illustration

Scalability to Massive Data Why? Most GIS programss assume data fits in memory and minimize only CPU computation But..Massive data does not fit in main memory! OS places data on disk and moves data in and out of memory Data is moved in blocks Accessing the disk is 1000 times slower than accessing main memory when processing massive data disk I/O is the bottleneck, rather than CPU time!

Scalability to Massive Data How? Local data accesses vs. scattered data accesses Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 2 6 3 8 9 4 7 10 1 2 10 9 5 6 3 4 8 7 Algorithm 2: Loads 5 blocks Algorithm 1: Loads 10 blocks N blocks >> N/B blocks

Example r.watershed r.watershed –m el=elev_grid dir=dir_grid ac=accu_grid Running on a 500MHz PIII, 1GB RAM, FreeBSD On Hawaii dataset we let it run for 17 days in which it completed 65% Kaweah 1100 x 1400 1.6M elements Puerto Rico 4400 x 1300 6M elements Hawaii 6800 x 4300 28M elements Capdem 12000 x 10000 122M elements r.watershed 12 min 5 days 26 days ? However good the OS, it cannot change the data access pattern of the program!!

TerraFlow Approach Redesign the algorithm to be I/O-Efficient Block size is large! at least 8KB (32KB, 64KB) Compute on whole block while it is in memory Avoid loading a block each time Improved locality Speedup = block size I/O efficient algorithms measure of complexity: number of blocks transfered between main memory and disk http://www.cs.duke.edu/geo*/terraflow

r.terraflow outline Step 1: Flow routing Water flows downhill: SFD, MFD Compute SFD/MFD flow directions by inspecting 8 neighbor points Identify flat areas: plateaus and sinks http://www.cs.duke.edu/geo*/terraflow

Flow Routing on Flat Areas …no obvious flow direction Plateaus Assign flow directions such that each cell flows towards the nearest spill point of the plateau Sinks Either catch the water inside the sink Assign flow directions towards the center of the sink Or route the water outside the sink using uphill flow directions Simulate flooding the terrain: sinks  plateaus Assign uphill flow directions on the original terrain by assigning downhill flow directions on the flooded terrain

r.terraflow outline Step 2: Compute flow accumulation Water flows following the flow directions Goal: Compute the total amount of water through each grid cell Initially one unit of water in each grid cell Every cell distributes water to the neighbors pointed to by its flow direction(s) All these steps can be solved I/O-efficiently Flow routing: modeled as graph problems (breadth-first search, connected components, graph contraction) Flow accumulation: sweeping using an I/O-efficient priority queue

Related Work TerraFlow’s emphasis Flow modeling Computational aspects, not modeling Flow modeling [O’Callaghan and Mark 1984] D8 method for flow accumulation [Jenson and Domingue 1988] General technique of flooding Software GRASS, ArcInfo,Tardem, Topaz, Tapes-G, RiverTools

GRASS Raster Flow Functions r.watershed Most commonly used. Uses A* algorithm to determine flow of water. Ehlschlaeger, USACERL. Input: elevation, [..] Output: flow direction, flow accumulation, [waterhseds, stream segments, slope length, slope steepness] Flow direction grid equivalent to running r.drain for every cell on the grid Watershed grid equivalent to running r.water.outlet for multiple outlets r.drain Traces the least-cost (steepest-downslope) flow path from a given cell. Stops in pits. Input: elevation, point coordinates Output: least-cost path r.water.outlet Generates a watershed basin from a flow direction map. Ehlschlaeger, USACERL. Input: flow direction (from r.watershed), basin coordinates Output: watershed basin map

GRASS Raster Flow Functions r.basin.fill Generates a raster map of watershed subbasins. Larry Band. Input: stream network (from r.watershed), thinned ridge network (by hand!) Output: watersheds subbasins r.topmodel, r.topidx Simulates TOPMODEL, Keith Beven. Input: elevation, basin, TOPMODEL parameters file Output: flow direction, filled elevation, tci, watersheds, [..] r.flow, r.flowmd Constructs flowlines, flowpath lengths and flowline densities. Flowlines stop in pits. Mitas, Mitasova, Hofierka, Zlocha. Input: elevation, [..] Output: flowline density, flowlines (vector), lengths More complex models r.water.fea - Finite element analysis program for hydrologic simulations r.hydro.CASC2D - Fully integrated distributed cascaded 2D hydrologic modeling. r.wrat - Water Resource Assessment Tool

r.terraflow features Input Output elevation grid flow direction grid SFD (D8) single flow directions MFD (Dinf) multiple flow directions flow accumulation grid Option to switch to SFD when flow value exceeds an user-defined threshold topographic convergence index (tci) grid plateau and depressions grid

GRASS:>r.terraflow help Description: Flow computation for massive grids. Usage: r.terraflow [-sq] elev=name filled=name direction=name watershed=name accumulation=name tci=name [d8cut=value] [memory=value] [STREAM_DIR=name] [stats=name] Flags: -s SFD (D8) flow (default is MFD) -q Quiet Parameters: elev Input elevation grid filled Output (filled) elevation grid direction Output direction grid watershed Output watershed grid accumulation Output accumulation grid tci Output tci grid d8cut If flow accumulation is larger than this value it is routed using SFD (D8) direction (meaningfull only for MFD flow only). default: infinity memory Main memory size (in MB) default: 300 STREAM_DIR Location of intermediate STREAMs default: /var/tmp stats Stats file default: stats.outv http://www.cs.duke.edu/geo*/terraflow

Preliminary Experimental Results PIII dual 1GHz processor, 1GB RAM Dataset Grid dimensions Grid size (million elements) Kaweah 1163 x 1424 1.6 Puerto Rico 4452 x 1378 5.9 Sierra Nevada 3750 x 2672 9.5 Hawaii 6784 x 4369 28.2 Lower New England 9148 x 8509 77.8 Panama 11283 x 10862 122.5 r.terraflow 1.85 min 4.65 min 19.22 min 22.35 min 114 min 3.5 hr r.watershed 9.2 min 93 min 18.2 hours killed after 6 days < 1% done

Panama DEM

Panama r.terraflow MFD

r.terraflow MFD zoom,3D

r.terraflow SFD zoom,3D

r.terraflow MFD zoom,2D

r.terraflow SFD zoom,2D

r.terraflow MFD TCI zoom,2D

r.terraflow SFD TCI zoom,2D

Flat DEM

r.terraflow MFD

r.terraflow SFD

r.watershed

Conclusions/Future Work Work in progress More features Water outlet queries Watershed delineation Experimental analysis Other features? Modeling? Other (intensive computing, I/O-bound) applications? http://www.cs.duke.edu/geo*/terraflow http://www.cs.duke.edu/geo*/terraflow http://www.cs.duke.edu/geo*/terraflow