Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Terrain Analysis for Massive Grids

Similar presentations


Presentation on theme: "Digital Terrain Analysis for Massive Grids"— Presentation transcript:

1 Digital Terrain Analysis for Massive Grids
Lars Arge, Jeff Chase, Laura Toma, Jeff Vitter, Rajiv Wickremesinghe Pat Halpin, Dean Urban in collaboration with

2 Modeling Flow Sierra-Nevada DEM Flow Direction Flow Accumulation

3 Modeling Flow Flow direction Flow Routing Flow accumulation value
The direction water flows at a cell Flow Routing Compute flow direction for all cells in the grid Flat areas Flooding Flow accumulation value Total area which flows through a cell in the terrain per unit width of contour Flow Accumulation Compute flow accumulation values for all cells in the terrain Flow is distributed according to the flow directions

4 Applications Automatic estimation of terrain parameters
watersheds drainage networks topographic index Surface saturation Soil water content Erosion, Deposition Forest structure Species diversity Sediment transport

5 Massive Data Remote sensing data available today
USGS (entire US at 10m resolution) NASA-SRTM (whole Earth 5TB at 30m resolution) Higher resolution data available Ex: Appalachian Mountains dataset 100m resolution (500MB) 30m resolution (5.5GB) 10m resolution (50GB) 1m resolution (5TB)

6 Problems with Existing Software
GRASS r.watershed Killed after 17 days on a 50MB dataset TARDEM flood, d8, aread8 Can handle the 50MB dataset Killed after running for 20 days on a 130MB dataset CPU utilization: 5%, 3GB swap file ArcInfo flowdirection, flowaccumulation Can handle the 130MB dataset Doesn’t work for files bigger than 2GB

7 Our Results: TerraFlow
Collection of programs for flow routing and flow accumulation on massive grids Theoretical results Flow routing and flow accumulation modeled as graph problems and solved in optimal bounds Practical results Efficient times faster than existing software on massive grids Scalable 1 billion elements!! (>2GB data) Flexible Outputs similar with ArcInfo flowdirection and flowaccumulation

8 Scalability: Why? How? Local data accesses vs. scattered data accesses
Massive data Data does not fit in memory OS places data on disk and moves data in and out of memory Data is moved in blocks Accessing disk is 1000 times slower than accessing main memory  disk I/O is the bottleneck! Local data accesses vs. scattered data accesses l

9 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 2 5 6 9 10 3 4 7 8

10 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 2 5 6 9 10 3 4 7 8

11 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 2 5 6 9 10 3 4 7 8

12 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 2 5 6 9 10 3 4 7 8

13 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 2 5 6 9 10 3 4 7 8

14 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 2 5 6 9 10 3 4 7 8

15 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 2 5 6 9 10 3 4 7 8

16 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 2 5 6 9 10 3 4 7 8

17 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 3 5 1 2 6 9 10 4 7 8

18 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 4 3 5 1 2 6 9 10 7 8

19 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 4 3 5 1 2 6 9 10 7 8

20 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 2 5 6 9 10 3 4 7 8 Loads 5 blocks

21 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 9 10 2 3 6 7 8 4 10 4 1 3 2 8 7 5 6 9 Loads 5 blocks

22 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 9 10 2 3 6 7 8 4 10 4 1 3 2 8 7 5 6 9 Loads 5 blocks

23 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 9 10 2 3 6 7 8 4 10 4 1 3 2 8 7 5 6 9 Loads 5 blocks

24 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 9 10 2 3 6 7 8 4 10 4 1 3 2 8 7 5 6 9 Loads 5 blocks

25 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 9 10 2 3 6 7 8 4 1 5 2 6 3 8 9 4 7 10 Loads 5 blocks

26 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 9 10 2 3 6 7 8 4 10 4 1 3 2 8 7 5 6 9 Loads 5 blocks

27 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 9 10 2 3 6 7 8 4 10 4 1 3 2 8 7 5 6 9 Loads 5 blocks

28 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 9 10 2 3 6 7 8 4 10 4 1 3 2 8 7 5 6 9 Loads 5 blocks

29 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 9 10 2 3 6 7 8 4 10 4 1 3 2 8 7 5 6 9 Loads 5 blocks

30 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 9 10 2 3 6 7 8 4 7 10 4 1 3 2 8 5 6 9 Loads 5 blocks

31 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 9 10 2 3 6 7 8 4 7 10 4 1 3 2 8 5 6 9 Loads 5 blocks

32 Local Accesses vs. Scattered Accesses
Example: reading an array from disk Array size N = 10 elements Disk block size = 2 elements Memory size = 4 elements (2 blocks) 1 5 9 10 2 3 6 7 8 4 10 4 1 3 2 8 7 5 6 9 Loads 5 blocks Loads 10 blocks N B blocks <<

33 Scalability: Why? How? Local data accesses vs. scattered data accesses
Massive data Data does not fit in memory OS places data on disk and moves data in and out of memory Data is moved in blocks Accessing disk is 1000 times slower than accessing main memory  disk I/O is the bottleneck! Local data accesses vs. scattered data accesses N/B << N block transfers However good the OS, it cannot change the data access pattern of the program!

34 TerraFlow Approach Improve locality by redesigning algorithms
Block size at least 8KB (32KB, 64KB) Compute on whole block while it is in memory Avoid loading a block each time Speedup = block size! I/O-Efficient algorithms

35 Related Work TerraFlow’s emphasis Flow modeling
Computational aspects, not modeling Flow modeling [O’Callaghan and Mark 1984] D8 method for flow accumulation [Jenson and Domingue 1988] General technique of flooding Existing software ArcInfo, GRASS, Tardem, Topaz, Tapes-G, RiverTools

36 Flow Routing on Flat Areas
…no obvious flow direction

37 TerraFlow Outline Flow routing Flow accumulation
Flood the terrain to eliminate sinks Identify watersheds and construct watershed graph Collapse watershed graph and raise sinks Flow accumulation Sweep terrain top-down to distribute flow All these steps can be solved I/O-Efficiently

38 Datasets Dataset Grid dimensions Grid size Sierra Nevada 3750 x 2672
9.5 million cells (19MB) Hawaii 6784 x 4369 28 million cells (54MB) East-Coast USA 13500 x 18200 246 million cells (500MB) Mid-West USA 11000 x 25500 280 million cells (560MB) Washington State 33454 x 31866 1 billion cells (2GB)

39 TerraFlow v.s. ArcInfo

40 TerraFlow – Performance
Significant speedup over ArcInfo for large grids East-Coast dataset ArcInfo: 78 hours TerraFlow: 8.7 hours Washington State dataset TerraFlow: 63 hours ArcInfo: Cannot process files larger than 2GB!

41 TerraFlow Features Flow directions, Flow accumulation
SFD (single flow directions) MFD (multiple flow directions) (SFD,SFD), (MFD,MFD), (MFD,MFD) Flow accumulation Use MFD and switch to SFD when flow value exceeds an user-defined threshold

42 TerraFlow: Result samples

43 TerraFlow Results Samples

44 Conclusions / Future Work
TerraFlow - Flow modeling More features Modeling New applications


Download ppt "Digital Terrain Analysis for Massive Grids"

Similar presentations


Ads by Google