1 / 44

Digital Terrain Analysis for Massive Grids

Digital Terrain Analysis for Massive Grids. Lars Arge, Jeff Chase, Laura Toma, Jeff Vitter, Rajiv Wickremesinghe Pat Halpin, Dean Urban. in collaboration with. http://www.cs.duke.edu/geo*/terraflow. Modeling Flow. Sierra-Nevada DEM. Flow Direction. Flow Accumulation. Flow direction

Download Presentation

Digital Terrain Analysis for Massive Grids

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Digital Terrain Analysis for Massive Grids Lars Arge, Jeff Chase, Laura Toma, Jeff Vitter, Rajiv Wickremesinghe Pat Halpin, Dean Urban in collaboration with http://www.cs.duke.edu/geo*/terraflow

  2. Modeling Flow Sierra-Nevada DEM Flow Direction Flow Accumulation

  3. Flow direction The direction water flows at a cell Flow Routing Compute flow direction for all cells in the grid Flat areas Flooding Flow accumulation value Total area which flows through a cell in the terrain per unit width of contour Flow Accumulation Compute flow accumulation values for all cells in the terrain Flow is distributed according to the flow directions Modeling Flow

  4. Automatic estimation of terrain parameters watersheds drainage networks topographic index Surface saturation Soil water content Erosion, Deposition Forest structure Species diversity Sediment transport Applications

  5. Massive Data • Remote sensing data available today • USGS (entire US at 10m resolution) • NASA-SRTM (whole Earth 5TB at 30m resolution) • Higher resolution data available • Ex: Appalachian Mountains dataset • 100m resolution (500MB) • 30m resolution (5.5GB) • 10m resolution (50GB) • 1m resolution (5TB)

  6. Problems with Existing Software • GRASS • r.watershed • Killed after17 days on a 50MB dataset • TARDEM • flood, d8, aread8 • Can handle the 50MB dataset • Killed after running for 20 days on a 130MB dataset • CPU utilization: 5%, 3GB swap file • ArcInfo • flowdirection, flowaccumulation • Can handle the 130MB dataset • Doesn’t work for files bigger than 2GB

  7. Our Results: TerraFlow • Collection of programs for flow routing and flow accumulation on massive grids • Theoretical results • Flow routing and flow accumulation modeled as graph problems and solved in optimal bounds • Practical results • Efficient • 2-1000 times faster than existing software on massive grids • Scalable • 1 billion elements!! (>2GB data) • Flexible • Outputs similar with ArcInfo flowdirection and flowaccumulation http://www.cs.duke.edu/geo*/terraflow

  8. Scalability: Why? How? Massive data • Data does not fit in memory • OS places data on disk and moves data in and out of memory • Data is moved in blocks • Accessing disk is 1000 times slower than accessing main memory  disk I/O is the bottleneck! Local data accesses vs. scattered data accesses l

  9. 1 2 5 6 9 10 3 4 7 8 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks)

  10. 1 2 5 6 9 10 3 4 7 8 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks)

  11. 1 2 5 6 9 10 3 4 7 8 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks)

  12. 1 2 5 6 9 10 3 4 7 8 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks)

  13. 1 2 5 6 9 10 3 4 7 8 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks)

  14. 1 2 5 6 9 10 3 4 7 8 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks)

  15. 1 2 5 6 9 10 3 4 7 8 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks)

  16. 1 2 5 6 9 10 3 4 7 8 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks)

  17. 1 2 5 6 9 10 3 4 7 8 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks)

  18. 1 2 5 6 9 10 3 4 7 8 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks)

  19. 1 2 5 6 9 10 3 4 7 8 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks)

  20. 1 2 5 6 9 10 3 4 7 8 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) Loads 5 blocks

  21. 1 2 10 9 5 6 4 3 8 7 1 5 2 6 3 8 9 4 7 10 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) Loads 5 blocks

  22. 1 2 10 9 5 6 4 3 8 7 1 5 2 6 3 8 9 4 7 10 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) Loads 5 blocks

  23. 1 2 10 9 5 6 4 3 8 7 1 5 2 6 3 8 9 4 7 10 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) Loads 5 blocks

  24. 1 2 10 9 5 6 4 3 8 7 1 5 2 6 3 8 9 4 7 10 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) Loads 5 blocks

  25. 1 2 10 9 5 6 4 3 8 7 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) 1 5 2 6 3 8 9 4 7 10 Loads 5 blocks

  26. 1 2 10 9 5 6 4 3 8 7 1 5 2 6 3 8 9 4 7 10 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) Loads 5 blocks

  27. 1 2 10 9 5 6 4 3 8 7 1 5 2 6 3 8 9 4 7 10 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) Loads 5 blocks

  28. 1 2 10 9 5 6 4 3 8 7 1 5 2 6 3 8 9 4 7 10 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) Loads 5 blocks

  29. 1 2 10 9 5 6 4 3 8 7 1 5 2 6 3 8 9 4 7 10 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) Loads 5 blocks

  30. 1 2 10 9 5 6 4 3 8 7 1 5 2 6 3 8 9 4 7 10 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) Loads 5 blocks

  31. 1 2 10 9 5 6 4 3 8 7 1 5 2 6 3 8 9 4 7 10 Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) Loads 5 blocks

  32. 1 2 10 9 5 6 4 3 8 7 1 5 2 6 3 8 9 4 7 10 N N blocks blocks << B Local Accesses vs. Scattered Accesses • Example: reading an array from disk • Array size N = 10 elements • Disk block size = 2 elements • Memory size = 4 elements (2 blocks) Loads 5 blocks Loads 10 blocks

  33. Scalability: Why? How? Massive data • Data does not fit in memory • OS places data on disk and moves data in and out of memory • Data is moved in blocks • Accessing disk is 1000 times slower than accessing main memory  disk I/O is the bottleneck! Local data accesses vs. scattered data accesses • N/B << N block transfers However good the OS, it cannot change the data access pattern of the program!

  34. TerraFlow Approach Improve locality by redesigning algorithms • Block size at least 8KB (32KB, 64KB) • Compute on whole block while it is in memory • Avoid loading a block each time • Speedup = block size! • I/O-Efficient algorithms http://www.cs.duke.edu/geo*/terraflow

  35. Related Work • TerraFlow’s emphasis • Computational aspects, not modeling • Flow modeling • [O’Callaghan and Mark 1984] • D8 method for flow accumulation • [Jenson and Domingue 1988] • General technique of flooding • Existing software • ArcInfo, GRASS, Tardem, Topaz, Tapes-G, RiverTools

  36. Flow Routing on Flat Areas …no obvious flow direction

  37. TerraFlow Outline • Flow routing • Flood the terrain to eliminate sinks • Identify watersheds and construct watershed graph • Collapse watershed graph and raise sinks • Flow accumulation • Sweep terrain top-down to distribute flow • All these steps can be solved I/O-Efficiently http://www.cs.duke.edu/geo*/terraflow

  38. Datasets http://www.cs.duke.edu/geo*/terraflow

  39. TerraFlow v.s. ArcInfo http://www.cs.duke.edu/geo*/terraflow

  40. Significant speedup over ArcInfo for large grids East-Coast dataset ArcInfo: 78 hours TerraFlow: 8.7 hours Washington State dataset TerraFlow: 63 hours ArcInfo: Cannot process files larger than 2GB! TerraFlow – Performance http://www.cs.duke.edu/geo*/terraflow

  41. TerraFlow Features • Flow directions, Flow accumulation • SFD (single flow directions) • MFD (multiple flow directions) (SFD,SFD), (MFD,MFD), (MFD,MFD) • Flow accumulation • Use MFD and switch to SFD when flow value exceeds an user-defined threshold http://www.cs.duke.edu/geo*/terraflow

  42. TerraFlow: Result samples http://www.cs.duke.edu/geo*/terraflow

  43. TerraFlowResults Samples http://www.cs.duke.edu/geo*/terraflow

  44. Conclusions / Future Work • TerraFlow - Flow modeling • More features • Modeling • New applications http://www.cs.duke.edu/geo*/terraflow http://www.cs.duke.edu/geo*/terraflow http://www.cs.duke.edu/geo*/terraflow

More Related