flow computation on massive grid terrains n.
Skip this Video
Loading SlideShow in 5 Seconds..
Flow Computation on Massive Grid Terrains PowerPoint Presentation
Download Presentation
Flow Computation on Massive Grid Terrains

Loading in 2 Seconds...

play fullscreen
1 / 36

Flow Computation on Massive Grid Terrains - PowerPoint PPT Presentation

  • Uploaded on

Flow Computation on Massive Grid Terrains. Lars Arge Laura Toma Dept. of Computer Science Duke University , USA. Helena Mitasova Dept. of Marine, Earth & Atmospheric Sciences , NCSU , USA. http://www.cs.duke.edu/geo*/terraflow. Flow direction The direction water flows at a cell

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Flow Computation on Massive Grid Terrains' - melita

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
flow computation on massive grid terrains

Flow Computation onMassive Grid Terrains

Lars Arge

Laura Toma

Dept. of Computer Science

Duke University,


Helena Mitasova

Dept. of Marine, Earth & Atmospheric Sciences, NCSU,



modeling flow on grids
Flow direction

The direction water flows at a cell

Flow Routing

Compute flow direction for allcells in the terrain, including flat areas

Flow accumulation value

Total amount of water which flows through a cellper unit width of contour

Flow is distributed according to the flow directions

Flow Accumulation

Compute flow accumulation values for allcells in the terrain

Modeling Flow on Grids
modeling flow
Modeling Flow

Sierra-Nevada DEM

Flow Direction

Flow Accumulation

Automatic estimation of various terrain parameters

watershed basins

stream network

topographic indices

Surface saturation

Soil water content



Forest structure

Sediment transport

Solar radiation

massive data
Massive Data
  • Remote sensing data available
    • NASA-SRTM (whole Earth 5TB at 30m resolution)
    • USGS (entire US at 10m resolution)
    • LIDAR (1m resolution)
  • Ex: Appalachian Mountains dataset
      • 100m resolution (500MB)
      • 30m resolution (5.5GB)
      • 10m resolution (50GB)
      • 1m resolution (5TB)
process massive data
Process Massive Data?
    • r.watershed, ...
    • Killed afterrunning for17 days on a 6700 x 4300 grid (approx 50 MB dataset)
    • flood, d8, aread8
    • Killed after running for 20 days on a 12000 x 10000 grid (appox 240MB dataset)
      • CPU utilization5%, 3GB swap file
  • ArcInfo
    • flowdirection, flowaccumulation
    • Can handle the 130MB dataset
    • Doesn’t work for datasets bigger than 2GB
  • Terraflow is Our suite of programs for flow routing and flow accumulation on massive grids[ATV`00,AC&al`02]
  • Flow routing and flow accumulation modeled as graph problems and solved in optimal I/O bounds
  • Efficient
    • 2-1000 times faster on very large grids than existing software
  • Scalable
    • 1 billion elements!! (>2GB data)
  • Flexible
    • Allows for both D8 and D-inf flow modeling


r terraflow
  • Port of Terraflow into GRASS
  • Preliminary results on
    • Augment with additional features
      • Output plateaus, depressions, tci, water outlet queries, watershed basins
    • Comparison with GRASS flow routines
      • r.watershed, r.flow, r.topidx, ...
    • Performance results
  • Scalability to large data
    • Why standard programs are not in general scalable
    • One approach to improve scalability
      • I/O-efficient algorithms
  • r.terraflow
    • Algorithm outline
    • Related work and programs
    • Preliminary comparison and performance results
    • Output illustration
scalability to massive data
Scalabilityto Massive Data


  • Most GIS programss assume data fits in memory and minimize only CPU computation
  • But..Massive data does not fit in main memory! OS places data on disk and moves data in and out of memory
    • Data is moved in blocks
    • Accessing the disk is 1000 times slower than accessing main memory when processing massive data disk I/O is the bottleneck, rather than CPU time!
scalability to massive data1

Algorithm 1: Loads 10 blocks

Algorithm 2: Loads 5 blocks

N blocks >> N/B blocks

Scalabilityto Massive Data


  • Local data accesses vs. scattered data accesses
  • Example: reading an array from disk
    • Array size N = 10 elements
    • Disk block size = 2 elements
    • Memory size = 4 elements (2 blocks)

1 5 2 6 3 8 9 4 7 10

1 2 10 9 5 6 3 4 8 7

  • r.watershed
    • r.watershed –m el=elev_grid dir=dir_grid ac=accu_grid
    • Running on a 500MHz PIII, 1GB RAM, FreeBSD
    • On Hawaii dataset we let it run for 17 days in which it completed 65%

However good the OS, it cannot change the data access pattern of the program!!

terraflow approach
TerraFlow Approach

Redesign the algorithm to be I/O-Efficient

  • Block size is large! at least 8KB (32KB, 64KB)
  • Compute on whole block while it is in memory
    • Avoid loading a block each time
    • Improved locality
    • Speedup = block size

I/O efficient algorithms

  • measure of complexity: number of blocks transfered between main memory and disk


r t erra f low o utline
r.terraflow outline

Step 1: Flow routing

Water flows downhill: SFD, MFD

  • Compute SFD/MFD flow directions by inspecting 8 neighbor points
  • Identify flat areas: plateaus and sinks


flow routing on flat areas
Flow Routing on Flat Areas

…no obvious flow direction

  • Plateaus
    • Assign flow directions such that each cell flows towards the nearest spill point of the plateau
  • Sinks
    • Either catch the water inside the sink
      • Assign flow directions towards the center of the sink
    • Or route the water outside the sink using uphill flow directions
      • Simulate flooding the terrain: sinks  plateaus
      • Assign uphill flow directions on the original terrain by assigning downhill flow directions on the flooded terrain
r t erra f low o utline1
r.terraflow outline

Step 2: Compute flow accumulation

  • Water flows following the flow directions
  • Goal: Compute the total amount of water through each grid cell
    • Initially one unit of water in each grid cell
    • Every cell distributes water to the neighbors pointed to by its flow direction(s)

All these steps can be solved I/O-efficiently

  • Flow routing: modeled as graph problems (breadth-first search, connected components, graph contraction)
  • Flow accumulation: sweeping using an I/O-efficient priority queue
related work
Related Work
  • TerraFlow’s emphasis
    • Computational aspects, not modeling
  • Flow modeling
    • [O’Callaghan and Mark 1984]
      • D8 method for flow accumulation
    • [Jenson and Domingue 1988]
      • General technique of flooding
    • Software
      • GRASS, ArcInfo,Tardem, Topaz, Tapes-G, RiverTools
grass raster flow functions
GRASS Raster Flow Functions
  • r.watershed
    • Most commonly used. Uses A* algorithm to determine flow of water. Ehlschlaeger, USACERL.
    • Input: elevation, [..]
    • Output: flow direction, flow accumulation, [waterhseds, stream segments, slope length, slope steepness]
    • Flow direction grid equivalent to running r.drain for every cell on the grid
    • Watershed grid equivalent to running r.water.outletfor multiple outlets
    • r.drain
      • Traces the least-cost (steepest-downslope) flow path from a given cell. Stops in pits.
      • Input: elevation, point coordinates
      • Output: least-cost path
    • r.water.outlet
      • Generates a watershed basin from a flow direction map. Ehlschlaeger, USACERL.
      • Input: flow direction (from r.watershed), basin coordinates
      • Output: watershed basin map
grass raster flow functions1
GRASS Raster Flow Functions
  • r.basin.fill
    • Generates a raster map of watershed subbasins. Larry Band.
    • Input: stream network (from r.watershed), thinned ridge network (by hand!)
    • Output: watersheds subbasins
  • r.topmodel, r.topidx
    • Simulates TOPMODEL, Keith Beven.
    • Input: elevation, basin, TOPMODEL parameters file
    • Output: flow direction, filled elevation, tci, watersheds, [..]
  • r.flow, r.flowmd
    • Constructs flowlines, flowpath lengths and flowline densities. Flowlines stop in pits. Mitas, Mitasova, Hofierka, Zlocha.
    • Input: elevation, [..]
    • Output: flowline density, flowlines (vector), lengths
  • More complex models
    • r.water.fea - Finite element analysis program for hydrologic simulations
    • r.hydro.CASC2D - Fully integrated distributed cascaded 2D hydrologic modeling.
    • r.wrat- Water Resource Assessment Tool
r t erra f low f eatures
r.terraflow features
  • Input
    • elevation grid
  • Output
    • flow direction grid
      • SFD (D8) single flow directions
      • MFD (Dinf) multiple flow directions
    • flow accumulation grid
      • Option to switch to SFD when flow value exceeds an user-defined threshold
    • topographic convergence index (tci) grid
    • plateau and depressions grid
GRASS:>r.terraflow help


Flow computation for massive grids.


r.terraflow [-sq] elev=name filled=name direction=name watershed=name accumulation=name tci=name [d8cut=value] [memory=value] [STREAM_DIR=name] [stats=name]


-s SFD (D8) flow (default is MFD)

-q Quiet


elev Input elevation grid

filled Output (filled) elevation grid

direction Output direction grid

watershed Output watershed grid

accumulation Output accumulation grid

tci Output tci grid

d8cut If flow accumulation is larger than this value it is routed using SFD (D8) direction (meaningfull only for MFD flow only).

default: infinity

memory Main memory size (in MB)

default: 300

STREAM_DIR Location of intermediate STREAMs

default: /var/tmp

stats Stats file

default: stats.outv


preliminary experimental results
Preliminary Experimental Results

PIII dual 1GHz processor, 1GB RAM

conclusions future work
Conclusions/Future Work
  • Work in progress
    • More features
      • Water outlet queries
      • Watershed delineation
    • Experimental analysis
  • Other features?
  • Modeling?
  • Other (intensive computing, I/O-bound) applications?