1 / 17

Scaling of the Community Atmospheric Model to ultrahigh resolution

Scaling of the Community Atmospheric Model to ultrahigh resolution. Michael F. Wehner Lawrence Berkeley National Laboratory mfwehner@lbl.gov with Pat Worley (ORNL), Art Mirin (LLNL) Lenny Oliker (LBNL), John Shalf (LBNL). Motivations. First meeting of the WCRP Modeling Panel (WMP)

Download Presentation

Scaling of the Community Atmospheric Model to ultrahigh resolution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scaling of the Community Atmospheric Model to ultrahigh resolution Michael F. Wehner Lawrence Berkeley National Laboratory mfwehner@lbl.gov with Pat Worley (ORNL), Art Mirin (LLNL) Lenny Oliker (LBNL), John Shalf (LBNL)

  2. Motivations • First meeting of the WCRP Modeling Panel (WMP) • Convened at the UK MetOffice October, 2005 by Shukla • Discussion focused on benefits and costs of climate and weather models approaching 1km in horizontal resolution • Eventual white paper by Shukla and Shapiro for the WMO JSC • “Counting the Clouds”, A presentation by Dave Randall (CSU) to DOE SciDAC (June 2005) • Dave presents a compelling argument for global atmospheric models that resolve cloud systems rather than parameterize them. • Presentation is on the web at www.scidac.org

  3. fvCAM • NCAR Community Atmospheric Model version 3.1 • Finite Volume hydrostatic dynamics (Lin-Rood) • Parameterized physics is the same as the spectral version • Our previous studies focus on the performance of the fvCAM with a 0.5oX0.625oX28L mesh on a wide variety of platforms (See Pat Worley’s talk this afternoon) • In the present discussion, we consider the scaling behavior of this model over a range of existing mesh configurations and extrapolate to ultra-high horizontal resolution.

  4. Operations count • Exploit three existing horizontal resolutions to establish the scaling behavior of the number of operations per fixed simulation period. • Existing resolutions (all 28 vertical levels) • “B” 2oX2.5o • “C” 1oX1.25o • “D” 0.5ox0.625o • Define: • m = # of longitudes, n = # of latitudes

  5. Operations Count (Scaling) • Parameterized physics • Time step can remain constant • Ops = m * n • Dynamics • Time step determined by the Courant condition • Ops = m * n * n • Filtering • Allows violation of an overly restrictive Courant condition near the poles • Ops = m * log(m) * n * n

  6. Operations Count (Physics)

  7. Operations Count (dynamics)

  8. Operations Count (Filters)

  9. Sustained computation rate requirements • A reasonable metric in climate modeling is that the model • must run 1000 times faster than real time. • Millenium scale control runs complete in a year. • Century scale transient runs complete in a month.

  10. Can this code scale to these speeds? • Domain decomposition strategies • Np = number of subdomains, Ng = number of grid points • Existing strategy is 1D in the horizontal • A better strategy is 2D in the horizontal • Note: fvCAM also uses a vertical decomposition as well as OpenMP parallelism to increase utilization of processors.

  11. Processor scaling • The performance data from fvCAM fits the first model well but tells us little about future technologies. • A practical constraint is that the number of subdomains is limited to be less than or equal to the number of horizontal cells . • At three cells across per subdomain, complete communication of the model’s data is required. • This constraint can provide an estimate of the maximum number of subdomains (~ processors) as well as the minimum processor performance required to achieve the 1000X real time metric (in the absence of communication costs).

  12. Maximum number of horizontal subdomains -2,123,366 -3840

  13. Minimum processor speed to achieve 1000X real time Assume no vertical decomposition and no OpenMP

  14. Total memory requirements

  15. Memory scales slower than processor speed due to Courant condition.

  16. Strawman 1km climate computer • “I” mesh at 1000X real time • .015oX.02oX100L • ~10 Petaflops sustained • ~100 Terabytes total memory • ~2 million horizontal subdomains • ~10 vertical domains • ~20 million processors at 500Mflops each sustained • including communications costs. • 5 MB memory per processor • ~20,000 nearest neighbor send-receive pairs per subdomain per simulated hour of ~10KB each

  17. Conclusions • fvCAM could probably be scaled up to a 1.5km mesh • Dynamics would have to be changed to fully non-hydrostatic • The scaling of the operations count is superlinear with horizontal resolution because of the Courant condition. • Surprisingly, filtering does not dominate the calculation. Physics cost is negligible. • One dimensional horizontal domain decomposition strategy will likely not work. • Limits on processor number and performance are too severe. • Two dimensional horizontal domain decomposition strategy would be favorable but requires a code rewrite. • Its not as crazy as it sounds.

More Related