1 / 18

From Athena to Minerva: A Brief Overview

From Athena to Minerva: A Brief Overview. Ben Cash Minerva Project Team , Minerva Workshop, GMU/COLA, September 16, 2013. Athena Background. World Modeling Summit (WMS; May 2008)

cachet
Download Presentation

From Athena to Minerva: A Brief Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. From Athena to Minerva: A Brief Overview Ben Cash Minerva Project Team, Minerva Workshop, GMU/COLA, September 16, 2013

  2. Athena Background • World Modeling Summit (WMS; May 2008) • Summit calls for revolution in climate modeling to more rapidly advance improvement in climate model resolution, accuracy and reliability • Recommends petascale supercomputers dedicated to climate modeling • Athena supercomputer • The U.S. National Science Foundation responds, offering to dedicate the retiring Athena supercomputer over a six-month period in 2009-2010 • An international collaboration was formed among groups in the U.S., Japan and the U.K. to use Athena to take up the challenge

  3. Project Athena • Dedicated supercomputer • Athena was a Cray XT-4 with 18,048 computational cores • Replaced by new Cray XT-5, Kraken, with 99,072 cores (since increased) • # 21 on June 2009 Top 500 list • 6 months, 24/7, 99.3% utilization • Over 1 PB data generated • Large international collaboration • Over 30 people • 6 groups • 3 continents • State-of-the-art globalAGCMs • NICAM (JAMSTEC/ U. Tokyo): NonhydrostaticIcosahedral Atmospheric Model • IFS (ECMWF): Integrated Forecast System • Highest possible spatial resolution

  4. Athena Science Goals • Hypothesis: Increasing climate model resolution to accurately resolve mesoscale phenomena in the atmosphere (and ocean and land surface) can dramatically improve the fidelity of the models in simulating climate– mean, variances, covariances, and extreme events. • Hypothesis: Simulating the effect of increasing greenhouse gases on regional aspects of climate, especially extremes, may, for some regions, depend critically on the spatial resolution of the climate model. • Hypothesis: Explicitly resolving important processes, such as clouds in the atmosphere (and eddies in the ocean and landscape features on the continental surface), without parameterization, canimprove the fidelity of the models, especially in describing the regional structure of weather and climate.

  5. Qualitative Analysis:2009 NICAM Precipitation and CloudinessMay 21-August 31

  6. Athena Cataloghttp://www.wxmaps.org/athena/home/

  7. Athena Lessons Learned • Dedicated usage of a relatively big supercomputer greatly enhances productivity • Dealing with only a few users and their requirements allows for more efficient utilization of resources • Challenge: Dedicated simulation projects like Project Athena can generate enormous amounts of data to be archived, analyzed and managed. NICS (and TeraGrid) do not currently have enough storage capacity. Data management is a big challenge. • Preparation time: 2 to 3 weeks at least were needed before the beginning of dedicated runs to test and optimize the codes and to plan strategies for optimal use of the system. Communication throughout the project was essential: (weekly telecons, email lists, personal calls, …)

  8. Athena Limitations • Athena was a tremendous success, generating tremendous amount of data and large number of papers for a six month project. • BUT… • Limited number of realizations • Athena runs generally consisted of a single realization • No way to assess robustness of results • Uncoupled models • Multiple, dissimilar models • Resources were split between IFS and NICAM • Differences in performance meant very different experiments performed – difficult to directly compare results • Storage limitations and post-processing demands limited what could be saved for each model

  9. Minerva Background • NCAR Yellowstone • In 2012, NCAR-Wyoming Supercomputing Center (NWSC) debuted Yellowstone, the successor to Bluefire, their previous production platform • IBM iDataplex, 72,280 cores, 1.5 petaflops peak performance • #17 on June 2013 Top 500 list • 10.7 PB disk capability – vast increase over capacity available during Athena • High capacity HPSS data archive • Dedicated high memory analysis clusters (Geyser and Caldera) • Accelerated Scientific Discovery (ASD) program • Recognizing that many groups will not be ready to take advantage of new architecture, NCAR accepted a small number proposals for early access to Yellowstone • 3 months of near-dedicated access before being opened to general user community • Opportunity to continue successful Athena collaboration between COLA and ECMWF, and to address limitations in the Athena experiments

  10. Minerva Timeline • March 2012 – Proposal finalized and submitted • 31 million core hours requested • April 2012 – Proposal accepted • 21 million core hours approved • Anticipated date of production start: July 21 • Code testing and benchmarking on Janus begins • October 5, 2012 • First login to Yellowstone – bcash reportedly user 1 • October – November 23, 2012 • Jobs are plagued by massive system instabilities, conflict between code and Intel compiler

  11. Minerva Timeline continued • November 24 – Dec 1, 2012 • Code conflict resolved, low core count jobs avoid worst of system instability • Minerva jobs occupy 61000 cores (!) • Peter Towers estimates Minerva easily sets record for “Most IFS FLOPs in a 24 hour period” • Jobs rapidly overrun initial 250 TB disk allocation, triggering request for additional resources • This becomes a Minerva project theme • Due to system instability, user accounts are not charged for jobs at this time • Roughly 7 million free core hours as a result: 28 million total • 800+ TB generated

  12. Minerva Catalog: Base Experiments Minerva Catalog: Extended Experiments ** to be completed

  13. Qualitative Analysis:2010 T1279 Precipitation May – November

  14. Minerva Lessons Learned • Dedicated usage of a relatively big supercomputer greatly enhances productivity • Experience with early usage period demonstrates tremendous progress can be made with dedicated access • Dealing with only a few users allows for more efficient utilization • Noticeable decrease in efficiency once scheduling multiple jobs of multiple sizes was turned over to a scheduler • NCAR resources initially overwhelmed by challenges of new machine and individual problems that arose. • Focus on a single model allows for in-depth exploration • Data saved at much higher frequency • Multiple ensemble members, increased vertical levels, etc.

  15. Dedicated simulation projects like Athena and Minerva generate enormous amounts of data to be archived, analyzed and managed. Data management is a big challenge. • Other than machine instability, data management and post-processing were solely responsible for halts in production. • Even on a system designed with lessons from Athena in mind, production capabilities overwhelm storage and processing • Post-processing and storage must be incorporated into production stream • ‘Rapid burn’ projects such as Athena and Minerva are particularly prone to overwhelming storage resources

  16. Beyond Minerva: A New Pantheon • Despite advances beyond Athena, more work to be done • Focus of Tuesday discussion • Fill in matrix of experiments • Further increases in ocean, at mospheric resolution • Sensitivity tests (aerosols, greenhouse gases) • ??

More Related