1 / 21

Performance of CMAQ on a Mac OS X System

Performance of CMAQ on a Mac OS X System. Tracey Holloway, John Bachan, Scott Spak Center for Sustainability and the Global Environment University of Wisconsin-Madison A presentation to the 3rd annual CMAS Models-3 conference October 19, 2004. Thinking different. Motivation Methods

wren
Download Presentation

Performance of CMAQ on a Mac OS X System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance of CMAQon a Mac OS X System Tracey Holloway, John Bachan, Scott Spak Center for Sustainability and the Global Environment University of Wisconsin-Madison A presentation to the 3rd annual CMAS Models-3 conference October 19, 2004

  2. Thinking different. • Motivation • Methods • Performance • Hardware • Release • Ongoing Improvements

  3. Motivations. • Simplified operation • Easier development • Easy clustering • Improved performance

  4. Motivation: Operation. • Single platform for all research and academic computing • User-friendly interface • UNIX OS • Open source software, hardware support • Today’s cluster node = tomorrow’s desktop

  5. Motivation: Development. • Better Developer Tools • Xcode • (Interface Builder) • CHUD performance & debugging suite • Distribution Tools • standardized profiles • PackageMaker • FAT binaries • automated installation

  6. Operation & Development.

  7. Motivation: Performance. • Unique Hardware Advantages • powerful PPC 970 vector chip • auto-vectorizing compilers • 2000 NASA Langley report • Populist Parallelization • mix dedicated cluster nodes with free cycles on personal & lab machines • off-the-shelf solutions • simple GUI and command-line tools

  8. Methods. • IBM XL Fortan v8.1 compiler • auto-vectorization • equivalent to AIX • Modifications • flag conversion • build settings • array passing • > 400 man-hours

  9. Performance. • 2 Test Machines • dual 2 GHz G5, 5 GB RAM, 1 GHz bus • stock dual 1 GHz G4, 1.5 GB RAM, 133 MHz bus • Mac OS X 10.3.5 • 1 Test Run • First day of CMAQ 4.3 tutorial • 1 day, 32 km x 32 km, 38 x 38, 6 layers • default EBI CB4 chemistry

  10. Benchmarks. Tutorial Runtime by Hardware and Compiler (seconds) seconds IFC = Intel Fortan Compiler 7.1 PGF = Portland Group Compiler 4.0-2 Intel machines running CMAQ 4.22 on 2 processors with mpich parallelization. Source: Gail Tonnesen, “Benchmarks for CPUs and Compilers for the CMAQ 4.2.2 release.” Macs running CMAQ 4.3 on 1 processor (XLF) or 2 processors (XLF SMP) with OpenMP parallelization

  11. Chemistry. Source: ACONC.nc output from Day 1 of CMAQ 4.3 tutorial Dual 2 GHz G5 running CMAQ 4.3 on 1 processor

  12. Good Chemistry. • Small difference from reference set • greater than difference among Intel machines and compilers • Noise, floating point calculations, initialization • greatest at surface level, early in run • ambient concentrations only • random distribution • no bias • does not propagate in time or space • not correlated to high or low concentrations • Consistent • G4/G5 • chemistry modules • compiler flags

  13. Better Chemistry. Tutorial Runtime by Chemistry Module (seconds) Dual 2 GHz G5 running CMAQ 4.3 on 1 processor

  14. Models-3 on Mac, 10/04. Core Platform • MM5 (Fovell) • MCIP v2.2 • Smoke v2.1 • CMAQ v4.3 • Libraries & Add-Ons • netCDF v3.5.1 • mpich v1.2.2-6 • I/O API v2.2 • MCPL Currently no PAVE, but Vis5d, VisAd, GrADS, NCL, and

  15. Hardware.

  16. Hardware. 18 G5 processors • Dedicated Cluster • XServe G5 Dual 2 GHz, 2 GB RAM • Xserve RAID 3.5 TB • 8 Power Mac G5 Dual 2GHz, 5 GB RAM • Distributed Capacity • student lab eMacs • personal G4 desktops 42 G4 processors 60 processor vector cluster 0 Full-time Sys-admins

  17. Cost Competitive. Apple • Xserve Dual G5 2GHz < $3500 • RAID storage at $3 per GB • G5 Desktop $2000 - 4000 Compare to • Dell PowerVault RAID at $5 per GB • Dell Precision dual Xeon 2.8 GHz, $1200 - 4200 • Sysadmin costs

  18. JOHN SCOTT

  19. Release. • Following input from the CMAS Center • alpha code to CMAS by November, 2004 • CMAS testing • potential support • Following CMAS Testing, preliminary code, scripts, binaries, instructions • available for download at www.sage.wisc.edu/cmaq • Scott Spak will answer questions for early users: snspak@wisc.edu

  20. Our planned activities g95 - GNU compilation parallel implementations Condor Xgrid Pooch/Appleseed further optimization Dual 2.5 GHz benchmarks CMAQ MADRID A community effort? CMAQ Unified MIMS PAVE Ongoing improvements.

  21. Acknowledgements. • Mary Sternitzky, UW • Seth Price, UW • Hans Vahlenkamp and NOAA GFDL • Zac Adelman and the CMAS Help Desk • Dr. Gail Tonnesen and Glen Kaukola, UCR • Models-3 Listserv All funding provided by the University of Wisconsin-Madison.

More Related