1 / 28

Performance Analysis of OVERFLOW on Sandia Compute Clusters

DOD HPCMO Users Group Conference June 2006 Denver, CO. Performance Analysis of OVERFLOW on Sandia Compute Clusters. Daniel W. Barnette Sandia National Labs Albuquerque, New Mexico.

lethia
Download Presentation

Performance Analysis of OVERFLOW on Sandia Compute Clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DOD HPCMO Users Group Conference June 2006 Denver, CO Performance Analysisof OVERFLOW on Sandia Compute Clusters Daniel W. Barnette Sandia National Labs Albuquerque, New Mexico Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.

  2. Introduction… • DOD asked DOE to participate … thanks! • Sandia has assembled a performance modeling team • I’m a newcomer to performance modeling, but not to CFD • Have been analyzing OVERFLOW for a few months now • Much more needs to be done (as always!) • Came to this conference to learn from you!! • Certainly can use constructive guidance… • So here goes….

  3. Overflow CFD Code • Mature compressible flow code • Mostly written and maintained by Pieter Buning, NASA LaRC • Uses overset grid technology, a powerful method for complex aerodynamic geometries • Used heavily in DOD, DOE, NASA, Boeing • Lots of flexibility on how to run the code • Used to benchmark 5-sphere test case…

  4. CFD Test Case5 Spheres Mach 1.5 Flow

  5. Plane Of Symmetry

  6. PE= 72.9% 91.1% 83.6% 85.2% 80.2%

  7. PerfMod Database • Working with Sue Goudy and Ryan D. Scott (BYU summer intern) to establish format • Will store metadata + run characteristics + data for post-processing • Will interface with Python GUI (TBD) in an as-yet-undetermined form • Will provide search & retrieve functionality • Sorely needed to help analyze large datasets from multiple runs for multiple platforms • Database will be used to impact • System cost/performance analyses • Application performance analyses • Determination of app-to-architecture mapping strategies • Etc…..

  8. Python GUI • To help with submitting large numbers of jobs for performance modeling analysis • Helps new users to get started running on Sandia’s clusters by utilizing point-and-click methods to compile/run/post-process • Helps to document runs for archive purposes by keeping track of input/output data • Will interface with R. Scott’s database

  9. PerfMod Gui (work in progress)

  10. Future Work • Timing studies for fixed-size problem over range of processors • Run on newest Redstorm (dual-core, faster NICs) • Performance Comparison of OVERFLOW with PREMO, a Sandia CFD code + other codes • MPI vs. MPI/SMP - Does it make a difference? -------------------------------- • Investigate how the cache is being utilized • Run with minimized ratio “t_comm/t_comp” and compare with non-optimal ratio (a SOFTWARE approach to greater performance??)

  11. Future Work • Timing studies for fixed-size problem over range of processors • Run on newest Redstorm (dual-core, faster NICs) • Performance Comparison of OVERFLOW with PREMO, a Sandia CFD code + other codes • MPI vs. MPI/SMP - Does it make a difference? -------------------------------- • Investigate how the cache is being utilized • Run with minimized ratio “t_comm/t_comp” and compare with non-optimal ratio (a SOFTWARE approach to greater performance??)

  12. Current OVERFLOW Strategy for Parallelized Overset Grids • Grids are broken up and then grouped to get large # of points on each processor (load-balanced) • Grid-to-grid communication packet sizes are not considered (speed-up is not optimized) • Question needs to be asked: • On any one processor, does minimizing overset grid surfaces (i.e., minimizing grid-to-grid communications), but maximizing overset grid volumes (i.e., maximizing the compute time), have a significant effect on run time?

  13. BREAKUP, a pre-processor to OVERFLOW • Code written 10 years ago • Prepares overset grids for parallel computers • Computes avg number of pts/processor • Generates near-uniform distribution of grid points/processor for ‘sub-grids’ • Constructs appropriate connectivity tables • Sequences all grid orientation possibilities, choosing the orientation that minimizes the ratio “t_comm/t_comp” Examples …….

  14. The End … and thanks, DOD, for the invite! dwbarne@sandia.gov

More Related