Performance Analysis of OVERFLOW on Sandia Compute Clusters

DOD HPCMO Users Group Conference June 2006 Denver, CO Performance Analysisof OVERFLOW on Sandia Compute Clusters Daniel W. Barnette Sandia National Labs Albuquerque, New Mexico Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.

Introduction… • DOD asked DOE to participate … thanks! • Sandia has assembled a performance modeling team • I’m a newcomer to performance modeling, but not to CFD • Have been analyzing OVERFLOW for a few months now • Much more needs to be done (as always!) • Came to this conference to learn from you!! • Certainly can use constructive guidance… • So here goes….

Overflow CFD Code • Mature compressible flow code • Mostly written and maintained by Pieter Buning, NASA LaRC • Uses overset grid technology, a powerful method for complex aerodynamic geometries • Used heavily in DOD, DOE, NASA, Boeing • Lots of flexibility on how to run the code • Used to benchmark 5-sphere test case…

CFD Test Case5 Spheres Mach 1.5 Flow

Plane Of Symmetry

PE= 72.9% 91.1% 83.6% 85.2% 80.2%

PerfMod Database • Working with Sue Goudy and Ryan D. Scott (BYU summer intern) to establish format • Will store metadata + run characteristics + data for post-processing • Will interface with Python GUI (TBD) in an as-yet-undetermined form • Will provide search & retrieve functionality • Sorely needed to help analyze large datasets from multiple runs for multiple platforms • Database will be used to impact • System cost/performance analyses • Application performance analyses • Determination of app-to-architecture mapping strategies • Etc…..

Python GUI • To help with submitting large numbers of jobs for performance modeling analysis • Helps new users to get started running on Sandia’s clusters by utilizing point-and-click methods to compile/run/post-process • Helps to document runs for archive purposes by keeping track of input/output data • Will interface with R. Scott’s database

PerfMod Gui (work in progress)

Future Work • Timing studies for fixed-size problem over range of processors • Run on newest Redstorm (dual-core, faster NICs) • Performance Comparison of OVERFLOW with PREMO, a Sandia CFD code + other codes • MPI vs. MPI/SMP - Does it make a difference? -------------------------------- • Investigate how the cache is being utilized • Run with minimized ratio “t_comm/t_comp” and compare with non-optimal ratio (a SOFTWARE approach to greater performance??)

Current OVERFLOW Strategy for Parallelized Overset Grids • Grids are broken up and then grouped to get large # of points on each processor (load-balanced) • Grid-to-grid communication packet sizes are not considered (speed-up is not optimized) • Question needs to be asked: • On any one processor, does minimizing overset grid surfaces (i.e., minimizing grid-to-grid communications), but maximizing overset grid volumes (i.e., maximizing the compute time), have a significant effect on run time?

BREAKUP, a pre-processor to OVERFLOW • Code written 10 years ago • Prepares overset grids for parallel computers • Computes avg number of pts/processor • Generates near-uniform distribution of grid points/processor for ‘sub-grids’ • Constructs appropriate connectivity tables • Sequences all grid orientation possibilities, choosing the orientation that minimizes the ratio “t_comm/t_comp” Examples …….

The End … and thanks, DOD, for the invite! dwbarne@sandia.gov

Performance Analysis of OVERFLOW on Sandia Compute Clusters

Performance Analysis of OVERFLOW on Sandia Compute Clusters

Presentation Transcript

Protocol-Dependent Message-Passing Performance on Linux Clusters

Analysis of Performance

Overflow

Analysis of Performance

High Performance Compute Cluster

Performance of MapReduce on Multicore Clusters

Overflow

Data Intensive Scientific Compute Model for Multicore clusters

Parallel Simulations on High-Performance Clusters

CUDA Performance Study on Hadoop MapReduce Clusters

Multimedia Content Analysis on Clusters and Grids

High Performance Linux Clusters

Symbolic Analysis for Buffer Overflow

High-Performance Clusters part 1: Performance

High Performance Compute Cluster

The Case for Tiny Tasks in Compute Clusters

High-Performance Clusters part 1: Performance

Multimedia Content Analysis on Clusters and Grids

Multimedia Content Analysis on Clusters and Grids

Protocol-Dependent Message-Passing Performance on Linux Clusters