1 / 27

Compiler and Tools: User Requirements from ARSC

Compiler and Tools: User Requirements from ARSC. Ed Kornkven Arctic Region Supercomputing Center DSRC kornkven@arsc.edu HPC User Forum September 10, 2009. Outline. ARSC and our user community User issues and eight needs they have in the HPC environment Conclusions. About ARSC.

tawny
Download Presentation

Compiler and Tools: User Requirements from ARSC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compiler and Tools: User Requirements from ARSC Ed Kornkven Arctic Region Supercomputing Center DSRC kornkven@arsc.edu HPC User Forum September 10, 2009

  2. Outline • ARSC and our user community • User issues and eight needs they have in the HPC environment • Conclusions

  3. About ARSC • HPCMPDoD Supercomputing Resource Center, est. 1993 • University of Alaska Fairbanks owned and operated • Provides HPC resources & support • Cray XT5, 3456 cores • Sun cluster, 2312 cores • Supports and conducts research

  4. ARSC User Community • An HPCMP DoD Supercomputing Resource Center • Support of DoD computational research priorities • Open research (publishable in open research journals) • Non-DoD academic research • ARSC supports high performance computational research in science and engineering with an emphasis on high latitudes and the Arctic • In-house research • Oceanography, space physics • Heterogeneous computing technologies, multicore systems • ARSC supports about 300 users, HPCMP about 4000

  5. HPCMP Application Areas • HPCMP projects are defined by ten Computational Technology Areas (CTAs) • Computational Structural Mechanics; Computational Fluid Dynamics; Computational Biology, Chemistry and Materials Science; Computational Electromagnetics and Acoustics; Climate/Weather/Ocean Modeling and Simulation; Signal/Image Processing; Forces Modeling and Simulation; Environmental Quality Modeling and Simulation; Electronics, Networking, and Systems/C4I; Integrated Modeling and Test Environments • These CTAs encompass many application codes • Mostly parallel, with varying degrees of scalability • Commercial, community-developed and home-grown • Unclassified and classified

  6. HPCMP Application Suite • This suite is used for various benchmarking uses including system health monitoring, procurement evaluation and acceptance testing • Contains applications and test cases • Composition of the suite fluctuates according to current and projected use • Past apps include WRF • Significance: Believed to represent the Program’s workload

  7. HPCMP Application Suite

  8. HPCMP Application Suite

  9. ARSC Academic Codes • Non-DoD users’ codes have similar profiles • Many are community codes • E.g., WRF, ROMS, CCSM, Espresso, NAMD • Also some commercial (e.g., Fluent) and home-grown • Predominantly MPI + Fortran/C; some OpenMP/hybrid

  10. Need #1 • Protect our code investment by supporting our legacy code base • MPI-based codes will be around for a while • Some are scaling well, even to 104 cores (our largest machines) • Many are not – lots of users still use 102 cores or fewer • Some single-node codes might be able to take advantage of many cores

  11. Parallel Programming is Too Unwieldy • Memory hierarchy stages have different “APIs” • CPU / registers – mostly invisible (handled by compiler) • Caches – code restructuring for reuse; possibly explicit cache management calls; may have to handle levels differently • Socket memory – maintain memory affinity of processes/threads • Node memory – explicit language features (e.g. Fortran refs/defs) • Off-node memory – different explicit language features (MPI calls) • Persistent storage – more language features (I/O, MPI-IO calls) • Other things to worry about • TLB misses • Cache bank conflicts • New memory layers (e.g. SSD), effect of multicore on memory performance, …

  12. Need #2 • Help with the complexity of parallel programming, esp. managing memory • State-of-the-art is to be an expert in • Architectural features (which constantly change) • Multiple languages (Fortran, MPI, OpenMP) • Performance analysis tools • Coding tricks (which depend on architecture)

  13. Q: Why do so few of our users use performance tools? Does the average user have no incentive? -- or – Have they given up because it seems too difficult?

  14. Need #3 • Users need to understand what the “performance game” is and they need tools to help them win. • Remember the days of “98% vectorized?” • What expectations (metrics) should users have for their code on today’s machines? (It must not be utilization.) • What will the rules be in a many-core world?

  15. Beyond Fortran & MPI • We do have some codes based on other parallel models or languages, e.g. • Charm++ -- NAMD, ChaNGa • Linda – Gaussian (as an optional feature) • PetSc – E.g., PISM (Parallel Ice Sheet Model) • These illustrate some willingness (or need) in our community to break out of the Fortran/MPI box However: The pool of expertise outside the box is even smaller than for MPI.

  16. HPCMP Investments in Software • HPCMP is investing in new software and software development methodologies • E.g., The PET and CREATE programs • User education • Modern software engineering methods • Transferable techniques and/or code • Highly scalable codes capable of speeding up decision-making ability

  17. “New” Programming Models • At ARSC, we are interested in PGAS languages for improving productivity in new development • Have a history with PGAS languages • Collaboration with GWU team (UPC) • Experience with Tsunami model: • Parallelization using CAF in days vs. weeks w/ MPI

  18. Need #4 • High-performing implementations of new programming models • For starters, timely implementations of co-array features of Fortran 2008 • Users need some confidence that their investments in these languages will be safe since their codes will outlive several hardware generations and perhaps the languages themselves.

  19. Beyond Fortran & MPI • Heterogeneous processors • Had a Cray XD1 with FPGAs • Very little use • Cell processors and GPUs • PlayStation cluster • IBM QS22

  20. Need #5 • Easier code development for heterogeneous environments. • Cell processors, GPUs and FPGAs have tempting performance, but • For most users the effort required to use these accelerators is too high. • Work underway in these areas is encouraging.

  21. Multicore Research In collaboration with GWU, we are seeking to better understand multicore behavior on our machines. Codes based on Charm++ (NAMD and ChaNGa) performed better on our 16-core nodes than the MPI-based codes we tested.

  22. Need #6 • We need models and methods to effectively use many cores. • Who doesn’t? • Could potential of many core processors go untapped? • Vector processors weren’t universally accepted because not all apps were a good fit. • If users don’t find a fit with many cores, they will still need to compute. • It’s up to CS, not users, to make multicore work.

  23. Need #7 • Corollary to other requirements: Provide new avenues to productive development, but allow it to be adopted incrementally. • Probably implies good language interoperability • Tools for analyzing code and giving advice, not just statistics • Automatically fix code or, show where the new language will most help

  24. Users Run Codes • Our users want to do science. • For many users, code development is a negligible part of their HPC use. • For all of them, it isn’t the main part. • Most will spend more time running programs than writing them.

  25. Need #8 • Users need help with the process of executing their codes. • Setting up runs • Launching jobs • Monitoring job progress • Checkpoint/restart • Storing output (TBs on ARSC machines) • Cataloguing results • Data analysis and visualization

  26. Conclusion • We are ready for new parallel programming paradigms. • Much science is being done with today’s machines, so “First do no harm” applies. • There are still plenty of opportunities for innovators to make a difference in making HPC more productive for users.

More Related