Towards scalable cross platform application performance analysis tool goals and progress
Download
1 / 26

Towards Scalable Cross-Platform Application Performance Analysis -- Tool Goals and Progress - PowerPoint PPT Presentation


  • 63 Views
  • Uploaded on

Towards Scalable Cross-Platform Application Performance Analysis -- Tool Goals and Progress. Shirley Moore [email protected] Scalability Issues. Code instrumentation Hand instrumentation too tedious for large codes Runtime control of data collection Batch queueing systems

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Towards Scalable Cross-Platform Application Performance Analysis -- Tool Goals and Progress' - theodore-todd


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Towards scalable cross platform application performance analysis tool goals and progress

Towards Scalable Cross-Platform Application Performance Analysis -- Tool Goals and Progress

Shirley Moore

[email protected]

LACSI Symposium, Santa Fe, NM


Scalability issues
Scalability Issues Analysis -- Tool Goals and Progress

  • Code instrumentation

    • Hand instrumentation too tedious for large codes

  • Runtime control of data collection

  • Batch queueing systems

    • Cause problems for interactive tools

  • Tracefile size and complexity

  • Data analysis

LACSI Symposium, Santa Fe, NM


Cross platform issues
Cross-platform Issues Analysis -- Tool Goals and Progress

  • Goal: similar user interfaces across different platforms

  • Tools necessarily rely on platform-dependent substrates – e.g., for accessing hardware counters.

  • Standardization of interfaces and data formats promotes interoperability and allows design of portable tools.

LACSI Symposium, Santa Fe, NM


Where is standardization needed
Where is Standardization Needed? Analysis -- Tool Goals and Progress

  • Performance data

    • Trace records vs. summary statistics

    • Data format

    • Data semantics

  • Library interfaces

    • Access to hardware counters

    • Statistical profiling

    • Dynamic instrumentation

LACSI Symposium, Santa Fe, NM


Standardization cont
Standardization? (cont.) Analysis -- Tool Goals and Progress

  • User interfaces

    • Common set of commands

    • Common functionality

  • Timing routines

  • Memory utilization information

LACSI Symposium, Santa Fe, NM


Parallel tools consortium
Parallel Tools Consortium Analysis -- Tool Goals and Progress

  • http://www.ptools.org/

  • Interaction between vendors, researchers, and users

  • Venue for standardization

  • Current projects

    • PAPI

    • DPCL

LACSI Symposium, Santa Fe, NM


Hardware counters
Hardware Counters Analysis -- Tool Goals and Progress

  • Small set of registers that count events, which are occurrences of specific signals related to the processor’s function

  • Monitoring these events facilitates correlation between the structure of the source/object code and the efficiency of the mapping of that code to the underlying architecture.

LACSI Symposium, Santa Fe, NM


Goals of papi
Goals of PAPI Analysis -- Tool Goals and Progress

  • Solid foundation for cross platform performance analysis tools

  • Free tool developers from re-implementing counter access

  • Standardization between vendors, academics and users

  • Encourage vendors to provide hardware and OS support for counter access

  • Reference implementations for a number of HPC architectures

  • Well documented and easy to use

LACSI Symposium, Santa Fe, NM


Papi implementation
PAPI Analysis -- Tool Goals and ProgressImplementation

Tools!!!

PAPI Low Level

PAPI High Level

Portable

Layer

PAPI Machine

Dependent Substrate

Machine

Specific

Layer

Kernel Extension

Operating System

Hardware Performance Counter

LACSI Symposium, Santa Fe, NM


Papi preset events
PAPI Preset Events Analysis -- Tool Goals and Progress

  • Proposed standard set of events deemed most relevant for application performance tuning

  • Defined in papiStdEventDefs.h

  • Mapped to native events on a given platform

    • Run tests/avail to see list of PAPI preset events available on a platform

LACSI Symposium, Santa Fe, NM


Statistical profiling
Statistical Profiling Analysis -- Tool Goals and Progress

  • PAPI provides support for execution profiling based on any counter event.

  • PAPI_profil() creates a histogram by text address of overflow counts for a specified region of the application code.

  • Used in vprof tool from Sandia Lab

LACSI Symposium, Santa Fe, NM


Papi reference implementations

Linux/x86, Windows 2000 Analysis -- Tool Goals and Progress

Requires patch to Linux kernel, driver for Windows

Linux/IA-64

Sun Solaris 2.8/Ultra I/II

IBM AIX 4.3+/Power

Contact IBM for pmtoolkit

SGI IRIX/MIPS

Compaq Tru64/Alpha Ev6 & Ev67

Requires OS device driver patch from Compaq

Per-thread and per-process counts not possible

Extremely limited number of events

Cray T3E/Unicos

PAPI Reference Implementations

LACSI Symposium, Santa Fe, NM


Papi future work
PAPI Future Work Analysis -- Tool Goals and Progress

  • Improve accuracy of hardware counter and statistical profiling data

    • Microbenchmarks to measure accuracy (Pat Teller, UTEP)

    • Use hardware support for overflow interrupts

    • Use Event Address Registers (EARs) where available

  • Data structure based performance counters (collaboration with UMd)

    • Qualify event counting by address range

    • Page level counters in cache coherence hardware

LACSI Symposium, Santa Fe, NM


Papi future cont
PAPI Future (cont.) Analysis -- Tool Goals and Progress

  • Memory utilization extensions (following list suggested by Jack Horner, LANL)

    • Memory available on a node

    • Total memory available/used

    • High-water-mark memory used by process/thread

    • Disk swapping by process

    • Process-memory locality

    • Location of memory used by an object

  • Dynamic instrumentation – e.g., PAPI probe modules

LACSI Symposium, Santa Fe, NM


For more information
For More Information Analysis -- Tool Goals and Progress

  • http://icl.cs.utk.edu/projects/papi/

    • Software and documentation

    • Reference materials

    • Papers and presentations

    • Third-party tools

    • Mailing lists

LACSI Symposium, Santa Fe, NM


DPCL Analysis -- Tool Goals and Progress

  • Dynamic Probe Class Library

  • Built of top of IBM version of University of Maryland’s dyninst

  • Current platforms

    • IBM AIX

    • Linux/x86 (limited functionality)

  • Dyninst ported to more platforms but by itself lacks functionality for easily instrumenting parallel applications.

LACSI Symposium, Santa Fe, NM


Infrastructure components
Infrastructure Components? Analysis -- Tool Goals and Progress

  • Parsers for common languages

  • Access to hardware counter data

  • Communication behavior instrumentation and analysis

  • Dynamic instrumentation capability

  • Runtime control of data collection and analysis

  • Performance data management

LACSI Symposium, Santa Fe, NM


Case studies
Case Studies Analysis -- Tool Goals and Progress

  • Test tools on large-scale applications in production environment

  • Reveal limitations of tools and point out areas where improvements are needed

  • Develop performance tuning methodologies for large-scale codes

LACSI Symposium, Santa Fe, NM


Perc performance evaluation research center
PERC: Analysis -- Tool Goals and ProgressPerformance Evaluation Research Center

  • Developing a science for understanding performance of scientific applications on high-end computer systems.

  • Developing engineering strategies for improving performance on these systems.

  • DOE Labs: ANL, LBNL, LLNL, ORNL

  • Universities: UCSD, UI-UC, UMD, UTK

  • Funded by SciDAC: Scientific Discovery through Advanced Computing

LACSI Symposium, Santa Fe, NM


Perc real world applications
PERC: Analysis -- Tool Goals and ProgressReal-World Applications

  • High Energy and Nuclear Physics

    • Shedding New Light on Exploding Stars: Terascale Simulations of Neutrino-Driven SuperNovae and Their NucleoSynthesis

    • Advanced Computing for 21st Century Accelerator Science and Technology

  • Biology and Environmental Research

    • Collaborative Design and Development of the Community Climate System Model for Terascale Computers

  • Fusion Energy Sciences

    • Numerical Computation of Wave-Plasma Interactions in Multi-dimensional Systems

  • Advanced Scientific Computing

    • Terascale Optimal PDE Solvers (TOPS)

    • Applied Partial Differential Equations Center (APDEC)

    • Scientific Data Management (SDM)

  • Chemical Sciences

    • Accurate Properties for Open-Shell States of Large Molecules

  • …and more…

LACSI Symposium, Santa Fe, NM


Parallel climate transition model
Parallel Climate Transition Model Analysis -- Tool Goals and Progress

  • Components for Ocean, Atmosphere, Sea Ice, Land Surface and River Transport

  • Developed by Warren Washington’s group at NCAR

    • POP: Parallel Ocean Program from LANL

    • CCM3: Community Climate Model 3.2 from NCAR including LSM: Land Surface Model

    • ICE: CICE from LANL and CCSM from NCAR

    • RTM: River Transport Module from UT Austin

  • Fortran 90 with MPI

LACSI Symposium, Santa Fe, NM


Pctm parallel climate transition model
PCTM: Analysis -- Tool Goals and ProgressParallel Climate Transition Model

RiverModel

OceanModel

AtmosphereModel

Flux Coupler

LandSurfaceModel

Sea Ice

Model

Sequential Executionof Parallelized Modules

LACSI Symposium, Santa Fe, NM


Pctm instrumentation
PCTM Instrumentation Analysis -- Tool Goals and Progress

  • Vampir tracefile in tens of gigabytes range even for toy problem

  • Hand instrumentation with PAPI tedious

  • UIUC working on SvPablo instrumentation

  • Must work in batch queueing environment

  • Plan to try other tools

    • MPE logging and jumpshot

    • TAU

    • VGV?

LACSI Symposium, Santa Fe, NM


In progress
In Progress Analysis -- Tool Goals and Progress

  • Standardization and reference implementations for memory utilization information (funded by DoD HPCMP PET, Ptools-sponsored project)

  • Repositories of application performance evaluation case studies (e.g., SciDAC PERC)

  • Portable dynamic instrumentation for parallel applications (DOE MICS project – UTK, UMd, UWisc)

  • Increased functionality and accuracy of hardware counter data collection (DoD HPCMP, DOE MICS)

LACSI Symposium, Santa Fe, NM


Next steps
Next Steps Analysis -- Tool Goals and Progress

  • Additional areas for standardization?

    • Scalable trace file format

    • Metadata standards for performance data

    • New hardware counter metrics (e.g., SMP and DMP events, data-centric counters)

    • Others?

LACSI Symposium, Santa Fe, NM


Next steps cont
Next Steps (cont.) Analysis -- Tool Goals and Progress

  • Sharing of tools and data

    • Open source software

    • Machine and software profiles

    • Runtime performance data

    • Benchmark results

    • Application examples and case studies

  • Long-term goal: common performance tool infrastructure across HPC systems

LACSI Symposium, Santa Fe, NM


ad