1 / 87

Briefing on Tool Evaluations

PAT. Briefing on Tool Evaluations. Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant Mr. Adam Leko, Sr. Research Assistant Mr. Bryan Golden, Research Assistant Mr. Hans Sherburne, Research Assistant HCS Research Laboratory University of Florida.

rehan
Download Presentation

Briefing on Tool Evaluations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PAT Briefing on Tool Evaluations Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant Mr. Adam Leko, Sr. Research Assistant Mr. Bryan Golden, Research Assistant Mr. Hans Sherburne, Research Assistant HCS Research Laboratory University of Florida

  2. Purpose & Methodology

  3. Purpose of Evaluations • Investigate performance analysis methods used in existing tools • Determine what features are necessary for tools to be effective • Examine usability of tools • Find out what performance factors existing tools focus on • Create standardized evaluation strategy and apply to popular existing tools • Gather information about tool extensibility • Generate list of reusable components from existing tools • Identify any tools that may serve as basis for our SHMEM and UPC performance tool • Take best candidates for extension and gain experience modifying to support new features

  4. Generate list of desirable characteristics for performance tools Categorize based on influence of a tool’s: Usability/productivity Portability Scalability Miscellaneous Will present list of characteristics and actual scores in later slide Assign importance rating to each Minor (not really important) Average (nice to have) Important (should include) Critical (absolutely needed) Formulate a scoring strategy for each Give numerical scores 1-5: 5 best 0: not applicable Create objective scoring criteria where possible Use relative scores for subjective categories Evaluation Methodology

  5. Performance Tool Test Suite • Method used to ensure subjective scores consistent across each tool • Also used to determine effectiveness of performance tool • Includes • Suite of C MPI microbenchmarks that have specific performance problems: PPerfMark [1,2], based on GrindStone [3] • Large-scale program: NAS NPB LU benchmark [4] • “Control” program with good parallel efficiency to test for false positives: CAMEL cryptanalysis C MPI implementation (HCS lab) • For each program in test suite, assign • FAIL: Tool was unable to provide information to identify bottleneck • TOSS-UP: Tool indicated a bottleneck was occurring, but user must be clever to find out and fix • PASS: Tool clearly showed where bottleneck was occurring and gave enough information so a competent user could fix it

  6. Performance Tool Test Suite (2) • What should performance tool tell us? • CAMEL • No communication bottlenecks, CPU-bound code • Performance could be improved by using non-blocking MPI calls • LU • Large number of small messages • Dependence on network bandwidth and latency • Identify which routines take the most time

  7. Big message Several large messages sent Dependence on network bandwidth Intensive server First node overloaded with work Ping-pong Many small messages, overall execution time dependent on network latency Random barrier One node holds up barrier One procedure responsible for slow node behavior Small messages One node is bombarded with lots of messages Wrong way Point-to-point messages sent in wrong order System time Most time spent in system calls Diffuse procedure Similar to random barrier One node holds up barrier Time for slow procedure “diffused” across several nodes in round-robin fashion Performance Tool Test Suite (3)

  8. Overview of Tools Evaluated

  9. Profiling tools TAU (Univ. of Oregon) mpiP (ORNL, LLNL) HPCToolkit (Rice Univ.) SvPablo (Univ. of Illinois, Urbana-Champaign) DynaProf (Univ. of Tennessee, Knoxville) Tracing tools Intel Cluster Tools (Intel) MPE/Jumpshot (ANL) Dimemas & Paraver (European Ctr. for Parallelism of Barcelona) MPICL/ParaGraph (Univ. of Illinois, Univ. of Tennessee, ORNL) List of Tools Evaluated

  10. Other tools KOJAK (Forschungszentrum Jülich, ICL @ UTK) Paradyn (Univ. of Wisconsin, Madison) Also quickly reviewed CrayPat/Apprentice2 (Cray) DynTG (LLNL) AIMS (NASA) Eclipse Parallel Tools Platform (LANL) Open/Speedshop (SGI) List of Tools Evaluated (2)

  11. Profiling Tools

  12. Tuning and Analysis Utilities (TAU) • Developer: University of Oregon • Current versions: • TAU 2.14.4 • Program database toolkit 3.3.1 • Website: • http://www.cs.uoregon.edu/research/paracomp/tau/tautools/ • Contact: • Sameer Shende: sameer@cs.uoregon.edu

  13. TAU Overview • Measurement mechanisms • Source (manual) • Source (automatic via PDToolkit) • Binary (DynInst) • Key features • Supports both profiling and tracing • No built-in trace viewer • Generic export utility for trace files (.vtf, .slog2, .alog) • Many supported architectures • Many supported languages: C, C++, Fortran, Python, Java, SHMEM (TurboSHMEM and Cray SHMEM), OpenMP, MPI, Charm • Hardware counter support via PAPI

  14. TAU Visualizations

  15. mpiP • Developer: ORNL, LLNL • Current version: • mpiP v2.8 • Website: • http://www.llnl.gov/CASC/mpip/ • Contacts: • Jeffrey Vetter: vetterjs@ornl.gov • Chris Chambreau: chcham@llnl.gov

  16. mpiP Overview • Measurement mechanism • Profiling via MPI profiling interface • Key features • Simple, lightweight profiling • Source code correlation (facilitated by mpipview) • Gives profile information for MPI callsites • Uses PMPI interface with extra libraries (libelf, libdwarf, libunwind) to do source correlation

  17. mpiP Source Code Browser

  18. HPCToolkit • Developer: Rice University • Current version: • HPCToolkit v1.1 • Website: • http://www.hipersoft.rice.edu/hpctoolkit/ • Contact: • John Mellor-Crummey: johnmc@cs.rice.edu • Rob Fowler: rjf@cs.rice.edu

  19. HPCToolkit Overview • Measurement Mechanism • Hardware counters (requires PAPI on Linux) • Key Features • Create hardware counter profiles for any executable via sampling • No instrumentation necessary • Relies on PAPI overflow events and program counter values to relate PAPI metrics to source code • Source code correlation of performance data, even for optimized code • Navigation pane in viewer assists in locating resource-consuming functions

  20. HPCToolkit Source Browser

  21. SvPablo • Developer: University of Illinois • Current versions: • SvPablo 6.0 • SDDF component 5.5 • Trace Library component 5.1.4 • Website: • http://www.renci.unc.edu/Software/Pablo/pablo.htm • Contact: • ?

  22. SvPablo Overview • Measurement mechanism • Profiling via source code instrumentation • Key features • Single GUI integrates instrumentation and performance data display • Assisted source code instrumentation • Management of multiple instances of instrumented sourced code and corresponding performance data • Simplified scalability analysis of performance data from multiple runs

  23. SvPablo Visualization

  24. Dynaprof • Developer: Philip Mucci (UTK) • Current versions: • Dynaprof CVS as of 2/21/2005 • DynInst API v4.1.1 (dependency) • PAPI v3.0.7 (dependency) • Website: • http://www.cs.utk.edu/~mucci/dynaprof/ • Contact: • Philip Mucci: mucci@cs.utk.edu

  25. Dynaprof Overview • Measurement mechanism • Profiling via PAPI and DynInst • Key features • Simple, gdb-like command line interface • No instrumentation step needed – binary instrumentation at runtime • Produces simple text-based profile output similar to gprof for • PAPI metrics • Wallclock time • CPU time (getrusage)

  26. Tracing Tools

  27. Intel Trace Collector/Analyzer • Developer: Intel • Current versions: • Intel Trace Collector 5.0.1.0 • Intel Trace Analyzer 4.0.3.1 • Website: • http://www.intel.com/software/products/cluster • Contact: • http://premier.intel.com

  28. Intel Trace Collector/Analyzer Overview • Measurement Mechanism • MPI profiling interface for MPI programs • Static binary instrumentation (proprietary method) • Key Features • Simple, straightforward operation • Comprehensive set of visualizations • Source code correlation pop-up dialogs • Views are linked, allowing analysis of specific portions/phases of execution trace

  29. Intel Trace Analyzer Visualizations

  30. MPE/Jumpshot • Developer: Argonne National Laboratory • Current versions: • MPE 1.26 • Jumpshot-4 • Website: • http://www-unix.mcs.anl.gov/perfvis/ • Contacts: • Anthony Chan: chan@mcs.anl.gov • David Ashton: ashton@mcs.anl.gov • Rusty Lusk: lusk@mcs.anl.gov • William Gropp: gropp@mcs.anl.gov

  31. MPE/Jumpshot Overview • Measurement Mechanism • MPI profiling interface for MPI programs • Key Features • Distributed with MPICH • Easy to generate traces of MPI programs • Compile with mpicc -mpilog • Scalable logfile format for efficient visualization • Java-based timeline viewer with extensive scrolling and zooming support

  32. Jumpshot Visualization

  33. CEPBA Tools (Dimemas, Paraver) • Developer: European Center for Parallelism of Barcelona • Current versions: • MPITrace 1.1 • Paraver 3.3 • Dimemas 2.3 • Website: • http://www.cepba.upc.es/tools_i.htm • Contact: • Judit Gimenez: judit@cepba.upc.edu

  34. Dimemas/Paraver Overview • Measurement Mechanism • MPI profiling interface • Key Features • Paraver • Sophisticated trace file viewer, uses “tape” metaphor • Supports displaying hardware counter metrics along in trace visualization • Uses modular software architecture, very customizable • Dimemas • Trace-driven simulator • Uses simple models for real hardware • Generates “predictive traces” that can be viewed by Paraver

  35. Paraver Visualizations

  36. Paraver Visualizations (2)

  37. MPICL/ParaGraph • Developer: • ParaGraph: University of Illinois, University of Tennessee • MPICL: ORNL • Current versions: • Paragraph (no version number, but last available update 1999) • MPICL 2.0 • Website: • http://www.csar.uiuc.edu/software/paragraph/ • http://www.csm.ornl.gov/picl/ • Contacts: • ParaGraph • Michael Heath: heath@cs.uiuc.edu • Jennifer Finger • MPICL • Patrick Worley: worleyph@ornl.gov

  38. MPICL/Paragraph Overview • Measurement Mechanism • MPI profiling interface • Other wrapper libraries for obsolete vendor-specific message-passing libraries • Key Features • Large number of different visualizations (about 27) • Several types • Utilization visualizations • Communication visualizations • “Task” visualizations • Other visualizations

  39. Paragraph Visualizations: Utilization

  40. Paragraph Visualizations: Communication

  41. Other Tools

  42. KOJAK • Developer: Forschungszentrum Jülich, ICL @ UTK • Current versions: • Stable: KOJAK-v2.0 • Development: KOJAK v2.1b1 • Website: • http://icl.cs.utk.edu/kojak/ • http://www.fz-juelich.de/zam/kojak/ • Contacts: • Felix Wolf: fwolf@cs.utk.edu • Bernd Mohr: b.mohr@fz-juelich.de • Generic email: kojak@cs.utk.edu

  43. KOJAK Overview • Measurement Mechanism • MPI profiling interface • Binary instrumentation on a few platforms • Key Features • Generates and analyzes trace files • Automatic classification of bottlenecks • Simple, scalable profile viewer with source correlation • Exports traces to Vampir format

  44. KOJAK Visualization

  45. Paradyn • Developer: University of Wisconsin, Madison • Current versions: • Paradyn: 4.1.1 • DynInst: 4.1.1 • KernInst: 2.0.1 • Website: • http://www.paradyn.org/index.html • Contact: • Matthew Legendre: legendre@cs.wisc.edu

  46. Paradyn Overview • Measurement Mechanism • Dynamic binary instrumentation • Key Features • Dynamic instrumentation at runtime • No instrumentation phase • Visualizes user-selectable metrics while program is running • Automatic performance bottleneck detection via Performance Consultant • Users can define their own metrics using a TCL-like language • All analysis happens while program is running

  47. Paradyn Visualizations

  48. Paradyn Performance Consultant

  49. Evaluation Ratings

  50. Scores given for each category Usability/productivity Portability Scalability Miscellaneous Scoring formula shown below Used to generate scores for each category Weighted sum based on characteristic’s importance Importance multipliers used Critical: 1.0 Important: 0.75 Average: 0.5 Minor: 0.25 Overall score is sum of all category scores Scoring System

More Related