1 / 24

Scalable Performance Visualization with Jumpshot

Scalable Performance Visualization with Jumpshot. Omer Zaki, Rusty Lusk, Bill Gropp, Debbie Swider Mathematics and Computer Science Division Argonne National Laboratory. Not Included. Getting performance with MPI applications implementation research topics MPI-2 contents

phil
Download Presentation

Scalable Performance Visualization with Jumpshot

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scalable Performance Visualization with Jumpshot Omer Zaki, Rusty Lusk, Bill Gropp, Debbie Swider Mathematics and Computer Science Division Argonne National Laboratory

  2. Not Included • Getting performance with MPI • applications • implementation research topics • MPI-2 • contents • implementation availability • MPICH • The MPI-2 approach to parallel I/O

  3. Outline • What is Jumpshot and where did it come from? • Related efforts • Our requirements for a logfile-based performance visualization system. • Producing logfiles: CLOG • Visualizing logfiles: Jumpshot • Java issues • Future work

  4. What is Jumpshot? • Tool for understanding the behavior of parallel programs • Post-mortem • Logfile-based • Includes logging package (CLOG) • Primarily for MPI programs • Written in Java processes CLOG logfile Jumpshot display

  5. Typical Jumpshot Screen

  6. Instrumented version of PETSc

  7. Where Did Jumpshot Come From? • The history of logfile-based performance-analysis tools at Argonne is also the history of the search for a programming environment in which to implement simple graphics plus a GUI. • Gist (BBN) -- raw X • black and white, not portable (BBN Butterfly only) • Upshot -- raw X plus Athena widgets, used with ALOG logfile format • painful, especially to get performance (1990) • Upshot redone in Tcl/Tk • easy to write, but graphics too slow

  8. History (continued) • Nupshot -- Upshot redone in Tcl/Tk/C for speed • good performance • Tcl/C interface unstable • CLOG -- new log format for many reasons • Java + Upshot = Jumpshot • uses CLOG • has new features • explores Java technology • next up: JPython?

  9. Related Efforts • Gist survives in Dolphin’s TotalView as TimeScan • PICL/ParaGraph - Pat Worley, Mike Heath, Jennifer Etheridge, Al Geist • VAMPIR - Pallas • Traceview - Al Malony • Pablo - Dan Reed, Ruth Aydt • XPVM - Jim Kohl, Al Geist • XMPI - Raja Daoud, now Notre Dame • Paradyne - Bart Miller, Myron Livny • others

  10. Why Do It Again? Requirements for a new system (not all of them yet satisfied by Jumpshot): • stable environment for long lifetime to accommodate future research • portable, even unto Microsoft • support for upshot-type views that we have found most useful • process timelines with scrolling and zooming • histograms of state durations, message properties, mountain ranges • animation not so useful

  11. Requirements (continued) • flexible, extensible logfile format to accommodate new types of events, states, concepts • end-user-defined states • scalable performance • control of logging at source • aggregation • at least tens of thousands of events • nested and overlapping states • nested more important than overlapping • connection of displayed events back to source code.

  12. Requirements (continued) • MPI awareness (communicators, semi-transparency of collective operations) • ability to query details of specific events, messages, and states. • ability to locate “interesting” parts of display (research topic) • a new one every week....

  13. The CLOG Logging Library: Background Characteristics of old ALOG: • fixed-format records (6 ints and short string) • good for parsing, storing, access • bad for extendibility • timestamps an integer number of microseconds • OK in 1990 • not accurate enough now • ASCII format in file • good for portability, easy to read • can’t store binary data conveniently

  14. CLOG: Requirements • Efficient enough to not interfere with behavior of program; I/O only when program finishes • Only one logfile at end is convenient • Timestamps not assumed synchronized • Flexibility in record type, but not completely self-describing • Portable: logfiles can be read on different machine than one where written.

  15. CLOG: How It Works • relies on MPI, for portability • calls MPI_Wtime to get timestamps • reasonably, but not ultimately, efficient on any given architecture/OS • multiple record formats with types, plus “raw” type • user can define own types, states, colors • log records accumulate in big buffers in memory until malloc fails, then stop

  16. How CLOG Works (cont.) • At end (CLOG_Finalize is collective) • process-local data is added to buffers • timestamps are adjusted, using simple algorithm and communication with other processes • processes form a binary tree and do local 3-way merge in parallel up to process 0, which writes file • file is in Java (MPI-2’s “external-32”) format

  17. “Normal” Jumpshot Features • Scrolling and zooming in timeline view • Arrows to represent messages • Click on arrows and states for details • Histograms of state durations, message bandwidth • Mountain range view to aggregate state info • Can select/deselect states, messages

  18. Timelines and Mountain Range

  19. “Unusual” Jumpshot Features • Scrolling and zooming in histogram view • Can focus on extreme durations/bandwidths • calculate top/bottom 1%, 5%, ... based on assumed normal distribution • blink corresponding state instances, arrows; can help locate “outlier” events in large confusing display. • Can scroll timelines individually to fine-tune clock synchronization • Inherited from Java: • portability • can be run as applet • fancy GUI features • multiple look-and-feel • tear-off subwindows

  20. Java Issues - Good • Portable (Sun, SGI, RS6000, Windows, NT) • Can be run either as normal X application or with a browser as an applet • Graphics are fast enough • Widget set is more than adequate for GUI construction - Swing.

  21. Java Issues - Awkward • Java still rapidly evolving in this area (1.0 - 1.1.2 - 1.1.6 - 1.2beta) • made many big changes during two-month period to deal with bugs, re-implemented features • add-ons also evolving (e.g. swing) • applet behavior not completely consistent with application behavior • Inconveniences arising from the fact that a Java program is not self-contained (e.g., CLASSPATH environment variable) • partially resolved with use of JRE • We are still committed.

  22. Jumpshot Distribution • Jumpshot currently comes with • CLOG library for creating logfiles, uses any MPI • mpe logging library with ALOG/CLOG switch • MPI profiling library for automatic instrumentation of MPI programs • Is distributed • as part of MPICH distribution (version 1.1.1) • separately as part of mpe library, for use with any MPI implementation • separate jumpshot only

  23. Future Work • Meet more of the requirements • connection to code (by logging __FILE__, __LINE__) • select by MPI communicator (not so easy, because of communicator id issue) • vertical scrolling • Research on scalability issues; useful agglomerations of data • Research in automated detection of performance anomalies

  24. The End

More Related