1 / 40

Pablo Project

Pablo Project. http://www-pablo.cs.uiuc.edu/Projects/Pablo/ Goal: portable performance data environment for parallel systems Pablo Version 5.0 components SDDF Library TraceLibrary I/O Analysis programs Analysis GUI SvPablo. Self Defining Data Format -SDDF.

vinson
Download Presentation

Pablo Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pablo Project • http://www-pablo.cs.uiuc.edu/Projects/Pablo/ • Goal: portable performance data environment for parallel systems • Pablo Version 5.0 components • SDDF Library • TraceLibrary • I/O Analysis programs • Analysis GUI • SvPablo

  2. Self Defining Data Format -SDDF • Performance data description language that specifies both data record structures and data record instances • Supports definition of records containing scalars and arrays of the base types found in most programming languages • Developed to link Pablo instrumentation software to Pablo analysis environment

  3. SDDF (cont.) • Goals - compactness, portability, generality, extensibility • ASCII and binary formats (binary contains flag indicating byte ordering) • SDDF interface library -- library of C++ classes for writing and interpreting files in SDDF format • FileStats utility -- shows types of records and range of values appearing in SDDF file

  4. SDDF Example // “description” “IO Seek” “Seek” { // “Time” “Timestamp” int “Timestamp”[]; // “Seconds” “Floating Point Timestamp” double “Seconds”; // “Event ID” “Corresponding event” // “700013” “lseek” // “700015” “fseek” int “Event Identifier”; // “Node” “Processor number”; int “Processor Number”; // “Duration” “Event duration in seconds” double “Duration”; // “File ID” “Unique file identifier” // “Number Bytes” “Number of bytes traversed” int “Number Bytes”; // “Offset” “Byte offset from position indicated by Whence” int “Offset”; // “Whence” “Indicates file position that Offset is measured from” // “0” “SEEK_SET” // “1” “SEEK_CUR” // “2” “SEEK_END” int “Whence”; ;;

  5. SDDF Example (cont.) “Seek” { [2] { 201803857, 0 }, 20.1803857, 70013, 0, 0.0031946, 3, 0, 0, 0 };;

  6. Pablo TraceLibrary • Basic trace library with extensions for procedure tracing, loop tracing, NX message passing tracing, I/O tracing, MPI tracing • Basic trace library • functions traceEvent, countEvent, startTimeEvent, endTimeEvent • event ID specifies type of event that is being traced

  7. Pablo TraceLibrary (cont.) • Extensions provide wrapper functions for management of event ID’s for various event types • Procedure and loop tracing done manually by inserting calls to TraceLibrary routines into application source code • Default mode is to dump trace buffer contents to a trace file, but it’s possible to have trace data output sent to a socket for real-time analysis

  8. TraceLibrary Scalability • Documentation states that TraceLibrary monitors and dynamically alters volume, frequency, and types of event data by • associating a user-specified maximum trace level with each event and • substituting less invasive data recording (e.g., event counts rather than complete event traces) if maximum user-specified rate is exceeded • Unclear if these measure are taken automatically by high-level trace library or if they must be explicitly called by user at low level

  9. I/O Extension to TraceLibrary • I/O instrumentation requires changes to application source code • I/O trace initialization and termination routines must be called before and after calling any other I/O trace routines • I/O trace bracketing routines provided for I/O requests that are not implemented as library calls (e.g., getc macro in C and Fortran I/O statements that are part of the language)

  10. I/O Extension (cont.) • I/O instrumentation options for C programs • Manually replace standard I/O calls with tracing counterparts • Define IOTRACE so that pre-processor replaces standard I/O calls with tracing counterparts • I/O instrumentation of Fortran programs • Manually bracket each I/O call with I/O trace library bracketing routines

  11. I/O Extension (cont.) • Programs containing to I/O extension interface routines must be linked with • Pablo Trace Extension Library libPabloTraceExt.a • Pablo Base Trace Library libPabloTrace.a

  12. Sample C program - No Instrumentation #include <stdio.h> #include <stdlib.h> main() { FILE *fp; char buffer[1024]; size_t cnt; fp = fopen(“/etc/motd”, “r”); if (fp != NULL) { cnt = fread(buffer, sizeof(char), 1024, fp); fclose(fp); } }

  13. Sample C program - Manual Instrumentation #include “IOTrace.h” #include <stdio.h> #include <stdlib.h> main() { FILE *fp; char buffer[1024]; size_t cnt; initIOTrace(); /* Initialize I/O Extension */ fp = traceFOPEN(“/etc/motd”, “r”); if (fp != NULL) { cnt = traceFREAD(buffer, sizeof(char), 1024, fp) traceFCLOSE(fp); } /* Trace termination routines */ endIOTrace(); endTracing(); }

  14. Sample C program - Preprocessor Replacement #define IOTRACE #include “IOTrace.h” #include <stdio.h> #include <stdlib.h> main() { FILE *fp; char buffer[1024]; size_t cnt; initIOTrace(); /* Initialize I/O Extension */ fp = fopen(“/etc/motd”, “r”); if (fp != NULL) { cnt = fread(buffer, sizeof(char), 1024, fp) fclose(fp); } /* Trace termination routines */ endIOTrace(); endTracing(); }

  15. Sample Fortran program - No Instrumentation integer i open(unit=2,file=‘/tmp/f’,form=‘formatted’,status=‘new’) i=0 write(2, 100) I close(2) 100 format(‘Node ‘, i3) end

  16. #include “fIOTrace.h” integer I call initIOTrace() call traceOpenBegin(‘/tmp/f’, i) open(unit=2,file=‘/tmp/f’,form=‘formatted’,status=‘new’) call traceOpenEnd(2) i = 0 call traceWriteBegin(2,1,0) write(2, 100) I call traceWriteEnd(9) call traceCloseBegin(2) close(2) call traceCloseEnd() 100 format(‘Node ‘,i3) call endIOTrace() call endTracing() end Sample Fortran program - Manual Instrumentation

  17. MPI TraceLibrary Extension • MPI profiling library that can be linked in without making source code changes • Each MPI process output a trace file labeled with the process number • Insert call to SetTraceFileName() immediately after MPI_Init() to control location of trace file

  18. MPI Extension (cont.) • Disable tracing by calling MPI_Control(0) • Re-enable tracing by calling MPI_Control(1) • Link with Pablo Trace Extension Library (libPabloTraceExt.a) and Pablo Base Trace Library (libPabloTrace.a) • Merge per-process trace file using the SDDF utility MergePabloTraces

  19. Pablo Trace File Analysis • Command-line FileStats program scans SDDF file and reports record types, min and max values for each field, and count of each record type. • SDDFStatistics GUI for generating and browsing statistics from an SDDF file • Pablo I/O analysis command-line routines • Pablo Analysis GUI

  20. SDDFStatistics • Statistics for entire file are displayed along top of display • Record types are displayed in panel at lower left • Clicking on a record type brings up statistics for each field of that record type • Clicking on a field displays a histogram summarizing values for that field • Clicking on an array field type brings up statistics for each dimension of that field

  21. SDDFStatistics display

  22. SDDFStatistics Usage • SDDFStatistics [-toolkitoption …] [-loadSummary filename] [-openSDDF filename] • Or use runSDDFStatistics script which invokes the SDDFStatistics program after setting environment variables so that required resources can be located

  23. I/O Analysis Programs • Iostats generates a report of application I/O activity summarized by I/O request type. • IOstatsTable produces table summarizing information about I/O operations. • IOtotalsByPE produces a report showing the total count, duration, and bytes involved for various operations by processor.

  24. I/O Analysis Programs (cont.) • LifetimeIOstats produces a report summarizing I/O activity by processor and file, prints a histogram of the file lifetimes, and prints total time spent in I/O calls for each procedure. • FileRegionIOstats generates a report of application I/O activity summarized by file region. Each file is divided spatially into regions whose size is set by calling enableFileRegionSummaries().

  25. I/O Analysis Programs (cont.) • TimeWindowIOstats produces a report from Time Window Summary trace records. The execution time of the program is divided into time windows whose size is set by calling enableTimeWindowSummaries(). • SyncIOfileIDs processes a trace file contining I/O trace events where many different file Ids may be associated with a given file, and write a new file where every I/O trace event associated with a particular file (as determined by the file name) has the same file ID.

  26. I/O Characterization Research using Pablo • Detailed characterization of I/O behavior of scalable applications and existing parallel file systems • Goals • Enable application developers to achieve higher fraction of peak I/O performance on existing parallel file systems • Help system software developers design better parallel file systems

  27. I/O Research (cont.) • Target Platforms • Intel Paragon • IBM SP • Convex Exemplar • SGI Origin 2000

  28. I/O Research (cont.) • The Scalable I/O (SIO) Initiative has targeted a number of application codes for study, including: • PRISM incompressible Navier-Stokes calculations • SAR Synthetic Aperture Radar application • HF Hartree-Fock calculations • ESCAT SMC electron scattering • RENDER ray-identification rendering

  29. Pablo and Virtual Reality • Problem • Very large volume of captured performance data for parallel systems • Human-computer interface is bandwidth-limited • Proposed solution • Immerse users in virtual world so that users can explore, viscerally experience, and modify the dynamic behavior of application and system software on a massively parallel system

  30. Avatar • Pablo virtual reality system • Operates with workstation monitor, head-mounted display, and the CAVE • Presentation metaphors • Scattercube Matrix • generalization of 2-d scatterplot matrix • shows 3-d projections of sparsely populated, N-dimensional space • Time Tunnel • event level display of processor and inter-processor behavior

  31. Pablo Analysis GUI • Toolkit of data transformation modules capable of processing SDDF records • Supports graphical connection of performance data transformation modules in style of AVS • By graphically connecting modules and interactively selecting trace data records, user specifies desired data transformation and presentations • Expert users can develop and add new data analysis modules

  32. Analysis GUI (cont.) • Module types • Data analysis • Mathematical transforms (counts, sums, ratios, max, min, average, trig functions, etc.) • Synthesis of vectors and arrays from scalar input data • Data presentation - bar graphs, bubble charts, strip charts, contour plots, interval plots, kiviat diagrams, 2-d and 3-d scatter plots, matrix displays, pie charts, polar plots

  33. Pablo Analysis GUI Main Window

  34. Module Creation Window

  35. Module Connection

  36. Configuring a Module (BarGraph)

  37. Graph Execution

  38. Graph with Synthesize Vector Module

More Related