incremental call path profiling n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Incremental Call-Path Profiling PowerPoint Presentation
Download Presentation
Incremental Call-Path Profiling

Loading in 2 Seconds...

play fullscreen
1 / 20

Incremental Call-Path Profiling - PowerPoint PPT Presentation


  • 52 Views
  • Uploaded on

Incremental Call-Path Profiling. Andrew Bernat bernat@cs.wisc.edu Computer Sciences Department University of Wisconsin-Madison Madison, WI 53706 USA. main. do_work. main. lookup. malloc. do_work. hash_lookup. MPI_Recv. strcpy. Point Profiler (Length 1). main. do_work. 96% CPU.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Incremental Call-Path Profiling' - miranda-lopez


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
incremental call path profiling

Incremental Call-Path Profiling

Andrew Bernat

bernat@cs.wisc.edu

Computer Sciences Department

University of Wisconsin-Madison

Madison, WI 53706

USA

Dynamic Call-Path Profiling

slide2

main

do_work

slide3

main

lookup

malloc

do_work

hash_lookup

MPI_Recv

strcpy

point profiler length 1
Point Profiler (Length 1)

main

do_work

96% CPU

edge profiler length 1
Edge Profiler (Length 1)

main

40%

53%

do_work

96% CPU

path profiler length 3
Path Profiler (Length 3)

main

50%

36%

40%

53%

do_work

96% CPU

full call path profiling
Full Call-Path Profiling

main

53%

43%

50%

36%

40%

53%

do_work

96% CPU

call path profiling disassembled
Call-Path Profiling Disassembled
  • Profiling functions is easy
  • Determining the call-path is hard
    • Efficiency – cost per function invocation
    • Safety – must not affect program’s behavior
    • Correctness
call path profilers
Call-Path Profilers
  • Provide path-profile data for every function in the program.
  • Two categories:
    • Sample-based (gprof, CPPROF)
    • Instrumenting profilers (PP, TAU, others)
sampling call path profilers
Sampling Call-Path Profilers
  • Periodically pause the program
    • Note active function
    • Record call-path (current stack)
    • Some profilers sample CPU usage
  • Advantages:
    • Complete call-path information
  • Disadvantages:
    • Imprecise (sampling-based)
    • Limited metrics available
instrumenting profilers
Instrumenting profilers
  • Track the current call-path
    • Stack of active functions
    • Maintain a pointer to the current call-path
  • Record metrics for all functions
    • Counters, CPU usage, wall time
  • Disadvantages
    • Incomplete (can miss recursion, dynamic calls)
    • Expensive (instrumentation at entries, exits, call sites)
    • Only supports limited, inexpensive metrics
incremental dynamic call path profiling
Incremental, Dynamic Call-Path Profiling
  • Incremental: Only profile functions of interest to the user
    • “Paradyn approach”
  • Dynamic: Allow “on-the-fly” profiling
    • Global analysis unnecessary
  • Cost Effective: Reduce overall cost
  • Complete: User still gets complete call-path information
incremental dynamic call path profiling1
Incremental, Dynamic Call-Path Profiling
  • Capture the call-path with a stack walk from within the process.
    • Includes dynamic calls and recursion
    • Makes tracing function calls unnecessary
  • Walk the stack at function entries and exits.
  • Cost only incurred when profiled functions are executed.
    • Allows use of more expensive metrics
ipath a prototype incremental call path profiler
iPath, a Prototype Incremental Call-Path Profiler
  • Allows use of arbitrary performance metrics.
    • PMAPI (AIX), PAPI (Linux)
    • Counters, timers, and arbitrary combinations
  • Profiles user-selected functions
  • Uses Dyninst
    • Traces unmodified binaries
ipath implementation
iPath Implementation
  • Instrumentation is contained in a run-time library.
    • User defines wanted metrics
  • Maintain a table for each function profiled
    • Stack walk and associated performance data for each detected call-path
  • Update the table at function entry and exit
  • Results available on the fly
ipath in action
iPath in Action
  • We applied iPath to two applications: the Paradyn daemon and the MILC QCD simulation framework.
  • Paradyn daemon: identified and fixed a serious bottleneck in address -> function mapping.
  • MILC: identified and fixed a communication bottleneck.
paradyn daemon
Paradyn Daemon
  • Top level: Performance Consultant was slow
  • Identified a bottleneck in address -> function mapping.
    • Parsing: target of a call-site
    • Runtime: identifying functions on the stack
  • Call-path analysis showed the lookup function performed horribly along only one path.
  • We optimized the function for that path.
  • Result: 98% decrease in instrumentation time!
slide18
MILC
  • Parallel computation framework for quantum chromodynamics simulations.
  • We analyzed MPI performance using iPath and focused on frequently executed paths.
  • We identified two bottlenecks, one of which we fixed.
  • We reduced the number of times MPI functions were called and replaced calls to reduce synchronization time.
  • Result: 45% decrease in execution time
summary
Summary
  • Call-path profiling is a useful technique, but current methods are incomplete.
  • Increase flexibility and reduce cost by profiling particular functions instead of the whole program.
  • Come see the demo!