1 / 12

Improving the support for ARM in IgProf

Improving the support for ARM in IgProf. Filip Nybäck CERN/GSoC video meeting 28 August, 2014. IgProf. profiler program measuring mainly performance and memory usage of other programs at run-time find performance bottlenecks parts of the code worth optimising. What has been done?.

lopezernest
Download Presentation

Improving the support for ARM in IgProf

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving the support for ARM in IgProf • Filip Nybäck • CERN/GSoC video meeting 28 August, 2014

  2. IgProf • profiler • program measuring mainly performance and memory usage of other programs at run-time • find performance bottlenecks • parts of the code worth optimising

  3. What has been done? • IgProf ported to 64-bit ARM • fast stack trace in libunwind ported to • 64-bit ARM • 32-bit ARM • bonus: simple energy profiling module

  4. Function instrumentation • the profiler gets between the function call and the function itself • collect function parameters and the return value • collect the number of calls application IgProf C library ... malloc(...) ... malloc hook for malloc

  5. Function instrumentation generate jumps Text • identify and patch • PC-relative instructions • load instructions • address calculation instructions • relative jump instructions

  6. Stack tracing • An example: • function a calls function b,function b calls function c • the stack trace: [c, b, a] • Stack tracing in libunwind: • uses unwind information • calculates the state of the registers in the previous frame • for stack tracing only the PC register is interesting The stack: a b c the stack grows downward

  7. Fast stack tracing in libunwind • only a subset of the state in each frame is needed to calculate the PC • cache the information about how to calculate the subset of the state • port to ARM • differences in the architectures and calling conventions • where the return address is stored (on the stack, in a register, both) • which register is used to find the previous frame (stack pointer, frame pointer, some other register)

  8. Results • stack tracing only • time(standard stack trace) = 20…30 * time(fast stack trace) • memory profiling with IgProf • time(IgProf + standard stack trace) = 7…8 * time(IgProf + fast stack trace)

  9. Energy profiling module • gets energy measurements from the RAPL (Running Average Power Limit) interface on recent Intel processors • through the PAPI library • based on sampling • highly experimental

  10. Principle of operation • The amount of energy consumed since the previous sampling event is attributed to the current location of execution amount of energy t n n–1 n–2 sampling interval

  11. Example energy profile

  12. Thank you! • questions • comments

More Related