1 / 16

PinPlay : A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs

PinPlay : A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs. Harish Patil, Cristiano Pereira , Mack Stallcup, Gregory Lueck, James Cownie Intel Corporation CGO 2010, Toronto, Canada. Non-Determinism. Program execution is not repeatable across runs

walt
Download Presentation

PinPlay : A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PinPlay: A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs Harish Patil, Cristiano Pereira, Mack Stallcup, Gregory Lueck, James Cownie Intel Corporation CGO 2010, Toronto, Canada

  2. Non-Determinism • Program execution is not repeatable across runs • Interactions with environment (single-threaded) • Shared-memory interleaving (multi-threaded) • Source of many problems • Hard to predict and test behaviors -> leads to bugs • Very hardand unpleasant todebug • Breaks program analyses that rely on repeatability • Obstacle for adoption of parallel programming

  3. Dealing with Non-Determinism • Eliminate it • Deterministic program execution enforced by runtime (e.g. constrained execution [ISCA’09]) • Deterministic Replay • Let it be butcapture and reproduce execution if needed • Every instruction gets same input as in original run • This paper: User-level Deterministic Replay • Implementation, challenges and usage examples

  4. Requirements • No OS or hardware changes • No changes in user environment • Manageable log sizes for long runs • Reasonable run-time overhead • Multi-threaded and multi-processed applications • Integration with other existing analysis tools (e.g. Dynamic analyzers, debuggers, profilers) • No assumptions about synchronization APIs

  5. Rest of the Talk • Motivation & Requirements • PinPlay Overview • Usage Examples • Results • Summary

  6. PinPlay User-level deterministic replay and analysis Logs (pinballs) Binary + Input PinPlay Normal Program Output + capture OS (Linux® or Windows®) • Run in application’s native environment • Replays user code • OS independent: cross-OS replay! • Easily integrates w/ other tools and debuggers Analysis Tools Logs (pinballs) + PinPlay replay Debuggers OS (Linux® or Windows®)

  7. Replay Models • Parallel-capture and parallel-replay T0 T2 T1 T0 T2 T1 T0 T2 T1 Logs (pinballs) PinPlay PinPlay • Parallel-capture and isolated-replay T0 PinPlay Logs (pinballs) Logs (pinballs) PinPlay PinPlay T1 Logs (pinballs) PinPlay T2

  8. Information Captured For Replay All memory Values • Subset of Memory Values • Shadow-memory to capture first reads without prior writes and OS side-effects automatically [Sigmetrics’06] • Values changed by remote threads • Initial registers and OS register side-effects: • Signals/Exceptions/APCs/system calls • Code executed (user and libraries) • Position of code and stack • Output of some instructions (e.g. RDTSC) • Subset of shared-memory access interleaving (transitive opt. - FDR [ISCA’03]) Reads without prior writes OS side-effects used by app Values from remote threads All other values (not captured)

  9. PinPlay Architecture User Land Application code and data Capable of logging, replaying and relogging execution (recapture from a replaying run) pinball Your Pin-based Tool PinPlay Lib Replayer Logger Instrumentation and analysis to capture logs Instrumentation and analysis to inject side-effects Intel’s Pin (JIT compiler and instrumentor) * OS (Linux® or Windows®) * http://www.pintool.org/

  10. Cross-OS Replay and Challenges • Log on one OS and replay on another • System call translations • Most OS activity does not happen on replay (only side-effects restored) • Semantics is translated across OSes (e.g. create thread) • Memory mapping • Problem: address space different across OSes • Solution: use Pin’s Fetch API to redirect code and memory operand rewriting to redirect data Remap code code code address space on Windows® address space on Linux® Remap data data data

  11. Usage Example: Program Analysis • Sampling and checkpointing for simulation • One run for profiling and finding representative regions, another for checkpointing • Requirement: both runs must be identical Logs (pinballs) PinPlay + Profiler Logs (pinballs) PinPlay Per-Process pinball Multi-process MPI program Per-Process pinball Checkpoints for simulation PinPlay + Checkpointer Representative Regions • Pinballs are used to share workloads for Pin-based analyses among architects

  12. Usage Example: Replay for Debugging • Capture a buggy run and replay under debugger • Guaranteed to reproduce the bug and helps root causing • Works w/ off-the-shelf unmodified debuggers (e.g. GDB) • PinPlay based tool extends GDB commands w/ your own • Limitation: debugger can’t change control-flow • Used to debug various multi-threaded applications • Also using it for in-house debugging of concurrency issues with a major database vendor PinPlay Enabled Debugger Tool Logs (pinballs) GDB (unmodified) Binary remote protocol Intel’s Pin

  13. Results Isolated replay

  14. Sources of Slowdown • Instrumentation of every memory operation to identify system call side-effects and log data • Could be done by OS at the cost of OS modification or OS-specific analysis (doesn’t work on Windows®) • Locks for shadow-memory accesses • Could be eliminated by using a shadow-copy per thread at the cost of significant increase in log sizes • Other optimizations possible (please look at the paper)

  15. Summary • User-level deterministic capture and replay • No OS changes, special hardware, or virtualization • Integrates w/ other Pin-tools for repeatable analysis and debugging • Replay occurs on any machine and works across OSes (Windows to Linux) • Pinballs are OS-independent and self-contained • Ideal for sharing workloads among researchers, for Pin-based analyses • We will release PinPlay libraries in future

  16. Q&A

More Related