Dmtcp a new linux checkpointing mechanism for vanilla universe jobs l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 25

DMTCP: A New Linux Checkpointing Mechanism For Vanilla Universe Jobs PowerPoint PPT Presentation


  • 102 Views
  • Uploaded on
  • Presentation posted in: General

DMTCP: A New Linux Checkpointing Mechanism For Vanilla Universe Jobs. Why DMTCP?. Why checkpoint at all? Problems with Condor’s Standard Universe Single process. No pthreads. No mmap() support. Forced re-link to form a static executable. DMTCP removes these restrictions!.

Download Presentation

DMTCP: A New Linux Checkpointing Mechanism For Vanilla Universe Jobs

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Dmtcp a new linux checkpointing mechanism for vanilla universe jobs l.jpg

DMTCP: A New Linux Checkpointing Mechanism For Vanilla Universe Jobs


Why dmtcp l.jpg

Why DMTCP?

  • Why checkpoint at all?

  • Problems with Condor’s Standard Universe

    • Single process.

    • No pthreads.

    • No mmap() support.

    • Forced re-link to form a static executable.

  • DMTCP removes these restrictions!


What is dmtcp l.jpg

What is DMTCP?

  • Distributed Multi-Threaded CheckPointing.

  • Works with Linux Kernel 2.6.9 and later.

  • Supports sequential and multi-threaded computations across single/multiple hosts.

  • Entirely in user space (no kernel modules or root privilege).

  • Transparent (no recompiling, no re-linking).

  • Written at Northeastern U. and MIT and under active development for 4+ years.

  • LGPL’d and freely available.

  • No Remote I/O.


Process structure l.jpg

Process Structure

Coordinator

Signal (USR2)

DMTCP

CT

CT

Process 1

Process N

T1

T1

T2

Network Socket

CT = DMTCP checkpoint thread

T = User Thread


How does it work l.jpg

How Does It Work?

  • ./dmtcp_checkpoint a.out # starts coordinator too

  • ./dmtcp_command –c # talks to coordinator

  • ./dmtcp_restart ckpt_a.out-*.dmtcp

  • Coordinator is a stateless synchronization server for the distributed checkpointing algorithm.

  • Checkpoint/Restart performance related to size of memory, disk write speed, and synchronization.


How does it work6 l.jpg

How Does It Work?

  • LD_PRELOAD: Transparently preloads checkpoint libraries which installs libc wrappers and checkpointing code.

  • SIGUSR2: Used internally from checkpoint thread to user threads.

  • Wrappers: Only on less heavily used calls to libc

    • fork, exec, system, pipe, bind, listen, setsockopt, connect, accept, clone, close, ptsname, openlog, closelog, signal, sigaction, sigvec, sigblock, sigsetmask, sigprocmask, rt_sigprocmask, pthread_sigmask

    • Overhead is negligible.


How does it work7 l.jpg

How Does It Work?

  • Additional wrappers when process id & thread id virtualization is enabled

    • getpid, getppid, gettid, tcgetpgrp, tcsetprgrp, getgrp, setpgrp, getsid, setsid, kill, tkill, tgkill, wait, waitpid, waitid, wait3, wait4


How does it work8 l.jpg

How Does It Work?

  • Checkpoint image compression on-the-fly (default).

  • Currently only supports dynamically linking to libc.so. Support for static libc.a is feasible, but not implemented.

  • Stays close to POSIX API standards.


A checkpoint under dmtcp l.jpg

A Checkpoint Under DMTCP

  • dmtcphijack.so & mtcp.so present in executable’s memory.

  • Ask coordinator process for checkpoint via dmtcp_command.

  • Now what happens?


A checkpoint under dmtcp10 l.jpg

A Checkpoint Under DMTCP

  • Suspend user threads with SIGUSR2.

  • Elect shared file descriptor leaders.

  • Drain kernel buffers and do network handshake with peers.

  • Write checkpoint to disk.

  • Refill kernel buffers.

  • Resume user threads.


Where is the checkpoint l.jpg

Where Is the Checkpoint?

  • In the cwd of the application.

    • A set of ckpt_<exec>_<id>.dmtcp files.

  • In the cwd of the coordinator.

    • A dmtcp_restart_script.sh file.

    • The dmtcp_restart_script.sh may need tweaking depending upon circumstance.


A restart under dmtcp l.jpg

A Restart Under DMTCP

  • Restart Process loads in memory.

  • Reopen files and recreate ptys.

  • Recreate and reconnect sockets.

  • Fork into user processes.

  • Rearrange file descriptors to initial layout.

  • Restore memory and threads.

  • Refill kernel buffers.

  • Resume user threads.


Supported os features l.jpg

Supported OS Features

  • Threads, mutexes/semaphores, fork, exec

    • Shared memory (via mmap), TCP/IP sockets, UNIX domain sockets, pipes, ptys, terminal modes, ownership of controlling terminals, signal handlers, open and/or shared fds, I/O (including the readline library), parent-child process relationships, process id & thread id virtualization, session and process group ids, and more…

  • Trying to keep the implementation small!


Supported applications l.jpg

Supported Applications

  • MPICH-2, OpenMPI, SciPy/iPython, Python

    • cmsRun, Perl, Ruby, PHP, GHCi (Glasgow Haskell Compiler), Ocaml, Octave, Macaulay2, GNUPlot, slsh (S-Lang scripts), MZScheme, GST (Gnu Smalltalk virtual machine), tcsh, dash, csh, tclsh (tcl-based interpreter), SQLite.

    • And many others!


Planned application support l.jpg

Planned Application Support

  • Bash, gcl (GNU Common Lisp), maxima (based on gcl), and the Sun JVM.

  • These programs use sbrk() for their own memory management and induce a bug in DMTCP.

  • A fix is planned and will go in soon.


Planned application support16 l.jpg

Planned Application Support

  • Matlab

    • Directly calling the binary without graphics works, but matlab uses bash which needs the sbrk() fix.


Condor dmtcp integration l.jpg

Condor/DMTCP Integration

  • Experimental at this time.

    • Determining scalability, stability, and extent of “weird edge cases” of DMTCP mixed with Condor.

  • Completely outside of Condor source code.

    • A vanilla job called “shim_dmtcp” that wraps the user’s job and stdfiles with DMTCP.

    • A submit description file which transfers needed dmtcp files over to the remote side and saves intermediate checkpoints.

    • No remote I/O!


Shim script execution l.jpg

Shim Script Execution

condor_starter

shim_dmtcp

Job

Coordinator


Submit file example l.jpg

Submit File Example

universe = vanilla

executable = shim_dmtcp

arguments = logfile stdinf stdoutf stderrf a.out arg0 arg1…

should_transfer_files = YES

when_to_transfer_output = ON_EVICT_OR_EXIT

transfer_input_files = <dmtcp libraries and programs>,\ a.out, stdinf, stdoutf, stderrf

environment = DMTCP_TMPDIR=./;JALIB_STDERR_PATH=/dev/null

kill_sig = 2

output = shim.$(Cluster).$(Process).out

error = shim.$(Cluster).$(Process).err

log = shim.log

queue


Condor dmtcp integration20 l.jpg

Condor/DMTCP Integration

  • Early Results

    • It works with our test case and thousands of jobs.

    • Problems

      • Checkpointing between Physical Address Kernels and normal kernels is a challenge.

      • DMTCP’s API needs some improvement.

      • Coordinator failure means job failure.

      • Shim script is clunky, e.g. no streaming I/O.

  • Next: Integration into our stduniv test suite for full regression testing.


Future condor integration l.jpg

Future Condor Integration

  • Add WantCheckpoint = True and CheckpointMethod = DMTCP for a vanilla universe job.

  • Condor takes care of the wrapping of the job with DMTCP and transferal of needed DMTCP files--no shim script voodoo.

  • Condor should honor CheckpointPlatform for Vanilla universe jobs in case of pool segmentation.

  • Parallel universe support with single coordinator.

  • Doug Thain’s Parrot for remote I/O.


Challenges l.jpg

Challenges

  • C/C++ runtime library compatibility issues.

    • Recompile DMTCP on slot before job execution?

  • Dynamic library incompatibilities.

  • No Checkpoint Server.

    • Condor file transfer protocol enhancement?

  • Debugging methods and practices?


Further reading l.jpg

Further Reading

  • “DMTCP: Transparent Checkpointing for Cluster Computation and the Desktop”

    • http://arxiv.org/abs/cs/0701037

  • Source Code

    • http://dmtcp.sourceforge.net


Questions l.jpg

Questions?

  • DMTCP

    • http://dmtcp.sourceforge.net

    • Gene Cooperman: [email protected]

  • Condor/DMTCP Integration

    • Pete Keller: [email protected]

    • Ask me if you want to try the Alpha Version out!


Thank you l.jpg

Thank you


  • Login