dmtcp a new linux checkpointing mechanism for vanilla universe jobs l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
DMTCP: A New Linux Checkpointing Mechanism For Vanilla Universe Jobs PowerPoint Presentation
Download Presentation
DMTCP: A New Linux Checkpointing Mechanism For Vanilla Universe Jobs

Loading in 2 Seconds...

play fullscreen
1 / 25

DMTCP: A New Linux Checkpointing Mechanism For Vanilla Universe Jobs - PowerPoint PPT Presentation


  • 170 Views
  • Uploaded on

DMTCP: A New Linux Checkpointing Mechanism For Vanilla Universe Jobs. Why DMTCP?. Why checkpoint at all? Problems with Condor’s Standard Universe Single process. No pthreads. No mmap() support. Forced re-link to form a static executable. DMTCP removes these restrictions!.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'DMTCP: A New Linux Checkpointing Mechanism For Vanilla Universe Jobs' - dayton


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
why dmtcp
Why DMTCP?
  • Why checkpoint at all?
  • Problems with Condor’s Standard Universe
    • Single process.
    • No pthreads.
    • No mmap() support.
    • Forced re-link to form a static executable.
  • DMTCP removes these restrictions!
what is dmtcp
What is DMTCP?
  • Distributed Multi-Threaded CheckPointing.
  • Works with Linux Kernel 2.6.9 and later.
  • Supports sequential and multi-threaded computations across single/multiple hosts.
  • Entirely in user space (no kernel modules or root privilege).
  • Transparent (no recompiling, no re-linking).
  • Written at Northeastern U. and MIT and under active development for 4+ years.
  • LGPL’d and freely available.
  • No Remote I/O.
process structure
Process Structure

Coordinator

Signal (USR2)

DMTCP

CT

CT

Process 1

Process N

T1

T1

T2

Network Socket

CT = DMTCP checkpoint thread

T = User Thread

how does it work
How Does It Work?
  • ./dmtcp_checkpoint a.out # starts coordinator too
  • ./dmtcp_command –c # talks to coordinator
  • ./dmtcp_restart ckpt_a.out-*.dmtcp
  • Coordinator is a stateless synchronization server for the distributed checkpointing algorithm.
  • Checkpoint/Restart performance related to size of memory, disk write speed, and synchronization.
how does it work6
How Does It Work?
  • LD_PRELOAD: Transparently preloads checkpoint libraries which installs libc wrappers and checkpointing code.
  • SIGUSR2: Used internally from checkpoint thread to user threads.
  • Wrappers: Only on less heavily used calls to libc
    • fork, exec, system, pipe, bind, listen, setsockopt, connect, accept, clone, close, ptsname, openlog, closelog, signal, sigaction, sigvec, sigblock, sigsetmask, sigprocmask, rt_sigprocmask, pthread_sigmask
    • Overhead is negligible.
how does it work7
How Does It Work?
  • Additional wrappers when process id & thread id virtualization is enabled
    • getpid, getppid, gettid, tcgetpgrp, tcsetprgrp, getgrp, setpgrp, getsid, setsid, kill, tkill, tgkill, wait, waitpid, waitid, wait3, wait4
how does it work8
How Does It Work?
  • Checkpoint image compression on-the-fly (default).
  • Currently only supports dynamically linking to libc.so. Support for static libc.a is feasible, but not implemented.
  • Stays close to POSIX API standards.
a checkpoint under dmtcp
A Checkpoint Under DMTCP
  • dmtcphijack.so & mtcp.so present in executable’s memory.
  • Ask coordinator process for checkpoint via dmtcp_command.
  • Now what happens?
a checkpoint under dmtcp10
A Checkpoint Under DMTCP
  • Suspend user threads with SIGUSR2.
  • Elect shared file descriptor leaders.
  • Drain kernel buffers and do network handshake with peers.
  • Write checkpoint to disk.
  • Refill kernel buffers.
  • Resume user threads.
where is the checkpoint
Where Is the Checkpoint?
  • In the cwd of the application.
    • A set of ckpt_<exec>_<id>.dmtcp files.
  • In the cwd of the coordinator.
    • A dmtcp_restart_script.sh file.
    • The dmtcp_restart_script.sh may need tweaking depending upon circumstance.
a restart under dmtcp
A Restart Under DMTCP
  • Restart Process loads in memory.
  • Reopen files and recreate ptys.
  • Recreate and reconnect sockets.
  • Fork into user processes.
  • Rearrange file descriptors to initial layout.
  • Restore memory and threads.
  • Refill kernel buffers.
  • Resume user threads.
supported os features
Supported OS Features
  • Threads, mutexes/semaphores, fork, exec
    • Shared memory (via mmap), TCP/IP sockets, UNIX domain sockets, pipes, ptys, terminal modes, ownership of controlling terminals, signal handlers, open and/or shared fds, I/O (including the readline library), parent-child process relationships, process id & thread id virtualization, session and process group ids, and more…
  • Trying to keep the implementation small!
supported applications
Supported Applications
  • MPICH-2, OpenMPI, SciPy/iPython, Python
    • cmsRun, Perl, Ruby, PHP, GHCi (Glasgow Haskell Compiler), Ocaml, Octave, Macaulay2, GNUPlot, slsh (S-Lang scripts), MZScheme, GST (Gnu Smalltalk virtual machine), tcsh, dash, csh, tclsh (tcl-based interpreter), SQLite.
    • And many others!
planned application support
Planned Application Support
  • Bash, gcl (GNU Common Lisp), maxima (based on gcl), and the Sun JVM.
  • These programs use sbrk() for their own memory management and induce a bug in DMTCP.
  • A fix is planned and will go in soon.
planned application support16
Planned Application Support
  • Matlab
    • Directly calling the binary without graphics works, but matlab uses bash which needs the sbrk() fix.
condor dmtcp integration
Condor/DMTCP Integration
  • Experimental at this time.
    • Determining scalability, stability, and extent of “weird edge cases” of DMTCP mixed with Condor.
  • Completely outside of Condor source code.
    • A vanilla job called “shim_dmtcp” that wraps the user’s job and stdfiles with DMTCP.
    • A submit description file which transfers needed dmtcp files over to the remote side and saves intermediate checkpoints.
    • No remote I/O!
shim script execution
Shim Script Execution

condor_starter

shim_dmtcp

Job

Coordinator

submit file example
Submit File Example

universe = vanilla

executable = shim_dmtcp

arguments = logfile stdinf stdoutf stderrf a.out arg0 arg1…

should_transfer_files = YES

when_to_transfer_output = ON_EVICT_OR_EXIT

transfer_input_files = <dmtcp libraries and programs>,\ a.out, stdinf, stdoutf, stderrf

environment = DMTCP_TMPDIR=./;JALIB_STDERR_PATH=/dev/null

kill_sig = 2

output = shim.$(Cluster).$(Process).out

error = shim.$(Cluster).$(Process).err

log = shim.log

queue

condor dmtcp integration20
Condor/DMTCP Integration
  • Early Results
    • It works with our test case and thousands of jobs.
    • Problems
      • Checkpointing between Physical Address Kernels and normal kernels is a challenge.
      • DMTCP’s API needs some improvement.
      • Coordinator failure means job failure.
      • Shim script is clunky, e.g. no streaming I/O.
  • Next: Integration into our stduniv test suite for full regression testing.
future condor integration
Future Condor Integration
  • Add WantCheckpoint = True and CheckpointMethod = DMTCP for a vanilla universe job.
  • Condor takes care of the wrapping of the job with DMTCP and transferal of needed DMTCP files--no shim script voodoo.
  • Condor should honor CheckpointPlatform for Vanilla universe jobs in case of pool segmentation.
  • Parallel universe support with single coordinator.
  • Doug Thain’s Parrot for remote I/O.
challenges
Challenges
  • C/C++ runtime library compatibility issues.
    • Recompile DMTCP on slot before job execution?
  • Dynamic library incompatibilities.
  • No Checkpoint Server.
    • Condor file transfer protocol enhancement?
  • Debugging methods and practices?
further reading
Further Reading
  • “DMTCP: Transparent Checkpointing for Cluster Computation and the Desktop”
    • http://arxiv.org/abs/cs/0701037
  • Source Code
    • http://dmtcp.sourceforge.net
questions
Questions?
  • DMTCP
    • http://dmtcp.sourceforge.net
    • Gene Cooperman: gene@ccs.neu.edu
  • Condor/DMTCP Integration
    • Pete Keller: psilord@cs.wisc.edu
    • Ask me if you want to try the Alpha Version out!