1 / 15

Process Manager Specification

Rusty Lusk lusk@mcs.anl.gov 1/15/04. Process Manager Specification. Outline. Process Manager Functionality Expected Consumers Commands Semantics Examples Schema. Process Manager Functionality. Process Execution Start process groups Provide status information during execution

daphne
Download Presentation

Process Manager Specification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rusty Lusk lusk@mcs.anl.gov 1/15/04 Process Manager Specification

  2. Outline • Process Manager Functionality • Expected Consumers • Commands • Semantics • Examples • Schema

  3. Process Manager Functionality • Process Execution • Start process groups • Provide status information during execution • Provide command output and error messages • Return exit status information • Process Group Control • Kill process groups • Signal Process groups

  4. Expected Consumers • Components which execute programs • Components which need to locate running processes • Components which need to control running processes

  5. Schematic of Process Management Component in Scalable Systems Software Context NSM SD Sched EM MPD’s SSS Components QM PM PM SSS XML application processes mpdrun simple scripts or hairy GUIs using SSS XML QM’s job submission language XML file mpiexec (MPI Standard args) interactive Prototype MPD-based implementation side SSS side

  6. Commands • <create-process-group> - creates a new process group. • <get-process-group-info> - get status information; includes current process ids, exit status information and stdout/err information. • <signal-process-group> - send a unix signal to all processes in a process group • <kill-process-group> - kill all processes in a process group • <del-process-group-info> - allow process manager to discard process group information after process group has exited. • All commands use the restriction syntax

  7. Examples <create-process-group submitter=‘desai’ totalprocs=‘4’ pgid=‘*’ output=‘merge’> <process-spec exec=‘/bin/cpi’ cwd=‘/’ path=‘/bin:/usr/bin’> <host-spec>node1 node2 </host-spec> </process-spec> </create-process-group>

  8. Examples (continued) <get-process-group-info> <process-group user=‘desai’ pgid=‘*’/> <process-group user=‘lusk’ pgid=‘*’/> <process-group pgid=‘*’> <process host=‘node4’ pid=‘*’/> </process-group> </get-process-group-info>

  9. Examples (continued) Response: <process-group-info> <process-groups> <process-group user=‘desai’ pgid=’12’/> <process-group user=‘desai’ pgid=’16’/> <process-group pgid=’24’> <process host=‘node4’ pid=‘15423’/> <process host=‘node4’ pid=‘2523’/> </process-group> <process-group> <process-group-info>

  10. Using the SSS Software Architecture on Chiba City

  11. Chiba City • Medium-sized cluster at Argonne National Laboratory • 256 dual-processor 500MHz PIII’s • Myrinet • Linux (and sometimes others) • No shared file system, for scalability (but now a test platform for PVFS2) • Dedicated to Computer Science scalability research, not applications • Many groups use it as a research platform • Both academic and commercial • Also used by friendly, hungry applications • New requirement: support research requiring specialized kernels and alternate operating systems, for OS scalability research

  12. New Challenges • Want to schedule jobs that require node rebuilds (for new OS’s, kernel module tests, etc.) as part of “normal” job scheduling • Want to build larger virtual clusters (using VMware or User Mode Linux) temporarily, as part of “normal” job scheduling • Requires major upgrade of Chiba City systems software

  13. Chiba Commits to SSS • Fork in the road (occurred August, 2003): • Major overhaul of old Chiba systems software (open PBS + Maui scheduler + homegrown stuff), OR • Take great leap forward and bet on all-new software architecture of SSS • Problems with leaping approach: • SSS interfaces not finalized • Some components don’t yet use library (implement own protocols in open code, not encapsulated in library) • Some components not fully functional yet • Solutions to problems: • Collect components that are adequately functional and integrated (PM, SD, EM, BCM) • Write “stubs” for other critical components (Sched, QM) • Do without some components (CKPT, monitors, accounting) for the time being

  14. Features of Adopted Solution • Stubs adequate, at least for time being • Scheduler does FIFO + reservations + backfill, improving • QM implements “PBS compatibility mode” (accepts user PBS scripts) as well as asking Process Manager to start parallel jobs directly • Process Manager wraps MPD-2 • Single ring of MPD’s runs as root, managing all jobs for all users • MPD’s started by Build-and-Config manager at boot time • An MPI program called MPISH (MPI Shell) wraps user jobs for handling file staging and multiple job steps • Python implementation of most components • Demonstrated feasibility of using SSS component approach to systems software • Running normal Chiba job mix for over five months now • Moving forward on meeting new requirements for research support

  15. Next Steps • Integrate other components into this structure • Integrate other instantiations of components into this structure • Replace stubs as possible • Easiest if they use same XML API’s • Put “unusual” capabilities into production • Rebuilding nodes on the fly

More Related