1 / 53

Job Delegation and Planning in Condor-G ISGC 2005 Taipei, Taiwan

Job Delegation and Planning in Condor-G ISGC 2005 Taipei, Taiwan. The Condor Project (Established ‘85). Distributed High Throughput Computing research performed by a team of ~35 faculty, full time staff and students. The Condor Project (Established ‘85).

kellsie
Download Presentation

Job Delegation and Planning in Condor-G ISGC 2005 Taipei, Taiwan

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Job Delegation and Planning in Condor-GISGC 2005 Taipei, Taiwan

  2. The Condor Project (Established ‘85) Distributed High Throughput Computing research performed by a team of ~35 faculty, full time staff and students.

  3. The Condor Project (Established ‘85) Distributed High Throughput Computing research performed by a team of ~35 faculty, full time staff and students who: • face software engineering challenges in a distributed UNIX/Linux/NT environment • are involved in national and international grid collaborations, • actively interact with academic and commercial users, • maintain and support large distributed production environments, • and educate and train students. Funding – US Govt. (DoD, DoE, NASA, NSF, NIH), AT&T, IBM, INTEL, Microsoft, UW-Madison, …

  4. A Multifaceted Project • Harnessing the power of clusters – dedicated and/or opportunistic (Condor) • Job management services for Grid applications (Condor-G, Stork) • Fabric management services for Grid resources (Condor, GlideIns, NeST) • Distributed I/O technology (Parrot, Kangaroo, NeST) • Job-flow management (DAGMan, Condor, Hawk) • Distributed monitoring and management (HawkEye) • Technology for Distributed Systems (ClassAD, MW) • Packaging and Integration (NMI, VDT)

  5. Some software produced by the Condor Project • MW • NeST • Stork • Parrot • VDT • And others… all as open source • Condor System • ClassAd Library • DAGMan • Fault Tolerant Shell (FTSH) • Hawkeye • GCB Data!

  6. Who uses Condor? • Commercial • Oracle, Micron, Hartford Life Insurance, CORE, Xerox, Exxon/Mobile, Shell, Alterra, Texas Instruments, … • Research Community • Universities, Govt Labs • Bundles: NMI, VDT • Grid Communities: EGEE/LCG/gLite, Particle Physics Data Grid (PPDG), USCMS, LIGO, iVDGL, NSF Middleware Initiative GRIDS Center, …

  7. Condor Pool MatchMaker Startd Schedd Startd Jobs Jobs Startd Schedd Jobs Jobs

  8. Condor Pool MatchMaker Startd Schedd Startd Jobs Jobs Jobs Startd Jobs Schedd Jobs Jobs

  9. Condor-G Schedd - Condor-C LSF PBS Globus 2 Globus 4 Unicore (Nordugrid) - Condor-G Schedd Startd Jobs Jobs

  10. Condor-G Middleware (Globus 2, Globus 4, Unicore, …) Condor Pool User/Application/Portal Grid Fabric (processing, storage, communication)

  11. Atomic/Durable Job Delegation • Transfer of responsibility to schedule and execute a job • Stage in executable and data files • Transfer policy “instructions” • Securely transfer (and refresh?) credentials, obtain local identities • Monitor and present job progress (tranparency!) • Return results • Multiple delegations can be combined in interesting ways

  12. Simple Job Delegation in Condor-G Globus GRAM Batch System Front-end Execute Machine Condor-G

  13. Expanding the Model • What can we do with new forms of job delegation? • Some ideas • Mirroring • Load-balancing • Glide-in schedd, startd • Multi-hop grid scheduling

  14. Mirroring • What it does • Jobs mirrored on two Condor-Gs • If primary Condor-G crashes, secondary one starts running jobs • On recovery, primary Condor-G gets job status from secondary one • Removes Condor-G submit point as single point of failure

  15. Mirroring Example Condor-G 1 Condor-G 2 X Jobs Jobs Execute Machine

  16. Mirroring Example Condor-G 1 Condor-G 2 Jobs Execute Machine

  17. Load-Balancing • What it does • Front-end Condor-G distributes all jobs among several back-end Condor-Gs • Front-end Condor-G keeps updated job status • Improves scalability • Maintains single submit point for users

  18. Load-Balancing Example Condor-G Back-end 1 Condor-G Front-end Condor-G Back-end 3 Condor-G Back-end 2

  19. Glide-In • Schedd and Startd are separate services that do not require any special privledges • Thus we can submit them as jobs! • Glide-In Schedd • What it does • Drop a Condor-G onto the front-end machine of a remote cluster • Delegate jobs to the cluster through the glide-in schedd • Can apply cluster-specific policies to jobs • Not fork-and-forget… • Send a manager to the site, instead of manage across the internet

  20. Glide-In Schedd Glide-In Schedd Example Frontend Middleware Jobs Condor-G Jobs Batch System

  21. Glide-In Startd Example Frontend Middleware Batch System Condor-G (Schedd) Startd Job

  22. Glide-In Startd • Why? • Restores all the benefits that may have been washed away by the middleware • End-to-end management solution • Preserves job semantic guarantees • Preserves policy • Enables lazy planning

  23. Sample Job Submit file universe = grid grid_type = gt2 globusscheduler = cluster1.cs.wisc.edu/jobmanager-lsf executable = find_particle arguments = …. output = …. log = … But we want metascheduling…

  24. Represent grid clusters as ClassAds • ClassAds • are a set of uniquely named expressions; each expression is called an attribute and is an attribute name/value pair • combine query and data • extensible • semi-structured : no fixed schema (flexibility in an environment consisting of distributed administrative domains) • Designed for “MatchMaking”

  25. Example of a ClassAd that could represent a compute cluster in a grid: Type = "GridSite"; Name = "FermiComputeCluster"; Arch = “Intel-Linux”; Gatekeeper_url = "globus.fnal.gov/lsf" Load = [ QueuedJobs = 42; RunningJobs = 200; ]; Requirements = ( other.Type == "Job" && Load.QueuedJobs < 100 ); GoodPeople = { "howard", "harry" }; Rank = member(other.Owner, GoodPeople) * 500

  26. Another Sample - Job Submit universe = grid grid_type = gt2owner = howard executable = find_particle.$$(Arch) requirements = other.Arch == “Intel-Linux” || other.Arch == “Sparc-Solaris” rank = 0 – other.Load.QueuedJobs; globusscheduler = $$(gatekeeper_url) … Note: We introduced augmentation of the job ClassAd based upon information discovered in its matching resource ClassAd.

  27. Multi-Hop Grid Scheduling • Match a job to a Virtual Organization (VO), then to a resource within that VO • Easier to schedule jobs across multiple VOs and grids

  28. Multi-Hop Grid Scheduling Example Experiment Resource Broker VO Resource Broker Experiment Condor-G VO Condor-G HEP CMS Globus GRAM Batch Scheduler

  29. Endless Possibilities • These new models can be combined with each other or with other new models • Resulting system can be arbitrarily sophisticated

  30. Job Delegation Challenges • New complexity introduces new issues and exacerbates existing ones • A few… • Transparency • Representation • Scheduling Control • Active Job Control • Revocation • Error Handling and Debugging

  31. Transparency • Full information about job should be available to user • Information from full delegation path • No manual tracing across multiple machines • Users need to know what’s happening with their jobs

  32. Representation • Job state is a vector • How best to show this to user • Summary • Current delegation endpoint • Job state at endpoint • Full information available if desired • Series of nested ClassAds?

  33. Scheduling Control • Avoid loops in delegation path • Give user control of scheduling • Allow limiting of delegation path length? • Allow user to specify part or all of delegation path

  34. Active Job Control • User may request certain actions • hold, suspend, vacate, checkpoint • Actions cannot be completed synchronously for user • Must forward along delegation path • User checks completion later

  35. Active Job Control (cont) • Endpoint systems may not support actions • If possible, execute them at furthest point that does support them • Allow user to apply action in middle of delegation path

  36. Revocation • Leases • Lease must be renewed periodically for delegation to remain valid • Allows revocation during long-term failures • What are good values for lease lifetime and update interval?

  37. Error Handling and Debugging • Many more places for things to go horribly wrong • Need clear, simple error semantics • Logs, logs, logs • Have them everywhere

  38. From earlier • Transfer of responsibility to schedule and execute a job • Transfer policy “instructions” • Stage in executable and data files • Securely transfer (and refresh?) credentials, obtain local identities • Monitor and present job progress (tranparency!) • Return results

  39. Job Failure Policy Expressions • Condor/Condor-G augemented so users can supply job failure policy expressions in the submit file. • Can be used to describe a successful run, or what to do in the face of failure. on_exit_remove = <expression> on_exit_hold = <expression> periodic_remove = <expression> periodic_hold = <expression>

  40. Job Failure Policy Examples • Do not remove from queue (i.e. reschedule) if exits with a signal: on_exit_remove = ExitBySignal == False • Place on hold if exits with nonzero status or ran for less than an hour: on_exit_hold = ((ExitBySignal==False) && (ExitSignal != 0)) || ((ServerStartTime – JobStartDate) < 3600) • Place on hold if job has spent more than 50% of its time suspended: periodic_hold = CumulativeSuspensionTime > (RemoteWallClockTime / 2.0)

  41. Data Placement*(DaP) must be an integral part ofthe end-to-endsolution Space management and Data transfer *

  42. Stork • A scheduler for data placement activities in the Grid • What Condor is for computational jobs, Stork is for data placement • Stork comes with a new concept: “Make data placement a first class citizen in the Grid.”

  43. Stage-in • Execute the Job • Stage-out Stage-in Execute the job Stage-out Release input space Release output space Allocate space for input & output data Data Placement Jobs Computational Jobs

  44. Condor Job Queue DaP A A.submit DaP B B.submit Job C C.submit ….. Parent A child B Parent B child C Parent C child D, E ….. C DAGMan D F Stork Job Queue A B E E DAG with DaP DAG specification C

  45. Why Stork? • Stork understands the characteristics and semantics of data placement jobs. • Can make smart scheduling decisions, for reliable and efficient data placement.

  46. Failure Recovery and Efficient Resource Utilization • Fault tolerance • Just submit a bunch of data placement jobs, and then go away.. • Control number of concurrent transfers from/to any storage system • Prevents overloading • Space allocation and De-allocations • Make sure space is available

  47. Support for Heterogeneity Protocol translation using Stork memory buffer.

  48. Support for Heterogeneity Protocol translation using Stork Disk Cache.

  49. Flexible Job Representation and Multilevel Policy Support [ Type = “Transfer”; Src_Url = “srb://ghidorac.sdsc.edu/kosart.condor/x.dat”; Dest_Url = “nest://turkey.cs.wisc.edu/kosart/x.dat”; …… …… Max_Retry = 10; Restart_in = “2 hours”; ]

  50. Run-time Adaptation • Dynamic protocol selection [ dap_type = “transfer”; src_url = “drouter://slic04.sdsc.edu/tmp/test.dat”; dest_url = “drouter://quest2.ncsa.uiuc.edu/tmp/test.dat”; alt_protocols = “nest-nest, gsiftp-gsiftp”; ] [ dap_type = “transfer”; src_url = “any://slic04.sdsc.edu/tmp/test.dat”; dest_url = “any://quest2.ncsa.uiuc.edu/tmp/test.dat”; ]

More Related