1 / 8

Job Scheduling

Job Scheduling. P. (Saday) Sadayappan Ohio State University. Problem Statement. Given a stream of parallel jobs and a set of computing resources, determine when and where to execute each job In the form that the job scheduling problem is addressed at most supercomputer centers:

nola-watson
Download Presentation

Job Scheduling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Job Scheduling P. (Saday) Sadayappan Ohio State University

  2. Problem Statement • Given a stream of parallel jobs and a set of computing resources, determine when and where to execute each job • In the form that the job scheduling problem is addressed at most supercomputer centers: • Homogeneous set of processors • Each job asks for a specific, fixed number of processors

  3. Job Scheduling Today • Earliest job schedulers (Intel iPSC) used a simple FCFS strategy; low utilization (50%) • Back-filling was implemented at Argonne • Give an earliest-possible reservation to job at head of the queue, but allow a later arriving job to bypass it, if the reservation is not violated • Utilization improves to ~90% • Used at most production facilities today

  4. Can Performance be Improved? • Metrics: • System Metric: Utilization • User Metrics: Response time (wait+run time), Slowdown (response-time/run-time) • Over a hundred papers published: • Focus mainly on improving user metrics: much greater potential for its improvement than utilization • Question: How important is it to squeeze an additional 5-10% utilization on a system that is already achieving over 85% utilization?

  5. Improving Response Time • Question: How important is it to evaluate alternatives to standard back-fill scheduling, with a goal of improved user response-time? • Many studies have reported simulation studies showing significant improvement of slowdown or response-time with new schemes; but most production schedulers simply use aggressive back-fill. Why?

  6. Possible Reasons for Non-Adoption • Academic studies do not model specific policy issues of a center, e.g. “good citizen rules,” multiple queues etc. • Most results are based on job log traces at Feitelson’s archive, with many logs from academic centers exhibiting low system utilization (< 70%). • Most studies report overall averages over entire trace: insufficient to assess impact of change: • E.g., using a Shortest-Job-First queue policy instead of the usual FCFS policy significantly improves overall average slowdown by a factor of 4; but increases response time for 24 hour jobs to 50 hours instead of 26 hours.

  7. QoS for Job Scheduling • Job schedulers do not provide QoS: • No response time guarantees • No equitable way of offering different service for urgent versus non-urgent jobs • Technical and Accounting issues: • Develop job schedulers that can do deadline-based scheduling • Develop accounting models to charge based on urgency of job: • Charge = f1(resource-usage) + f2(wait-time-limit) • Question: How desirable is it to develop job schedulers with QoS functionality?

  8. Questions?

More Related