Opportune job shredding an efficient approach for scheduling parameter sweep applications
This presentation is the property of its rightful owner.
Sponsored Links
1 / 29

Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications PowerPoint PPT Presentation


  • 89 Views
  • Uploaded on
  • Presentation posted in: General

Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications. Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University. Parameter Sweep Applications. An important class of applications Set of independent tasks MCell Application

Download Presentation

Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Opportune job shredding an efficient approach for scheduling parameter sweep applications

Opportune Job Shredding:An Efficient Approach for Scheduling Parameter Sweep Applications

Rohan Kurian, Pavan Balaji, P. Sadayappan

The Ohio State University


Parameter sweep applications

Parameter Sweep Applications

  • An important class of applications

    • Set of independent tasks

    • MCell Application

      • 3D simulations for sub-cellular architecture/physiology

    • GTOMO (Parallel Tomography) Application

      • Multiple view-point simulation

  • Systems exist for scheduling on the Grid

  • Cluster-based Scheduling?


Application level schedulers

Application Level Schedulers

  • Manage the scheduling of applications

    • Break the application to appropriate chunks

    • APST (AppLeS Parameter Sweep Template)

    • NIMROD

  • Greedy approach to schedule PSA chunks


Presentation roadmap

Presentation Roadmap

  • Job Scheduling in Clusters

  • Multi-Site Job Scheduling

  • PSA Scheduling Strategies

  • Multi-Site Scheduling of PSAs

  • Performance Evaluation

  • Conclusions


Job scheduling in clusters

Job Scheduling in Clusters

  • Mapping arriving jobs to available resources

  • Multiple Schemes for Scheduling

    • First Come First Serve (FCFS)

    • Conservative Scheduling

    • Aggressive or EASY Scheduling

  • Fair-Share Constraints

    • A user can not have more than ‘N’ queued jobs

  • Submitting the multiple chunks of a PSA job

    • Violation of Fair-Share constraints

    • Combine chunks to form a single parallel job


Formation of psas in clusters

Formation of PSAs in Clusters

Small Independent Tasks

Parallel Parameter Sweep Application


Presentation roadmap1

Presentation Roadmap

  • Job Scheduling in Clusters

  • Multi-Site Job Scheduling

  • PSA Scheduling Strategies

  • Multi-Site Scheduling of PSAs

  • Performance Evaluation

  • Conclusions


Multi site job scheduling

Multi-Site Job Scheduling

  • Multiple Simultaneous Requests

    • Job submitted to multiple sites

    • Started on the earliest cluster

    • Existing schemes have limitations

      • Heterogeneous Clusters

      • Different Scheduling Schemes


Multiple simultaneous requests

Jobs

Jobs

Jobs

Meta Scheduler

Meta Scheduler

Meta Scheduler

Local Scheduler

Local Scheduler

Local Scheduler

Multiple-simultaneous-requests

Site 1

Site 2

Site 3


Presentation roadmap2

Presentation Roadmap

  • Job Scheduling in Clusters

  • Multi-Site Job Scheduling

  • PSA Scheduling Strategies

  • Multi-Site Scheduling of PSAs

  • Performance Evaluation

  • Conclusions


Psa scheduling strategies

PSA Scheduling Strategies

  • Flooding based Job Shredding

    • Submit all chunks in the PSA at once

    • Greedy approach

    • Improves User and System metrics

    • Doesn’t ensure fairness to Non-PSA jobs

  • Opportune Job Shredding

    • Uses an additional Application-Level Scheduler

      • Monitors the current schedule of the system

    • If no normal backfill is possible

      • Allow PSA jobs to shred and backfill


Presentation roadmap3

Presentation Roadmap

  • Job Scheduling in Clusters

  • Multi-Site Job Scheduling

  • PSA Scheduling Strategies

  • Multi-Site Scheduling of PSAs

  • Performance Evaluation

  • Conclusions


Multi site scheduling for psas

Multi-Site Scheduling for PSAs

  • Two-level Application Level Schedulers

  • No constraints on sites

    • Allowed to have different speeds

    • Allowed to have different scheduling policies

  • Similar to “Multiple Simultaneous Requests”

    • Simultaneous requests only for PSAs


Multi site scheduling for psas1

Multi-Site Scheduling for PSAs

Meta

Application-Level

Scheduler

Site 1

App-Level

Scheduler

App-Level

Scheduler

Site 2

Job Queue

Local

Scheduler

Job Queue

Local

Scheduler

App-Level

Scheduler

Job Queue

Local

Scheduler

Site 3


Presentation roadmap4

Presentation Roadmap

  • Job Scheduling in Clusters

  • Multi-Site Job Scheduling

  • PSA Scheduling Strategies

  • Multi-Site Scheduling of PSAs

  • Performance Evaluation

  • Conclusions


Performance metrics

Performance Metrics

  • Response Time

    • Completion Time – Submit Time

  • Slowdown

    • Response Time / Runtime

  • Loss of Capacity (LOC)

    • LOC = min {(waiting jobs procs), idle procs}

    • T = Time for which this state lasts

    • LOC = LOC x T


Evaluation scheme

Evaluation Scheme

  • Simulation based Approach

  • CTC trace from Feitelson’s archive

  • EASY backfilling used

  • For multi-site evaluation

    • CTC traces from 3 different months

    • Processing speeds in the ratio 2:1:3


Flooding based job shredding

Flooding Based Job Shredding

  • Up to 60% improvement for PSA Jobs

  • Up to 90% worse performance for Non-PSA Jobs


Flooding job category wise breakup

Flooding: Job Category wise breakup

  • Narrow Short Non-PSA jobs suffer most

  • Loss of back-filling opportunities is the main reason


Flooding loss of capacity

Flooding: Loss of Capacity

  • Up to 75% improvement in the Loss of Capacity


Opportune job shredding

Opportune Job Shredding

  • Up to 70% improvement for PSA Jobs

  • Less than 2% worsening in performance for Non-PSA Jobs


Opportune job category wise breakup

Opportune: Job Category wise breakup

  • No category of Non-PSA jobs suffers more than 7%


Opportune loss of capacity

Opportune: Loss of Capacity

  • Up to 12% improvement in the Loss of Capacity


Opportune multi site

Opportune (Multi-Site)

  • Up to 95% improvement for PSA Jobs

  • No significant loss of performance for Non-PSA jobs


Opportune multi site response time

Opportune (Multi-Site):Response Time

  • Up to 75% improvement for PSA Jobs

  • No significant loss of performance for Non-PSA jobs


Opportune multi site slowdown

Opportune (Multi-Site):Slowdown

  • Up to 95% improvement for PSA Jobs

  • No significant loss of performance for Non-PSA jobs


Opportune multi site loss of capacity

Opportune (Multi-Site):Loss of Capacity

  • Up to 45% improvement in the Loss of Capacity


Concluding remarks

Concluding Remarks

  • Opportune Job Shredding

    • Efficient Scheduling of PSAs

    • Single Site and Multi-Site versions

    • Significant improvement for PSA jobs

    • Ensures that Non-PSA jobs are not affected

  • Plan to integrate this with Prod. Schedulers


Thank you

Thank You!


  • Login