Exploring issues with workflow scheduling on the grid
This presentation is the property of its rightful owner.
Sponsored Links
1 / 40

Exploring Issues with Workflow Scheduling on the Grid PowerPoint PPT Presentation


  • 86 Views
  • Uploaded on
  • Presentation posted in: General

Exploring Issues with Workflow Scheduling on the Grid. Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing slides! also: Viktor Yarmolenko, Wei Zheng, … and Anastasios Gounaris for presenting it!.

Download Presentation

Exploring Issues with Workflow Scheduling on the Grid

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Exploring issues with workflow scheduling on the grid

Exploring Issues with Workflow Scheduling on the Grid

Rizos Sakellariou

University of Manchester, UK

with thanks to:

Henan Zhao and Ewa Deelman for providing slides!

also: Viktor Yarmolenko, Wei Zheng, …

and Anastasios Gounaris for presenting it!


Workflow applications are widely considered a common use case of grids

Workflow applications are widely considered a common use case of Grids

LIGO (Pegasus team, ISI)

(large-scale)

myGrid, Manchester

(small size)


Modelling the problem

Modelling the problem…

  • A workflow is a Directed Acyclic Graph (DAG)

  • Scheduling DAGs onto resources is well studied in the context of homogeneous systems – less so, in the context of heterogeneous systems (mostly without taking into account any uncertainty).

  • Needless to say that this is an NP-complete problem.

  • Are workflows really any type of DAGs or a special type of DAGs? We don’t really know… (some workflows are clearly not DAGs – only DAGs considered here…)


Dag scheduling

DAG scheduling

  • An order by which tasks will be executed needs to be established (eg., red, yellow, or blue first?)

  • Resources need to be chosen for each task (some resources are fast, some are not so fast!)

  • The cost of moving data between resources should not outweigh the benefits of parallelism.


Does the order matter

0

1

2

3

4

5

6

7

8

9

Does the order matter?

  • If task 6 takes comparatively longer to run, we’d like to execute task 2 just after task 0 finishes (perhaps before tasks 1, 3, 4, 5).

Follow the critical path! This is not really new! 


Our methodology

Our methodology…

  • Revisit the DAG scheduling problem for heterogeneous systems…

  • Start with simple static scenarios…

    • Even this problem is not well understood, despite the fact that there have been more than 30 heuristics published… (check the proceedings of the Heterogeneous Computing Workshop for a start…)

  • Try to build on existing knowledge, as we obtain a good understanding of each step!


Outline of part i

Outline of Part I

  • Static DAG scheduling onto heterogeneous systems (i.e., we know computation & communication a priori)

  • Introduce uncertainty in computation times.

  • Handle multiple DAGs at the same time.

[1] Rizos Sakellariou, Henan Zhao. A Hybrid Heuristic for DAG Scheduling on Heterogeneous Systems. Proceedings of the 13th IEEE Heterogeneous Computing Workshop (HCW’04) (in conjunction with IPDPS 2004), Santa Fe, April 2004, IEEE Computer Society Press, 2004.

[2] Rizos Sakellariou, Henan Zhao. A low-cost rescheduling policy for efficient mapping of workflows on grid systems. Scientific Programming, 12(4), December 2004, pp. 253-262.

[3] Henan Zhao, Rizos Sakellariou. Scheduling Multiple DAGs onto Heterogeneous Systems. Proceedings of the 15th Heterogeneous Computing Workshop (HCW'06) (in conjunction with IPDPS 2006), Rhodes, Apr. 2006, IEEE Computer Society Press.


Exploring issues with workflow scheduling on the grid

0

18 12 9 11 14

1000 15

19 16 27 23

23

11

17 13

1

2

3

4

5

Task

M1

M2

M3

0

37

39

27

1

30

20

24

2

21

21

28

6

7

8

3

35

38

31

4

27

24

29

5

29

37

20

6

22

24

30

7

37

26

37

9

8

35

31

26

9

33

37

21

The starting point for a model…A DAG, 10 tasks, 3 machines(assume we know execution times, communication costs)


A simple idea

0

3

1

2

5

4

6

7

8

9

A simple idea…

Assign nodes to the fastest machine!

Communication between

nodes 4 and 8 takes way

too long!!!

Heuristics that take into

account the whole structure

of the DAG are needed…

Makespan is > 1000!


Exploring issues with workflow scheduling on the grid

Still, if we consider the whole DAG…

HEFT – a minor change leads to different schedules (~15%):

0

10

20

30

40

50

60

70

80

90

100

110

120

130

140

150

160

170

Makespan: 143 Makespan: 164

0

0

3

3

2

5

5

1

4

4

1

2

6

7

7

8

8

6

9

9

H.Zhao,R.Sakellariou. An experimental study of the rank function

of HEFT. Proceedings of EuroPar’03.


Exploring issues with workflow scheduling on the grid

Hmm…

  • This was a rather well defined problem…

  • This was just a small change in the algorithm…

  • Yet, with big variations in the outcome.

  • What about different heuristics?

  • What about more generic problems?


Dag scheduling a hybrid heuristic

DAG scheduling: A Hybrid Heuristic

  • Trying to find out why there were such differences in the outcome of HEFT…we observed problems with the order… to address those problems we came up with a Hybrid Heuristic… it worked quite well!

  • Phases:

    • Rank (list scheduling)

    • Create groups of independent tasks

    • Schedule independent tasks

      • Can be carried out using any scheduling algorithm for independent tasks, e.g. MinMin, MaxMin, …

      • A novel heuristic (Balanced Minimum Completion Time)

R.Sakellariou, H.Zhao. A Hybrid Heuristic for DAG Scheduling on

Heterogeneous Systems. Proceedings of the IEEE Heterogeneous

Computing Workshop (HCW 04) , 2004.


Exploring issues with workflow scheduling on the grid

Hmm…

  • Yes, but, so far, you have used static task execution times… in practice such times are difficult to specify exactly…

  • There is an answer for run-time deviations: adjust at run-time…

  • But:

    don’t we need to understand the static case first?


Characterise the schedule

Characterise the Schedule

  • Spare time indicates the maximum time that a node, i, may delay without affecting the start time of an immediate successor, j.

  • Slackindicates the maximum time that a node, i, may delay without affecting the overall makespan.

  • The idea: keep track of the values of the slack and/or the spare time and reschedule only when the delay exceeds slack…(selective rescheduling)

R.Sakellariou, H.Zhao. A low-cost rescheduling policy for efficient

mapping of workflows on grid systems. Scientific Programming,

12(4), December 2004, pp. 253-262.


Example

Example

FT(4)=32.5, DAT(4,7)=40.5, ST(7)=45.5 →Spare_Time(4)=5

Slack(8)=0;

Slack(7)=Slack(8)+Spare_Time(7)=0;

Slack(5)=Slack(8)+Spare_Time(5)=6


Lessons learned simulation and deviations of up to 100

Lessons Learned…(simulation and deviations of up to 100%)

  • Heuristics that perform better statically, perform better under uncertainty.

  • By using the metrics on spare time, one can track the amount of deviation of the makespan from the static estimate. Then, we can minimise the number of times we reschedule, still achieving good results.


Moving on to multiple dags

Moving on… to multiple DAGs

  • It is really ideal to assume that we have exclusive usage of resources…

  • In practice, we may have multiple DAGs competing for resources at the same time…

Henan Zhao, Rizos Sakellariou. Scheduling Multiple DAGs onto Heterogeneous Systems. Proceedings of the 15th Heterogeneous Computing Workshop (HCW'06) (in conjunction with IPDPS 2006), Rhodes, Apr. 2006, IEEE Computer Society Press.


Scheduling multiple dags approaches

Scheduling Multiple DAGs:Approaches

  • Approach 1: Schedule one DAG after the other with existing DAG scheduling algorithms

    • Low resource utilization & long overall makespan

  • Approach 2: Still one after the other, but do some backfilling and fill the gaps

    • Which DAG to schedule first? The one with longest makespan or the one with shortest makespan?

  • Approach 3: Alternate between DAGs (either round-robin or using some other form of priority).

    • Much better than Approach 1 & 2.


Exploring issues with workflow scheduling on the grid

But, is makespan optimisation a good objective when scheduling multiple DAGs?


Mission fairness

Mission: Fairness

In multiple DAGs:

  • Users perspective: “I want my DAG to complete execution as soon as possible”.

  • System perspective: “I would like to keep as many users as possible happy; I would like to increase resource utilisation (and income)”.

    Let’s be fair to users!

    (The system may want to take into account different levels of quality of service agreed with each user)


Lessons learned open questions

Lessons Learned… Open questions…

  • It is possible to achieve reasonably good fairness without affecting makespan.

  • An algorithm with good behaviour in the static case appears to make things easier in terms of achieving fairness…

  • What is fairness?

  • What should be the behavior when run-time changes occur?

  • What about different notions of Quality of Service (e.g., based on SLAs…)


Questions still unanswered

Questions still unanswered…

  • What are the representative DAGs (workflows) in the context of Grid computing?

  • Extensive evaluation / analysis (theoretical too) is needed. Not clear what is the best makespan we can get (it is not easy to find the critical path…)

  • What are the uncertainties involved? How good are the estimates that we can obtain for the execution time / communication cost? Performance prediction is hard…

  • How ‘heterogeneous’ our Grid resources really are?


Workflows are not generic dags

Workflows are not generic DAGs

  • Bioinformatics workflows are really small (10s of nodes)

  • There are scientific workflows with thousands of nodes (Montage, LIGO, SCEC), but they have a rather regular structure.

  • Experience from joint work with the Pegasus team indicates that there may not be much to gain from sophisticated heuristics (paper to be published based on the earlier studies below)

  • James Blythe, S. Jain, Ewa Deelman, Yolanda Gil, Karan Vahi, Anirban Mandal, Ken Kennedy: Task scheduling strategies for workflow-based applications in grids. CCGRID 2005: 759-767

  • Rizos Sakellariou, Henan Zhao. A Hybrid Heuristic for DAG Scheduling on Heterogeneous Systems. Proceedings of the 13th IEEE Heterogeneous Computing Workshop (HCW’04) (in conjunction with IPDPS 2004), Santa Fe, April 2004, IEEE Computer Society Press, 2004.


Part ii but there is more than just shortening the makespan when scheduling dags workflows

Part IIBut, there is more (than just shortening the makespan) when scheduling DAGs (workflows)!


Efficient data handling

Efficient data handling

  • Workflow input data is staged dynamically, new data products are generated during execution

  • For large workflows 10,000+ input files

“Scheduling Data-Intensive Workflows onto Storage-Constrained Distributed Resources”, A. Ramakrishnan, G. Singh, H. Zhao, E. Deelman, R. Sakellariou, K. Vahi, K. Blackburn, D. Meyers, and M. Samidi, CCGrid 2007

(Similar order of intermediate/output files)

  • If not enough disk space: failures occur

  • Solution:

    • Determine which data are no longer needed and when

    • Add nodes to the workflow to cleanup data along the way

    • Take into account disk space onto resources

  • Benefits: simulations show up to 57% space improvements for LIGO-like workflows


  • 44 improvement in footprint for montage workflow when adding cleanup nodes

    44% Improvement in footprint for Montage workflow(when adding cleanup nodes)


    Exploring issues with workflow scheduling on the grid

    LIGO Inspiral Analysis Workflow

    Small Workflow: 164 nodes

    Full Scale analysis: 185,000 nodes and 466,000 edges

    10 TB of input data and 1 TB of output data

    LIGO workflow running on OSG

    “Optimizing Workflow Data Footprint” G. Singh, K. Vahi, A. Ramakrishnan,

    G. Mehta, E. Deelman, H. Zhao, R. Sakellariou, K. Blackburn, D. Brown,

    S. Fairhurst, D. Meyers, G. B. Berriman , J. Good, D. S. Katz, Scientific Programming.


    Exploring issues with workflow scheduling on the grid

    LIGO Workflows

    26%

    Improvement

    In disk space

    Usage

    50% slower runtime


    Exploring issues with workflow scheduling on the grid

    LIGO Workflows

    56% improvement

    in space usage

    3 times slower in runtime

    “Optimizing Workflow Data Footprint” G. Singh, K. Vahi, A. Ramakrishnan,

    G. Mehta, E. Deelman, H. Zhao, R. Sakellariou, K. Blackburn, D. Brown,

    S. Fairhurst, D. Meyers, G. B. Berriman , J. Good, D. S. Katz, Scientific Programming.


    Lesson learned

    Lesson Learned…

    When scheduling workflows, one may want to trade performance with storage requirements to make it feasible to complete the execution of a workflow!


    Exploring issues with workflow scheduling on the grid

    Part IIIBut, there are other issues related to performance that have to do with:the workflow execution environment and the queuing mechanisms of traditional systems!


    Exploring issues with workflow scheduling on the grid

    Scheduling

    Ewa Deelman, [email protected]/~deelmanpegasus.isi.edu

    Slide Courtesy: Ewa Deelman, [email protected]/~deelmanpegasus.isi.edu


    Execution environment

    Execution Environment

    Slide Courtesy: Ewa Deelman, [email protected]/~deelmanpegasus.isi.edu


    Queues are evil

    Queues are evil… 

    Is Advance Reservation a solution?


    Exploring issues with workflow scheduling on the grid

    Might be… For sure, there are several challenges with respect to workflows: e.g., given a user-specified deadline how can we make reservations for individual tasks?

    Henan Zhao, Rizos Sakellariou. Advance Reservation Policies for Workflows. Proceedings of the 12th Workshop on Job Scheduling Strategies for Parallel Processing, 2006.


    Advance reservation provides still a limited level of service

    Advance Reservation provides still a limited level of service!

    • Can we think of a model where:

      • users specify their constraints,

      • make an agreement (legally binding contract) with the resource owner (Service Level Agreement:SLA)

      • it’s up to the system to do the scheduling (based on the SLAs) to honour the agreement.

    http://www.gridscheduling.org

    Viktor Yarmolenko, Rizos Sakellariou. Towards Increased Expressiveness in Service Level Agreements. Concurrency and Computation: Practice and Experience, 2007.

    Viktor Yarmolenko, Rizos Sakellariou. An Evaluation of Heuristics for SLA-based parallel job scheduling. High Performance Grid Computing Workshop, IPDPS, 2006.


    Sla based job scheduling

    SLA based job scheduling

    • SLA based job scheduling can offer the levels of service currently missing:

      • It happens all the time in the real-world!

    • But, there are several key challenges to address:

      • Build appropriate protocols (legally binding), behaviour models, etc. for negotiation and re-negotiation

      • Pricing Policies (income, penalties, etc…)

      • Manage complexity

      • Regulation, monitoring, dispute resolution…

      • Convince the users to change attitudes!

    • Scheduling the SLAs doesn’t appear to be the biggest challenge… But:

      • How to schedule workflows using SLAs (how to deal with co-allocation problems, for instance) is a big challenge!

      • Needs extensive evaluation! 


    To summarize

    To summarize…

    • Understanding the basic static scenarios and having robust solutions for those scenarios helps the extension to more complex cases…

    • Pretty much everything here is addressed by heuristics. Their evaluation requires extensive experimentation: Still:

      • No agreement about how DAGs (workflows) look like.

      • No agreement about how heterogeneous resources really are.

    • There are indications that sophisticated DAG scheduling may not be very relevant for workflows. But, there are optimization problems that relate to:

      • Data handling, Licences?, Budget?, (or multiple criteria)…

        and, above all…


    Exploring issues with workflow scheduling on the grid

    What is the way to ease the constraints imposed by the traditional queue-based models for job scheduling?


    Exploring issues with workflow scheduling on the grid

    I’d be happy to hear from anyone with interests in these problems.You are also welcome to come and visit us in Manchester!


  • Login