1 / 32

Scheduling Bag Of Tasks Under Budget Constraints

Scheduling Bag Of Tasks Under Budget Constraints. Ana-Maria Oprescu , Thilo Kielmann ( Vrije University) Presented By Gal Cohen Cloud Computing Seminar CS Technion , Spring 2012. Bag Of Tasks. High t hroughput computing jobs No interactive deadline Tasks are independent of each other

hanley
Download Presentation

Scheduling Bag Of Tasks Under Budget Constraints

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scheduling Bag Of Tasks Under Budget Constraints Ana-Maria Oprescu, ThiloKielmann (Vrije University) Presented By Gal Cohen Cloud Computing Seminar CS Technion, Spring 2012

  2. Bag Of Tasks • High throughput computing jobs • No interactive deadline • Tasks are independent of each other • All tasks are ready for execution • Unknown runtimes • Execution Model: • Allocate resources (e.g. machines) • Run each task (once) from the bag on some machine

  3. Assumptions: Bag Of Tasks • Unknown runtime distribution • However, some distribution exists • The total number of jobs is also known • Tasks can be aborted

  4. Using cloud computing to run bag of tasks: Abstractions • There are many Cloud providers. (EC2, Azure, Rackspace, 3Tera) • Many types of machines even in the same provider, for a different price. • CPU count and speed • Memory size • Upper limit on the number of machines assignable from a provider (self imposed) • A machine is charged per ATU (Hour)

  5. Problem description • The Goal • Run all the tasks from a given bag on cloud computers, meeting a limited budget • Minimize the makespan of the whole bag (without exceeding the budget constraint) • Assumption • Running each task on a machine separately (FIFO)

  6. Model Description • The scheduler (BaTS) runs outside of the cloud (for free) • The scheduler gets the Bag Of Tasks • It allocates machines from each cloud • Dispatch jobs to the allocated machines • Receives feedback on tasks completion

  7. BaTS: Budget constrained task scheduler (Illustration)

  8. BaTS: Budget constrained task scheduler (Outline) • pick a sampling set of tasks of size • Pick initial workers from each machine type • Run a test set on each type of machine (parallel) • Estimate avg Task Execution Time for each type • Construct a configuration based on estimates • Acquire Machines and run tasks • At Regular monitoring intervals go back to 5

  9. picking the sampling set size – confidence interval Error Level Typical Values: 0.10,0.15,0.20,0.25

  10. picking the sampling set size Required sample size (n) Bag Of Tasks Size (N)

  11. BaTS: Budget constrained task scheduler (Outline) • pick a sampling set of tasks of size • Pick initial workers from each machine type • Run a test set on each type of machine (parallel) • Estimate avg Task Execution Time for each type • Construct a configuration based on estimates • Acquire Machines and run tasks • At Regular monitoring intervals go back to 5

  12. Estimating avg Task Execution Time for each machine type • Estimate the runtime of running tasks using the average runtime of tasks with larger execution time • Update a moving average of Task Execution Time (in minutes) for each machine type , during the computation

  13. BaTS: Budget constrained task scheduler (Outline) • pick a sampling set of tasks of size • Pick initial workers from each machine type • Run a test set on each type of machine (parallel) • Estimate avg Task Execution Time for each type • Construct a configuration based on estimates • Acquire Machines and run tasks • At Regular monitoring intervals go back to 5

  14. Construct a configuration based on estimates • We need to decide on the value of , The number of machines from each type • We want to minimize: • While not exceeding the budget : ATU cost for machine of type i

  15. Construct a configuration based on estimates (cont.) • Maximize • Subject to • Using BKP (Bounded Knapsack Problem)

  16. BaTS: Budget constrained task scheduler (Outline) • pick a sampling set of tasks of size • Pick initial workers from each machine type • Run a test set on each type of machine (parallel) • Estimate avg Task Execution Time for each type • Construct a configuration based on estimates • Acquire Machines and run tasks • At Regular monitoring intervals go back to 5

  17. Refining the initial configuration • Continuous monitoring is needed: • The configuration was decided based on estimates of average speeds that might not be accurate • Estimated speed of a machine type () converges during the run • The estimated budget and makespan neglects startup time • The machines ATU start time are different. So, we can’t monitor just before ATU ends

  18. Refining the initial configuration (cont.) • Thus, BaTScontinuously tries to avoid budget violations • Theoretically, It’s easy.As the execution continues, the bag is smaller and the budget is smaller. • The trouble is estimating the size of the bag at a given moment. (some machines will finish their current job before ATU ends)

  19. Refining the initial configuration (cont.) • For every type i, we maintain a list of all machines that participated at some point the computation • For every machine we remember • the number of executed tasks, • The total uptime

  20. Refining the initial configuration (cont.) • Total uptime after executing, • Machine speed • The remaining unused time of the ATU is • The expected future #tasks executed by , • #Tasks to be paid for

  21. Refining the initial configuration (cont.) • The potential number of executed tasks • = is the remaining time from the previous ATU that was not large enough for a whole task.

  22. Refining the initial configuration (cont.) • A budget violation is prevented by checking • If the condition does not hold, Using the remaining budget and tasks, BaTS computes a new slower and cheaper configuration.

  23. BaTS: Budget constrained task scheduler (Outline) • pick a sampling set of tasks of size • Pick initial workers from each machine type • Run a test set on each type of machine (parallel) • Estimate avg Task Execution Time for each type • Construct a configuration based on estimates • Acquire Machines and run tasks • At Regular monitoring intervals go back to 5

  24. BaTS Algorithm • Compute n = sample size • Construct initial config C , acquire machines • While bag has tasks do • Wait for any machine M to ask for work • If M returned result of task T • Update stats for machine M • Update the for M’s type • If sample set tasks for M’s type finished • Update clusters stat for M’s type • If (monitoring interval || first clusters stats ready) • Compute estimates • If (constraint violation || first clusters stats ready) • Call BKP to compute a new config, acquire/release machines • Send M a random Task T’, remove T’ from the bag

  25. Evaluation • Emulating 2 clouds with 32 identical machines each • tasks, sample size • Normal Distribution of tasks length

  26. Evaluation • “Machine speed” in each “cloud” was simulated according to 5 scenarios:

  27. Evaluation In each scenario, comparing RR to BaTS • RR always uses 32+32 machines • BaTS initial configuration is 30+30 machines and • Budget B = the cost of running RR for that scenario • Budget B = the cost of running only on the most “profitable” machine type. (computed offline)

  28. Evaluation

  29. Evaluation

  30. Evaluation

  31. Conclusions • BaTS helps choosing the cloud resources suitable for an application • BaTS helps scheduling within budget while still performing reasonably well

  32. Conclusions • Limitations • The provided tests “cheat” because the number of machines is very small • The “Tail phase” is not handled well (The “faster” machines will be released before the “slow” ones) • Guessing a proper budget • Actual Bags on actual clouds • What about data transfer costs? • Storage constraints? • Other metric – maximize the profitability (or minimize the budget) while not exceeding a given makespan

More Related