1 / 51

Scheduling Parameter Sweep Applications

Scheduling Parameter Sweep Applications. Sathish Vadhiyar Sources/Credits: Papers on survey of heuristis and APST papers. Figures taken from the papers. Background. Tasks of a job do not have dependencies A machine executes a single task at a time – space shared machines

gusty
Download Presentation

Scheduling Parameter Sweep Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scheduling Parameter Sweep Applications Sathish Vadhiyar Sources/Credits: Papers on survey of heuristis and APST papers. Figures taken from the papers

  2. Background • Tasks of a job do not have dependencies • A machine executes a single task at a time – space shared machines • Collection of tasks and machines are known apriori • Matching of tasks to machines done offline • Estimates of execution time for each task on each machine is known

  3. Scheduling Problem • ETC – Expected time to compute matrix • ETC(i,j) – estimated execution time of task i on machine j • Notations: • mat(j) – machine availability time for machine j, i.e., earliest time at which j has completed all tasks that were previously assigned to it • Completion time, ct(i,j) = mat(j)+ETC(i,j) • Objective – find max ct(i, j), makespan, and find heuristic with minimum makespan

  4. Scheduling Heuristics • Opportunistic Load Balancing (OLB) • Assign next task (arbitrary order) to the next available machine • Regardless of task’s ETC on that machine • User Directed Allocation (UDA) • Assign next task (arbitrary order) to the machine with lowest ETC • Regardless of machine availability • Fast greedy • Assign each task (arbitrary order) to the machine with minimum ct

  5. Scheduling Heuristics 4. Min-Min • Start with a list of Unmapped tasks, U. • Determine the set of minimum completion times for U. • Choose the next task that has min of min completion times and assign to the machine that provides the min. completion time. • The new mapped task is removed from U and the process is repeated. • Theme - Map as many tasks as possible to their first choice of machine • Since short jobs are mapped first, the percentage of tasks that are allocated to their first choice is high

  6. Scheduling Heuristics 5. Max-Min • Start with a list of Unmapped tasks, U. • Determine the set of minimum completion times for U. • Choose the next task that has max of min completion times and assign to the machine that provides the min. completion time. • The new mapped task is removed from U and the process is repeated. • Avoid starvation of long tasks • Long tasks executed concurrently with short tasks • Better machine-utilization 6. Greedy • Combination of max-min and min-min • Evaluates both and finds the better solution

  7. Scheduling Heuristics • Genetic Algorithm

  8. GA • Operates 200 chromosomes. A chromosome represents a mapping of task to machines, a vector of size t. • Initial population – 200 chromosomes randomly generated with 1 Min-Min seed • Evaluation – initial population evaluated based on fitness value (makespan) • Selection – • Roulette wheel – probabilistically generate new population, with better mappings, from previous population • Elitism – guaranteeing that the best solution (fittest) is carried forward

  9. GA - Roulette wheel scheme Chromosomes 1 2 3 4 Score 4 10 14 2 Probability of 0.13 0.33 0.47 0.07 selection Select a random number, r, between 0 and 1. Progressively add the probabilities until the sum is greater than r

  10. GA • Crossover • Choose pairs of chromosomes. • For every pair • Choose a random point • exchange machine assignments from that point till the end of the chromosome • Mutation. For every chromosome: • Randomly select a task • Randomly reassign it to new machine • Evaluation • Stopping criterion: • Either 1000 iterations or • No change in elite chromosome for 150 iterations

  11. Simulated Annealing • Poorer solutions accepted with a probability that depends on temperature value • Initial mapping • Initial temperature – initial makespan • Each iteration: • Generate new mapping based on mutation of prev. mapping. Obtain new makespan • If new makespan better, accept • If new makespan worse, accept if a random number z in [0,1] > y where • Reduce temperature by 10%

  12. Genetic Simulated Annealing (GSA) • Almost same as GA • During selection, SA is employed to form new population • Initial system temperature – average makespan of the population • Each iteration of GA • Post-mutation or post-crossover, during a comparison of chromosome with the previous chromosome • if (new makespan) < (old makespan + temperature), new chromosome becomes part of the population • Temperature decreased by 10%

  13. Tabu search • Keeps track of regions of solution space that have already been searched • Starts with a random mapping • Generate all possible pairs of tasks, (i,j), t in (0, t-1) and j in (i+1, t) • i and j’s machine assignments are exchanged (short hop) and makespan evaluated • If makespan better (successful short hop), search begins from i=0, else search continues from previous (i,j) • Continue until 1200 successful short hops or all pairs have been evaluated • Add final mapping to tabu list. The list keeps track of solution space searched • A new random mapping generated that differs from solution space by atleast half the machine assignments (long hop) • Search continued until fixed number of short and long hops

  14. A Comparison Study of Static Mapping Heuristics for a Class of Meta-Tasks on Heterogeneous Computing SystemsTracy D. Braun et. al

  15. Simulation Results • ETC matrix randomly generated using uniform distribution • ETC matrices may be • Consistent – m/c i faster than m/c j for all tasks. Each row is sorted across all columns. • Inconsistent – not sorted • Semi-consistent – sorted across even columns. • t=512, m=16 • OLB, UDA, Fast Greedy and Greedy – few seconds • SA, tabu – 30 seconds • GA, GSA – 60 seconds • A* - 20 minutes

  16. Task execution results • Consistent cases • UDA performs worst in consistent cases – why? • Inconsistent cases • UDA improves because “best” machines are distributed avoiding load imbalance • Fast greedy and Min-Min improve upon UDA since they consider MCTs and MCTs are also evenly distributed • Tabu search gave poor performance for inconsistent cases since there are more successful short hops than long hops. • In both cases, Min-Min performed better than Max-Min due to the nature of the task mix • In both cases, GA’s performed the best. All tasks were assigned the same machine

  17. AppLeS Parameter Sweep Template (APST)

  18. APST • For efficient deployment of parameter sweep applications on the Grid • Distinct experiments share large input files and produce large output files • Shared data files must be co-located with experiments • PSA – set of independent tasks • Input – set of files, a single file can be input to more than 1 task • Output – each task produces exactly one output • Number of computational tasks in PSA orders of magnitude greater than number of processors

  19. Scheduling Heuristics • Self-scheduled workqueue • Adaptive scheduling algorithm – assigns tasks to nodes as soon as they are available in a greedy fashion • Suitable: • If no large input files • Large clusters interconnected with high-speed clusters • Computation to data-movement times are high

  20. Algorithm

  21. Gantt Chart

  22. Step 4

  23. Heuristics • Min-min • f – minimum of CTi,j • Best – minimum • Max-min • f – minimum of CTi,j • Best – maximum • Sufferage • f – ratio between second minimum and minimum • Best – maximum • a host should be given to a task that would “suffer” the most if not given the host • XSufferage • Site-level • Cluster-level MCTs and cluster-level sufferage • Avoids deficiencies of sufferage when a task’s input file is in a cluster and 2 hosts in the cluster have identical performance

  24. Sufferage and XSufferage Host k in cluster j

  25. Sufferage and XSufferage

  26. Sample APST Setup

  27. Impact of Quality of Information on Scheduling • Random noise [-p,+p] ; p – 0%-100% added to accuracy of estimates

  28. Results

  29. Results

  30. Robust Static Allocation of Resources for Independent Tasks…-Sugavanam et. al., JPDC 2007 • ETC numbers are just estimates and inaccurate • Need to map tasks to maximize the robustness of makespan against estimation errors • Robustness? • Degradation in makespan is within acceptable limits when estimates are perturbed • The goal is to maximize the collective allowable error in estimation without makespan exceeding the constraint

  31. Problem Formulation • Cest – vector of estimated execution times on a machine • C – vector of actual times • Performance feature, Ø that determines the robustness of the makespan is the finish times of machines = {Fj, 1<j<M} • Robustness radius for machine j and for mapping μ: minimum Euclidean distance between C and Cest within which the finish time of machine j can reach tolerable variation

  32. Robustness Radius • Within this robustness radius, the finish time of machine j will be atmost the makespan constraint, tau. • The above equation can be interpreted as the perpendicular distance from cest to the hyperplane, tau-Fj(C) = 0 • Rewritten as:

  33. Robustness Metric • Robustness metric for the mapping: • If Euclidean distance between actual and estimated is no larger than the above metric, the makespan will be atmost the constraint, tau • The larger the robustness metric, better the mapping • Thus the problem is to maximize the metric such that the makespan is within the time constraint

  34. Heuristics – Max-Max

  35. Greedy Iterative Maximization (GIM)

  36. Greedy Iterative Maximization (GIM)

  37. Sum Iterative Maximization (SIM) “Robustness improvement” – change in the sum of the robustness radii of the machines after task reassignment or swapping

  38. Sum Iterative Maximization (SIM)

  39. Genitor

  40. Genitor

  41. Memetic Algorithm • Combines global search using genetic algorithm and local search using hill climbing

  42. Memetic

  43. HereBoy Evolutionary Algorithm • Combines GA and SA

  44. HereBoy Evolutionary Algorithm Reduces the mutation as the current robustness reaches upper bound User defined maximum mutation rate User defined maximum probability Probability of accepting a poorer solution

  45. UB calculation for Hereboy • Assumes a homogeneous MET system – execution time for a task on all machines is equal to minimum execution time on the original set of machines • The tasks are arranged in ascending order • N=T/M. The first N tasks are stored in a set S

  46. GIM and SIM had low makespan and high robustness

  47. References • The AppLeS Parameter Sweep Template: User-Level Middleware for the Grid by Henri Casanova, Graziano Obertelli, Francine Berman and Rich wolski Proceedings of the Super Computing Conference (SC'2000). • Heuristics for Scheduling Parameter Sweep applications in Grid environments by Henri Casanova, Arnaud Legrand, Dmitrii Zagorodnov and Francine Berman in Proceedings of the 9th Heterogeneous Computing workshop (HCW'2000), pp349-363. • A Comparison Study of Static Mapping Heuristics for a Class of Meta-Tasks on Heterogeneous Computing Systems Tracy D. Braun, Howard Jay Siegel, Noah Beck, Ladislau L. Bölóni, Albert I. Reuther, Mitchell D. Theys, Bin Yao, Richard F. Freund, Muthucumaru Maheswaran, James P. Robertson, Debra Hensgen, Eighth Heterogeneous Computing Workshop , April 12 - 12, 1999, San Juan, Puerto Rico Page 15 • Robust static allocation of resources for independent tasks under makespan and dollar cost constraints. Journal of Parallel and Distributed Computing, Volume 67, Issue 4, April 2007, Pages 400-416. Prasanna Sugavanam, H.J. Siegel, Anthony A. Maciejewski, Mohana Oltikar, Ashish Mehta, Ron Pichel, Aaron Horiuchi, Vladimir Shestak, Mohammad Al-Otaibi, Yogish Krishnamurthy, et al.

  48. Simulation Background • ETC(i,j) = B(i)*xrij • B(i) = xbi; xbiЄ [1, Фb] • xrijЄ [1, Фr] • Фr, Фb can control machine and task heterogeneity • ETC matrix randomly generated using uniform distribution

  49. Models used in APST

  50. Scheduling • Goal: • Minimize application’s makespan - time between when 1st input file submitted and when last output file is obtained • Sched(): • Takes into account resource performance estimates to generate a plan for assigning file transfers to links and tasks to hosts • Call sched repeatedly at various point of time (called scheduling events) to make it more adaptive • At each scheduling event, sched() knows: • Grid topology • Number and locations of copies of input files • List of computations and file transfers currently underway or already completed

More Related