1 / 12

Multi-Resource Packing for Cluster Schedulers

Multi-Resource Packing for Cluster Schedulers. Srikanth Kandula Ganesh Ananthanarayanan Sriram Rao. Robert Grandl Aditya Akella. Diverse Resource Requirements. Tasks need varying amounts of each resource E.g., Memory [100MB to 17GB] CPU [2% of a core to 6 cores].

iorwen
Download Presentation

Multi-Resource Packing for Cluster Schedulers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multi-Resource Packing for Cluster Schedulers Srikanth Kandula Ganesh Ananthanarayanan Sriram Rao Robert Grandl Aditya Akella

  2. Diverse Resource Requirements Tasks need varying amounts of each resource • E.g., Memory [100MB to 17GB] CPU [2% of a core to 6 cores] Demands for resources are not correlated • Correlation coefficient across resource demands  [–0.11, 0.33] Need to match tasks with machines based on resource

  3. Current Schedulers do not Pack Resources allocated in terms of “slots” T3: 4 GB T3: 4 GB T2: 2 GB T2: 2 GB T1: 2 GB T1: 2 GB Resource Fragmentation 4 GB Memory Machine B 4 GB Memory Machine A 4 GB Memory Machine B 4 GB Memory Machine A Current Schedulers Packer Schedulers

  4. Current Schedulers do not Pack T3: 2 GB Mem Resources allocated in terms of “slots” T2: 2 GB Mem 20 MB/s In Nw. T2: 2 GB Mem 20 MB/s In Nw. T1: 2 GB Mem 20 MB/s In Nw. T1: 2 GB Mem 20 MB/s In Nw. Over-allocation 20 MB/s In Nw. 20 MB/s In Nw. 4 GB Memory 20 MB/s In Nw. Machine A 4 GB Memory 20 MB/s In Nw. Machine A Packer Schedulers Current Schedulers

  5. Current Schedulers do not Pack Durations: Durations: A: 3t A: t 6 tasks 6 tasks 6 tasks 18 tasks A A B: 3t B: 2t 2 tasks 2 tasks 2 tasks 0 tasks 6 tasks B B C: 3t C: 3t Slots allocated purely on fairness considerations 33% improvement 2 tasks 2 tasks 2 tasks 0 tasks 0 tasks 6 tasks C C 3t 3t 2t 2t t t DRF Packer 18 cores 18 cores 18 cores 18 cores 18 cores 18 cores 16 GB 16 GB 16 GB 36 GB 6 GB 6 GB Resources used: DRF share = 1/3 Resources used: Packer Packer Schedulers Current Schedulers

  6. It is all about packing ? Multi-dimensional bin packing is NP-hard for #dimens. ≥ 2 • Several heuristics proposed • But they do not apply here … size of the ball, contiguity of allocation, resource demands are elastic in time Will perfect packing suffice ? Competing objectives: Cluster utilizationvs. Job completion times vs. Fairness

  7. Intuition behind the solution Something reasonably simple and which can be applied Job completion time Cluster efficiency Cluster efficiency Fairness

  8. Tetris Pack tasks along multiple resources • Cosine similarity between task demand vector and machine resource vector A (simplified) Scheduling procedure 1: while (resources R are free) 2: among FJ jobs furthest from fair share 3: score (j) = 4: max task t in j, demand(t) ≤ R A(t, R) + T(j) 5: pick j*, t* = argmax score(j) 6: R = R – demand(t*) 7: end while Multi-resource version of SRTF • Favor jobs with small remaining duration and small resource consumption T Incorporate Fairness • Fairness knob  (0, 1] f →0 close to perfect fairness f = 1 most efficient scheduling F

  9. Task Requirements and resource usages Resource Tracker Learning task requirements • From tasks that have finished in the same phase • Coefficient of variation  [0.022, 0.41] • Collecting statistics from recurring jobs Machine - In Network • Peak usage demands estimates for tasks Resource Tracker • measure actual usage of resources • enforce allocations • aware of activities on the cluster other than tasks assignment: ingest and evacuation 1024 850 512 MBytes / sec 0 Time (sec) In Network Used Task In Network Estimates In Network Free

  10. Evaluation Prototype atop Hadoop 2.3 Cluster capacity: 250 Nodes 4 hour synthetic workload 60 jobs with complementary task demands • Tetris as a pluggable scheduler to RM • Implement RT as a NM service • Modified AM/RM resource allocation protocol CDF Reduction (%) in Job Duration - DRF Large scale evaluation Reduces average job duration by up to 40% Reduces makespan by 39%

  11. Evaluation Facebook production traces analysis Slowdown (%) Slowdown (%) Fairness Knob - Fair Fairness Knob - DRF Trace-driven simulation Fairness knob: fewer than 6% of jobs slow down; by not more than 8% on average Knob value of 0.75 offers nearly the best possible efficiency with little unfairness

  12. Conclusion Identify the importance of scheduling all relevant resources in a cluster Resource Fragmentation Over-allocation and Interference Come and see our poster ! New scheduler that pack tasks along multiple resources Reduce makespan Job Completion Time Enable a trade-off between packing efficiency and fairness Fairness Knob

More Related