html5-img
1 / 36

Cluster Resource Management: A Scalable Approach

Cluster Resource Management: A Scalable Approach. Ning Li and Jordan Parker CS 736 Class Project. Outline. Introduction A Scalable Approach: Hierarchy Results Conclusions Questions. Why Study Resource Management?. Clusters have become increasingly popular for large parallel computing.

locke
Download Presentation

Cluster Resource Management: A Scalable Approach

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cluster Resource Management: A Scalable Approach Ning Li and Jordan Parker CS 736 Class Project

  2. Outline • Introduction • A Scalable Approach: Hierarchy • Results • Conclusions • Questions Ning Li and Jordan Parker

  3. Why Study Resource Management? • Clusters have become increasingly popular for large parallel computing. • Web Servers • Clusters are becoming increasingly large to the order of thousands of nodes. • Clusters are providing multiple services. • Hard to evaluate • Bad is easy to determine • Good is much harder Ning Li and Jordan Parker

  4. A 50% A 50% A 50% A 50% Node 1 Node 2 Node 3 Node 4 Overall B 50% B 50% B 50% B 50% B 100% Node 1 Node 2 Node 3 Node 4 A 66% A 66% A 66% B 100% B 33% B 33% B 33% Resource Management Example • 4th Node Services only B • Poor Management • Ideal Overall A 37.5% B 62.5% Ning Li and Jordan Parker

  5. Clustering Goals • Scalability • Reliability • High Performance • Affordability Ning Li and Jordan Parker

  6. Related Work • Proportional-Share • Cluster Reserves Ning Li and Jordan Parker

  7. Related Work: Approach Differences • Our Goal: to provide a scalable solution for resource management. • Other work focused primarily on just having good management • This often meant 1 manager for all the nodes • Clearly this could present a scalable bottleneck • Effectiveness: Other solutions probably better for smaller clusters, we hope to be better for large (>1000 nodes) clusters. Ning Li and Jordan Parker

  8. Outline • Introduction • A Scalable Approach: Hierarchy • Results • Conclusions • Questions Ning Li and Jordan Parker

  9. Hierarchical Management Nodes service jobs Managers facilitate resource management 1 2 3 4 5 6 7 8 9 10 11 12 Hierarchy: A Scalable Approach Ning Li and Jordan Parker

  10. Banking Algorithm • Goal • Determine best allocation given previous usage • Primitives • Tickets • Bank accounts • Deposit / withdraw tickets • 6 Steps Ning Li and Jordan Parker

  11. Banking Algorithm • Step 1: For each service class on each node • Deposit unused tickets • Step 2: For each service class on each node • Reallocate service class • Full utilization: Allocation = usage + k • Under utilization: Allocation = usage - k Ning Li and Jordan Parker

  12. Banking Algorithm Cont. • Step 3: For each service class • Compare total allocation to desired • Subtract from over-allocated • Add to needy & under-allocated • Step 4: For each service class • Deposit / Withdraw • If still over-allocated withdraw • If still under-allocated deposit Ning Li and Jordan Parker

  13. Banking Algorithm Cont. • Step 5: • Withdraw and allocate • Reward the needy nodes • Step 6: • Done, clear the bank accounts Ning Li and Jordan Parker

  14. 1 2 2 3 4 5 3 4 5 6 7 8 9 10 11 12 6 7 8 9 10 11 12 Reliability • Bottom-up Manager Replacement 5 5 6 7 2 2 1 3 8 9 10 4 11 12 Ning Li and Jordan Parker

  15. Outline • Introduction • A Scalable Approach: Hierarchy • Results • Conclusions • Questions Ning Li and Jordan Parker

  16. Results Ning Li and Jordan Parker

  17. Implementation Details • Simulations via The NS – Network Simulator • Low bandwidth 10Mbs communication network • UDP for lower server overhead • Assumptions • Node level resource management works ideally Ning Li and Jordan Parker

  18. Node 4 Overall 1st 40% 1st 66% 1st 66% 1st 66% 1st 60% Node 2 Node 3 Node 1 2nd 20% 3rd 40% 2nd 30% 2nd 33% 2nd 33% 2nd 33% 3rd 10% Test 1: Overview • 4 nodes – 3 services – 60/30/10 Allocation • 4th node receives all of 3rd class’s requests • Steady Workload Ning Li and Jordan Parker

  19. Test 1: Data Ning Li and Jordan Parker

  20. Test 2: Overview • 100 nodes – 3 services – 60/30/10 Allocation • nodes 1-30 receive all of 3rd class’s requests • Steady Workload Ning Li and Jordan Parker

  21. Test 2: Data Ning Li and Jordan Parker

  22. Test 3: Overview • 100 nodes – 3 services – 60/30/10 Allocation • nodes 1-30 receive all of 3rd class’s requests • Dynamic Workload Ning Li and Jordan Parker

  23. Test 3: Data Ning Li and Jordan Parker

  24. Test 4: Overview • 100 nodes – 3 services – 60/30/10 Allocation • nodes 1-30 receive all of 3rd class’s requests • Steady Workload • Reporting 1/5 • Nodes every 0.3 second • Managers every 1.5 seconds Ning Li and Jordan Parker

  25. Test 4: Data Ning Li and Jordan Parker

  26. Test 5: Overview • 900 nodes – 3 services – 60/30/10 Allocation • nodes 1-300 receive all of 3rd class’s requests • Steady Workload Ning Li and Jordan Parker

  27. Test 5: Data Ning Li and Jordan Parker

  28. Outline • Introduction • A Scalable Approach: Hierarchy • Results • Conclusions • Questions Ning Li and Jordan Parker

  29. Conclusions • Benefits of an hierarchy • Scalable • Reliable • Geographic Applications • Implemented a new management scheme: Banking • Comparable Results • Improved Scalability Ning Li and Jordan Parker

  30. Conclusions • Clusters are sensitive to small policy changes • Clusters are built for specific workloads • Their performance is important and small changes have significant impact • No scheme is universally applicable • Future Work • Real system implementation • Real Workloads • Real node level resource management • More steady performance Ning Li and Jordan Parker

  31. Outline • Introduction • A Scalable Approach: Hierarchy • Results • Conclusions • Questions Ning Li and Jordan Parker

  32. Questions Ning Li and Jordan Parker

  33. Related Work: Proportional-Share • Stride Scheduling • Ticket based and similar to lottery • Scale • Randomly query k nodes to find best allocation • Different Application • Condor-like resource allocation/applications Ning Li and Jordan Parker

  34. Related Work: ClusterReserves • Resource Container Schedulers • Constrained Optimization Algorithm • Scale • Centralized single manager Ning Li and Jordan Parker

  35. Hierarchical Cluster Reserves – Version 1 • Modify Cluster Reserves optimization algorithm • Use it when manager manages nodes • ANDwhen level_n+1 manager manages level_n managers. Ning Li and Jordan Parker

  36. Hierarchical Cluster Reserves – Version 2 • Cluster Reserves optimization algorithm • Use it when manager manages nodes • Don’t use it for upper level managers • Modify the manager to manager reporting • Lie to the algorithm Ning Li and Jordan Parker

More Related