1 / 42

Ziliang Zong, Adam Manzanares, and Xiao Qin Department of Computer Science and

Energy Efficient Scheduling for High-Performance Clusters. Ziliang Zong, Adam Manzanares, and Xiao Qin Department of Computer Science and Software Engineering Auburn University. Where is Auburn University?. Ph.D.’04, U. of Nebraska-Lincoln. 04-07, New Mexico Tech.

lev
Download Presentation

Ziliang Zong, Adam Manzanares, and Xiao Qin Department of Computer Science and

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Energy Efficient Scheduling for High-Performance Clusters Ziliang Zong, Adam Manzanares, and Xiao Qin Department of Computer Science and Software Engineering Auburn University

  2. Where is Auburn University? Ph.D.’04, U. of Nebraska-Lincoln 04-07, New Mexico Tech 07-09, Auburn University

  3. Storage Systems Research Group at New Mexico Tech (2004-2007) 2014/9/19 3

  4. Storage Systems Research Group at Auburn (2008) 2014/9/19 4

  5. Storage Systems Research Group at Auburn (2009) 2014/9/19 5

  6. Investigators Ziliang Zong, Ph.D. Assistant Professor, South Dakota Schools of Mines and Technology Adam Manzanares, Ph.D. Candidate Auburn University Xiao Qin, Ph.D. Assistant Professor at Auburn University

  7. Introduction - Applications 2014/9/19 7

  8. Introduction – Data Centers

  9. Motivation – Electricity Usage EPA Report to Congress on Server and Data Center Energy Efficiency, 2007

  10. Motivation – Energy Projections EPA Report to Congress on Server and Data Center Energy Efficiency, 2007

  11. Motivation – Design Issues

  12. Outline

  13. Architecture – Multiple Layers

  14. Energy Efficient Devices

  15. Multiple Design Goals

  16. Outline

  17. Energy-Aware Scheduling for Clusters

  18. Parallel Applications

  19. 8 T1 T3 T2 T4 1 23 33 39 0 8 6 5 2 3 T1 T3 T4 10 15 23 26 32 0 8 6 2 2 4 T2 4 14 24 6 T3 T4 T1 T1 20 23 29 0 0 8 8 2 T2 18 Motivational Example Linear Schedule Time: 39s No Duplication Schedule (NDS) Time: 32s Task Duplication Schedule (TDS) Time: 29s An Example of duplication

  20. (8,48) (6,6) (5,5) T1 T3 T2 T4 1 23 33 39 0 8 (15,90) (10,60) 2 3 T1 T3 T4 (4,4) (2,2) 23 26 32 0 8 6 2 T2 (6,36) 4 14 24 T3 T4 T1 T1 20 23 29 0 0 8 8 2 T2 18 Motivational Example (cont.) Linear Schedule Time:39s Energy: 234J No Duplication Schedule (MCP) Time: 32s Energy: 242J Task Duplication Schedule (TDS) Time: 29s Energy: 284J An Example of duplication CPU_Energy=6W Network_Energy=1W

  21. (8,48) (6,6) (5,5) 1 (15,90) (10,60) 2 3 T1 T3 T4 (4,4) (2,2) 23 26 32 0 8 6 2 T2 (6,36) 4 14 24 T3 T4 T1 T1 20 23 29 0 0 8 8 2 T2 18 Motivational Example (cont.) The energy cost of duplicating T1: CPU side: 48J Network side: -6J Total: 42J The performance benefit of duplicating T1: 6s Energy-performance tradeoff: 42/6 = 7 EAD Time: 32s Energy: 242J PEBD Time: 29s Energy: 284J If Threshold = 10 Duplicate T1? EAD: NO PEBD: Yes

  22. Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Step 1: DAG Generation Task Description: Task Set {T1, T2, …, T9, T10 } T1 is the entry task; T10 is the exit task; T2, T3 and T4 can not start until T1 finished; T5 and T6 can not start until T2 finished; T7 can not start until both T3 and T4 finished; T8 can not start until both T5 and T6 finished; T9 can not start until both T6 and T7 finished; T10 can not start until both T8 and T9 finished;

  23. Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Total Execution time from current task to the exit task Earliest Start Time Earliest Completion Time Latest Allowable Start Time Latest Allowable Completion Time Favorite Predecessor Step 2: Parameters Calculation

  24. Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3, 1} Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8,5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8,5, 6, 2, 7,4, 3,1} Step 3: Scheduling

  25. Basic Steps of Energy-Aware Scheduling Algorithm Implementation: Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3, 1} Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8, 5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8,5, 6, 2, 7, 4, 3,1} Original Task List: {10, 9, 8,5, 6, 2, 7,4, 3,1} Step 4: Duplication Decision Decision 1: Duplicate T1? Decision 2: Duplicate T2? Duplicate T1? Decision 3: Duplicate T1?

  26. Generate the DAG of given task sets Calculate energy increase and time decrease Calculate energy increase Find all the critical paths in DAG Ratio= energy increase/ time decrease more_energy<=Threshold? Generate scheduling queue based on the level (ascending) No Yes No select the task (has not been scheduled yet) with the lowest level as starting task Ratio<=Threshold? Duplicate this task and select the next task in the same critical path Yes meet entry task Duplicate this task and select the next task in the same critical path No allocate it to the same processor with the tasks in the same critical path No Yes Save time if duplicate this task? For each task which is in the same critical path with starting task, check if it is already scheduled Yes PEBD EAD The EAD and PEBD Algorithms

  27. Energy Dissipation in Processors http://www.xbitlabs.com

  28. Parallel Scientific Applications Fast Fourier Transform Gaussian Elimination

  29. Large-Scale Parallel Applications Robot Control Sparse Matrix Solver http://www.kasahara.elec.waseda.ac.jp/schedule/

  30. Impact of CPU Power Dissipation Impact of CPU Types: 19.4% 3.7% Energy consumption for different processors (Gaussian, CCR=0.4) Energy consumption for different processors (FFT, CCR=0.4)

  31. Impact of Interconnect Power Dissipation Impact of Interconnection Types: 5% 3.1% 16.7% 13.3% Energy consumption (Robot Control, Myrinet) Energy consumption (Robot Control, Infiniband)

  32. Parallelism Degrees Impact of Application Parallelism: 6.9% 5.4% 17% 15.8% Energy consumption of Sparse Matrix (Myrinet) Energy consumption of Robert Control(Myrinet)

  33. Communication-Computation Ratio Impact of CCR: Energy consumption under different CCRs CCR: Communication-Computation Rate

  34. Performance Impact to Schedule Length: Schedule length of Gaussian Elimination Schedule length of Sparse Matrix Solver

  35. Heterogeneous Clusters - Motivational Example

  36. Motivational Example (cont.) Energy calculation for tentative schedule C1 C2 C3 C4

  37. Experimental Settings Simulation Environments

  38. Communication-Computation Ratio CCR sensitivity for Gaussian Elimination

  39. Heterogeneity Computational nodes heterogeneity experiments

  40. Conclusions • Architecture for high-performance computing platforms • Energy-Efficient Scheduling for Clusters • Energy-Efficient Scheduling for Heterogeneous Systems • How to measure energy consumption? Kill-A-Watt 2014/9/19 40

  41. http://www.auburn.edu/~xzq0001 2014/9/19 41

  42. Questionshttp://www.eng.auburn.edu/~xqin

More Related