Cooling-Aware and Thermal-Aware Workload Placement for Green HPC Data Centers

Cooling-Aware and Thermal-Aware WorkloadPlacement for Green HPC Data Centers Sandeep K. S. Gupta (co-authors: AyanBanerjee, TridibMukherjee, George Varsamopoulos) School of Computing, Informatics and Decision Systems Engg. Arizona State University

Sandeep Gupta, IEEE Senior Member @ School of Computing & Informatics Use-inspired, Human-centric research in distributed cyber-physical systems • Heads Thermal Management for Data Centers Intelligent Container Pervasive Health Monitoring Criticality Aware-Systems Mobile Ad-hoc Networks ID Assurance BEST PAPER AWARD: Security Solutions for Pervasive HealthCare – ICISIP 2006. BOOK: Fundamentals of Mobile and Pervasive Computing, Publisher: McGraw-Hill Dec. 2004 • Area Editor • TCP Co-Chair: GreenCom’07 • TCP Chair Also for IEEE TPDS WINET http://impact.asu.edu/greencom http://www.bodynets.org Email: Sandeep.Gupta@asu.edu; IMPACT Lab URL: http://impact.asu.edu;

IMPACT: Current Research Thrusts • Challenges– Traffic congestion, Energy Scarcity, Climate Change, Medical Cost … • Smart Infrastructure – distributed CPS (Cyber-Physical Embedded System (of systems)) • Criticality (Context)-awarenessto enhance dependability (security, safety, reliability) of CPS systems • Unifying Framework to enhance our understanding in developing (energy) efficient, sustainable, assured CPS • Model-based Design and Development to harness complexity (simultaneously ensure safety, security, efficiency etc.) as well cost. • Enhanced Usability and Interoperability to reduce manageability overhead and enhance

IMPACT Lab Members and Collaborators • IMPACT Lab (http://impact.asu.edu) • Faculty: • Sandeep K. S. Gupta (Professor) • Postdoc: • Georgios Varsamopoulos • Tridib Mukherjee • Students: • Zahra Abbasi (CSE Phd) • Ayan Banerjee (CSE Phd) • Michael Jonas (CSE Phd) • Sailesh Kandula (CSE MS) • Su Kim (CSE Phd) • Collaborator From: • Microsoft Embedded Innovation Center, Aachen • FDA • University of Florence • Intel Corp. • Texas Instruments • U. Penn

Introduction-Motivation Projected Electricity Use of data centers\, 2007 to 2011 • The magnitude of data center energy consumption • Internet users’ growth in the world from 2000-2009: 400% [http://www.internetworldstats.com/stats.htm] • Data center energy consumption grew 20-30% annually in 2006 and 2007 [ Uptime Institute research] • Addressing energy saving for internet/HPC Data Centers • Thermal and Cooling awareness to improve energy consumption Future energy use projection - current efficiency trend Historical energy use [Source: EPA] Typical data center energy end use [Source: Department of energy]

The BlueTool project http://impact.asu.edu/BlueTool/

Overview of problem and results • Problem • Can we save energy by coordinating job scheduling and cooling? • How much? • Results and Contributions • SP-EIR: an energy inefficiency metric of spatial scheduling • higher SP-EIR → worse energy performance of a schedule • lower heat recirculation → lower SP-EIR • Higher thermostat setting → lower SP-EIR (because of CoP) • HTS • a spatial scheduling algorithm that heuristically maximizes the thermostat setting • Evaluation of the HTS combined with FCFS or EDF: • EDF-HTS saves 15% over EDF-LRH

Outline of talk • Background • thermal awarenes and cooling awareness • System model • Physical assumptions, job model, cooling model • Heat recirculation and thermostat • Dependency between job and cooling • Problem definition • How to schedule jobs so as to minimize the need for low cooling temperature • SP-IER and HTS • Simulation-based comparison of various combinations between FCFS and EDF with LRH and HTS. • On-going work • Energy-proportional computing and its savings

Job scheduling and energy awareness • Most energy-aware approaches are power-aware (e.g. DVFS schemes) • Thermal awareness: • to know heat recirculation • Cooling awareness • to know cooling performance • Why cooling awareness? • Cooling, along with PDUs, responsible for PUE>1 • Optimizing for cooling can save additional energy • About 15% for the simulated data center job scheduling schemes energy-oblivious energy-aware or power-aware performance-oriented power-aware (thermally oblivious) thermal-aware (aware of heat effects) cooling-oblivious cooling-aware (aware of the cooling model)

supply airtemperature (Tsup) Tthresholds (set points) system power (P) Ppeak Pidle=b Input air temperature (Tsen) CPU utilization (U) System model (1) • Cold-aisle, hot-aisle configuration • Tred: red-line • Each job comes with a deadline • performance heterogeneity • fast and slow machines • CRAC (cooling equipment) • Two cooling power modes • low (preferred for energy eff.) • High • Two (programmable) set points • Low->high • High->low • Mode-switching delaytsw • Coefficient of performance depends on the current mode • Tsup = Tsen-Tdiffmode • Epoch: the interval between two consecutive triggers is called an • Computing equipment: • linear power consumption model • P = a U + b low high Challenge is to set low->high set point as high as possible.

System model (2) • Models assumed • Cooling distribution matrix F • Diagonal matrix: fii: portion of cool air going to equipment i • Heat recirculation matrix D • dij: portion of heat going from equipment i to equipment j • Tin(t) = FTsup(t)+DP(t)≤Tred Tsup(t) ≤ F-1[Tred-D(aU(t)+ω) ] Tin≤Tred Equip. 1 d13 f1 d21 d12 d31 Equip. 2 CRAC f2 f3 d32 d23 Equip. 3 Tsup(t) has to be dynamically adjusted in accordance to U(t) to match the Tred constraint Highest thermostat setting, maxTsen, can be derived as: maxTsen = Tsup + Tdiffm - [temperature increase due to mode-switching delay] Selecting/scheduling a different set of servers (i.e. changing a and ω) can change the requirements on Tsup.

Problem definition and HTS • Problem definition • Given a data center and its running jobs, • for a given new job, find: • a spatial schedule (i.e. server assignment) for that job, • andthermostat settings for the CRAC, • that minimize the energy consumption while meeting the deadlines. • Algorithm HTS (Highest Thermostat Setting) • Spatial scheduling only algorithm (i.e. server assignment) • Find a spatial schedule that maximizes (lw->hi) thermostat setting • Assign ranking grade to each server Rank(server j) = Tred – [temperature rise to j caused by all servers at full blast] • Assign the job to the available servers with the highest rank values

SP-EIR: SP-EIR(alg, job set J) = Challenging to compute max SP-EIR over all possible jobs sets Akin to competitive ratio in performance domain One (naïve) upper bound to SP-EIR: Ealg(100% utilization)/Eopt(idle) Note the naïve upper bound is independent of the algorithm (for 100% utilization, there is only one possible schedule) It is solely dependent on datacenter thermal and cooling behavior For simulated data center, upper bound is 1.69 Here we “measure” SP-EIR using simulation – Leave theoretical analysis as a challenge for theoreticians SP-EIR:an energy inefficiency metric

Simulation setup 5% overall data center utilization • F and D derived from a CFD model of the ASU HPCI data center • 9.6 m  8.4 m  3.6 m • 30 Dell 1955 chassis, 20 Dell 1855 chassis • P derived from power measurements of the computing equipment • Some variations of the spatial scheduling algorithms: • cooling oblivious (e.g. LRH): statically use the thermostat setting for 100% of data center utilization • m (e.g. LRHm): statically use the maximum thermostat setting for the given job trace • d (e.g. LRHd): dynamically adjust the thermostat setting to match Tred. 40% overall data center utilization 80% overall data center utilization

Measuring thermal efficiency: LRH • Thermal efficiency: least contribution on the heat recirculation • LRH: A metric of thermal efficiency of a server [Tang et al. T-PDS ’08] • Based on a two-layer rank calculation • Rank the servers as recipients of heat recirculation • Rank the servers as contributors of heat recirculation • LRH weight of S =Σrecipients recipient value  amount of heat from S to recipient Example: LRH ranking of servers A and B A The direction and amount of heat recirculation B LRH rank of Server B is worse than A

Reference algorithm used as optimal: minimize the product DP Observations HTS alwas has the lowest SP-EIR in the simulations Enhancing any algorithm with cooling-awareness reduces the SP-EIR. MTDP has lower SP-EIR than LRH although it is thermally oblivious Power-aware workload consolidation (MTDP) has higher saving effect than thermal aware job scheduling (LRH)\ Enhancing LRH with cooling awareness can bring the SP-EIR lower than MTDP SP-EIR as measured

EDF-HTS: results on energy savings with respect to other schemes

Conclusions • Cooling awareness • Advantages • Additional energy savings with other thermal-aware (but cooling-oblivious) schemes • Savings up to 23% in the simulations • Disadvantages • Requires good knoweldge of the heat recirculation pattern and the performance of the cooling units • Holistic management approaches that can configure the cooling unit by network can be cooling-aware • SP-EIR • SP-EIR depends on the given algorithm, job and data center. • Upper bound for any algorithm depends on the thermal and power characteristics of the data center.

Implications of thermal awareness • First direction • Introduce thermal awareness beyond just scheduling, in data center management: • Thermal-aware power management • Thermal-aware cooling management • Cooling-awareness enables the above • “Model-driven Co-ordinated Management of Data Centers,” ComNet, Special issue on Managing Emerging Computing Environments, under review • Second direction • Investigate technological trends on the savings of management • E.g. “Trends and Effects of energy proportionality on server provisioning in internet data centers,” HiPC 2010

Energy proportionality metrics • Energy-proportional computing: • Consume power in proportion to utilization (purple line) • Metrics • IPR: idle-to-peak power ratio • Pidle / Ppeak • LDR: linear deviation ratio • maxu (P(u)-Linear(u))/Linear(u) (the ratio of the maximum offset from the straight green line over the the value of the straight line at the maximum point)

Historical trends of energy proportionality Source data from SPECpower_ssj2008 published results http://www.spec.org/power_ssj2008/results/

Discussion on diverging LDR optimal performance-to-power ratio • Negative LDR • Ideal for stand-alone systems that are under-utilized • Positive LDR • Ideal for use in consolidation • Near-zero LDR • energy efficiency is independent to the utilization level P U minimal energy increase for considerable performance increase P U performance-to-power ratio almost independent of workload P U

Conclusions • Energy proportionality will have different effects on the energy savings, depending upon the shape of the power curve • IPR → 0 • energy savings of power management (server provisioning) are expected to be minimal • LDR >> 0 • maximum energy efficiency may not be at the 100% utilization • Systems can be optimally efficient at lower utilizations

Cooling-Aware and Thermal-Aware Workload Placement for Green HPC Data Centers

Cooling-Aware and Thermal-Aware Workload Placement for Green HPC Data Centers

Presentation Transcript

STHoles: A Multidimensional Workload-Aware Histogram

Towards Green Aware Computing

Towards Thermal Aware Workload Scheduling in a Data Center

Power Aware Virtual Machine Placement

Energy-efficient, Thermal-aware Data Placement, Replication, and Scheduling in Data Centers

Energy and heat-aware metrics for data centers

Thermal Aware Workload Scheduling with Backﬁlling for Green Data Centers

Cooling-Aware and Thermal-Aware Workload Placement for Green HPC Data Centers

Dictionaries and Data-Aware Measures

Thermal-aware Task Placement in Data Centers

Power aware scheduling/workload placement over cloud

Energy Aware Grid: Global Workload Placement based on Energy Efficiency

Thermal Aware Data Management in Cloud based Data Centers

Thermal Aware Resource Management Framework

Lens Aberration Aware Timing-Driven Placement

A policy-aware switching layer for data centers

Thermal-aware Issues in Computers

Power-Aware Placement

Variation Aware Placement in FPGAs

Cost- and Energy-Aware Load Distribution Across Data Centers

A Policy-aware Switching Layer for Data Centers

Power-Aware Placement