200 likes | 293 Views
This invited talk by Dr. Krishna Kant from Intel/GMU and M. Murugan from U/Minn delves into the coordination of supply and demand in energy-adaptive computing. The presentation covers the motivation behind energy adaptive computing, hierarchical adaptation strategies, and ongoing research results, with a focus on addressing energy constraints and sustainability concerns in IT infrastructure. The talk explores the need for smarter control mechanisms to cope with temporary energy deficiencies, high-temperature operation, frugal designs, and the coordination of power and thermal management across computation, network, and storage systems. Additionally, the session details adaptation methods, supply and demand side considerations, and a proposed algorithm to address potential challenges in energy-adaptive computing. Experimental results and insights into thermal impacts, as well as forthcoming challenges related to network and storage energy management, are also discussed.
E N D
Supply and Demand Coordination in EnergyAdaptive Computing(invited talk) Dr. Krishna Kant Intel/GMU M. Murugan, U/Minn
Outline • Motivate energy adaptive computing • Operation under Energy Constraints • Hierarchical adaptation to energy constraints. • Results and ongoing work
Motivation • ICT Energy Issues • Soaring energy & cooling costs in Data Centers • Power/thermal issues hindering Moore’s Law • Sustainability concerns leading to use of renewable energy, chiller-less cooling, smaller capacities, etc. • Consequences • Variable energy supply & smaller safety margins • Requires smarter control to cope with temporary energy deficiencies.
IT systems fed by Renewable Energy • Limit or eliminate energy draw from grid • Less infrastructure & losses, but variable supply • Need to consider impact on both computing & communications • Similar issues wrt unreliable grid supply Need better power adaptability
High Temperature Operation • Chiller-less data centers • Less energy/materials, but space inefficient • High temperature operation of comm/computing equipment • Smaller Toutlet – Tinlet • Deal w/ occasionally hitting temp. limits. X Need smarter thermal adaptability
Frugal Designs • Overdesign is the norm today • Huge power supplies, fans, heat sinks, server cases, high rack capacity, UPS capacity, … • Engineered for worst case Rarely encountered • Huge power wastage, waste of materials, energy, … • What if we right-size everything? • Highly energy efficient but need smarter control Better energy adaptability to deal w/ frugal design
Energy Adaptive Computing • EAC strives to do dynamic end to end adjustment to • Workload adaptation for graceful QoS degradation under energy limitations • Infrastructure adaptation to cope with temporary energy deficiencies. • Requires coordinated power/thermal mgmt of computation, network & storage. • Enhances sustainability of IT infrastructure
Adaptation Methods • Workload Adaptation • Coarse grain: Shut down low priority tasks • Fine grain: Graceful QoS degradation, e.g., • Batched service, poorer resolution, … • Infrastructure Adaptation • Operation at lower speeds (DVFS) • Effective use of low power modes & “width” control. • Workload adaptation always done first
Infrastructure Adaptation • Need a multilevel scheme – • Individual “assets” up to entire data center • Need both supply & demand side adaptations
Supply Side Adaptation • Supply side Limits • Hard caps at higher levels(true limit) vs. “soft” (artificial) caps at lower levels. • Limits may be a result of thermal/cooling issues. • Load consolidation • An essential part of energy efficient operation • Load consolidation vs. soft capping • Need to address workload adaptation changes as a result of supply increase & decrease.
Demand Side Adaptation • Adaptation to fluctuating demand • Transactional workload: Migrate queries or app VMs? • Issues w/ combined supply & demand side adaptations • Imbalance: One node squeezed while other has surplus power • Ping-pong Control: Oscillatory migration of workload • Error accumulation down the hierarchy.
A Proposed Algorithm • Unidirectional control • Load migration moves up the hierarchy, from local to global. • Local migrations are temporary & do not trigger changes to “soft” caps on supply. • Target Node selection • Based on bin packing (best-fit decreasing) • Allows for more imbalance, which can be exploited for workload consolidation • Properties • Avoids ping-pong, attempts to minimize imbalance
Experimental Results • Scenario • 3 levels, 18 identical servers (4+4 + 5+5) • 3 applications, total of 25 app instances • Any app can run on any server • Demand Poisson (active power ∞ utilization)
Migration Frequency • Migration drivers: consolidation vs. energy deficiency • Low util Consolidation, High util Energy deficiency • Other characteristics • Migration frequency low in all cases • No ping-pong observed
Thermal Impacts • Additional Issues • Energy consumption limited by thermal/cooling issues, not energy availability • Migrations required to limit temperature • Temperature & power have nonlinear relationship • Need to account for both power & thermal effects
Results w/ Thermal Effects • Imbalanced cooling • Servers 1-14: Ta=25o C, Servers 15-18: Ta=40oC • Temperature limit: 65oC • Power demand is adjusted by the alg. to account for higher temperature
Challenges • EAC is about end-to-end control • Network & storage energy alsoneeds to be addressed • Network adaptation • More than power mgmt of ports. Need consolidation of traffic across ports • Need to deal w/ congestion created due to adaptation. • Storage adaptation • More than just storage device control, need to consider storage network as well. • Putting it all together is hard! • Need effective means of multi-level admission control. • Ultimate vision: Integrate client side as well
Conclusions • Need to go beyond energy efficiency • Design devices/systems to minimize life-cycle energy footprint • Creatively adapt to available energy to operate “at the edge” • Ongoing/future work • Coordinated server, network & storage mgmt. • Generalized workload adaptation (rule based?) • Explore tradeoffs between QoS, power savings and admission control performance