Challenges in distributed energy adaptive computing
This presentation is the property of its rightful owner.
Sponsored Links
1 / 35

Challenges in Distributed Energy Adaptive Computing PowerPoint PPT Presentation


  • 55 Views
  • Uploaded on
  • Presentation posted in: General

Challenges in Distributed Energy Adaptive Computing. K. Kant NSF and GMU. Information & communication Technology (ICT) has a problem Performance Centric  Energy & Sustainability centric How do we get there?. ICT Power Growth until 2020. Increase in spite of power efficient designs

Download Presentation

Challenges in Distributed Energy Adaptive Computing

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Challenges in distributed energy adaptive computing

Challenges in Distributed Energy Adaptive Computing

K. Kant

NSF and GMU

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


Challenges in distributed energy adaptive computing

Information & communication Technology (ICT) has a problem

Performance Centric  Energy & Sustainability centric

How do we get there?

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


Ict power growth until 2020

ICT Power Growth until 2020

  • Increase in spite of power efficient designs

    • Clients: 8x in number, 3X in power

    • Data Centers: > 2X increase

    • Network: 3X increase

Network

Clients

Transmission, conversion

& distribution

Data Center

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


Current state unsustainable computing

Current StateUnsustainable Computing

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


Data center infrastructure

Data Center Infrastructure

  • Resource intensive: Water, cabling, metal, …

  • ~50% power wasted before getting to racks

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


Challenges in distributed energy adaptive computing

Distribution Infrastructure

~10% distribution loss + High carbon impact

IT LOAD

2.5MW Generator

~180 Gallons/hour

13.2kv

208V

~1% loss in switch

gear and conductors

115kv

UPS:

480V

13.2kv

13.2kv

6% loss

94% efficient

1.0% loss

99.0% efficient

0.3% loss

99.7% efficient

0.5% loss

99.5% efficient

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


50 rack power wasted

~50% Rack Power Wasted

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


Sustainable computing

Sustainable Computing

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


Renewable energy push

Renewable Energy Push

  • Limit energy draw from grid

    • Less infrastructure

    • Less losses

    • but variable supply

Need better power adaptability

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


High temperature dc s

High Temperature DC’s

  • Chiller-less operation

    • Less energy/materials, but space inefficient

  • High temperature operation

    • Smaller Toutlet – Tinlet

    • More throttling

    • More failure prone (?)

X

Need smarter thermal adaptability

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


Overdesign

Overdesign

  • Overdesign is the norm today

    • Huge power supplies, fans, heat sinks, server cases, high rack capacity, UPS capacity, …

    • Engineered for worst case  Rarely encountered

    • Huge power wastage, waste of materials, energy, …

  • What if we right-size everything?

    • Highly energy efficient but need smarter control

Better energy adaptability to deal w/ frugal design


Energy adaptive computing

Energy Adaptive Computing

  • EAC strives to do dynamic end to end adjustment to

    • Workload adaptation for graceful QoS degradation under energy limitations

    • Infrastructure adaptation to cope with temporary energy deficiencies.

  • Requires coordinated power/thermal mgmt of computation, network & storage.

  • Enhances sustainability of IT infrastructure


Eac instances

EAC Instances

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


Client server eac

Client-server EAC

  • Transparently adapt to client energy states

    • State = {on-AC, normal, low-battery, …}

    • Service contract Ci = {setup QoS, operational QoS}

  • Adaptation Challenges

    • Communicating & enforcing contracts.

    • Group adaptation of clients forced by network/servers ?

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


Cluster eac

Cluster EAC

  • Adaptation to intra & inter-DC limits

    • Multi-level: Server, rack & DC levels

  • Adaptation Challenges

    • Estimate & collect power deficits/surplus at multiple levels

    • Coordination across large range of devices

      • Location based services

      • Coordination across levels

    • Simultaneously handle client-server loop

K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


P2p eac

P2P EAC

  • Adaptation based on “available energy”

    • Content: video resolution, audio coding, …

    • Network: modulate wireless radio usage (?)

    • Energy proportional use of peer resources

    • Energy driven content replication & reorganization

  • Adaptation Challenges

    • Satisfying QoS ?

    • Balancing src/dest usage vs. relay node energy usage ?

  • K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


    Challenges some specific issues

    ChallengesSome specific Issues

    K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


    Power estimation challenges

    Power Estimation Challenges

    • Notion of effective power?

      • Additive relationship: Workload  power

      • Why is this hard? Interference

    • Available power

      • Determined by power, thermal & perhaps other issues (noise).

      • Required at multiple levels: facility, enclosure, machine, …

    K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


    Network role in eac

    Network Role in EAC

    • Energy Adaptation

      • Aggressive control of switch/router ports

        • Speed, state & width controls

      • Traffic consolidation across paths

    • Adaptation induced congestion

      • Propagation (e.g., ECN, EBCN) & response

        • Computation – communication tradeoff ?

        • Redirection ?

    • Network protocol support for adaptation?

    K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


    Other issues

    Other Issues

    • EAC Security

      • Attacks on power sources

      • Energy Attacks on IT, e.g.,

        • Demanding too much, cyclic demands, …

    • Storage adaptation

      • Storage devices, controllers & network.

    • Coordinated end to end control is hard!

    • Formal models to understand impact of energy adaptation.

    K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


    Energy adaptation in data centers

    Energy Adaptation in Data Centers

    K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


    Adaptation methods

    Adaptation Methods

    • Workload Adaptation

      • Coarse grain: Shut down low priority tasks

      • Fine grain: Graceful QoS degradation, e.g.,

        • Batched service, poorer resolution, …

    • Infrastructure Adaptation

      • Operation at lower speeds (DVFS)

      • Effective use of low power modes & “width” control.

    • Workload adaptation always done first


    Infrastructure adaptation

    Infrastructure Adaptation

    • Need a multilevel scheme –

      • Individual “assets” up to entire data center

    • Need both supply & demand side adaptations


    Supply side adaptation

    Supply Side Adaptation

    • Supply side Limits

      • Hard caps at higher levels (true limit) vs. “soft” (artificial) caps at lower levels.

      • Limits may be a result of thermal/cooling issues.

    • Load consolidation

      • An essential part of energy efficient operation

      • Load consolidation vs. soft capping

    • Need to address workload adaptation changes as a result of supply increase & decrease.


    Demand side adaptation

    Demand Side Adaptation

    • Adaptation to fluctuating demand

      • Transactional workload: Migrate queries or app VMs?

    • Issues w/ combined supply & demand side adaptations

      • Imbalance: One node squeezed while other has surplus power

      • Ping-pong Control: Oscillatory migration of workload

      • Error accumulation down the hierarchy.


    A proposed algorithm

    A Proposed Algorithm

    • Unidirectional control

      • Load migration moves up the hierarchy, from local to global.

      • Local migrations are temporary & do not trigger changes to “soft” caps on supply.

    • Target Node selection

      • Based on bin packing (best-fit decreasing)

      • Allows for more imbalance, which can be exploited for workload consolidation

    • Properties

      • Avoids ping-pong, attempts to minimize imbalance


    Experimental results

    Experimental Results

    • Scenario

      • 3 levels, 18 identical servers (4+4 + 5+5)

      • 3 applications, total of 25 app instances

      • Any app can run on any server

      • Demand Poisson (active power ∞ utilization)


    Migration frequency

    Migration Frequency

    • Migration drivers: consolidation vs. energy deficiency

      • Low util Consolidation, High util Energy deficiency

    • Other characteristics

      • Migration frequency low in all cases

      • No ping-pong observed


    Thermal impacts

    Thermal Impacts

    • Additional Issues

      • Energy consumption limited by thermal/cooling issues, not energy availability

      • Migrations required to limit temperature

    • Temperature & power have nonlinear relationship

    • Need to account for both power & thermal effects


    Results w thermal effects

    Results w/ Thermal Effects

    • Imbalanced cooling

      • Servers 1-14: Ta=25o C, Servers 15-18: Ta=40oC

      • Temperature limit: 65oC

    • Power demand is adjusted by the alg. to account for higher temperature


    Conclusions

    Conclusions

    • Need to go beyond energy efficiency

      • Design devices/systems to minimize life-cycle energy footprint

      • Creatively adapt to available energy to operate “at the edge”

    • Ongoing/future work

      • Coordinated server, network & storage mgmt.

      • Explore tradeoffs between QoS, power savings and admission control performance


    Thank you

    Thank you!

    K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


    Power inefficiencies

    Power Inefficiencies

    Wasted leakage & clock power

    Rack

    supply

    90-95% efficient

    CPU

    Voltage

    Regulators

    280V

    Server

    PSU

    DRAM & Mem

    controller

    ±12, ±5V

    70-90% efficient

    Fans

    Storage

    Adapters

    95% efficient

    Idle wasted power

    K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


    Operating regimes

    Operating Regimes

    K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


    So what s the problem

    DC1

    storage

    Server1

    DC2

    storage

    Server2

    So, What’s the Problem

    Client

    Client

    • Local constraints & controls  end-to-end impacts

      • DC to DC load shift

        • Service disruption & post-shift impact

      • Client request to alter content

        • Less or more work for server

    • Potential conflicting controls

    Network

    Core Network

    K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing


  • Login