1 / 28

Power-aware Resource Allocation for Cpu - and Memory Intense Internet Services

Power-aware Resource Allocation for Cpu - and Memory Intense Internet Services. Vlasia Anagnostopoulou ( vlasia@cs.ucsb.edu ), Susmit Biswas , Heba Saadeldeen , Ricardo Bianchini , Tao Yang, Diana Franklin, Frederic T. Chong University of California, Santa Barbara

loman
Download Presentation

Power-aware Resource Allocation for Cpu - and Memory Intense Internet Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Power-aware Resource Allocation for Cpu- and Memory Intense Internet Services Vlasia Anagnostopoulou (vlasia@cs.ucsb.edu), SusmitBiswas, HebaSaadeldeen, Ricardo Bianchini, Tao Yang, Diana Franklin, Frederic T. Chong University of California, Santa Barbara First E2DC Workshop 08/05/2012

  2. Cpu- and Memory Intense Internet-Services • Latency-bound • Intense computation (=>high cpu utilization) • Petascale data MapReduce, Hadoop,…

  3. Datacenter clusters

  4. Datacenter cluster operation

  5. Challenges • Standard middleware algorithms are inefficient for cpu- and memory-intense internet services • Resource allocation operates at a fine-granularity • But is oblivious of the SLA • Power management is SLA-aware • But is only driven by the CPU • Coarse-grained • Request distribution does not operate at a resource granularity

  6. Overview of solution Standard Middleware Optimized Middleware • SLA-aware and fine-grained • Two steps: • Configure states of servers (basic power-aware resource allocation) • Allocate resources to servers (cpuand memory) Resource Allocation Power-aware Resource Allocation for cpu and memory Power Management Adjusted Request distribution Request distribution

  7. Contents • Introduction • Power-aware Resource Allocation • Basic • With Support for Multiple Applications • Adjusted Request Distribution • Methodology • Experiments • Conclusion

  8. Basic Power-aware Resource Allocation • Configure server states: • Active, off, low-power state • Problem of memory being inaccessible • Internet-services have high memory demand (for caching) • Solution: use a memory-active, low-power state (barely-alive) • Memory is on • Server is not operational, but memory can be remotely accessed • Memory contributes to global cache

  9. Details of Barely-alive state

  10. Basic Power-aware Resource Allocation • Calculations: • Active servers to service load • N_cpu_act= Load_demand / Cpu_capacity • Memory-active servers to satisfy memory demand • Active or barely-alive • N_mem_act = Memory_demand/ Mem_capacity • Configure to maximize energy savings, or to maximize memory allocation

  11. Example • N=5 servers • Cpu-capacity = 1,000 conn. • Mem-capacity = 1GB • Load = 3,000 conn. • Target mem-alloc = 4GB • Maximize energy-savings: • Maximize memory alloc.: • Mem. usage: 0.8GB/server • How to control the memory allocation?

  12. Memory Allocation for SLA • Two objectives: • 1) Allocate memory for SLA • 2) Share memory among services with SLA guarantees • Must be fair; accept priority • Guarantee minimum performance • Characteristics: • Uniform allocation per server (to avoid imbalance) • Memory performance monitoring capability which is SLA-aware

  13. Memory allocation for SLA • Utilize stack algorithm [Mattson] • Measures contribution of memory size to the hit-rate • Hit-rate is used as proxy of performance • Server-level: Calculate alloc for target-hit-rate • Attach SLA mapping • Cluster-level: calculate avg size for target hit-rate • How to allocate memory when constrained?

  14. SLA/Memory Sharing • Aggregate metric of performance • sum of allocations which yield performance closest to SLA • Linear optimization problem to maximize aggregate performance: • at each step, allocate memory s.t. to minimize aggregate performance • subject to memory capacity constraint • guarantee min SLA for each app {app1, app2} => Target SLA {#2, #2} dist_to_SLA_alloc= ∞ dist_to_SLA_alloc= ∞ dist_to_SLA_alloc = 1 dist_to_SLA_alloc = 1 dist_to_SLA_alloc = 0 dist_to_SLA_alloc = 0

  15. Request Distribution Processing…

  16. Adjusted Request Distribution Processing…

  17. Contents • Introduction • Power-aware Resource Allocation • Basic • With Support for Multiple Applications • Adjusted Request Distribution • Methodology • Simulator • Traces • Experiments • Conclusion

  18. Methodology • Datacenter-cluster simulator: • 1 rack • trace-based functional simulator • Simulate all standard and proposed middleware algorithms • Traces: • Internet-search “snippet” generator

  19. Contents • Introduction • Power-aware Resource Allocation • Basic • With Support for Multiple Applications • Adjusted Request Distribution • Methodology • Simulator • Traces • Experiments • Basic Algorithm • Shared Cluster • Conclusion

  20. Experiments – Basic Algorithm • Evaluate various configuration objectives: • Barely-alive: maximize memory allocation; Mixed: maximize energy savings • Fix SLA, evaluate energy savings only. Also, evaluate residual memory. • SLA #1, #2, #3: Response time degradation 1-2%, 2-3%, 3-4% • Aggressiveness of consolidation: 50, 70, 85%

  21. Results – basic algorithm • Mixed system has highest energy savings; up to 42% (24% over On/Off) • BA: up to 34% (20% over On/Off)

  22. Results – basic algorithm • Mixed system is most stable • In barely-alive system savings depend on the SLA level; can push the parameter for savings aggressiveness • On/off system savings are influenced by both parameters. Degrade significantly at high SLA levels

  23. Results - Basic algorithm • BA: up to extra 7.5GB memory: allocate to another application, transition to low-power etc

  24. Results – Cluster Sharing

  25. Results – Cluster sharing

  26. Contents • Introduction • Power-aware Resource Allocation • Basic • With Support for Multiple Applications • Adjusted Request Distribution • Methodology • Simulator • Traces • Experiments • Basic Algorithm • Shared Cluster • Conclusion

  27. Conclusion • Combine power management and resource allocation => power-aware resource allocation • SLA-driven, fine grained management of datacenter clusters • Performance guarantees + energy savings • Flexibility to different optimizations for datacenter scenarios • Achieve deep energy savings or potential for more memory utility out of cluster • Holistic design of middleware software

  28. Thank you for your attention!!! Questions? Contact: vlasia@cs.ucsb.edu URL: www.cs.ucsb.edu/~vlasia

More Related