1 / 19

MemScale: Active Low-Power Modes for Main Memory

MemScale: Active Low-Power Modes for Main Memory. Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University *University of Michigan. Server memory power challenges. Power consumption of a Google server [Barroso & Hoelzle’07].

falala
Download Presentation

MemScale: Active Low-Power Modes for Main Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MemScale: Active Low-Power Modes for Main Memory Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University *University of Michigan

  2. Server memory power challenges Power consumption of a Google server [Barroso & Hoelzle’07] Power (% of peak) Compute Load (%) • DRAM power varies little with load • Memorypower represents 30-40% of total power for typical loads • Fraction is larger since memory controller power is not included

  3. Improving memory energy efficiency Observation: Memory bandwidth is rarely fully utilized [Meisner’11]; we can save energy during periods of light and moderate load Previous approaches Leveraging DRAM idle low-power state[Lebeck’00][Delaluz’01][Li’04][Diniz’07]… Rank sub-setting and DRAM reorganization [Ahn’09][Udipi’10][Zheng’10]… Memory controller power is typically not considered Need active low-power modes to save energy when underutilized Frequency has greater impact on bandwidth than latency

  4. MemScale: Active low-power modes for memory Goal: Dynamically scale memory frequency to conserve energy Hardware mechanism: Frequency scaling (DFS) of the channels, DIMMs, DRAM devices Voltage & frequency scaling (DVFS) ofthe memory controller Key challenge: Conserving significant energy while meeting performance constraints Approach: Online profiling to estimate performance and bandwidth demand Epoch-based modeling and control to meet performance constraints Main result: System energy savings of 18% with averageperformance loss of 4%

  5. Outline Motivation and overview Background on memory systems MemScale: DVFS for the memory system Results Conclusions

  6. Impact of frequency scaling on memory latency Req ACT PRE Burst Reply CL 800 MHz MC ACT CL Burst PRE Time 400 MHz MC ACT CL Burst PRE • For DDR3 DRAM, scaling frequency from 800MHz to 400MHz: bandwidth down by 50%, latency up by only 10%

  7. Opportunity for MemScale Background: clock tree, I/O driver, register, PLL, DLL, refresh, others Dynamic: read, write, termination MC: memory controller • Effects of lower frequency on power: • Lowers background power linearly (~f) • Lowers MC power by cubic factor (~f^3)

  8. Outline Motivation and overview Background on memory systems MemScale: DVFS for the memory system Results Conclusions

  9. MemScale design Goal: Minimize energy under user-specified slowdown bound Approach: OS-managed, epoch-based memory frequency tuning Each epoch (e.g., an OS quantum): Profile performance & bandwidth demand New performance counters track mem latency, queue occupancies Estimate performance & energy at each frequency Models estimate queuing delays & system energy Re-lock to best frequency; continue tracking performance Slack: delta between estimated & observed performance Carry slack forward to performance target for next epoch 9

  10. Frequency and slack management Actual Pos. Slack Profiling Neg. Slack Pos. Slack CPU Target Calculate slack vs. target Estimate performance/energy via models High Freq. MC, Bus + DRAM Low Freq. Epoch 1 Epoch 2 Epoch 3 Epoch 4 Time 10 10

  11. Modeling of performance and energy • New performance counters enable estimate of • Level of contention (bank and bus) • Energy consumption • CPI of each application • Avg memory latency • Performance slack • Estimate full system energy

  12. MemScale adjusts frequency dynamically Timeline of workload mix MID3

  13. Outline Motivation and overview Background on memory systems MemScale: DVFS for the memory system Results Conclusions 13

  14. Methodology Detailed simulation 16 cores, 16MB LLC, 4 DDR3 channels, 8 DIMMs Multi-programmed workloads from SPEC suites Power modes 10 frequencies between 200 and 800 MHz Power consumption Micron’s DRAM power model Memory system power = 40% of total server power

  15. Results – energy savings and performance Average energy savings Performance overhead Memory energy savings of 44% System energy savings of 18% always within performance bound

  16. Alternative approaches • Fast power-down • Transition ranks into fast power-down mode when idle • Decoupled-DIMM [Zheng’09] • Low frequency DRAM + high frequency DIMMs & channels • Static • Pre-selected active low-power mode w/o dynamic scaling • Unrealistic: needs a priori knowledge of workload behavior

  17. Results – comparison to alternative approaches Full system energy savings (MID) Performance overhead (MID) Energy Savings (%) CPI increase (%) Static Fast-PD Static MemScale Fast-PD MemScale Decoupled-DIMM MemScale+Fast-PD Decoupled-DIMM MemScale+Fast-PD

  18. Conclusions MemScale contributions: Active low-power modes for the memory subsystem New perf. counters to capture energy and contention OS policy to choose best power mode dynamically Avg 18% system energy savings,avg 4% performance loss In the paper Performance and energy models Sensitivity analyses (including lower performance bounds) Energy break-down comparison

  19. THANKS! SPONSORS:

More Related