190 likes | 415 Views
Software-Hardware Cooperative Power Management Technique for Main Memory. Hai Huang, Kang G. Shin University of Michigan Charles Lefurgy, Karthick Rajamani, Tom Keller, Eric Van Hensbergen, Freeman Rawson IBM Austin Research Lab. Motivation.
E N D
Software-Hardware Cooperative Power Management Technique for Main Memory Hai Huang, Kang G. Shin University of Michigan Charles Lefurgy, Karthick Rajamani, Tom Keller, Eric Van Hensbergen, Freeman Rawson IBM Austin Research Lab
Motivation • High power dissipation causes a lot problems for many computing systems, especially for large servers • High electric and cooling cost • Unreliable electronic components • Low rack-density • Intelligent management of system power is important to ensure these systems can continue to function
DRAM: A Power Hog • Main memory (DRAM) consumes a significant portion of the total power – which makes it a good candidate to optimize power for • E.g., in an IBM mid-range eServer system, around 40% of the total power is consumed by the main memory
Outline • Motivation • Background • Previous Work • A Cooperative Approach • Results • Conclusion
Outline • Motivation • Background • Previous Work • A Cooperative Approach • Results • Conclusion
Background • DRAM dissipates power continuously • Self-refresh, row/column decoders, amplifiers, data queue, etc. • DRAM’s power management capabilities • Multiple power states • Memory controller is used to implement a simple interface to transition between these states • Transitions have non-negligible delays • Trade-offs between power and performance
Read/Write (779.1 mW) auto Standby (275.0 mW) 5ns 5ns 5ns 1000ns Power-down (150 mW) Self-refresh (20.87 mW) Example: DDR Example: Registered 512MB DDR module w/8 devices per rank
Outline • Motivation • Background • PreviousWork • Software Techniques • Hardware Techniques • A Cooperative Approach • Results • Conclusion
Process i context-switched in Process j context-switched in Standby Rank 0 Rank 1 Rank 2 Rank 3 time Self-refresh Self-refresh Self-refresh Standby Self-refresh Standby Self-refresh Software Technique Process i: uses ranks 0 and 2 Process j: uses rank 3 • OS can track each process’ virtual-to-physical memory mappings
Idle time > Threshold Idle time > Threshold read/write Standby power Self-refresh time Idle time < Threshold Idle time < Threshold Hardware Technique • Allows for much finer-grained control of power • Monitors each memory access • Predicts when to transition to lower power modes
Process i Process j Process i Process j memory accesses time Hardware Technique: Problems • Hardware techniques can be easily confused by constant context-switching • Different processes would have different memory access behavior, and it takes time for the memory controller to adapt, readapt, readapt… - Imagine hundreds of parallel processes instead of 2! - context switching interval ~ 1 msec
Outline • Motivation • Background • Previous Work • A Cooperative Approach • Results • Conclusion
Cooperative Approach • Improve the hardware technique so we don’t have to readapt, readapt, readapt… • Need system software cooperation • Make the hardware understand the notion of processes • At each context switch, OS sends a signal to the memory controller • Upon receiving this signal, the memory controller saves and restores its internal registers, which are used for keeping past memory access patterns • Essentially, we can now manage power for the current process solely depending on this and only this process’ past memory accesses
Memory controller CPU Registers Registers Threshold predictor Signals context switch Restores scheduled process’ CPU context and MC context Saves current process’ CPU context MC context Context-Aware Memory Controller
Process i Process j Process i Process j memory accesses Cooperative Technique: Per-Process time
Outline • Motivation • Background • Previous Work • A Cooperative Approach • Results • Conclusion
Experimental Setup • Mambo: • A full-machine simulator to run various workloads and collect memory traces • Memsim: • Trace-driven simulator that produces performance and power results for the main memory • Workloads: • SPECjbb + bzip2 + crafty (low memory-intensive) • SPECjbb + art + mcf (high memory-intensive)
Results Low-memory intensive workload High-memory intensive workload
Conclusion • Cooperative technique • Uses 72–75% less power than when no power management is applied, with 11–14% slow-down in average response time • Uses 14–17% less power than the hardware technique • Uses 16–26% less power than the software technique • Has a comparable performance to HW and SW techniques • Future Work • Communicate hints directly from user processes to the hardware