1 / 12

What computer architects need to know about memory throttling WEED 2010 June 20, 2010

IBM Research – Austin Heather Hanson Karthick Rajamani. What computer architects need to know about memory throttling WEED 2010 June 20, 2010. Outline. Memory throttling overview Experimental platform System configuration Memory throttling implementation

ian-nixon
Download Presentation

What computer architects need to know about memory throttling WEED 2010 June 20, 2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IBM Research – Austin Heather Hanson Karthick Rajamani What computer architects need to know about memory throttlingWEED 2010June 20, 2010

  2. Outline • Memory throttling overview • Experimental platform • System configuration • Memory throttling implementation • Memory throttling characterization • Bandwidth • Power • Performance • Summary

  3. Memory throttling in a nutshell • Memory throttling is a power-performance knob that: • Impacts memory reference rates of both instruction and data streams • controls power • can be used for safety or optimization • regulate DIMM temperatures • enforce memory power budgets • Memory throttling restricts read & write traffic • directly controls memory power • indirectly affects processors and other components • Several implementation styles in commercial systems • insert periodic idle cycles • allow arbitrary number of transactions up to power (estimated) threshold • run + hold windows • enforce read & write quotas [this paper] • first N transactions to proceed in time window • any further requests wait until next time period

  4. quota-style memory throttling reads & writes proceed as requested up to N requests per period Example: N = 6 Up to 6 transactions serviced per period, regardless of request timing Comparison to clock throttling run-hold clock throttling regular frequency during run portion; clock halted during hold portion Nth request in each period; additional requests would be queued for later service

  5. POWER6 Memory Throttling • IBM JS12 blade system • Processor • POWER6 • 1 socket x 2 cores per processor socket • 3.8 GHz frequency (fixed in these experiments) • SLES10 linux • Memory • 16 GB capacity • 8 DIMMS x 2 GB each • DDR2 • 667 MHz bus • Quota-style memory throttling • N transactions per M memory cycles 100% throttle level == unthrottled • Time period is faster than thermal and power supply timescales

  6. Memory throttle characterization methodology • Sweep throttle settings • Set throttle • Run steady-behavior benchmark • DAXPY (double A * X plus Y) • FPMAC (floating-point multiply accumulate) • RandomMemory (generate random addresses) • SPECPower_ssj2008 calibration phase (peak throughput for warehouse transactions) • Record sensor data, 256ms per sample • Memory power • Memory reads & writes • Instruction throughput • And other sensors not shown here • Decrement throttle • Repeat for full range of throttle settings • Repeat throttle sweep for multiple benchmarks and memory footprints • Microbenchmarks: L1 cache contained and main memory footprints • SPECPower_ssj2008: behaves as nearly contained in on-chip caches • Calculate median sensor data for each permutation {benchmark, footprint, throttle}

  7. transition between linear & saturated regions Memory throttle effect on bandwidth saturated linear

  8. A closer look at RandomMemory-DIMM • uses less bandwidth than other benchmarks at same throttle levels • also less bandwidth than its own saturation level • Simply measuring bandwidth at a single/current throttle level is not enough to identify a region of operation • less than max could be saturated or transition region • ….a controller will not be able to accurately predict the effect on bandwidth of a throttle level change • …or predict the effect on power or performance Subtle but very important point about transition region Actual bandwidth < max bandwidth bandwidth restrictions pipeline starvation reduced request rate

  9. Memory Poweris basically linear with bandwidth, so this chart looks familiar….

  10. performance power Throttling effects relative to each benchmark • Generally more performance reduction than power reduction (in %) • Throttling alone doesn’t affect static portion of memory power • Leveraging idle low-power modes of memory can alter positively the power-performance curve for memory request rate throttling. • Possible to waste energy from longer execution time • Larger bandwidth demands  larger effect from throttling • Conversely, power reduction only when performance is impacted. L1-contained DAXPY: throttling has no effect DIMM-sized DAXPY: drastic effect

  11. Summary • Memory throttling is a power-performance knob available in commercial systems • Memory controller restricts read & write bandwidth • caps memory power • controls DIMM temperature • Mileage may vary • power and performance management depend on bandwidth demand • throttling a low-bandwidth workload doesn’t reduce much power • potential to use more energy due to increased execution time • use highly throttled settings with caution • Effective tool for power capping • power constrained configurations • thermal safety • power shifting

  12. Acknowledgements • IBM Research – Austin • IBM Systems & Technology Group • Memory characterization: Joab Henderson, Kenneth Wright • EnergyScale firmware: Guillermo Silva, Andrew Geissler

More Related