1 / 31

Keeping Hot Chips Cool

Circuits R-US. Keeping Hot Chips Cool. Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY . So, What’s Going On ?. At 65nm node Static Power is equal to Active Power Clock distribution accounts for half of active power.

elvis
Download Presentation

Keeping Hot Chips Cool

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Circuits R-US Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY

  2. So, What’s Going On ? • At 65nm node Static Power is equal to Active Power • Clock distribution accounts for half of active power

  3. Why Can’t We Keep Scaling Vt ? 1000 100 10 1

  4. 5% 10% 15% 20% Exploiting positive slacks Power4 Timing Histogram Low Power Opportunities • Most of the Power reduction techniques exploit this positive slack.

  5. Low Power Levers • Structural Techniques • Voltage Islands • Multi-threshold devices • Multi-oxide devices • Minimize capacitance by custom design • Power efficient circuits • Parallelism in micro-architecture • Dynamic Techniques • Clock gating • Power gating • Variable frequency • Variable voltage supply • Variable device threshold

  6. Outline Clock & Latch Optimization Voltage Islands Power Gating Leakage Power Active Power Clock Power

  7. Outline Clock & Latch Optimization Voltage Islands Power Gating Leakage Power Active Power Clock Power

  8. Minimizing Active Power:Coarse Grained Voltage Islands • Trade off power for delay by running functional blocks at different voltages • Can use mix of Low and High Vt to balance performance and leakage • Switch off inactive blocks to reduce leakage power • High VT • E.g.: Telecom ASIC 1.0/1.2 V islands saved: 16 % active power 50 % standby power

  9. Secondary power drop Vddh = 1.5V Vddl = 1.2V Fine-Grained Voltage Islands PowerPC 405 • No timing degrade, and no area increase for the core!

  10. Outline Clock & Latch Optimization Voltage Islands Power Gating Leakage Power Active Power Clock Power

  11. Minimizing Clock Power:Local Clock buffer - Latch clustering • Clocks consume large amount of power in high-performance designs • Large portion of that power goes to the last stage of the clock tree • Minimize the Capacitive loading on local clock buffers by clustering latches around them. • Tradeoff between latch placement flexibility and clock power savings • Reduction in clock skew between capturing and launching latch compensates for loss in latch placement flexibility.

  12. Clock Power Savings • Reduces total capacitance on the local clock buffer by 25% • Direct savings in clock power in the Random Control Logic

  13. Outline Clock & Latch Optimization Voltage Islands Power Gating Leakage Power Active Power Clock Power

  14. Minimizing Leakage Power:Power Supply Gating • Leakage power is now more than switching power • Limits the performance of microprocessors • Power gating is one of the most effective ways of minimizing leakage power • Cut-off power to inactive units/components • Dynamic/workload based power gating • Reduces both gate and sub-threshold leakage • Over 20-2000x reduction in leakage with little or no cycle time penalty.

  15. Power Gating Concept Performance on Demand Dedicated Units off on P1 P2 P1 P2 L2 L2 P4 P4 P3 P3 More Power Available to Scalar Units Higher SPEC Performance Dedicated Units Available for Higher Application Performance

  16. VDDL VGS = VDD IDS,MAX CORE IDS VGND VGS = 0 V VDS,LINEAR IACTIVE VDS To reduce the performance degradation, the voltage drop across SLEEP transistor should be minimized to reduce active leakage current. Requires sizing up of footer device GNDL Normal Operation Mode

  17. Sleep Mode VDDL VGS = VDD IDS,MAX CORE IDS VGND VGS = 0 V VDS During the sleep mode, all of the internal capacitive nodes and VGND node are charged up to near VDD. Requires sizing down of footer device to reduce standby leakage. GNDL

  18. VDDL VGS = VDD IDS,MAX CORE IDS VGND VGS = 0 V ITURN_ON Rs VDS When the SLEEP transistor is turned on, the maximum instant current can flow. Requires sizing up of footer device. GNDL Wake-Up Mode

  19. Sleep / Wake / Run State Control assert run Exit sleep state assert wake disable fence discharge off & run deassert wake/run enable fence Enter sleep state off run charge ) discharge cycle (wake) charge cycles sleep sleep run (idle)

  20. Footer Selection and Sizing 15.5x 10x-20x Leakage Reduction 20x 25x Leakage Reduction 33x 50x < 1% Frequency Loss 100x

  21. Power vs Performance Tradeoff 130nm Hardware ~8% Performance Degradation Due to Sleep Transistor with 1% area overhead Target Specification: 250MHz at 0.9V ~ 500MHz at 1.4V 1% footer size is used for a 2-stage pipelined 40-bit ALU

  22. Sleep Transistor Sizing and Performance 130nm Hardware Less Than 2% Performance Degradation More Than 8% Performance Degradation

  23. Leakage Power Reduction 130nm Hardware Leakage Suppression Using VDD Scaling ~8.4 x ~2000 x Leakage Suppression using Power Gating Structure with 1% area overhead

  24. Physical Design:External Footer Switch

  25. Physical Design:Internal Footer Switch • Internal fine-grained power gating is more efficient in addressing: • Electro-Migration and Current Delivery.

  26. Ground Redistribution The ‘real’ chip-level ground distribution is M4 and above. It is unchanged by power gating Global ground Virtual ground M3 This part of the redistribution is electrically similar to an unmodified distribution V2 M2 V1 M1 Contact Logic Device Footer Cell

  27. Physical Design: Footer Insertion Footer Rows Without Footers With Footers

  28. Power Gating in High-Performance Gated and non-gated logic haveidentical width 5% total area overhead for power gating 20X leakage reduction <1% performance degradation Non-gated Logic Gated Logic

  29. 10.4% 5.7% Power Gating: Footer area overhead 10mV Virtual Ground

  30. Conclusions • Power is the limiting factor in traditional CMOS scaling and must be dealt with aggressively • Controlling leakage is crucial for future scaling • Power gating and voltage islands are effective techniques to minimize leakage and active power • Special consideration to clock distribution must be given in high performance designs to minimize clock power • In order to keep hot chips cool, a holistic power minimization approach across the whole design stack is required which must include : • Device level techniques • Circuit level techniques • System level power management

More Related