power optimal pipelining in deep submicron technology n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Power-Optimal Pipelining in Deep Submicron Technology PowerPoint Presentation
Download Presentation
Power-Optimal Pipelining in Deep Submicron Technology

Loading in 2 Seconds...

play fullscreen
1 / 27

Power-Optimal Pipelining in Deep Submicron Technology - PowerPoint PPT Presentation


  • 142 Views
  • Uploaded on

ISLPED 2004 8/10/2004. Power-Optimal Pipelining in Deep Submicron Technology. Seongmoo Heo and Krste Asanovi ć Computer Architecture Group, MIT CSAIL. Traditional Pipelining. Goal: Maximum performance. Vdd. Setup. Clk-Q. Propagation Delay. Clk. Clk. Clk.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Power-Optimal Pipelining in Deep Submicron Technology


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. ISLPED 2004 8/10/2004 Power-Optimal Pipelining in Deep Submicron Technology Seongmoo Heo and Krste Asanović Computer Architecture Group, MIT CSAIL

    2. Traditional Pipelining • Goal: Maximum performance Vdd Setup Clk-Q Propagation Delay Clk Clk Clk

    3. Pipelining as a Low-Power Tool • Goal: Low-Power, Fixed Throughput Vdd Setup Clk-Q Propagation Delay Clk Time Slack Clk Time Slack Clk

    4. Pipelining as a Low-Power Tool • Goal: Low-Power, Fixed Throughput Vdd Setup Clk-Q Propagation Delay Clk Time Slack Clk Traded for Power (supply voltage scaling) Time Slack Clk

    5. Pipelining as a Low-Power Tool Power * Clock frequency fixed Flip-flop Power Overhead Pipelining Time slack Delay

    6. Pipelining as a Low-Power Tool Power * Clock frequency fixed Power Saving Supply voltage scaling Delay

    7. Power-Optimal Pipelining • Power reduction from pipelining limited by power overhead of increased number of flip-flops  Power-Optimal Pipelining

    8. Power-Optimal Pipelining • Power reduction from pipelining limited by power overhead of increased number of flip-flops  Power-Optimal Pipelining Power Too shallow pipelining Delay

    9. Power-Optimal Pipelining • Power reduction from pipelining limited by power overhead of increased number of flip-flops  Power-Optimal Pipelining Power Too deep pipelining Too shallow pipelining Delay

    10. Power-Optimal Pipelining • Power reduction from pipelining limited by power overhead of increased number of flip-flops  Power-Optimal Pipelining Power Too deep pipelining Too shallow pipelining Optimal pipelining Optimal Power Saving Delay

    11. Contribution • Pipelining is an old idea. • Research focus has been on performance impact of pipelining. • Idea of using pipelining [Chandrakasan ’92] to lower power has not been fully explored in deep submicron technology. • Analysis and circuit-level simulation of Power-Optimal Pipelining for different regimes of Vth, activity factor, clock gating

    12. Bottom-to-Top Approach • Impact of pipelining on power component • Impact of pipelining on total power (with/without clock-gating) Power Total Power (clock-gated) active active inactive Time Idle Power Component Leakage Power Component Switching Power Component

    13. Bottom-to-Top Approach • Impact of pipelining on power component • Impact of pipelining on total power (with/without clock-gating) Power Total Power (not clock-gated) active active inactive Time *Idle power = power consumed when circuit is idle and not clock-gated Idle Power Component Leakage Power Component Switching Power Component

    14. Methodology • Target digital system: Fixed throughput, Highly parallel computation, Logic-dominant • Test bench • BPTM (Berkeley Predictive Technology Model) 70nm process: • LVT(0.17/-0.2), MVT(0.19/-0.22), HVT(0.21/-0.24) • Hspice simulation at 100°C, Clock = 2 GHz Baseline N FO4 inverters (N = 2 ~ 24) TG flip-flops TG flip-flops One Pipeline Stage

    15. Pipelining and Switching Power:Analytical Trend Optimal Saving O(N2) Flip-flop overhead Switching Power Quadratic reduction of logic switching power O(1/N)  Vdd2  N2 Optimal FO4 Number of FO4 per stage, N

    16. Pipelining and Leakage Power:Analytical Trend Optimal Saving O(1/N) O(N ) (1<< 2) Flip-flop overhead Superlinear reduction of logic leakage power Leakage Power Optimal FO4  Vdd * e(Vdd) N DIBL effect Number of FO4 per stage, N

    17. Pipelining and Idle Power:Analytical Trend • Clock-gating is not always possible • Increased control complexity • insufficient setup time of clock enable signal • Leakage Power + Flip-flop Switching Power • Between leakage power scaling and flip-flop switching power scaling depending on leakage level

    18. Pipelining and Idle Power:Analytical Trend Leakage Power Scale Flip-flop Switching Power Scale Optimal Saving Idle Power Optimal Saving O(N) Optimal FO4 Linear reduction of Flip-flop switching power Relative Power O(N ) (1<< 2) O(1/N)  1/N * Vdd2  N Optimal FO4 O(1/N) Number of FO4 per stage, N Number of FO4 per stage, N

    19. Simulation Results:Power Components Fixed Throughput @ 2 GHz N = Number of FO4 inverters per stage (Not including flip-flop delay) N* = Optimal N Saving* = Optimal power saving by pipelining

    20. Optimal Power Saving Optimal FO4 = 6 Optimal FO4 = 6~8 No Clock Gating Clock Gating *2 GHz *Flip-flop delay not included in optimal FO4 relative power relative power activity factor activity factor

    21. Optimal Power Saving Optimal FO4 = 6 Optimal FO4 = 6~8 No Clock Gating Clock Gating Idle Power Leakage Power relative power relative power Switching Power Switching Power activity factor activity factor

    22. Optimal Power Saving Optimal FO4 = 6 Optimal FO4 = 6~8 No Clock Gating Clock Gating relative power relative power LVT activity factor activity factor

    23. Discussion • LVT can be fast and power-efficient • enables lower Vdd • Flip-flop delay more important than flip-flop power for power-optimal pipelining

    24. Limitation of This Work

    25. Conclusion • Pipelining is an effective low-power tool when used to support voltage scaling in digital system implementing highly parallel computation. • Optimal Logic Depth: 6-8 FO4 • ~ 8-10 FO4 including flip-flop delay • Optimal Power Saving: 55 – 80% • It depends on Vth, AF, Clock-Gating • Insights: • Pipelining is more effective with High AF • Pipelining is most effective at saving switching power • Pipelining is more effective with lower Vth • Except for when leakage power is dominant. • Pipelining is more effective with clock-gating • reduced flip-flop overhead.

    26. Acknowledgments • Thanks to SCALE group members and anonymous reviewers • Funded by NSF CAREER award CCR-0093354, NSF ITR award CCR-0219545, and a donation from Intel Corporation.

    27. BACKUP SLIDES