1 / 59

CSV881: Low-Power Design Gate-Level Power Optimization

CSV881: Low-Power Design Gate-Level Power Optimization. Vishwani D. Agrawal James J. Danaher Professor Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 vagrawal@eng.auburn.edu http://www.eng.auburn.edu/~vagrawal. Components of Power. Dynamic

macy
Download Presentation

CSV881: Low-Power Design Gate-Level Power Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSV881: Low-Power Design Gate-Level Power Optimization Vishwani D. Agrawal James J. Danaher Professor Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 vagrawal@eng.auburn.edu http://www.eng.auburn.edu/~vagrawal Lectures 10, 11, 12: Gate-level optimization

  2. Components of Power • Dynamic • Signal transitions • Logic activity • Glitches • Short-circuit (often neglected) • Static • Leakage Lectures 10, 11, 12: Gate-level optimization

  3. Power of a Transition isc VDD Dynamic Power = CLVDD2/2+ Psc R Vo Vi CL R Ground Lectures 10, 11, 12: Gate-level optimization

  4. Dynamic Power • Each transition of a gate consumes CV 2/2. • Methods of power saving: • Minimize load capacitances • Transistor sizing • Library-based gate selection • Reduce transitions • Logic design • Glitch reduction Lectures 10, 11, 12: Gate-level optimization

  5. Glitch Power Reduction • Design a digital circuit for minimum transient energy consumption by eliminating hazards Lectures 10, 11, 12: Gate-level optimization

  6. Theorem 1 • For correct operation with minimum energy consumption, a Boolean gate must produce no more than one event per transition. Output logic state changes One transition is necessary Output logic state unchanged No transition is necessary Lectures 10, 11, 12: Gate-level optimization

  7. Event Propagation Single lumped inertial delay modeled for each gate PI transitions assumed to occur without time skew Path P1 1 3 1 0 2 4 6 P2 1 2 3 0 Path P3 5 2 0 Lectures 10, 11, 12: Gate-level optimization

  8. Inertial Delay of an Inverter Vin dHL+dLH d = ──── 2 dHL dLH Vout time Lectures 10, 11, 12: Gate-level optimization

  9. Multi-Input Gate A B Delay d < DPD C DPD: Differential path delay A B C DPD d d Hazard or glitch Lectures 10, 11, 12: Gate-level optimization

  10. Balanced Path Delays A B Delay d < DPD DPD C Delay buffer A B C d No glitch Lectures 10, 11, 12: Gate-level optimization

  11. Glitch Filtering by Inertia A B Delay d> DPD C A B C DPD d > DPD Filtered glitch Lectures 10, 11, 12: Gate-level optimization

  12. Theorem • Given that events occur at the input of a gate, whose inertial delay is d, at times, t1 ≤ . . . ≤ tn , the number of events at the gate output cannot exceed tn – t1 ──── d min ( n , 1 + ) tn - t1 time t1 t2 t3 tn Lectures 10, 11, 12: Gate-level optimization

  13. Minimum Transient Design • Minimum transient energy condition for a Boolean gate: | ti – tj | < d Where ti and tj are arrival times of input events and d is the inertial delay of gate Lectures 10, 11, 12: Gate-level optimization

  14. Balanced Delay Method • All input events arrive simultaneously • Overall circuit delay not increased • Delay buffers may have to be inserted 1 1 1 1 1 No increase in critical path delay 3 1 1 1 1 1 Lectures 10, 11, 12: Gate-level optimization

  15. Hazard Filter Method • Gate delay is made greater than maximum input path delay difference • No delay buffers needed (least transient energy) • Overall circuit delay may increase 1 1 1 1 1 3 1 1 1 1 Lectures 10, 11, 12: Gate-level optimization

  16. Designing a Glitch-Free Circuit • Maintain specified critical path delay. • Glitch suppressed at all gates by • Path delay balancing • Glitch filtering by increasing inertial delay of gates or by inserting delay buffers when necessary. • A linear program optimally combines all objectives. Path delay = d1 |d1 – d2| < D Delay D Path delay = d2 Lectures 10, 11, 12: Gate-level optimization

  17. Problem Complexity • Number of paths in a circuit can be exponential in circuit size. • Considering all paths through enumeration is infeasible for large circuits. • Example: c880 has 6.96M path constraints. Lectures 10, 11, 12: Gate-level optimization

  18. Define Arrival Time Variables • di Gate delay. • Define two timing windowvariables per gate output: • tiEarliest time of signal transition at gate i. • Ti Latest time of signal transition at gate i. • Glitch suppression constraint: Ti – ti < di t1, T1 ti, Ti . . . di tn, Tn Reference: T. Raja, Master’s Thesis, Rutgers Univ., 2002. Lectures 10, 11, 12: Gate-level optimization

  19. Linear Program • Variables: gate and buffer delays, arrival time variables. • Objective: minimize number of buffers. • Subject to: overall circuit delay constraint for all input-output paths. • Subject to: minimum transient energy condition for all multi-input gates. Lectures 10, 11, 12: Gate-level optimization

  20. An Example: Full Adder add1b 1 1 1 1 1 1 1 1 1 Critical path delay = 6 Lectures 10, 11, 12: Gate-level optimization

  21. Linear Program • Gate variables: d4 . . . d12 • Buffer delay variables: d15 . . . d29 • Window variables: t4 . . . t29 and T4 . . . . T29 Lectures 10, 11, 12: Gate-level optimization

  22. Multiple-Input Gate Constraints For Gate 7: T7≥ T5 + d7 t7≤ t5 + d7d7 > T7 – t7 T7≥ T6 + d7 t7≤ t6 + d7 Glitch suppression Lectures 10, 11, 12: Gate-level optimization

  23. Single-Input Gate Constraints Buffer 19: T16 + d19 = T19 t16 + d19 = t19 Lectures 10, 11, 12: Gate-level optimization

  24. Critical Path Delay Constraints T11≤maxdelay T12≤maxdelay maxdelay is specified Lectures 10, 11, 12: Gate-level optimization

  25. Objective Function • Need to minimize the number of buffers. • Because that leads to a nonlinear objective function, we use an approximate criterion: minimize ∑ (buffer delay) all buffers i.e., minimize d15 + d16 + ∙ ∙ ∙ + d29 • This gives a near optimum result. Lectures 10, 11, 12: Gate-level optimization

  26. AMPL Solution: maxdelay =6 1 2 1 1 1 1 1 2 1 2 2 Critical path delay = 6 Lectures 10, 11, 12: Gate-level optimization

  27. AMPL Solution: maxdelay =7 3 1 1 1 1 1 2 2 1 2 Critical path delay = 7 Lectures 10, 11, 12: Gate-level optimization

  28. AMPL Solution: maxdelay ≥11 5 1 1 1 3 1 2 3 4 Critical path delay = 11 Lectures 10, 11, 12: Gate-level optimization

  29. ALU4: Four-Bit ALU 74181 Maximum Power Savings (zero-buffer design): Peak = 33%, Average = 21% Lectures 10, 11, 12: Gate-level optimization

  30. ALU4: Original and Low-Power Lectures 10, 11, 12: Gate-level optimization

  31. Benchmark Circuits Normalized Power Max-delay (gates) 7 15 24 48 47 94 43 86 No. of Buffers 5 0 62 34 294 120 366 111 Circuit ALU4 C880 C6288 c7552 Average 0.80 0.79 0.68 0.68 0.40 0.36 0.44 0.42 Peak 0.68 0.67 0.54 0.52 0.36 0.34 0.34 0.32 Lectures 10, 11, 12: Gate-level optimization

  32. C7552 Circuit: Spice Simulation Power Saving: Average 58%, Peak 68% Lectures 10, 11, 12: Gate-level optimization

  33. References • R. Fourer, D. M. Gay and B. W. Kernighan, AMPL: A Modeling Language for Mathematical Programming, South San Francisco: The Scientific Press, 1993. • M. Berkelaar and E. Jacobs, “Using Gate Sizing to Reduce Glitch Power,” Proc. ProRISC Workshop, Mierlo, The Netherlands, Nov. 1996, pp. 183-188. • V. D. Agrawal, “Low Power Design by Hazard Filtering,” Proc. 10th Int’l Conf. VLSI Design, Jan. 1997, pp. 193-197. • V. D. Agrawal, M. L. Bushnell, G. Parthasarathy and R. Ramadoss, “Digital Circuit Design for Minimum Transient Energy and Linear Programming Method,” Proc. 12th Int’l Conf. VLSI Design, Jan. 1999, pp. 434-439. • T. Raja, V. D. Agrawal and M. L. Bushnell, “Minimum DynamicPower CMOS Circuit Design by a Reduced Constraint Set Linear Program,” Proc. 16thInt’l Conf. VLSI Design, Jan. 2003, pp. 527-532. • T. Raja, V. D. Agrawal, and M. L. Bushnell, “Transistor sizing of logicgates to maximize input delay variability,” J. Low Power Electron., vol.2, no. 1, pp. 121–128, Apr. 2006. • T. Raja, V. D. Agrawal, and M. L. Bushnell, “Variable Input Delay CMOS Logic for Low Power Design,” IEEE Trans. VLSI Design, vol. 17, mo. 10, pp. 1534-1545. October 2009. Lectures 10, 11, 12: Gate-level optimization

  34. Exercise: Dynamic Power • An average gate • VDD, V = 1 volt • Output capacitance, C = 1pF • Activity factor, α = 10% • Clock frequency, f = 1GHz • What is the dynamic power consumption of a 1 million gate VLSI chip? Lectures 10, 11, 12: Gate-level optimization

  35. Answer • Dynamic energy per transition = 0.5CV2 • Dynamic power per gate = Energy per second = 0.5 CV2 α f = 0.5 ✕ 10 – 12 ✕ 12 ✕ 0.1 ✕ 109 = 0.5 ✕ 10 – 4 = 50μW • Power for 1 million gate chip = 50W Lectures 10, 11, 12: Gate-level optimization

  36. Components of Power • Dynamic • Signal transitions • Logic activity • Glitches • Short-circuit • Static • Leakage Lectures 10, 11, 12: Gate-level optimization

  37. Subthreshold Conduction Vgs – Vth –Vds Ids = I0 exp( ───── ) × (1– exp ─── ) nVT VT Ids 1mA 100μA 10μA 1μA 100nA 10nA 1nA 100pA 10pA Subthreshold slope Saturation region Subthreshold region d g s Vth 0 0.3 0.6 0.9 1.2 1.5 1.8 V Vgs Lectures 10, 11, 12: Gate-level optimization

  38. Thermal Voltage, vT VT = kT/q = 26 mV, at room temperature. When Vds is several times greater than VT Vgs – Vth Ids = I0 exp( ───── ) nVT Lectures 10, 11, 12: Gate-level optimization

  39. Leakage Current • Leakage current equals Ids when Vgs= 0 • Leakage current, Ids = I0exp( – Vth/nVT) • At cutoff, Vgs = Vth, and Ids = I0 • Lowering leakage to 10-b ✕ I0 Vth = bnVTln 10 = 1.5b × 26 ln 10 = 90b mV • Example: To lower leakage to I0/1,000 Vth = 270 mV Lectures 10, 11, 12: Gate-level optimization

  40. Threshold Voltage • Vth = Vt0 + γ[(Φs+Vsb)½ – Φs½] • Vt0 is threshold voltage when source is at body potential (0.4 V for 180nm process) • Φs = 2VTln(NA /ni)is surface potential • γ = (2qεsiNA)½tox /εox is body effect coefficient (0.4 to 1.0) • NA is doping level = 8×1017 cm–3 • ni = 1.45×1010 cm–3 Lectures 10, 11, 12: Gate-level optimization

  41. Threshold Voltage, Vsb = 1.1V • Thermal voltage, VT = kT/q = 26 mV • Φs = 0.93 V • εox = 3.9×8.85×10-14 F/cm • εsi = 11.7×8.85×10-14 F/cm • tox = 40 Ao • γ = 0.6 V½ • Vth = Vt0 + γ[(Φs+Vsb)½- Φs½] = 0.68 V Lectures 10, 11, 12: Gate-level optimization

  42. A Sample Calculation • VDD = 1.2V, 100nm CMOS process • Transistor width, W = 0.5μm • OFF device (Vgs = Vth) leakage • I0 = 20nA/μm, for low threshold transistor • I0 = 3nA/μm, for high threshold transistor • 100M transistor chip • Power = (100×106/2)(0.5×20×10-9A)(1.2V) = 600mW for all low-threshold transistors • Power = (100×106/2)(0.5×3×10-9A)(1.2V) = 90mW for all high-threshold transistors Lectures 10, 11, 12: Gate-level optimization

  43. Dual-Threshold Chip • Low-threshold only for 20% transistors on critical path. • Leakage power = 600×0.2 + 90×0.8 = 120 + 72 = 192 mW Lectures 10, 11, 12: Gate-level optimization

  44. Dual-Threshold CMOS Circuit Lectures 10, 11, 12: Gate-level optimization

  45. Dual-Threshold Design • To maintain performance, all gates on critical paths are assigned low Vth . • Most other gates are assigned high Vth . • But, some gates on non-critical paths may also be assigned low Vth to prevent those paths from becoming critical. Lectures 10, 11, 12: Gate-level optimization

  46. Integer Linear Programming (ILP) to Minimize Leakage Power • Use dual-threshold CMOS process • First, assign all gates low Vth • Use an ILP model to find the delay (Tc) of the critical path • Use another ILP model to find the optimal Vth assignment as well as the reduced leakage power for all gates without increasing Tc • Further reduction of leakage power possible by letting Tc increase Lectures 10, 11, 12: Gate-level optimization

  47. ILP -Variables For each gate i define two variables. • Ti :the longest time at which the output of gate i can produce an event after the occurrence of an input event at a primary input of the circuit. • Xi :a variable specifyinglow or high Vth for gate i ; Xiis an integer [0, 1], 1  gate i is assigned low Vth , 0  gate i is assigned high Vth . Lectures 10, 11, 12: Gate-level optimization

  48. ILP - objective function Leakage power: minimize the sum of all gate leakage currents, given by • ILi is the leakage current of gate i with low Vth • IHiis the leakage current of gate i with high Vth • Using SPICE simulation results, construct a leakage current look up table, which is indexed by the gate type and the input vector. Lectures 10, 11, 12: Gate-level optimization

  49. ILP - Constraints Ti • For each gate (1) output of gate j is fanin of gate i (2) • Max delay constraints for primary outputs (PO) (3) Tmax is the maximum delay of the critical path Gate i Gate j Tj Lectures 10, 11, 12: Gate-level optimization

  50. ILP Constraint Example • Assume all primary input (PI) signals on the left arrive at the same time. • For gate 2, constraints are Lectures 10, 11, 12: Gate-level optimization

More Related