Ece260b cse241a winter 2005 power distribution
1 / 52

ECE260B – CSE241A Winter 2005 Power Distribution - PowerPoint PPT Presentation

  • Uploaded on

ECE260B – CSE241A Winter 2005 Power Distribution. Website: Motivation. Power supply noise is a serious issue in DSM design Noise is getting worse as technology scales Noise margin decreases as supply voltage scales

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' ECE260B – CSE241A Winter 2005 Power Distribution' - kawena

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Ece260b cse241a winter 2005 power distribution

ECE260B – CSE241AWinter 2005Power Distribution



  • Power supply noise is a serious issue in DSM design

    • Noise is getting worse as technology scales

    • Noise margin decreases as supply voltage scales

    • Power supply noise may slow down circuit performance

    • Power supply noise may cause logic failures







Power = …

  • Routing resources

    • 20-40% of all metal tracks used by Vcc, Vss

    • Increased power  denser power grid

  • Pins

    • Vcc or Vss pin carries 0.5-1W of power

    • Pentium4 uses 423 pins; 223 Vcc or Vss

    • More pins  package more expensive (+ package development, motherboard redesign, …)

  • Battery cost

    • 1kg NiCad battery powers a Pentium 4 alone for less than 1 hour

  • Performance

    • High chip temperatures degrade circuit performance

    • Large across-chip temperature variations induce clock skew

    • High chip power limits use of high-performance circuits

    • Power transients determine minimum power supply voltage

Power package


Heat Sink


Processor Pins

Integrated Heat Spreader

Decoupling Capacitors


Package Pins


Power = Package

Pentium4 die is about 1.5g and less than 1cm^3

Pentium-4 in package with interposer, heat sink, and fan can be 500g and 150cm^3

Modern processor packaging is complex and adds significantly to product cost.

Courtesy M. McDermott UT-Austin

Planning for power
Planning for Power

  • Early simulation of major power dissipation components

  • Early quantification of chip power

    • Total chip power

    • Maximum power density

    • Total chip power fluctuations

      • inherent & added fluctuations due to clock gating

  • Early power distribution analysis (dc, ac, & multi-cycle)

    • I.e., average, maximum, multi-cycle fluctuations

  • Early allocation & coordination of chip resources

    • Wiring tracks for power grid

    • Low Vt devices

    • Dynamic circuits

    • Clock gating

    • Placement and quantity of added decoupling capacitors

  • Power and ground routing
    Power and Ground Routing

    • Floorplanning includes planning how the power, ground and clock should route

    • Power supply distribution

      • Tree: trunk must supply current to all branches

      • Resistance must be very small since when a gate switches, its current flows through the supply lines

        • If the resistance of supply lines is too large, voltage supplied to gates will drop, which can cause the gate to malfunction

        • Usually, want at most 5-10% IR drop due to supply resistance

      •  Usually on the top layers of metal, then distributed to lower wiring layers

    Planar power distribution

    cut line





    no cut line


    cut line



    no connection

    Planar Power Distribution

    • Topology of VDD/VSS networks.

      • Inter-digitated

      • Design each macrocell such that all VDD and VSS terminals are on opposite sides.

      • If floorplan places all macrocells with VDD on same side, then no crossing between VDD and VSS.














    Courtesy K. Yang, UCLA

    Gridded power distribution
    Gridded Power Distribution

    • With more metal layers, power is striped

      • Connection between the stripes allows a power grid

        • Minimizes series resistance

      • Connection of lower layer layout/cells to the grid is through vias

        • Note that planar supply routing is often still needed for a strong lower layer connection.

        • There may not be sufficient area to make a strong connection in the middle of a design (connect better at periphery of die)

    Courtesy K. Yang, UCLA

    Power supply drop noise
    Power Supply Drop/Noise

    • Supply noise = variations in power supply voltage that act as noise source for logic gates

      • Power supply wiring resistance  voltage variations with current surges

      • Current surges depend on dynamic behavior of circuit

    • Solution approach

      • Measure maximum current required by each block

      • Redesign power/ground network to reduce resistance

      • Worst case: move activity to another clock cycle to reduce peak current  scheduling problem

    • Example: Drive 32-bit bus, total bus wire load = 2pF, with delay 0.5ns

      • R for each transistor needs to be < 0.25kW to meet RC = 0.5ns

      • Effective R of bits together is 250/32 = 7.5W

      • For < 10% drop, power distribution R must be < 1W

    Courtesy K. Yang, UCLA


    • Physical migration of metal atoms due to “electron wind” can eventually create a break in a wire

      • MTTF (mean time to failure)  1/J2 where J= current density

      • Current density must not exceed specification  wire Ii/wi < Jspec

      • Specified as mA per m wire width (e.g., 1mA/ m) or mA per via cut

    • EM occurs both in signal (AC=bidirectional) and power wires (DC = unidirectional)

      • Much worse for DC than AC; DC occurs inside cells and in power buses

    • May need more contacts on transistor sources and drains to meet EM limits

    • Width of power buses must support both iR and EM requirements

    • Issues in IR and EM constraint generation

      • Topology is most likely not a tree

      • How do we determine current patterns?

      • Effects of R, L

    What happens
    What Happens?

    • Example of an AlCu line seen under microscope.

      • Accelerated by higher temperature and high currents

      • Voids form on grain boundaries

      • Metal atoms move with current away from voids and collect at boundaries

    Catastrophic failure

    Courtesy K. Yang, UCLA

    Taken from

    Taken from Sverre Sjøthun, “Electromigration

    In-Depth,” from

    Courtesy S. Sapatnekar, UMinn

    Power supply rules of thumb
    Power Supply Rules of Thumb

    • Rules depend on technology

      • Tech file has rules for resistance and electromigration

    • Examples:

      • Must have a contact for each 16l of transistor width (more is better)

      • Wire must have less than 1mA/mm of width

      • Power/Gnd width = Length of wire * Sum (all transistors connected to wire) / 3*106l (very approximate)

    • For small designs, power supply design is non-issue

    Courtesy K. Yang, UCLA

    Basic methodology concepts
    Basic Methodology Concepts

    • Reliability (slotting, splitting)

    • Alignment of hierarchical rings, stripes

    • Isolation of analog power

    • Styles of power distribution

      • Rings and trunks

      • Uniform grid

      • Bottom-up grid generation

      • Depends on:

        • Package: flip-chip vs. wire-bond; I/O count (fewer pads  denser grid)

        • Power budget

        • IR drop limits

        • Floorplan constraints (hard macros, etc.)

    Metal slotting vs splitting
    Metal Slotting vs. Splitting

    Easy connections through standard via arrays

    • Required by metal layout rules for uniform CMP (planarization)

    • Split power wires

      • Less data than traditional slotting

      • More accurate R/C analysis of power mesh

      • Not supported by all tools








    Difficult to connect - where should vias go?

    Courtesy Cadence Design Systems, Inc.

    Trunks and rings methodology
    Trunks and Rings Methodology

    • Each Block has its own ring

      • Rings may be inside the blocks or part of the top level

    • Each Block has trunks connecting top level to block





    Rings may be shared with abutted blocks

    Individual trunks connecting

    blocks to top level

    block 3



    block 5



    block 2


    block 4



    block 1









    Courtesy Cadence Design Systems, Inc.

    Trunks and rings


    Power tailored to the demands of each block (flexible)

    More area efficient since the demands of each block are uniquely met

    Simple implementation supported by many tools

    Rings can be shared between blocks by abutted blocks


    Limited redundancy, power grid built to match needs

    Assumptions in design may change or be invalid

    Non regular structure requires more detailed IR drop/EM analysis

    missing vias/connections fatal

    Rings will require slotting/splitting due to wide widths

    Increase in data volume

    Trunks and Rings

    Courtesy Cadence Design Systems, Inc.

    Uniform chip grid methodology
    Uniform Chip Grid Methodology

    • Robust and redundant power network

      • mainly in microprocessors and high end large ASICs

    • Implementation

      • Primary distribution through upper metal layers

        • Lower layers in blocks to connect to top through via stacks

      • Typically pushed into blocks

      • Blocks typically abut

        • Requires block grids to align

      • Rows/Followpins should align with block pins

        • Global buffer insertion

    global grid

    higher layers

    Fine or custom grid

    or no grid

    on lower layers







    block 4

    block 5



    block 3


    block 4



    block 1









    Courtesy Cadence Design Systems, Inc.

    Uniform chip grid


    Easily implemented

    Lends itself to straightforward hand calculations

    Path redundancy allows less sensitively to changes in current pattern

    Mesh of power/ground provides shielding (for capacitance) and current returns (for inductance)

    Top-down propagation easy to use on this style


    Takes up significant routing resources (20%-40% of all routing tracks if not already reserved for power/ground)

    Fine grids may slow down P&R tools

    Imposes grid structure into each block which may be unnecessary

    Top and blocks coupled closely if top level routing pushed into blocks

    Changes to block/top must be reflected in other

    Uniform Chip Grid

    Courtesy Cadence Design Systems, Inc.

    Bottom up grid generation methodology
    Bottom-Up Grid Generation Methodology

    • Design and optimize power grid for block, merge at top

    • Advantages

    • Able to tailor grid for routing resource efficiency in each block

    • Flexibility to choose the best grid for the block (i.e. ring and stripe, power plane, grid)

    • Disadvantages

    • Designing grid in context of the “big picture” is more difficult

    • Block grid may present challenging connections to top level

    • Assumptions for block grid’s connection to top level must be analyzed and validated

    Courtesy Cadence Design Systems, Inc.

    Power routing in area based p r
    Power Routing in Area-Based P&R

    • Power routing approaches

      • (1) Pre-route parts of power grid during floorplanning

      • (2) Build grid (except connections to standard cells) before P&R

      • (3) Build entire grid before P&R

      • N.B.: Area-based P&R tools respect pre-routes absolutely

    • Cadence tools: power routing done inside SE, all other tasks (clock, place, route, scan, …) done by point tools

      • Lab 5 tomorrow has a tiny bit of power routing (rings, stripes)

    • Miscellany

      • ECOs: What happens to rings and trunks if blocks change size?

      • Layer choices: What is cost of skipping layers (to get from thick top-layer metal down to finer layers)?

      • How wide should power wires be?

      • Post-processing strategies

    Courtesy Cadence Design Systems, Inc.

    Power routing wire width considerations
    Power Routing Wire Width Considerations

    • Slotting rules: Choose maximum width below slotting width

    • Halation (width-dependent spacing) rules: Do as much as possible of power routing below wide wire width to save routing space

    • Choose power routing widths carefully to avoid blocking extra tracks (and, use the space if blocking the track!)

      What is better power width here?

    Blocked tracks

    Courtesy Cadence Design Systems, Inc.

    Power routing tool usage
    Power Routing Tool Usage

    • 4 layer power grid example (HVHV)

      • Turn on via stacking

      • Route metal2 vertically

      • Route metal4 vertically (use same coordinates)

      • Route metal3 horizontally (make coincident with every N metal1 routes)

      • Turn off via stacking

      • Route metal1 horizontally



    metal1 inside cells

    metal3 every n micron

    Courtesy Cadence Design Systems, Inc.

    Post processing flows def or layout editing
    Post-Processing Flows (DEF or Layout Editing)

    During PnR

    After post processing

    Courtesy Cadence Design Systems, Inc.

    Tree supply network design
    (Tree) Supply Network Design

    • Tree topology assumption not very useful in practice, but illustrates some basic ideas

    • Assume R dominates, L and C negligible

      • marginally permissible assumption

    • Current drawn at various points in the tree (time-varying waveform)

    • Current causes a V=IR drop

      • “Ground” is not at 0V

      • “Vdd” is not at intended level


    = sinks

    Courtesy S. Sapatnekar, UMinn

    Ir drop constraints
    IR Drop Constraints

    • Chowdhury and Breuer, TCAD 7/88

    • Can write V drop to each sink as

      •  Ri Ii < Vspec for all sink current patterns made available

      • Tree structure: can compute Ii easily

      • Ri   li / wi

    • Change wi to reduce IR drop

    • Objective: minimize  ai wi

    • Current density must never exceed a specification

      • For each wire, Ii/wi < Jspec


    Courtesy S. Sapatnekar, UMinn

    P g mesh optimization r only
    P/G Mesh Optimization (R only)

    • Dutta and Marek-Sadowska, DAC 89

    • Cost function:  ai li wi =  ai cili2 // = total wire area (since ci = conductance = wi/( li)

    • Constraints

      • EM: Ii e wi// current density I/w less than upper bound

        • Substitute Ii = vi (wi/  li) // I = V/R vp - vq  e  li // divide by wi, *  li

      • Wire width constraints: Wmin  wi  Wmax (translate to ci)

      • Voltage drop constraints: va - vb  Vspec1 and/or vi  Vspec2

      • Circuit equations that determine the v’s

  • Variables: ci’s (vi’s depend on ci’s)

  • Courtesy S. Sapatnekar, UMinn

    Solution technique
    Solution Technique

    • Method of feasible directions

      • Find an initial feasible solution (satisfies all constraints)

      • Choose a direction that maintains feasibility

      • Make a move in that direction to reduce cost function

    • Given a set of ci’s, must find corresponding vi’s

      • Feasible direction method: move from point c* to c+

      • c* and c+ must be close to each other (i.e., if you have the solution at c*, the solution at c+ corresponds to a minor change in conductances)

      • Solving for vi’s : solving a system of linear equations

        • Solution at c* is a good guess for the solution at c+

        • Converges in a few iterations

    Courtesy S. Sapatnekar, UMinn

    Modeling gate currents
    Modeling Gate Currents

    • Currents in supply grid caused by charging/discharging of capacitances by logic gates

    • All analyses require generation of a “worst-case switching” scenario

    • Enumeration is infeasible  Two basic approaches

      • Simulation based methods: designer supplies “hot” vectors, or we try to generate these hot vectors automatically

      • “Pattern-independent” methods: try to estimate the worst-case (can be expensive, very inaccurate)

    • Once current patterns are available, apply them to supply network to find out if constraints are satisfied

    Courtesy S. Sapatnekar, UMinn

    Complexity of hot vector generation
    Complexity of Hot Vector Generation

    • Devadas et al., TCAD 3/92:

      • Assume zero gate delays for simplicity

      • Find the maximum current drawn by a block of gates

      • Using a current model for each gate

        • Find a set of input patterns so that the total current is maximized

        • Boolean assignment problem: equivalent to Weighted Max-Satisfiability

          • Given a Boolean formula in conjunctive normal form (product of sums), is there an assignment of truth values to the variables such that the formula evaluates to True?

        • Checking for Satisfiability (for k-sat, k > 2) is NP-complete

      •  Difficult even under zero gate delay assumption

    Courtesy S. Sapatnekar, UMinn

    Pattern independent methods
    Pattern-Independent Methods

    • iMAX approach: Kriplani et al., TCAD 8/95

      • Current model for a single gate

      • Gates switch at different times

      • Total current drawn from Vdd (ignoring supply network C) is the sum of these time-shifted waveforms

    • Objective: find the worst-case waveform


     Delay

    Courtesy S. Sapatnekar, UMinn


    (Not to scale!)


    • Maximum current not just a sum of individual maximum currents

    • Temporal dependencies

    • [Using deliberate clock skews can reduce the peak current, as we saw in the Useful-Skew discussion]

    Courtesy S. Sapatnekar, UMinn

    Maximum envelope current mec
    Maximum Envelope Current (MEC)

    • Find the time interval during which a gate may switch

      • Manufacturing process variations can cause changes

      • Actual switching event can cause changes

      • Switching at second gate can occur at t=1 or at t=2

      • In general, a large number of paths can go through a gate; assume (conservatively) that switching occurs in t  [1,2]

      • Assume that all gate inputs can switch independently – provides an upper bound on the switching current

    (unit gate delays)

    Courtesy S. Sapatnekar, UMinn

    Large errors in mec approach




    (Large) Errors in MEC Approach

    • Correlation Problem

      • Switching at G0, G1, G2 and G3 not independent

      • G0 = 0 implies that G1, G2, G3 switch; G0 = 1 means that other inputs will determine gate activity

      • If the other inputs cannot make the gate switch in the same time window, then iMAX estimates are pessimistic

    • Reconvergent Fanout Problem

      • Signals that diverge at G0 reconverge at Gk  inputs to Gk are not independent

      • Assumption of independent switching is not valid

    • Many heuristic refinements proposed, but guardbanding (error) of power estimation still a huge issue







    Courtesy S. Sapatnekar, UMinn


    • Motivation

    • Power Supply Noise Estimation

    • Decoupling Capacitance (decap) Budget

    • Allocation of Decoupling Capacitance

    • Experiment Results

    • Conclusion

    Why decoupling capacitance
    Why Decoupling Capacitance

    • Frequency point of view

      • Decaps form low-pass filters

      • They cancel anti- effects

    • Physical point of view

      • Decaps serve as charge reservoirs

      • They shortcut supply current paths and reduces voltage drop

    • No effect to DC supply currents

    Power supply network rlc mesh
    Power Supply Network—RLC Mesh






    : VDD pin




    Slide courtesy of S Zhao, K Roy & C.-K. Kok

    Current distribution in power supply mesh illustration
    Current Distribution in Power Supply Mesh Illustration



    Current flowing


    :Connection point,




    :VDD pin







    Module A

    Slide courtesy of S Zhao, K Roy & C.-K. Kok

    Current distribution in power supply network
    Current Distribution in Power Supply Network

    • Distribute switching current for each module in the power supply mesh

    • Observation: Currents tend to flow along the least-impedance paths

    • Approximation: Consider only those paths with minimal impedance --shortest, second shortest, …

    Slide courtesy of S Zhao, K Roy & C.-K. Kok

    Current flowing paths and power supply noise calculation









    Current Flowing Paths and Power Supply Noise Calculation

    • Power supply noise at a target module is the voltage difference between the VDD pin and the module

    • Apply KVL:







    Slide courtesy of S Zhao, K Roy & C.-K. Kok

    Why decoupling capacitance1





    Why Decoupling Capacitance?

    • P/G network wiresizing won’t change voltage drop frequency spectrum

    • To reduce Vdrop by k times needs to size up wires by k times along the supply current path











    • Decoupling caps act as a low-pass filter

    • Efficient to remove high frequency elements of Vdrop

    Decoupling capacitance budget
    Decoupling Capacitance Budget

    • Decap budget for each module can be determined based on its noise level

    • Initial budget can be estimated as follows:

    • Iterations are performed if necessary until noise at each module in the floorplan is kept under certain limit

    Slide courtesy of S Zhao, K Roy & C.-K. Kok

    Allocation of decoupling capacitance
    Allocation of Decoupling Capacitance

    • Decap needs to be placed in the vicinity of each target module

    • Decap requires WS to manufacture on

      • Use MOS capacitors

    • Decap allocation is reduced to WS allocation

    • Two-phase approach:

      • Allocate the existing WS in the floorplan

      • Insert additional WS into the floorplan if required

    Slide courtesy of S Zhao, K Roy & C.-K. Kok

    Allocation of existing white space
    Allocation of Existing White Space










    Slide courtesy of S Zhao, K Roy & C.-K. Kok

    Allocation of existing ws linear programming lp approach

    Objective: Maximize the utilization of available WS

    Existing WS can be allocated to neighboring modules using LP


    LP Approach:

    Allocation of Existing WS--Linear Programming (LP) Approach

    Slide courtesy of S Zhao, K Roy & C.-K. Kok

    Insert additional ws into floorplan if necessary
    Insert Additional WS into Floorplan If Necessary

    • Update decap budget for each module after existing WS has been allocated

    • If additional WS if required, insert WS into floorplan by extending it horizontally and vertically

    • Two-phase procedure:

      • insert WS band between rows based the decap budgets of the modules in the row

      • insert WS band between columns based on the decap budgets of the modules in the column

    Slide courtesy of S Zhao, K Roy & C.-K. Kok

    Moving modules to insert ws
    Moving Modules to Insert WS

    Slide courtesy of S Zhao, K Roy & C.-K. Kok

    Experimental results comparison of decap budgets ours vs greedy solution
    Experimental ResultsComparison of Decap Budgets(Ours vs “Greedy Solution”)


    • A methodology for decoupling capacitance allocation at floorplan level is proposed

    • Linear programming technique is used to allocate existing WS to maximize its utilization

    • A heuristic is proposed for additional WS insertion

    • Compared with “Greedy” solution, our method produces significantly smaller decap budgets