Ece260b cse241a winter 2005 power distribution
This presentation is the property of its rightful owner.
Sponsored Links
1 / 52

ECE260B – CSE241A Winter 2005 Power Distribution PowerPoint PPT Presentation


  • 78 Views
  • Uploaded on
  • Presentation posted in: General

ECE260B – CSE241A Winter 2005 Power Distribution. Website: http://vlsicad.ucsd.edu/courses/ece260b-w05. Motivation. Power supply noise is a serious issue in DSM design Noise is getting worse as technology scales Noise margin decreases as supply voltage scales

Download Presentation

ECE260B – CSE241A Winter 2005 Power Distribution

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Ece260b cse241a winter 2005 power distribution

ECE260B – CSE241AWinter 2005Power Distribution

Website: http://vlsicad.ucsd.edu/courses/ece260b-w05


Motivation

Motivation

  • Power supply noise is a serious issue in DSM design

    • Noise is getting worse as technology scales

    • Noise margin decreases as supply voltage scales

    • Power supply noise may slow down circuit performance

    • Power supply noise may cause logic failures


Power

Vcc

Vss

Vcc

Vss

Vcc

Power = …

  • Routing resources

    • 20-40% of all metal tracks used by Vcc, Vss

    • Increased power  denser power grid

  • Pins

    • Vcc or Vss pin carries 0.5-1W of power

    • Pentium4 uses 423 pins; 223 Vcc or Vss

    • More pins  package more expensive (+ package development, motherboard redesign, …)

  • Battery cost

    • 1kg NiCad battery powers a Pentium 4 alone for less than 1 hour

  • Performance

    • High chip temperatures degrade circuit performance

    • Large across-chip temperature variations induce clock skew

    • High chip power limits use of high-performance circuits

    • Power transients determine minimum power supply voltage


Power package

Fan

Heat Sink

Processor

Processor Pins

Integrated Heat Spreader

Decoupling Capacitors

OLGA Pins

Package Pins

Interposer

Power = Package

Pentium4 die is about 1.5g and less than 1cm^3

Pentium-4 in package with interposer, heat sink, and fan can be 500g and 150cm^3

Modern processor packaging is complex and adds significantly to product cost.

http://www.intel.com/support/processors/procid/ptype.htm

Courtesy M. McDermott UT-Austin


Planning for power

Planning for Power

  • Early simulation of major power dissipation components

  • Early quantification of chip power

    • Total chip power

    • Maximum power density

    • Total chip power fluctuations

      • inherent & added fluctuations due to clock gating

  • Early power distribution analysis (dc, ac, & multi-cycle)

    • I.e., average, maximum, multi-cycle fluctuations

  • Early allocation & coordination of chip resources

    • Wiring tracks for power grid

    • Low Vt devices

    • Dynamic circuits

    • Clock gating

    • Placement and quantity of added decoupling capacitors


  • Power and ground routing

    Power and Ground Routing

    • Floorplanning includes planning how the power, ground and clock should route

    • Power supply distribution

      • Tree: trunk must supply current to all branches

      • Resistance must be very small since when a gate switches, its current flows through the supply lines

        • If the resistance of supply lines is too large, voltage supplied to gates will drop, which can cause the gate to malfunction

        • Usually, want at most 5-10% IR drop due to supply resistance

      •  Usually on the top layers of metal, then distributed to lower wiring layers


    Planar power distribution

    cut line

    VDD

    VSS

    cell

    VDD

    no cut line

    VSS

    cut line

    VDD

    VSS

    no connection

    Planar Power Distribution

    • Topology of VDD/VSS networks.

      • Inter-digitated

      • Design each macrocell such that all VDD and VSS terminals are on opposite sides.

      • If floorplan places all macrocells with VDD on same side, then no crossing between VDD and VSS.

    VDD

    B

    VSS

    C

    VDD

    VSS

    A

    VDD

    VSS

    VDD

    VSS

    VDD

    VSS

    Courtesy K. Yang, UCLA


    Gridded power distribution

    Gridded Power Distribution

    • With more metal layers, power is striped

      • Connection between the stripes allows a power grid

        • Minimizes series resistance

      • Connection of lower layer layout/cells to the grid is through vias

        • Note that planar supply routing is often still needed for a strong lower layer connection.

        • There may not be sufficient area to make a strong connection in the middle of a design (connect better at periphery of die)

    Courtesy K. Yang, UCLA


    Power supply drop noise

    Power Supply Drop/Noise

    • Supply noise = variations in power supply voltage that act as noise source for logic gates

      • Power supply wiring resistance  voltage variations with current surges

      • Current surges depend on dynamic behavior of circuit

    • Solution approach

      • Measure maximum current required by each block

      • Redesign power/ground network to reduce resistance

      • Worst case: move activity to another clock cycle to reduce peak current  scheduling problem

    • Example: Drive 32-bit bus, total bus wire load = 2pF, with delay 0.5ns

      • R for each transistor needs to be < 0.25kW to meet RC = 0.5ns

      • Effective R of bits together is 250/32 = 7.5W

      • For < 10% drop, power distribution R must be < 1W

    Courtesy K. Yang, UCLA


    Electromigration

    Electromigration

    • Physical migration of metal atoms due to “electron wind” can eventually create a break in a wire

      • MTTF (mean time to failure)  1/J2 where J= current density

      • Current density must not exceed specification  wire Ii/wi < Jspec

      • Specified as mA per m wire width (e.g., 1mA/ m) or mA per via cut

    • EM occurs both in signal (AC=bidirectional) and power wires (DC = unidirectional)

      • Much worse for DC than AC; DC occurs inside cells and in power buses

    • May need more contacts on transistor sources and drains to meet EM limits

    • Width of power buses must support both iR and EM requirements

    • Issues in IR and EM constraint generation

      • Topology is most likely not a tree

      • How do we determine current patterns?

      • Effects of R, L


    What happens

    What Happens?

    • Example of an AlCu line seen under microscope.

      • Accelerated by higher temperature and high currents

      • Voids form on grain boundaries

      • Metal atoms move with current away from voids and collect at boundaries

    Catastrophic failure

    Courtesy K. Yang, UCLA


    Ece260b cse241a winter 2005 power distribution

    Taken from http://www.nd.edu/~micro/fig20.html

    Taken from Sverre Sjøthun, “Electromigration

    In-Depth,” from www.dpwg.com

    Courtesy S. Sapatnekar, UMinn


    Power supply rules of thumb

    Power Supply Rules of Thumb

    • Rules depend on technology

      • Tech file has rules for resistance and electromigration

    • Examples:

      • Must have a contact for each 16l of transistor width (more is better)

      • Wire must have less than 1mA/mm of width

      • Power/Gnd width = Length of wire * Sum (all transistors connected to wire) / 3*106l (very approximate)

    • For small designs, power supply design is non-issue

    Courtesy K. Yang, UCLA


    Basic methodology concepts

    Basic Methodology Concepts

    • Reliability (slotting, splitting)

    • Alignment of hierarchical rings, stripes

    • Isolation of analog power

    • Styles of power distribution

      • Rings and trunks

      • Uniform grid

      • Bottom-up grid generation

      • Depends on:

        • Package: flip-chip vs. wire-bond; I/O count (fewer pads  denser grid)

        • Power budget

        • IR drop limits

        • Floorplan constraints (hard macros, etc.)


    Metal slotting vs splitting

    Metal Slotting vs. Splitting

    Easy connections through standard via arrays

    • Required by metal layout rules for uniform CMP (planarization)

    • Split power wires

      • Less data than traditional slotting

      • More accurate R/C analysis of power mesh

      • Not supported by all tools

    GND

    GND

    GND

    GND

    VS.

    M1

    M1

    Difficult to connect - where should vias go?

    Courtesy Cadence Design Systems, Inc.


    Trunks and rings methodology

    Trunks and Rings Methodology

    • Each Block has its own ring

      • Rings may be inside the blocks or part of the top level

    • Each Block has trunks connecting top level to block

    G

    G

    V

    V

    Rings may be shared with abutted blocks

    Individual trunks connecting

    blocks to top level

    block 3

    V

    V

    block 5

    G

    G

    block 2

    V

    block 4

    G

    V

    block 1

    V

    G

    G

    G

    V

    V

    G

    V

    Courtesy Cadence Design Systems, Inc.


    Trunks and rings

    Advantages

    Power tailored to the demands of each block (flexible)

    More area efficient since the demands of each block are uniquely met

    Simple implementation supported by many tools

    Rings can be shared between blocks by abutted blocks

    Disadvantages

    Limited redundancy, power grid built to match needs

    Assumptions in design may change or be invalid

    Non regular structure requires more detailed IR drop/EM analysis

    missing vias/connections fatal

    Rings will require slotting/splitting due to wide widths

    Increase in data volume

    Trunks and Rings

    Courtesy Cadence Design Systems, Inc.


    Uniform chip grid methodology

    Uniform Chip Grid Methodology

    • Robust and redundant power network

      • mainly in microprocessors and high end large ASICs

    • Implementation

      • Primary distribution through upper metal layers

        • Lower layers in blocks to connect to top through via stacks

      • Typically pushed into blocks

      • Blocks typically abut

        • Requires block grids to align

      • Rows/Followpins should align with block pins

        • Global buffer insertion

    global grid

    higher layers

    Fine or custom grid

    or no grid

    on lower layers

    G

    G

    V

    V

    V

    V

    block 4

    block 5

    G

    G

    block 3

    V

    block 4

    G

    V

    block 1

    V

    G

    G

    G

    V

    V

    G

    V

    Courtesy Cadence Design Systems, Inc.


    Uniform chip grid

    Advantages

    Easily implemented

    Lends itself to straightforward hand calculations

    Path redundancy allows less sensitively to changes in current pattern

    Mesh of power/ground provides shielding (for capacitance) and current returns (for inductance)

    Top-down propagation easy to use on this style

    Disadvantages

    Takes up significant routing resources (20%-40% of all routing tracks if not already reserved for power/ground)

    Fine grids may slow down P&R tools

    Imposes grid structure into each block which may be unnecessary

    Top and blocks coupled closely if top level routing pushed into blocks

    Changes to block/top must be reflected in other

    Uniform Chip Grid

    Courtesy Cadence Design Systems, Inc.


    Bottom up grid generation methodology

    Bottom-Up Grid Generation Methodology

    • Design and optimize power grid for block, merge at top

    • Advantages

    • Able to tailor grid for routing resource efficiency in each block

    • Flexibility to choose the best grid for the block (i.e. ring and stripe, power plane, grid)

    • Disadvantages

    • Designing grid in context of the “big picture” is more difficult

    • Block grid may present challenging connections to top level

    • Assumptions for block grid’s connection to top level must be analyzed and validated

    Courtesy Cadence Design Systems, Inc.


    Power routing in area based p r

    Power Routing in Area-Based P&R

    • Power routing approaches

      • (1) Pre-route parts of power grid during floorplanning

      • (2) Build grid (except connections to standard cells) before P&R

      • (3) Build entire grid before P&R

      • N.B.: Area-based P&R tools respect pre-routes absolutely

    • Cadence tools: power routing done inside SE, all other tasks (clock, place, route, scan, …) done by point tools

      • Lab 5 tomorrow has a tiny bit of power routing (rings, stripes)

    • Miscellany

      • ECOs: What happens to rings and trunks if blocks change size?

      • Layer choices: What is cost of skipping layers (to get from thick top-layer metal down to finer layers)?

      • How wide should power wires be?

      • Post-processing strategies

    Courtesy Cadence Design Systems, Inc.


    Power routing wire width considerations

    Power Routing Wire Width Considerations

    • Slotting rules: Choose maximum width below slotting width

    • Halation (width-dependent spacing) rules: Do as much as possible of power routing below wide wire width to save routing space

    • Choose power routing widths carefully to avoid blocking extra tracks (and, use the space if blocking the track!)

      What is better power width here?

    Blocked tracks

    Courtesy Cadence Design Systems, Inc.


    Power routing tool usage

    Power Routing Tool Usage

    • 4 layer power grid example (HVHV)

      • Turn on via stacking

      • Route metal2 vertically

      • Route metal4 vertically (use same coordinates)

      • Route metal3 horizontally (make coincident with every N metal1 routes)

      • Turn off via stacking

      • Route metal1 horizontally

    metal2/metal4

    coincident

    metal1 inside cells

    metal3 every n micron

    Courtesy Cadence Design Systems, Inc.


    Post processing flows def or layout editing

    Post-Processing Flows (DEF or Layout Editing)

    During PnR

    After post processing

    Courtesy Cadence Design Systems, Inc.


    Tree supply network design

    (Tree) Supply Network Design

    • Tree topology assumption not very useful in practice, but illustrates some basic ideas

    • Assume R dominates, L and C negligible

      • marginally permissible assumption

    • Current drawn at various points inthe tree (time-varying waveform)

    • Current causes a V=IR drop

      • “Ground” is not at 0V

      • “Vdd” is not at intended level

    Supply

    = sinks

    Courtesy S. Sapatnekar, UMinn


    Ir drop constraints

    IR Drop Constraints

    • Chowdhury and Breuer, TCAD 7/88

    • Can write V drop to each sink as

      •  Ri Ii < Vspecfor all sink current patterns made available

      • Tree structure: can compute Ii easily

      • Ri   li / wi

    • Change wi to reduce IR drop

    • Objective: minimize  ai wi

    • Current density must never exceed a specification

      • For each wire, Ii/wi < Jspec

    Supply

    Courtesy S. Sapatnekar, UMinn


    P g mesh optimization r only

    P/G Mesh Optimization (R only)

    • Dutta and Marek-Sadowska, DAC 89

    • Cost function:  ai li wi =  ai cili2 // = total wire area (since ci = conductance = wi/( li)

    • Constraints

      • EM: Ii e wi// current density I/w less than upper bound

        • Substitute Ii = vi (wi/  li) // I = V/R vp - vq  e  li // divide by wi, *  li

      • Wire width constraints: Wmin  wi  Wmax (translate to ci)

      • Voltage drop constraints: va - vb  Vspec1 and/or vi  Vspec2

      • Circuit equations that determine the v’s

  • Variables: ci’s (vi’s depend on ci’s)

  • Courtesy S. Sapatnekar, UMinn


    Solution technique

    Solution Technique

    • Method of feasible directions

      • Find an initial feasible solution (satisfies all constraints)

      • Choose a direction that maintains feasibility

      • Make a move in that direction to reduce cost function

    • Given a set of ci’s, must find corresponding vi’s

      • Feasible direction method: move from point c* to c+

      • c* and c+ must be close to each other (i.e., if you have the solution at c*, the solution at c+ corresponds to a minor change in conductances)

      • Solving for vi’s : solving a system of linear equations

        • Solution at c* is a good guess for the solution at c+

        • Converges in a few iterations

    Courtesy S. Sapatnekar, UMinn


    Modeling gate currents

    Modeling Gate Currents

    • Currents in supply grid caused by charging/discharging of capacitances by logic gates

    • All analyses require generation of a “worst-case switching” scenario

    • Enumeration is infeasible  Two basic approaches

      • Simulation based methods: designer supplies “hot” vectors, or we try to generate these hot vectors automatically

      • “Pattern-independent” methods: try to estimate the worst-case (can be expensive, very inaccurate)

    • Once current patterns are available, apply them to supply network to find out if constraints are satisfied

    Courtesy S. Sapatnekar, UMinn


    Complexity of hot vector generation

    Complexity of Hot Vector Generation

    • Devadas et al., TCAD 3/92:

      • Assume zero gate delays for simplicity

      • Find the maximum current drawn by a block of gates

      • Using a current model for each gate

        • Find a set of input patterns so that the total current is maximized

        • Boolean assignment problem: equivalent to Weighted Max-Satisfiability

          • Given a Boolean formula in conjunctive normal form (product of sums), is there an assignment of truth values to the variables such that the formula evaluates to True?

        • Checking for Satisfiability (for k-sat, k > 2) is NP-complete

      •  Difficult even under zero gate delay assumption

    Courtesy S. Sapatnekar, UMinn


    Pattern independent methods

    Pattern-Independent Methods

    • iMAX approach: Kriplani et al., TCAD 8/95

      • Current model for a single gate

      • Gates switch at different times

      • Total current drawn from Vdd (ignoring supply network C) is the sum of these time-shifted waveforms

    • Objective: find the worst-case waveform

    Ipeak

     Delay

    Courtesy S. Sapatnekar, UMinn


    Example

    (Not to scale!)

    Example

    • Maximum current not just a sum of individual maximum currents

    • Temporal dependencies

    • [Using deliberate clock skews can reduce the peak current, as we saw in the Useful-Skew discussion]

    Courtesy S. Sapatnekar, UMinn


    Maximum envelope current mec

    Maximum Envelope Current (MEC)

    • Find the time interval during which a gate may switch

      • Manufacturing process variations can cause changes

      • Actual switching event can cause changes

      • Switching at second gate can occur at t=1 or at t=2

      • In general, a large number of paths can go through a gate; assume (conservatively) that switching occurs in t  [1,2]

      • Assume that all gate inputs can switch independently – provides an upper bound on the switching current

    (unit gate delays)

    Courtesy S. Sapatnekar, UMinn


    Large errors in mec approach

    G1

    G2

    G3

    (Large) Errors in MEC Approach

    • Correlation Problem

      • Switching at G0, G1, G2 and G3 not independent

      • G0 = 0 implies that G1, G2, G3 switch; G0 = 1 means that other inputs will determine gate activity

      • If the other inputs cannot make the gate switch in the same time window, then iMAX estimates are pessimistic

    • Reconvergent Fanout Problem

      • Signals that diverge at G0 reconverge at Gk  inputs to Gk are not independent

      • Assumption of independent switching is not valid

    • Many heuristic refinements proposed, but guardbanding (error) of power estimation still a huge issue

    G0

    G1

    Gk

    G2

    G0

    G3

    Courtesy S. Sapatnekar, UMinn


    Outline

    Outline

    • Motivation

    • Power Supply Noise Estimation

    • Decoupling Capacitance (decap) Budget

    • Allocation of Decoupling Capacitance

    • Experiment Results

    • Conclusion


    Why decoupling capacitance

    Why Decoupling Capacitance

    • Frequency point of view

      • Decaps form low-pass filters

      • They cancel anti- effects

    • Physical point of view

      • Decaps serve as charge reservoirs

      • They shortcut supply current paths and reduces voltage drop

    • No effect to DC supply currents


    Power supply network rlc mesh

    Power Supply Network—RLC Mesh

    VDD

    :Current

    Source

    Rp

    Lp

    : VDD pin

    VDD

    VDD

    VDD

    Slide courtesy of S Zhao, K Roy & C.-K. Kok


    Current distribution in power supply mesh illustration

    Current Distribution in Power Supply Mesh Illustration

    Current

    contribution

    Current flowing

    path

    :Connection point,

    VDD

    (1)

    (3)

    :VDD pin

    (5)

    VDD

    (2)

    (6)

    C

    B

    Module A

    Slide courtesy of S Zhao, K Roy & C.-K. Kok


    Current distribution in power supply network

    Current Distribution in Power Supply Network

    • Distribute switching current for each module in the power supply mesh

    • Observation: Currents tend to flow along the least-impedance paths

    • Approximation: Consider only those paths with minimal impedance --shortest, second shortest, …

    Slide courtesy of S Zhao, K Roy & C.-K. Kok


    Current flowing paths and power supply noise calculation

    i

    i

    3(t)

    1(t)

    R1

    L1

    C2

    2(t)

    Current Flowing Paths and Power Supply Noise Calculation

    • Power supply noise at a target module is the voltage difference between the VDD pin and the module

    • Apply KVL:

    VDD

    R2

    L2

    k

    C1

    i

    Slide courtesy of S Zhao, K Roy & C.-K. Kok


    Why decoupling capacitance1

    i

    i

    1(t)

    3(t)

    Why Decoupling Capacitance?

    • P/G network wiresizing won’t change voltage drop frequency spectrum

    • To reduce Vdrop by k times needs to size up wires by k times along the supply current path

    VDD

    R2

    L2

    k

    C1

    R1

    L1

    C2

    i

    2(t)

    • Decoupling caps act as a low-pass filter

    • Efficient to remove high frequency elements of Vdrop


    Decoupling capacitance budget

    Decoupling Capacitance Budget

    • Decap budget for each module can be determined based on its noise level

    • Initial budget can be estimated as follows:

    • Iterations are performed if necessary until noise at each module in the floorplan is kept under certain limit

    Slide courtesy of S Zhao, K Roy & C.-K. Kok


    Allocation of decoupling capacitance

    Allocation of Decoupling Capacitance

    • Decap needs to be placed in the vicinity of each target module

    • Decap requires WS to manufacture on

      • Use MOS capacitors

    • Decap allocation is reduced to WS allocation

    • Two-phase approach:

      • Allocate the existing WS in the floorplan

      • Insert additional WS into the floorplan if required

    Slide courtesy of S Zhao, K Roy & C.-K. Kok


    Allocation of existing white space

    Allocation of Existing White Space

    WS

    A

    B

    D

    w2

    C

    w1

    E

    w3

    Slide courtesy of S Zhao, K Roy & C.-K. Kok


    Allocation of existing ws linear programming lp approach

    Objective: Maximize the utilization of available WS

    Existing WS can be allocated to neighboring modules using LP

    Notation:

    LP Approach:

    Allocation of Existing WS--Linear Programming (LP) Approach

    Slide courtesy of S Zhao, K Roy & C.-K. Kok


    Insert additional ws into floorplan if necessary

    Insert Additional WS into Floorplan If Necessary

    • Update decap budget for each module after existing WS has been allocated

    • If additional WS if required, insert WS into floorplan by extending it horizontally and vertically

    • Two-phase procedure:

      • insert WS band between rows based the decap budgets of the modules in the row

      • insert WS band between columns based on the decap budgets of the modules in the column

    Slide courtesy of S Zhao, K Roy & C.-K. Kok


    Moving modules to insert ws

    Moving Modules to Insert WS

    Slide courtesy of S Zhao, K Roy & C.-K. Kok


    Experimental results comparison of decap budgets ours vs greedy solution

    Experimental ResultsComparison of Decap Budgets(Ours vs “Greedy Solution”)


    Experimental results for mcnc benchmark circuits

    Experimental Results for MCNC Benchmark Circuits


    Floorplan of playout before after ws insertion

    Floorplan of playout Before/After WS Insertion


    Conclusion

    Conclusion

    • A methodology for decoupling capacitance allocation at floorplan level is proposed

    • Linear programming technique is used to allocate existing WS to maximize its utilization

    • A heuristic is proposed for additional WS insertion

    • Compared with “Greedy” solution, our method produces significantly smaller decap budgets


    Ece260b cse241a winter 2005 power distribution

    Thank you


  • Login