Ece260b cse241a winter 2005 design styles multi vdd vth designs
This presentation is the property of its rightful owner.
Sponsored Links
1 / 52

ECE260B – CSE241A Winter 2005 Design Styles Multi-Vdd/Vth Designs PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

ECE260B – CSE241A Winter 2005 Design Styles Multi-Vdd/Vth Designs. Website: The Design Problem. Source: sematech97. A growing gap between design complexity and design productivity. Design Methodology.

Download Presentation

ECE260B – CSE241A Winter 2005 Design Styles Multi-Vdd/Vth Designs

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Ece260b cse241a winter 2005 design styles multi vdd vth designs

ECE260B – CSE241AWinter 2005Design StylesMulti-Vdd/Vth Designs


The design problem

The Design Problem

Source: sematech97

A growing gap between design complexity and design productivity

Design methodology

Design Methodology

  • Design process traverses iteratively between three abstractions: behavior, structure, and geometry

  • More and more automation for each of these steps

Behavioral description of accumulator

Behavioral Description of Accumulator

Design described as set of input-output

relations, regardless of chosen


Data described at higher abstraction

level (“integer”)

Structural description of accumulator

Structural Description of Accumulator

Design defined as composition of

register and full-adder cells (“netlist”)

Data represented as {0,1,Z}

Time discretized and progresses with

unit steps

Description language: VHDL

Other options: schematics, Verilog

Implementation methodologies

Implementation Methodologies

Full custom

Full Custom

  • Hand drawn geometry

  • All layers customized

  • Digital and analog

  • Simulation at transistor level

  • High density

  • High performance

  • Long design time

Magic Layout Editor

(UC Berkeley)

Symbolic layout

Symbolic Layout

  • Dimensionless layout entities

  • Only topology is important

  • Final layout generated by “compaction” program

Stick diagram of inverter

Standard cells

Standard Cells

  • Organized in rows

  • Cells made as full custom by vendor (not user)

  • All layers customized

  • Digital with possible special analog cells

  • Simulation at gate level (digital)

  • Medium-high density

  • Medium-high performance

  • Reasonable design time

Routing channel

requirements are

reduced by presence

of more interconnect


Standard cell example

Standard Cell — Example


Standard cell example1

Standard Cell - Example

3-input NAND cell

(from Mississippi State Library)

characterized for fanout of 4 and

for three different technologies

Automatic cell generation

Automatic Cell Generation

Random-logic layout

generated by CLEO

cell compiler (Digital)

Module generators compiled datapath

Module Generators — Compiled Datapath

Macrocell based design

Macrocell-Based Design

  • Predefined macro blocks (uP, RAM, etc.)

  • Macro blocks made as full custom by vendor (IP blocks)

  • All layers customized

  • Digital and some analog

  • Simulation at behavior

    or gate level

  • High density

  • High performance

  • Short design time

  • Use standard on-chip busses

  • “System on a chip” (SOC)


Interconnect Bus

Routing Channel

Macrocell design methodogoly



Data paths

Routing Channel

Standard cells

Video-encoder chip


Macrocell Design Methodogoly


Defines overall

topology of design,

relative placement of

modules, and global

routes of busses,

supplies, and clocks

Gate array

Gate Array

  • Predefined transistors connected via metal

  • Two types: channel based, sea of gates

  • Only metal layers customized

  • Fixed array sizes

  • Digital cells in library

  • Simulation at gate level (digital)

  • Medium density

  • Medium performance

  • Reasonable design time

Gate array primitive cells

Gate Array — Primitive Cells


Cell(4-input NOR)



Sea of gates


Random Logic



LSI Logic LEA300K

(0.6 mm CMOS)

Prewired arrays

Prewired Arrays

  • Programmable logic blocks

  • Programmable connections between logic blocks

  • No layers customized (standard devices)

  • Digital only

  • Low-medium performance

  • Low-medium density

  • Programmable: SRAM, EPROM, Flash,

    Anti-fuse, etc.

  • Easy and quick design changes

  • Cheap design tools

  • Low development cost

  • High device cost

  • NOT a real ASIC

Courtesy Altera Corp.

Programmable logic devices

Programmable Logic Devices




Field programmable gate arrays fuse based

Field-Programmable Gate Arrays - Fuse-based

Standard-cell like




Programming interconnect using anti-fuses

Field programmable gate arrays ram based

Field-Programmable Gate Arrays - RAM-based

Ram based fpga basic cell clb

RAM-based FPGA - Basic Cell (CLB)

Courtesy of Xilinx

Ram based fpga

RAM-based FPGA

Xilinx XC4025

High performance devices

High Performance Devices

  • Mixture of full custom, standard cells and macro’s

  • Full custom for special blocks: Adder (data path), etc.

  • Macro’s for standard blocks: RAM, ROM, etc.

  • Standard cells for non critical digital blocks

Global signaling and layout

Global Signaling and Layout

  • Global signaling and layout optimization

  • Multi-Vdd

  • Static power analysis

  • Multi-Vth + Vdd + sizing

D. Sylvester, DAC-2001

Global signaling

Global Signaling

  • Current global signaling paradigm  insert large static CMOS repeaters to reduce wire RC delay

  • Impending problems:

    • Too many repeaters

      • 180nm processors: 22K repeaters (Itanium), 70K (Power4)

      • Project 1-1.5M repeaters at 45-65nm technologies

    • Too much power

      • Many large repeaters = significant static and dynamic power

    • Too much noise

      • Repeater clustering complicates power distribution

      • Inductive coupling across wide bus structures

D. Sylvester, DAC-2001

Cell layout optimization

GDSII Import

Compact fixed width

Cell Layout Optimization

  • Advanced layout techniques must allow

    • Continuous individual device sizing

    • Variable p/n ratios

    • Tapered FET stacking sizes

    • Arbitrary Vth assignments within gates

  • First cut: Cadabra  15-22% power reduction using 1st two approaches under fixed footprint constraint

Optimize specific instances of standard gates

Ref: Hurat, Cadabra

D. Sylvester, DAC-2001

Multi vdd


  • Global signaling and layout optimization

  • Multi-Vdd

  • Static power analysis

  • Multi-Vth + Vdd + sizing

D. Sylvester, DAC-2001

Multi vdd status

Multi-Vdd Status

  • Idea: Incorporate two Vdd’s to reduce dynamic power

  • Limited to a few recent Japanese multimedia processors

    • Example – 0.3 mm, 75MHz, 3.3V media processor (Toshiba)

      • Total power savings of 47% in logic, 69% in clock

    • Dynamic voltage scaling of mobile processors

      • Transmeta Crusoe, Intel Speedstep, etc.

      • Not considered in this talk

  • Very powerful technique currently applied only inlow-performance designs

    • Mentality: today’s high performance parts aren’t “limited” by power

D. Sylvester, DAC-2001

Lower power via rich replacement

% of total paths

Path delay (normalized to clock period)

Lower Power Via Rich Replacement

  • Media processors and other low speed designs have many non-critical paths

    • 60-70% of paths have delay  half the clock period

    • After replacement, most paths become near critical

  • What about high-speed microprocessors?

D. Sylvester, DAC-2001

Similar story for high performance

Similar Story For High-Performance

  • IBM 480 MHz PowerPC shows over 50% of paths have delay less than half the clock period

    • Implies that high-performance designs can benefit from multi-Vdd

Ref: Akrout, JSSC98

D. Sylvester, DAC-2001

Resizing is not the right answer

After post-synthesis resizing

Before post-synthesis resizing

Resizing Is Not The Right Answer

  • Post-synthesis optimizations resize gates to recover power on non-critical paths

    • Looks similar to pre- and post-replacement figures in media processor…

This is the wrong approach for nanometer design!

Ref: Sirichotiyakul, DAC99

D. Sylvester, DAC-2001

Multi vdd instead of sizing

Multi-Vdd Instead of Sizing

  • Power ~ C Vdd2 f, where f is fixed

  • Key: Reducing gate width impacts power sub-linearly

    • Interconnect capacitance is not affected

  • Reducing supply voltage cuts power quadratically

    • All capacitive loads have lower voltage swing

  • How can we minimize delay penalty at low Vdd?

D. Sylvester, DAC-2001

Challenges for multi vdd

Challenges For Multi-Vdd

  • Area overhead

    • Toshiba reported 7% rise in area due to placement restrictions, level converters, additional power grid routing

  • EDA tool support for the above issues (placement, dual power routing)

  • Noise analysis

    • Additional shielding required between Vdd,low and Vdd,high signals?

    • Including clock network

D. Sylvester, DAC-2001

Static power

Static Power

  • Global signaling and layout optimization

  • Multi-Vdd

  • Static power

  • Multi-Vth + Vdd + sizing

D. Sylvester, DAC-2001

Static power1

Static Power

  • Why do we care about static power in non-portable devices?

    • Standby power is “wasted” -- leaves fewer Watts for computation

    • Worsens reliability by raising die temperatures

  • Leakage current is a function of Vth and subthreshold swing (Ss) (x10 at operating vs. room temp!)

  • Ss expected to remain at 80-85 mV/dec (room temp)

    • Device technology may cut this by ~20%

  • Vth reductions are mandated by scaling Vdd

    • Vth has been around Vdd/5

D. Sylvester, DAC-2001

Leakage suppression approaches


Pull Up


Pull Down




High Vth Device

Leakage Suppression Approaches

  • Dual-Vth (most common)

    • Low-Vth on critical paths, high-Vth off

    • Only cost is additional masks


    • Series inserted high-Vth device cuts leakage current when off (sleep mode)

    • Delay and area penalties, control device sizing is critical

  • Other techniques

    • Substrate biasing to control Vth

    • Dual-Vth domino

      • Use low-Vth devices only inevaluate paths

D. Sylvester, DAC-2001

Can gate length biasing help leakage reduction

Can Gate-length biasing help leakage reduction?

  • Reduce leakage?

Variation of leakage and delay (each normalized to 1) for an NMOS device in an industrial 130nm technology

  • Reduce leakage variability?


Gate length biasing

Gate-length Biasing

  • First proposed by Sirisantana et al.

    • Comparative study of effect of doping, tox and gate-length

    • Large bias used, significant slow down

  • Small bias

    • Little reduction in leakage beyond 10% bias while delay degrades linearly

    • Preserves pin compatibility

       Technique applicable as post-RET step

  • Salient features

    • Design cycle not interfered

    • Zero cost (no additional masks)



  • Technology-level

    All devices in all cells have one biased gate-length

  • Cell-level

    All devices in a cell have one biased gate-length

  • Device-level

    All devices have independent biased gate-length

    Simplification: In each cell, NMOS devices have one gate-length and PMOS devices have another

Device level leakage reduction

Device-Level Leakage Reduction

Circuit level

Circuit level

  • Bias gate-length for non-critical cells

  • Library extended with each cell having a biased version

  • Benefits analyzed in conjunction with Multi-VT assignment and in isolation

    • SVT-SGL

    • DVT-SGL

    • SVT-DGL

    • DVT-DGL

Results leakage reduction

Results: Leakage Reduction

With less than 2.5% delay penalty

  • Design Compiler used for VT assignment and gate-length biasing

  • Better results expected with Duet (academic sizer from Michigan)

Multi vth vdd sizing

Multi-Vth + Vdd + Sizing

  • Global signaling and layout optimization

  • Multi-Vdd

  • Static power analysis

  • Multi-Vth + Vdd + sizing

D. Sylvester, DAC-2001

Multi everything


  • Need an approach that selects between speed, static power, and dynamic power

  • Should be scalable to nanometer design

    • Rules out dual-Vth domino or other dynamic logic families (low supplies kill performance advantages)

  • Techniques mentioned so far

    • Flexible, optimized cell layouts

    • Multi-Vdd

    • Dual-Vth

  • Put them all together

D. Sylvester, DAC-2001

Multi vdd can leverage vth s

Multi-Vdd Can Leverage Vth’s

  • Existing designs using multi-Vdd do not alter Vth in low-Vdd cells

    • Highly sub-optimal, delay is fully penalized

    • Limits cell replacement  limits power savings

  • Much better solution: reduce Vth in low-Vdd cells to carefully balance delay, static power, and dynamic power

    • Enforce technology scaling within a chip – whenever we reduce Vdd, we also reduce Vth to maintain speed

D. Sylvester, DAC-2001

Multi vdd vth negates delay penalty

Multi-Vdd + Vth Negates Delay Penalty

Delay ~ CVdd/Ion

  • Scenarios

    • Constant Vth (current paradigm)

    • Scale Vth to maintain constant static power

    • Scale Vth to reduce static power linearly with Vdd

  • Delay penalty is substantially offset

  • Ion is very sensitive to Vth at Vdd < 1V

  • Pstatic reduces with Vdd due to linear term and smaller Ioff (Ionand DIBL )

D. Sylvester, DAC-2001

Now add sizing

Now Add Sizing

  • Multi-Vdd + multi-Vth + sizing/cell layout optimization attacks power from many angles (multi-dimensional)

  • Depending on criticality and switching activities, non-critical gates can be:

    • Assigned Vdd,low

    • Assigned Vdd,low + lower Vth

    • Assigned Vth,high

    • Downsized (at the individual transistor level if advantageous)

    • Assigned Vdd,low and upsized

      • For gates that cannot tolerate Vdd,low delay, this can be power efficient

    • And others

D. Sylvester, DAC-2001



  • Power density must saturate to maintain affordable packaging options

    • 50 W/cm2 means 200-250W for future large MPUs

    • Dynamic thermal management saves 25% on packaging power budget

  • Multi-Vdd will leverage multiple Vth’s to offset delay penalty at low Vdd

    • More widespread re-assignment to Vdd,low

    • Use Vdd first instead of re-sizing to take advantage of large path slacks

    • Anticipated power savings of 50-80%

  • Static power also addressed through multi-Vth + Vdd + sizing

    • Vth difficult to control in ultra-short channels

    • Intra-cell Vth assignment + MTCMOS/variants + sleep modes

D. Sylvester, DAC-2001

Next week project meetings

Next Week: Project Meetings

D. Sylvester, DAC-2001

  • Login