A method for fast delay area estimation
Sponsored Links
This presentation is the property of its rightful owner.
1 / 18

A Method for Fast Delay/Area Estimation PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

A Method for Fast Delay/Area Estimation. EE219b Semester Project Mike Sheets May 16, 2000. Overview. Problem statement Proposed solution Constant delay paradigm Zero-slack algorithm Implementation Incorporation into SIS Library characterization Results Conclusions Future Work.

Download Presentation

A Method for Fast Delay/Area Estimation

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

A Method for Fast Delay/Area Estimation

EE219b Semester Project

Mike Sheets

May 16, 2000


  • Problem statement

  • Proposed solution

    • Constant delay paradigm

    • Zero-slack algorithm

  • Implementation

    • Incorporation into SIS

    • Library characterization

    • Results

  • Conclusions

  • Future Work

Problem Statement

  • Given a boolean network, estimate the area if implemented with particular required time constraints

    • Estimation should be fast and reasonably accurate

  • Examine how technology independent logic optimization affects the estimation

Area/Delay Models

  • Constant area (traditional) model

    • Composed of discretely sized gates with constant area

    • Mapping involves calculating delay as a function of load

  • Constant delay model

    • Composed of mathematical functions relating area to size

    • Mapping involves calculating size (area) as a function of load

Constant Area Model

Constant Delay Model





Area = constant from library

Size = constant from library

Delay = dint + k*CL

Area = Aint + Aslope*size

Size = k*CL /(Delay – dint)

Delay = constant

Zero Slack Algorithm

Given input arrival times {ai} and output required time {rk}, assign gate delays as follows:

  • Initialize all internal required/arrival times to “unknown”

  • Select the path(s) with the minimum value of (rk-ai)/lp where lp is the length of the path in number of gates

    • For each node from primary inputs to primary outputs

      • Calculate all the (ai, li) pairs from all fanin edges

      • Discard dominated pairs, save the union of the undominated pairs

    • When all primary outputs are reached, calculate minimum (rk-ai)/lp

  • Assign delay of each gate in the selected path(s) to this minimum

  • Update arrival and required times for all fi and fo edges of newly assigned delays

  • Repeat steps 2-4 until all gates are assigned delays

Pair domination defined:



Pair (ai, li) dominates (aj, lj) if

ai  aj and li  lj

If either (a1, l1) or (a2, l2) dominates the other, the four possible paths through n can be reduced to two, since the dominated path is “faster” than necessary.












Faster Approximation

Select an allowable slack threshold sthresh (if zero then algorithm yields same result as previous)

  • Compute the forward level lj and arrival time aj of all nodes in network using a forward trace

  • Compute the reverse level kj and required time rj of all nodes in network using a backward trace

  • Update the delay of every node asdj = dj + (rj-aj)/(lj+kj)

  • While the slack of any node exceeds sthresh then repeat steps 1-3.

Incorporation into SIS




Tech. independent optimization:

script.algebraic, script.boolean, etc



Tech. dependent optimization:








Area/delay tradeoff curve


Fast delay/area estimation:


Library Characterization

  • Commercial standard cell library have possibly multiple gates that implement the same equation

  • Each gate in the library has characteristics:

    • Size

    • Delays from all input pins to the output pin for all transitions and several loads

    • Capacitance for all input pins

    • Maximum load

    • Area

  • We need estimation parameters for each class of gates (ie. gates with the same equation):

    • Intrinsic gate delay (dint)

    • Drive factor (k)

    • Area line y-intercept (Aint)

    • Area line slope (Aslope)

    • Input capacitance line y-intercept (cint)

    • Input capacitance line slope (cslope)

Inverter Characterization (1)

  • Inverter delay scales linearly with load/size

    • Slope is k

    • Y-intercept is dint

Inverter Characterization (2)

  • Inverter area scales linearly with size

    • Slope is Aslope

    • Y-intercept is Aint

Characterization Issues

  • Requires at least two gates per class in the library

  • Additionally, some gates have poor accuracy (trend lines have poor coefficients of determination)

  • Further research shows the reason is CMOS implementation (below)

  • Future work might replace linear model with piece-wise linear model for more accuracy

NAND-gate CMOS schematic

for smaller sizes

NAND-gate CMOS schematic

for larger sizes

Estimation Library

  • These issues are evident in the table

    • OAI31 and OAI32 have Aslope of 0.0, meaning that the two cells in the library had the same area

    • NOR3, NOR4 had poor coefficients of determination

    • Many gates in the library had only one size

Estimation Modes

  • Sweep mode

    • User specifies a range of required times to sweep (possibly only one) and a step size

    • Estimation starts with the largest required time and steps down until network fails the zero slack algorithm (ie. negative slack is encountered)

  • Binary search mode

    • Used to find the minimum possible required time (period) given infinite area

    • Starts at a user-specified maximum and performs a binary search until a pass limit is reached


  • Various sized combinational logic benchmarks

    • MCNC c17, c880, c1908, c3540

  • Various sized sequential logic benchmarks

    • Interpretation of required time is clock period (assuming all flip-flops are clocked synchronously)

    • MCNC s713, s838, s953, s1196, s1238, s1423

  • Tested four scripts

    • script.none (no optimization), script.algebraic, script.boolean, script.rugged

Tradeoff Curves

  • Sweep mode allows multiple required times (clock periods) to be easily tabulated

Sensitivity to Optimization Script

  • When delay is non-critical (ie. as required time approaches infinity)

    • Area within 20% of no optimization

    • Variation between optimization scripts mostly under 10%


  • Sometimes more optimization yields worse results

  • As required times become smaller, more paths become critical requiring larger sizes (area)

    • Area increases quickly before failure

  • From the benchmarks shown, estimation is relatively insensitive to technology independent optimization with infinite required times

Possible Future Work

  • Accuracy

    • Relate estimated areas to actual areas from a good mapping using the full technology library

    • Use more complex delay equations to handle different rise/fall times

    • Modify the algorithm to handle the case where a primary input cannot drive the required load

  • Characterization

    • Revise characterization to support piece-wise linear functional forms

    • Automate process so only the actual technology library is required as an input

  • Mapping

    • Examine how various mapping options affect estimation

    • Use buffered fanout trees (Touati) after sizing gates

  • Speed

    • Compare speed of total estimation procedure to traditional flow

  • Power estimation

  • Login