a method for fast delay area estimation
Skip this Video
Download Presentation
A Method for Fast Delay/Area Estimation

Loading in 2 Seconds...

play fullscreen
1 / 18

A Method for Fast Delay/Area Estimation - PowerPoint PPT Presentation

  • Uploaded on

A Method for Fast Delay/Area Estimation. EE219b Semester Project Mike Sheets May 16, 2000. Overview. Problem statement Proposed solution Constant delay paradigm Zero-slack algorithm Implementation Incorporation into SIS Library characterization Results Conclusions Future Work.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'A Method for Fast Delay/Area Estimation' - cameo

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
a method for fast delay area estimation

A Method for Fast Delay/Area Estimation

EE219b Semester Project

Mike Sheets

May 16, 2000

  • Problem statement
  • Proposed solution
    • Constant delay paradigm
    • Zero-slack algorithm
  • Implementation
    • Incorporation into SIS
    • Library characterization
    • Results
  • Conclusions
  • Future Work
problem statement
Problem Statement
  • Given a boolean network, estimate the area if implemented with particular required time constraints
    • Estimation should be fast and reasonably accurate
  • Examine how technology independent logic optimization affects the estimation
area delay models
Area/Delay Models
  • Constant area (traditional) model
    • Composed of discretely sized gates with constant area
    • Mapping involves calculating delay as a function of load
  • Constant delay model
    • Composed of mathematical functions relating area to size
    • Mapping involves calculating size (area) as a function of load

Constant Area Model

Constant Delay Model





Area = constant from library

Size = constant from library

Delay = dint + k*CL

Area = Aint + Aslope*size

Size = k*CL /(Delay – dint)

Delay = constant

zero slack algorithm
Zero Slack Algorithm

Given input arrival times {ai} and output required time {rk}, assign gate delays as follows:

  • Initialize all internal required/arrival times to “unknown”
  • Select the path(s) with the minimum value of (rk-ai)/lp where lp is the length of the path in number of gates
    • For each node from primary inputs to primary outputs
      • Calculate all the (ai, li) pairs from all fanin edges
      • Discard dominated pairs, save the union of the undominated pairs
    • When all primary outputs are reached, calculate minimum (rk-ai)/lp
  • Assign delay of each gate in the selected path(s) to this minimum
  • Update arrival and required times for all fi and fo edges of newly assigned delays
  • Repeat steps 2-4 until all gates are assigned delays

Pair domination defined:



Pair (ai, li) dominates (aj, lj) if

ai  aj and li  lj

If either (a1, l1) or (a2, l2) dominates the other, the four possible paths through n can be reduced to two, since the dominated path is “faster” than necessary.












faster approximation
Faster Approximation

Select an allowable slack threshold sthresh (if zero then algorithm yields same result as previous)

  • Compute the forward level lj and arrival time aj of all nodes in network using a forward trace
  • Compute the reverse level kj and required time rj of all nodes in network using a backward trace
  • Update the delay of every node as dj = dj + (rj-aj)/(lj+kj)
  • While the slack of any node exceeds sthresh then repeat steps 1-3.
incorporation into sis
Incorporation into SIS




Tech. independent optimization:

script.algebraic, script.boolean, etc



Tech. dependent optimization:








Area/delay tradeoff curve


Fast delay/area estimation:


library characterization
Library Characterization
  • Commercial standard cell library have possibly multiple gates that implement the same equation
  • Each gate in the library has characteristics:
    • Size
    • Delays from all input pins to the output pin for all transitions and several loads
    • Capacitance for all input pins
    • Maximum load
    • Area
  • We need estimation parameters for each class of gates (ie. gates with the same equation):
    • Intrinsic gate delay (dint)
    • Drive factor (k)
    • Area line y-intercept (Aint)
    • Area line slope (Aslope)
    • Input capacitance line y-intercept (cint)
    • Input capacitance line slope (cslope)
inverter characterization 1
Inverter Characterization (1)
  • Inverter delay scales linearly with load/size
    • Slope is k
    • Y-intercept is dint
inverter characterization 2
Inverter Characterization (2)
  • Inverter area scales linearly with size
    • Slope is Aslope
    • Y-intercept is Aint
characterization issues
Characterization Issues
  • Requires at least two gates per class in the library
  • Additionally, some gates have poor accuracy (trend lines have poor coefficients of determination)
  • Further research shows the reason is CMOS implementation (below)
  • Future work might replace linear model with piece-wise linear model for more accuracy

NAND-gate CMOS schematic

for smaller sizes

NAND-gate CMOS schematic

for larger sizes

estimation library
Estimation Library
  • These issues are evident in the table
    • OAI31 and OAI32 have Aslope of 0.0, meaning that the two cells in the library had the same area
    • NOR3, NOR4 had poor coefficients of determination
    • Many gates in the library had only one size
estimation modes
Estimation Modes
  • Sweep mode
    • User specifies a range of required times to sweep (possibly only one) and a step size
    • Estimation starts with the largest required time and steps down until network fails the zero slack algorithm (ie. negative slack is encountered)
  • Binary search mode
    • Used to find the minimum possible required time (period) given infinite area
    • Starts at a user-specified maximum and performs a binary search until a pass limit is reached
  • Various sized combinational logic benchmarks
    • MCNC c17, c880, c1908, c3540
  • Various sized sequential logic benchmarks
    • Interpretation of required time is clock period (assuming all flip-flops are clocked synchronously)
    • MCNC s713, s838, s953, s1196, s1238, s1423
  • Tested four scripts
    • script.none (no optimization), script.algebraic, script.boolean, script.rugged
tradeoff curves
Tradeoff Curves
  • Sweep mode allows multiple required times (clock periods) to be easily tabulated
sensitivity to optimization script
Sensitivity to Optimization Script
  • When delay is non-critical (ie. as required time approaches infinity)
    • Area within 20% of no optimization
    • Variation between optimization scripts mostly under 10%
  • Sometimes more optimization yields worse results
  • As required times become smaller, more paths become critical requiring larger sizes (area)
    • Area increases quickly before failure
  • From the benchmarks shown, estimation is relatively insensitive to technology independent optimization with infinite required times
possible future work
Possible Future Work
  • Accuracy
    • Relate estimated areas to actual areas from a good mapping using the full technology library
    • Use more complex delay equations to handle different rise/fall times
    • Modify the algorithm to handle the case where a primary input cannot drive the required load
  • Characterization
    • Revise characterization to support piece-wise linear functional forms
    • Automate process so only the actual technology library is required as an input
  • Mapping
    • Examine how various mapping options affect estimation
    • Use buffered fanout trees (Touati) after sizing gates
  • Speed
    • Compare speed of total estimation procedure to traditional flow
  • Power estimation