a method for fast delay area estimation
Download
Skip this Video
Download Presentation
A Method for Fast Delay/Area Estimation

Loading in 2 Seconds...

play fullscreen
1 / 18

A Method for Fast Delay/Area Estimation - PowerPoint PPT Presentation


  • 122 Views
  • Uploaded on

A Method for Fast Delay/Area Estimation. EE219b Semester Project Mike Sheets May 16, 2000. Overview. Problem statement Proposed solution Constant delay paradigm Zero-slack algorithm Implementation Incorporation into SIS Library characterization Results Conclusions Future Work.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'A Method for Fast Delay/Area Estimation' - cameo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
a method for fast delay area estimation

A Method for Fast Delay/Area Estimation

EE219b Semester Project

Mike Sheets

May 16, 2000

overview
Overview
  • Problem statement
  • Proposed solution
    • Constant delay paradigm
    • Zero-slack algorithm
  • Implementation
    • Incorporation into SIS
    • Library characterization
    • Results
  • Conclusions
  • Future Work
problem statement
Problem Statement
  • Given a boolean network, estimate the area if implemented with particular required time constraints
    • Estimation should be fast and reasonably accurate
  • Examine how technology independent logic optimization affects the estimation
area delay models
Area/Delay Models
  • Constant area (traditional) model
    • Composed of discretely sized gates with constant area
    • Mapping involves calculating delay as a function of load
  • Constant delay model
    • Composed of mathematical functions relating area to size
    • Mapping involves calculating size (area) as a function of load

Constant Area Model

Constant Delay Model

ND2X1

ND2

CL

CL

Area = constant from library

Size = constant from library

Delay = dint + k*CL

Area = Aint + Aslope*size

Size = k*CL /(Delay – dint)

Delay = constant

zero slack algorithm
Zero Slack Algorithm

Given input arrival times {ai} and output required time {rk}, assign gate delays as follows:

  • Initialize all internal required/arrival times to “unknown”
  • Select the path(s) with the minimum value of (rk-ai)/lp where lp is the length of the path in number of gates
    • For each node from primary inputs to primary outputs
      • Calculate all the (ai, li) pairs from all fanin edges
      • Discard dominated pairs, save the union of the undominated pairs
    • When all primary outputs are reached, calculate minimum (rk-ai)/lp
  • Assign delay of each gate in the selected path(s) to this minimum
  • Update arrival and required times for all fi and fo edges of newly assigned delays
  • Repeat steps 2-4 until all gates are assigned delays

Pair domination defined:

a1

r3

Pair (ai, li) dominates (aj, lj) if

ai  aj and li  lj

If either (a1, l1) or (a2, l2) dominates the other, the four possible paths through n can be reduced to two, since the dominated path is “faster” than necessary.

n1

n3

l1

l3

n

a2

r4

l2

l4

n2

n4

faster approximation
Faster Approximation

Select an allowable slack threshold sthresh (if zero then algorithm yields same result as previous)

  • Compute the forward level lj and arrival time aj of all nodes in network using a forward trace
  • Compute the reverse level kj and required time rj of all nodes in network using a backward trace
  • Update the delay of every node as dj = dj + (rj-aj)/(lj+kj)
  • While the slack of any node exceeds sthresh then repeat steps 1-3.
incorporation into sis
Incorporation into SIS

BLIF

net.

read_blif

Tech. independent optimization:

script.algebraic, script.boolean, etc

Tech.

lib.

Tech. dependent optimization:

map

read_library

Area

Manual

analysis

Est.

lib.

Area/delay tradeoff curve

read_estim

Fast delay/area estimation:

estimate

library characterization
Library Characterization
  • Commercial standard cell library have possibly multiple gates that implement the same equation
  • Each gate in the library has characteristics:
    • Size
    • Delays from all input pins to the output pin for all transitions and several loads
    • Capacitance for all input pins
    • Maximum load
    • Area
  • We need estimation parameters for each class of gates (ie. gates with the same equation):
    • Intrinsic gate delay (dint)
    • Drive factor (k)
    • Area line y-intercept (Aint)
    • Area line slope (Aslope)
    • Input capacitance line y-intercept (cint)
    • Input capacitance line slope (cslope)
inverter characterization 1
Inverter Characterization (1)
  • Inverter delay scales linearly with load/size
    • Slope is k
    • Y-intercept is dint
inverter characterization 2
Inverter Characterization (2)
  • Inverter area scales linearly with size
    • Slope is Aslope
    • Y-intercept is Aint
characterization issues
Characterization Issues
  • Requires at least two gates per class in the library
  • Additionally, some gates have poor accuracy (trend lines have poor coefficients of determination)
  • Further research shows the reason is CMOS implementation (below)
  • Future work might replace linear model with piece-wise linear model for more accuracy

NAND-gate CMOS schematic

for smaller sizes

NAND-gate CMOS schematic

for larger sizes

estimation library
Estimation Library
  • These issues are evident in the table
    • OAI31 and OAI32 have Aslope of 0.0, meaning that the two cells in the library had the same area
    • NOR3, NOR4 had poor coefficients of determination
    • Many gates in the library had only one size
estimation modes
Estimation Modes
  • Sweep mode
    • User specifies a range of required times to sweep (possibly only one) and a step size
    • Estimation starts with the largest required time and steps down until network fails the zero slack algorithm (ie. negative slack is encountered)
  • Binary search mode
    • Used to find the minimum possible required time (period) given infinite area
    • Starts at a user-specified maximum and performs a binary search until a pass limit is reached
experimentation
Experimentation
  • Various sized combinational logic benchmarks
    • MCNC c17, c880, c1908, c3540
  • Various sized sequential logic benchmarks
    • Interpretation of required time is clock period (assuming all flip-flops are clocked synchronously)
    • MCNC s713, s838, s953, s1196, s1238, s1423
  • Tested four scripts
    • script.none (no optimization), script.algebraic, script.boolean, script.rugged
tradeoff curves
Tradeoff Curves
  • Sweep mode allows multiple required times (clock periods) to be easily tabulated
sensitivity to optimization script
Sensitivity to Optimization Script
  • When delay is non-critical (ie. as required time approaches infinity)
    • Area within 20% of no optimization
    • Variation between optimization scripts mostly under 10%
conclusions
Conclusions
  • Sometimes more optimization yields worse results
  • As required times become smaller, more paths become critical requiring larger sizes (area)
    • Area increases quickly before failure
  • From the benchmarks shown, estimation is relatively insensitive to technology independent optimization with infinite required times
possible future work
Possible Future Work
  • Accuracy
    • Relate estimated areas to actual areas from a good mapping using the full technology library
    • Use more complex delay equations to handle different rise/fall times
    • Modify the algorithm to handle the case where a primary input cannot drive the required load
  • Characterization
    • Revise characterization to support piece-wise linear functional forms
    • Automate process so only the actual technology library is required as an input
  • Mapping
    • Examine how various mapping options affect estimation
    • Use buffered fanout trees (Touati) after sizing gates
  • Speed
    • Compare speed of total estimation procedure to traditional flow
  • Power estimation
ad