Optimization strategies for physical synthesis and timing closure
Download
1 / 104

Optimization Strategies for Physical Synthesis and Timing Closure - PowerPoint PPT Presentation

Optimization Strategies for Physical Synthesis and Timing Closure Charles J. Alpert IBM Corp. Sachin Sapatnekar University of Minnesota ECE Dept. Salil Raje Hierarchical Design, Inc. Optimization Strategies for Physical Synthesis and Timing Closure Part One Charles J. Alpert

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

Optimization Strategies for Physical Synthesis and Timing Closure

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Optimization Strategies for Physical Synthesis and Timing Closure

Charles J. Alpert

IBM Corp.

Sachin Sapatnekar

University of Minnesota ECE Dept.

Salil Raje

Hierarchical Design, Inc.


Optimization Strategies for Physical Synthesis and Timing Closure

Part One

Charles J. Alpert

Austin Research Laboratory

IBM Research Division, Austin, TX 78758

alpert@austin.ibm.com


Talk Outline

  • Introduction

  • Buffer insertion

    • Van Ginneken dynamic programming

    • Extensions

  • Steiner tree construction

  • Blockage avoidance

  • Wire sizing

  • Interconnect planning


Simple Buffer Insertion Problem

Given: Source and sink locations, sink capacitances

and RATs, a buffer type, source delay rules, unit wire resistance and capacitance

RAT4

Buffer

RAT3

s0

RAT2

RAT1


s0

RAT2

RAT1

Simple Buffer Insertion Problem

Find: Buffer locations and a routing tree such that slack at the source is minimized

RAT4

RAT3


Slack Example

RAT = 500

delay = 400

slack = -200

RAT = 400

delay = 600

RAT = 500

delay = 350

slack = +100

RAT = 400

delay = 300


R1

R2

A

B

C

C1

C2

Elmore Delay


Common Approaches

  • Iteratively insert buffers

  • Closed-form solutions (2 pin nets)

  • Dynamic programming

  • Simultaneous constructions


Talk Outline

  • Introduction

  • Buffer insertion

    • Van Ginneken dynamic programming

    • Extensions

  • Steiner tree construction

  • Blockage avoidance

  • Wire sizing

  • Interconnect planning


Van Ginneken’s Classic Algorithm

  • Optimal for multi-sink nets

  • Quadratic runtime

  • Bottom-up from sinks to source

  • Generate list of candidates at each node

  • At source, pick the best candidate in list


Key Assumptions

  • Given routing tree

  • Given potential insertion points


(1)

(2)

(3)

Generating Candidates


(3)

(b)

(a)

Both (a) and (b) “look” the same to the source.

Throw out the one with the worst slack

(4)

Pruning Candidates


(4)

(5)

Candidate Example Continued


(5)

At driver, compute which candidate maximizes

slack. Result is optimal.

Candidate Example Continued

After pruning


Left

Candidates

Right

Candidates

Merging Branches


Critical

With pruning

Pruning Merged Branches


Van Ginneken Example

(20,400)

Buffer

C=5, d=30

Wire

C=10,d=150

(30,250)

(5, 220)

(20,400)

Buffer

C=5, d=50

C=5, d=30

Wire

C=15,d=200

C=15,d=120

(30,250)

(5, 220)

(45, 50)

(5, 0)

(20,100)

(5, 70)

(20,400)


Van Ginneken Example Cont’d

(30,250)

(5, 220)

(45, 50)

(5, 0)

(20,100)

(5, 70)

(20,400)

(5,0) is inferior to (5,70). (45,50) is inferior to (20,100)

Wire C=10

(30,250)

(5, 220)

(20,100)

(5, 70)

(30,10)

(15, -10)

(20,400)

Pick solution with largest slack, follow arrows to get solution


Van Ginneken Recap

  • Generate candidates from sinks to source

  • Quadratic runtime

    • Adding a buffer adds only one new candidate

    • Merging branches additive, not multiplicative

  • Optimal for Elmore delay model


Talk Outline

  • Introduction

  • Buffer insertion

    • Van Ginneken dynamic programming

    • Extensions

  • Steiner tree construction

  • Blockage avoidance

  • Wire sizing

  • Interconnect planning


Optimal Extensions

  • Multiple buffer types

  • Inverters

  • Polarity constraints

  • Controlling buffer resources

  • Capacitance constraints

  • Blockage recognition

  • Wire sizing


(1)

(2)

Time complexity increases from O(n2) to O(n2B2)

where B is the number of different buffer types

Multiple Buffer Types


(1)

(2)

  • Maintain a “+” and a “-” list of candidates

  • Only merge branches with same polarity

  • Throw out negative candidates at source

Inverters


“-” list

“+” list

“-” list

Polarity Constraints

  • Some sinks are positive, some negative

  • Put negative sinks into “-” list


Controlling Buffering Resources

Before, maintain list of capacitance slack pairs

(C1, q1), (C2, q2), (C3, q3) (C4, q4), (C5, q5) (C6, q6), (C7, q7), (C8, q8) (C9, q9)

Now, store an array of lists, indexed by # of buffers

3

2

1

0

(C1, q1, 3), (C2, q2, 3), (C3, q3, 3)

(C4, q4, 2), (C5, q5, 2)

(C6, q6, 1), (C7, q7, 1), (C8, q8, 1)

(C9, q9, 0)

Prune candidates with inferior cap, slack, and #buffers


Buffering Resource Trade-off


Capacitance Constraints

  • Each gate g drives at most C(g) capacitance

  • When inserting buffer g, check downstream capacitance.

  • If bigger than C(g), throw out candidate

Total cap = 500 ff


Blockage Recognition

Delete insertion points that run over blockages


Other Extensions

  • Simultaneous driver sizing

  • Modeling effective capacitance

  • Higher-order interconnect delay

  • Slew constraints

  • Noise constraints


Driver Sizing


Driver Sizing

  • Driver behaves like buffer

  • Pick driver with the best slack

  • Implications upstream in timing graph

  • Delay penalty for large input capacitance


R

Cn

Cf

p-Models

  • Van Ginneken candidate: (Cap, slack)

C

  • Replace Cap with p-model (Cn, R, Cf)

  • Total capacitance preserved: Cn + Cf = C

  • R represents degree of resistive shielding


Ceff

Computing Gate Delay

  • When inserting buffer, compute effective capacitance from p-model

  • Use effective instead of lumped capacitance in gate delay equation

  • Optimality no longer guaranteed


Higher-order Interconnect Delay

  • Moment matching with first 3 moments

  • Previously: candidate (p-model, slack)

  • Now: candidate (p-model, m1, m2, m3)

  • Given moments, compute slack on the fly

  • Bottom-up, efficient moment computation

  • Problem: guess slew rate


Slew Constraints

  • When inserting buffer, compute slews to gates driven by buffer

  • If slew exceeds target, prune candidate

  • Difficulty: unknown gate input slew

Slew 300 ps

?

Slew 350 ps


Noise Constraints

  • Each gate has acceptable noise threshold

  • Compute cumulative noise for each wire via Devgan noise metric

  • Throw out candidates that violate noise

  • Not in production code


Extensions Recap

  • Multiple buffer types, including inverters

  • Polarity constraints

  • Controlling buffer resources

  • Slew, capacitance, and noise constraints

  • Blockage recognition

  • Driver sizing

  • Higher-order delay modeling

  • Wire sizing


Talk Outline

  • Introduction

  • Buffer insertion

    • Van Ginneken dynamic programming

    • Extensions

  • Steiner tree construction

  • Blockage avoidance

  • Wire sizing

  • Interconnect planning


Tour of Italy Problem

  • Van Ginneken uses fixed Steiner route

  • Need timing-driven Steiner trees


Timing-Driven Steiner Approaches

  • BRBC

  • Prim-Dijkstra

  • P-Tree

  • A-Tree (RSA)

  • SERT

  • MVERT


Rectilinear Steiner Arborescence

  • Assume all sinks in first quadrant

  • Iteratively

    • Find sink pair p and q maximimizing min(xp, xq) + min (yp, yq)

    • Remove p and q from consideration

    • Replace with r = (min(xp, xq), min (yp, yq)

    • Connect p and q to r


RSA Example

2

1

5

4

6

3


RSA Diagonal Line Sweep

1

2

3

4

5

6


Prim-Dijkstra Algorithm

Prim’s

MST

Dijkstra’s

SPT

Trade-off


Prim’s and Dijkstra’s Algorithms

  • d(i,j): length of the edge (i, j)

  • p(j): length of the path from source to j

  • Prim: d(i,j) Dijkstra: d(i,j) + p(j)

p(j)

d(i,j)


The Prim-Dijkstra Trade-off

  • Prim: add edge minimizing d(i,j)

  • Dijkstra: add edge minimizing p(i) + d(i,j)

  • Trade-off: c(p(i)) + d(i,j) for 0 <= c <= 1

  • When c=0, trade-off = Prim

  • When c=1, trade-off = Dijkstra


Skinny on RSA/Prim-Dijkstra

  • Fast, easy to implement

  • Converting spanning to Steiner tree easy

  • Ignores sink criticality

  • No natural decoupling opportunities

  • Polarity constraints problem


+

+

+

+

_

_

_

_

_

_

_

Polarity Problem


+

+

+

+

_

_

_

_

_

_

_

A Better Solution?


(1)

(2)

(3)

Buffer Aware Trees


C-Tree Algorithm

  • Cluster sinks by

    • Polarity

    • Manhattan distance

    • Criticality

  • Two-level tree

    • Form tree for each cluster

    • Form top-level tree


C-Tree Example


Clustering Distance Metric

  • pDist(i,j) = | polarity(i) – polarity(j)|

  • sDist(i,j) = (|xi – xj| + |yi – yj|)/diam

  • tDist(i,j) scaled between 0 and 1, 0 for equal criticalities, 1 for opposite criticalities

  • Final distance metric d(i,j) = pDist(i,j) + bsDist(i,j) + (1-b)tDist(i,j)


Clustering – Finding Centers

3

2

R

1

4


Clustering – Group to Centers

3

2

R

1

4


Net n8702


Flat

C-Tree


C-Tree 4 clusters


C-Tree 2 clusters


Talk Outline

  • Introduction

  • Buffer insertion

    • Van Ginneken dynamic programming

    • Extensions

  • Steiner tree construction

  • Blockage avoidance

  • Wire sizing

  • Interconnect planning


Don’t Avoid All Blockages!


Optimal Buffered Maze Routing


Optimal Buffered Maze Routing

  • Optimal for 2-pin nets

  • Quadratic runtime in number of grid nodes

  • Generates candidate list at each node

  • Can prune candidates at the same node just like in Van Ginneken’s algorithm


Buffer Bays


Re-Routing Into Buffer Bays


BBB Algorithm

  • Start with good routing tree

  • Iteratively:

    • Delete a high cost 2-path (each internal node of the path has degree 2)

    • Connect two sub-trees via maze routing

    • Cost function: blocked + a(unblocked wire)


2-path3

2-path1

2-path2

Blockage Avoidance Example


2-path3

2-path1

2-path2

Blockage Avoidance Example


2-path3

2-path1

2-path2

Blockage Avoidance Example


Talk Outline

  • Introduction

  • Buffer insertion

    • Van Ginneken dynamic programming

    • Extensions

  • Steiner tree construction

  • Blockage avoidance

  • Wire sizing

  • Interconnect planning


Continuous Wire Sizing

Optimal shape: f(x) = a(e-bx)


Two Types of Wire Sizing

Wire Tapering (TWS)

Uniform Wire Sizing (UWS)


Basic Wire Sizing Algorithm

  • Tree edges fixed, can be sized to any width

  • Monotone property: ancestor edges cannot be narrower than downstream edges


Optimal Wire Sizing (OWS)

  • Maximum width solution

    • Apply greedy iterative improvement

    • Upper bound

  • Minimum width solution

    • Apply greedy iterative improvement

    • Lower bound

  • Enumerate possibilities between lower and upper (potentially exponential)


TWS versus UWS

TWS

UWS


Why Uniform Wire Sizing?

  • Empirically, UWS almost as good as TWS

  • Tapering info hard to give to router

  • Better congestion and space management

  • Extraction, detailed routing, verification?

  • Estimated Steiner anyway

  • Can do it simultaneously with buffering


Wire Codes

  • 4-tuple (H-layer, H-width, V-layer, V-width)

  • Example (M3, 0.9, M4, 2.7)

  • Wire sizing problem: pick best wire code

(M3, 0.9)

(M3, 0.9)

(M4, 2.7)

(M4, 2.7)


Including Buffers

  • K buffers induces K+1 different nets

  • Each net can have its own wire code

(net1)

(net3)

(net4)

(net2)


Wire Code Selection

  • Given W wire codes

  • Generate W copies of each candidate, each labeled with a wire code

  • Merge candidates with matching wire codes

  • When inserting buffer, generate W candidates, instead of 1

  • Complexity increases linearly with W


Talk Outline

  • Introduction

  • Buffer insertion

    • Van Ginneken dynamic programming

    • Extensions

  • Steiner tree construction

  • Blockage avoidance

  • Wire sizing

  • Interconnect planning


What is the Problem?

  • DSM timing closure

    • Squeeze buffers into tight spaces

    • Alleviate hot spots, local wire congestion

    • Getting worse

  • Handle wire congestion, buffering resources early

  • Acknowledge these constraints when floorplanning


Which Floorplan Is Better?

  • Timing analysis worthless

  • Interconnect synthesis, electrical correction, routing, extraction

  • Days to find answer


Present

Buffer Explosion

Past

  • Number of buffers triples each generation

  • 800K buffers in 0.05 micron technology


Buffer Block Planning

  • Create blocks between macros just for holding buffers

  • Adjust floorplan accordingly

  • Computing size/#/location of blocks

    • Analyze 2-pin nets

    • Find feasible regions

    • Assign buffers with smallest region

    • Combine buffers into blocks


Feasible Regions

feasible region


Buffer Block Planning Trade-offs

  • Goods

    • Buffer locations flexibile

    • Global view, buffers most difficult ones first

  • Bads

    • Wire congestion around blocks

    • Don’t have timing information

    • Some nets still cannot be buffered/routed


A Net Which Cannot Be Buffered


“Buffer Site”

  • Dummy cell that holds a buffer

  • Not connected to any net

  • Becomes buffer when assigned to a net

  • Extra sites  decoupling caps, ECO


Early Buffering Observations

  • Exact buffer location unimportant

    • Freely sprinkle buffer sites

    • Allocate percentage within macros

    • Enough altogether

  • Timing constraints unavailable

    • Macro designs incomplete

    • No interconnect synthesis

    • Length-based constraint


1

0

4

0

3

1

0

2

0

2

3

0

0

0

5

4

Buffer Sites in a Tile Graph

Model buffer sites directly by constraining the

number of buffers that can be inserted into a tile


Length Based Constraint

L = 3 tile units


Problem Formulation

  • Satisfy constraints of

    • Length

    • Number of buffer sites

    • Wire tracks

  • Minimize

    • Inserted buffers

    • Wire congestion

  • Return buffer/wire locations


RABID Approach

  • Initial Steiner tree construction

  • Wire congestion reduction

  • Buffer assignment

  • Final post-processing

Resource Allocation for Buffer

and Interconnect Distribution


1. Steiner Tree Heuristic


2. Wire Congestion Reduction


3. Buffer Assignment


4. Final Post-Processing


Wrap-Up

  • Interconnect synthesis increasingly critical

  • DP most powerful/flexible approach

  • Must also have

    • Right Steiner tree

    • Blockage awareness and avoidance

    • Planning

  • Wire size carefully and conservatively


DP Buffer Insertion References

  • Buffer placement in distributed RC-tree networks for minimal Elmore delay van Ginneken, L.P.P.P. Circuits and Systems, 1990., IEEE International Symposium on , 1990 Page(s): 865 -868 vol.2

  • Optimal wire sizing and buffer insertion for low power and a generalized delay model Lillis, J.; Chung-Kuan Cheng; Lin, T.-T.Y. Solid-State Circuits, IEEE Journal of , Volume: 31 Issue: 3 , March 1996 Page(s): 437 –447

  • Buffer insertion for noise and delay optimization Alpert, C.J.; Devgan, A.; Quay, S.T. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , Volume: 18 Issue: 11 , Nov. 1999 Page(s): 1633 -1645

  • Buffer insertion with accurate gate and interconnect delay computation Alpert, C.J.; Devgan, A.; Quay, S.T. Design Automation Conference, 1999. Proceedings. 36th , 1999 Page(s): 479 –484

  • Wire Segmenting For Improved Buffer Insertion Alpert, C.; Devgan, A. Design Automation Conference, 1997. Proceedings of the 34th Page(s): 588 –593

  • Simultaneous routing and buffer insertion for high performance interconnect Lillis, J.; Chung-Kuan Cheng; Ting-Ting Y. Lin VLSI, 1996. Proceedings., Sixth Great Lakes Symposium on , 1996 Page(s): 148 -153


Steiner Tree References

  • Prim-Dijkstra tradeoffs for improved performance-driven routing tree design Alpert, C.J.; Hu, T.C.; Huang, J.H.; Kahng, A.B.; Karger, D. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , Volume: 14 Issue: 7 , July 1995 Page(s): 890 -896

  • Buffered steiner trees for difficult instances Alpert, C.J.; Hrkic, M.; Hu, J;, Kahng, A.B.; Lillis, J.; Liu, B.; Quay, S.T.; Sapatnekar, S.S.; Sullivan, A.J.; Villarrubia, P.; International Symposium on Physical Design, April 2001 Page(s): 4-9

  • Simultaneous routing and buffer insertion for high performance interconnect Lillis, J.; Chung-Kuan Cheng; Ting-Ting Y. Lin VLSI, 1996. Proceedings., Sixth Great Lakes Symposium on , 1996 Page(s): 148 –153

  • Efficient algorithms for the minimum shortest path Steiner arborescence problem with applications to VLSI physical design Cong, J.; Kahng, A.B.; Kwok-Shing Leung Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , Volume: 17 Issue: 1 , Jan. 1998 Page(s): 24 -39

  • On optimal interconnections for VLSI Kahng, A.B.; Robins, G.; Kluwer Academic Publishers, Norwell, MA. 1995

  • An interconnect topology optimization by a tree transformation Tsujii, N.; Baba, K.; Tsukiyama, S. Design Automation Conference, 2000. Proceedings of the ASP-DAC 2000. Asia and South Pacific , 2000 Page(s): 93 -98

  • Non-Hanan routing Hou, H.; Hu, J.; Sapatnekar, S.S. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , Volume: 18 Issue: 4 , April 1999 Page(s): 436-444.


Blockage Avoidance References

  • Steiner tree optimization for buffers, blockages, and bays Alpert, C.J.; Gandham, G.; Jiang Hu; Neves, J.I.; Quay, S.T.; Sapatnekar, S.S. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , Volume: 20 Issue: 4 , April 2001 Page(s): 556 –562.

  • A fast algorithm for context-aware buffer insertion Jagannathan, A.; Sung-Woo Hur; Lillis, J. Design Automation Conference, 2000. Proceedings 2000 Page(s): 368 –373.

  • Simultaneous routing and buffer insertion with restrictions on buffer locations Hai Zhou; Wong, D.F.; I-Min Liu; Aziz, A. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , Volume: 19 Issue: 7 , July 2000 Page(s): 819 -824

  • Maze routing with buffer insertion and wiresizing Minghorng Lai; Wong, D.F. Design Automation Conference, 2000. Proceedings 2000 Page(s): 374 -378

  • Routing tree construction under fixed buffer locations Cong, J.; Xin Yuan Design Automation Conference, 2000. Proceedings 2000 Page(s): 379 -384


Wire Sizing References

  • Optimal Wire-sizing Function With Fringing Capacitance Consideration Chung-Ping Chen; Wong, D.F. Design Automation Conference, 1997. Proceedings of the 34th Page(s): 604 -607.

  • Optimal non-uniform wire-sizing under the Elmore delay model Chung-Ping Chen; Hai Zhou; Wong, D.F. Computer-Aided Design, 1996. ICCAD-96. Digest of Technical Papers., 1996 IEEE/ACM International Conference on, 1996 Page(s): 38 –43.

  • Shaping a VLSI wire to minimize Elmore delay Fishburn, J.P. European Design and Test Conference, 1997. ED&TC; 97. Proceedings , 1997, Page(s): 244 –251.

  • Interconnect synthesis without wire tapering Alpert, C.J.; Devgan, A.; Fishburn, J.P.; Quay, S.T. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , Volume: 20 Issue: 1 , Jan. 2001 Page(s): 90 -104.

  • Interconnect estimation and planning for deep submicron designs Cong, J.; Pan, D.Z. Design Automation Conference, 1999. Proceedings. 36th , 1999 Page(s): 507 -510.

  • Optimal wiresizing under the distributed Elmore delay model Cong, J.; Leung, K.-S. Computer-Aided Design, 1993. ICCAD-93. Digest of Technical Papers., 1993 IEEE/ACM International Conference on , 1993 Page(s): 634 -639.

  • Interconnect sizing and spacing with consideration of coupling capacitance Cong, J.; Lei He; Cheng-Kok Koh; Zhigang Pan Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , Volume: 20 Issue: 9 , Sep 2001 Page(s): 1164 -1169


Interconnect Planning References

  • A practical methodology for early buffer and wire resource allocation Alpert, C.J.; Jiang Hu; Sapatnekar, S.S.; Villarrubia, P.G. Design Automation Conference, 2001. Proceedings , 2001 Page(s): 189 –194

  • An interconnect-centric design flow for nanometer technologies Cong, J. Proceedings of the IEEE , Volume: 89 Issue: 4 , April 2001 Page(s): 505 -528

  • Buffer block planning for interconnect-driven floorplanning Cong, J.; Tianming Kong; Pan, D.Z. Computer-Aided Design, 1999. Digest of Technical Papers. 1999 IEEE/ACM International Conference on , 1999 Page(s): 358 –363

  • Provably good global buffering using an available buffer block plan Dragan, F.F.; Kahng, A.B.; Mandoiu, I.; Muddu, S.; Zelikovsky, A. Computer Aided Design, 2000. ICCAD-2000. IEEE/ACM International Conference on , 2000 Page(s): 104 -109

  • Provably good global buffering by multiterminal multicommodity flow approximation Dragan, F.F.; Kahng, A.B.; Mandoiu, I.; Muddu, S.; Zelikovsky, A. Design Automation Conference, 2001. Proceedings of the ASP-DAC 2001. Asia and South Pacific , 2001 Page(s): 120 –125

  • Planning buffer locations by network flows Tang, X.; Wong, D.F.; International Symposium on Physical Design, April 2001 Page(s): 180-185

  • Routability-Driven Repeater Block Planning for Interconnect-Centric Floorplanning Sarkar, P.; Sundararaman, V.; Koh, C.-K.; International Symposium on Physical Design, April 2001 Page(s): 186-191


ad
  • Login