Development and Application of Tree Synthesis Algorithms

Development and Application of Tree Synthesis Algorithms John Lillis University of Illinois Chicago

Overview • Part I: Buffer tree synthesis • Formulations • S/P/SP-tree • Part II: Fanin tree embedding/replication • Optimization across gate boundaries • Interaction with placement

Part I: Buffer Tree Synthesis

Premises of Work • Conservation of Resources Crucial • Estimate: 700-800K Buffers/Chip in Near Future • Cost-Performance Tradeoffs • General Cost Model • Topology / Embedding / Buffering Spaces Should be Explored Simultaneously • 2-Phase Approach Not Robust / Predictable • Particularly Troublesome in Presence of Blockages • MAIN PREMISE: Powerful Buffer Tree Synthesis is a Core for Modern Design

Max Slack Weakness Overoptimized subtrees Slack Cost

Problem Formulation • Given: • Location of Driver and Sinks • Technology Parameters • Timing Requirements • Buffer Library • Target Routing Graph (Blockages) • Find: • Topology in corresponding space • its Embedding • and Buffer Assignment • Minimizing Cost • s.t. Timing Constraints

Philosophy of Constraint Imposition Full space Constrained space • Goals: • Predictable Behavior • Absence of ad-hoc Heuristics • Main Idea: • Optimally Solve Constrained Variant of the Problem • Well-Designed Constraints Produce • Large Flexible Solution Space • Tractability • Constraints: Topology Space

Topology Embedding Flexibility s c a b s s c c a a b b

Target Routing Graph Construction Routing blockage Buffer blockage s a c b

Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree

Core Subroutine: Timing-Driven Maze Routing Target Sources • Generalization of [Hur, et. al.; TCAD Feb 2000] • Single Target, Multiple Sources • Finds non-dominated paths • Simultaneous Buffer Insertion • Handling of Blockages in Topology Synthesis

Topology Embedding • Goal: Obtain timing feasible embedding / buffering of given topology, minimizing cost • Solution: Dynamic Programming (bottom-up)

Solution sets A(u,v) u v • A(u,v) represents a set of solutions that correspond to • Vertex u in Topology • Vertex v in Target Graph A1b = Join(A1.left , A1.right) A1 = GenDijsktra(A1b)

S-Tree • Notion of localities: • Spatial • Temporal • Polarity • Partition sinks into 2 sets based on: • estimated timing criticality • signal polarity requirements • some other criteria... • Subtrees can break topology and “stitch” at different place

S-Tree Topology Space s d c b s s b d d b c c a a Sink partition: {a,c,d} {b} a

S-Tree Recurrence A1b = Join(A1.left , A1.right) A1 = GenDijsktra(A1b) A2b = Join(A2.left , A2.right) A2 = GenDijsktra(A2b) A12b = Join(A12.left , A12.right) + Join(A1 , A2) A12 = GenDijsktra(A12b)

S-Tree Topology Space s s s b c f e c a f d a b d e s s b f c a d e c e c e a b f d f d a b Initial topology

Incorporating polarity • 4 sets: • critical & positive signal polarity • critical & negative • non-critical & positive • non-critical & negative • Other partitioning schemes...

P-Tree Topology Space • All Permutation-Constrained Topologies s a b c d e s a e b c d a e b c d

Limitations of P-Tree Space Driver Critical Non-critical • Isolation of Critical / Non-Critical Subtrees: “Temporal-Locality” • Min WL May Not Produce Min Cost Driver Critical Non-critical

SP-Tree • Combine everything said so far... • From P-Tree • Spatial locality • Robustness • From S-Tree • Temporal locality • Polarity locality • Ability to fix “topology problems” by “stitching”

Solution Space Entire space SP-Tree S-Tree P-Tree Fixed topo.

Experiments • Randomly generated nets • Non-uniform required arrival time • Non-uniform sink input capacitance • Buffer-biased cost • Interested in: • Min cost feasible solution • Max slack solution for verification • Runtime • More details in the paper...

Algorithms for Experiments • S-Tree • P-Tree • SP-Tree • RMP [Cong, Yuan; DAC 2000] • RMP-Quick [Cong, Yuan; DAC 2000]

Results Net2-06 Min cost feasible Max slack # buffers

SP-Tree vs. P-Tree

Conclusions • Key Concepts: • General Cost Models • Routing Congestion • Buffer Congestion • Orthogonal Separation of Spatial and Temporal Locality • Polarity Requirements • Routing and Buffer Blockages • Targets: • Small-to-Medium Sized Signal Nets • Results Summary • Highly Cost-Efficient, High Performance Solutions • Substantially Outperforms Prior Approaches in Solution Quality and Runtime

Part II: Fanin Tree Embedding/Replication

Replication Overview • Hrkic, Lillis, Beraudo (DAC04, IWLS04) • Concept: Netlist structure limits potential of timing-driven placement • Difficult for top-down synthesis to fix • Main issue: inherently non-monotone paths • Approach (Hrkic, Lillis; DAC04) touches on placement, synthesis (netlist perturbation) and routing.

Logic Replication • Duplicate logic cell • Preserve functionality • Improve timing • Place / Move cells • Adjust connections B A B A CR C C D E D E

Early Work • Use replication to straighten I/O paths • Local monotonicity [Beraudo, Lillis, DAC 2003] • Sequence of 3 cells on the path • Incremental framework B B CR A A C C D E D E

Limitations of Local Monotonicity • Local Monotonicity satisfied • Still many non-monotone paths A B C F D E

Replication Tree Approach[Hrkic et. al. DAC04] • Identify critical sink • Extract critical fan-in tree (Replication Tree) • Optimize fan-in tree (Fan-in Tree Embedding) • Legalize placement

Slowest Paths Tree • Focus on slowest paths • Find slowest paths tree from critical sink • Include paths within epsilon of current critical delay • Focus on most critical portions of fan-in cone

Replication Tree C C CR A AR A B B BR DR D D E F E FR F • Most circuits do not contain large fan-in trees due to reconvergence • Given a critical tree temporarily replicate the entire tree • Assign connections: • if (u,v) is tree edge; connect uR to vR • else connect u to vR

Placement cost C CR AR A B BR DR D E F FR • Replication is temporary • Placement cost is crucial • Cost discount for placing cell over its logical equivalent • low cost for placing DR over D • actual replication will never occur • multiple low cost location possible

Fan-in Tree Embedding • Given: • Fan-in tree • Placement of sink and inputs • Arrival times at inputs • Placement and routing graph • Find: • Placement of internal tree nodes (Gates) • Minimizing Cost • s.t. Timing Constraints • cost / delay tradeoff

Fan-in Tree Embedding Example C C A A B B sink sink Higher delay, lower cost Lower delay, higher cost

Fan-out and Fan-in Tree C A source B C A sink B Bottom-up Top-down

Fan-in Tree Embedding • Adaptation of S-Tree algorithm [Hrkic, Lillis, DAC 2002] • Keep: • Graph Model for Embedding Target • Modified Timing-Driven Maze Routing • multiple source, multiple targets • at each vertex keep a list of non-dominated solutions • S. Hur, J. Lillis, IEEE TCAD 2000 • Modify: • Top-down vs. Bottom-up • Solution signature (c,t): • c - cost • t - signal arrival time • Gate placement cost p(x,y)

Fan-in Tree Embedding • Non-binary tree: multiple gate inputs • Top-Down Dynamic Programming • Maze Routing to populate solutions • deffered backtracking • Join Solutions • c=px,y + c1 + ... + cn • t=MAX(t1, ... ,tn) • Bottom-Up solution extraction • backtrack to extract maze route • extract gate placement Modified maze routing Join

Aside: Legalization • Use Modified Gain-Graph approach [Hur, Lillis; ICCAD00] • Modified to incorporate timing information

Optimization Flow • Identify critical sink (static timing analysis) • Extract Fan-in Tree • Replication Tree • epsilon-Slowest Paths Tree • Embed Fan-in Tree • Decide which cells to Replicate / Unify • Legalize placement • Repeat while there is improvement

Development and Application of Tree Synthesis Algorithms

Development and Application of Tree Synthesis Algorithms

Presentation Transcript

Application: Algorithms

Synthesis and Application Development of Inorganic Nano Mesoporous Materials

Minimum Spanning Tree Algorithms

Application-layer Connector Synthesis

Application and development

Misconceptions of Adult Learning and Application Synthesis

MAXIMAL CLIQUES and JOIN TREE Algorithms

Sequence Alignment Algorithms – Application to Bioinformatics Tool Development

CN and PN Tree Search Algorithms

Synthesis of Application-Specific On-Chip Networks

Synthesis and Application of CNT Paste

Application of Speech Recognition, Synthesis, Dialog

Decision Tree Learning Algorithms

Application of Addition Algorithms

Synthesis and Application of High Molecular Weight Surfactants

Algorithms for Inferring the Tree of Life

Minimum Spanning Tree Algorithms

Algorithms and Efficiency of Algorithms

High-Level Synthesis Algorithms

Minimum Spanning Tree Algorithms