1 / 65

Development and Application of Tree Synthesis Algorithms

Development and Application of Tree Synthesis Algorithms. John Lillis University of Illinois Chicago. Overview. Part I: Buffer tree synthesis Formulations S/P/SP-tree Part II: Fanin tree embedding/replication Optimization across gate boundaries Interaction with placement.

afra
Download Presentation

Development and Application of Tree Synthesis Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Development and Application of Tree Synthesis Algorithms John Lillis University of Illinois Chicago

  2. Overview • Part I: Buffer tree synthesis • Formulations • S/P/SP-tree • Part II: Fanin tree embedding/replication • Optimization across gate boundaries • Interaction with placement

  3. Part I: Buffer Tree Synthesis

  4. Premises of Work • Conservation of Resources Crucial • Estimate: 700-800K Buffers/Chip in Near Future • Cost-Performance Tradeoffs • General Cost Model • Topology / Embedding / Buffering Spaces Should be Explored Simultaneously • 2-Phase Approach Not Robust / Predictable • Particularly Troublesome in Presence of Blockages • MAIN PREMISE: Powerful Buffer Tree Synthesis is a Core for Modern Design

  5. Max Slack Weakness Overoptimized subtrees Slack Cost

  6. Problem Formulation • Given: • Location of Driver and Sinks • Technology Parameters • Timing Requirements • Buffer Library • Target Routing Graph (Blockages) • Find: • Topology in corresponding space • its Embedding • and Buffer Assignment • Minimizing Cost • s.t. Timing Constraints

  7. Philosophy of Constraint Imposition Full space Constrained space • Goals: • Predictable Behavior • Absence of ad-hoc Heuristics • Main Idea: • Optimally Solve Constrained Variant of the Problem • Well-Designed Constraints Produce • Large Flexible Solution Space • Tractability • Constraints: Topology Space

  8. Topology Embedding Flexibility s c a b s s c c a a b b

  9. Target Routing Graph Construction Routing blockage Buffer blockage s a c b

  10. Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree

  11. Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree

  12. Core Subroutine: Timing-Driven Maze Routing Target Sources • Generalization of [Hur, et. al.; TCAD Feb 2000] • Single Target, Multiple Sources • Finds non-dominated paths • Simultaneous Buffer Insertion • Handling of Blockages in Topology Synthesis

  13. Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree

  14. Topology Embedding • Goal: Obtain timing feasible embedding / buffering of given topology, minimizing cost • Solution: Dynamic Programming (bottom-up)

  15. Solution sets A(u,v) u v • A(u,v) represents a set of solutions that correspond to • Vertex u in Topology • Vertex v in Target Graph A1b = Join(A1.left , A1.right) A1 = GenDijsktra(A1b)

  16. Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree

  17. S-Tree • Notion of localities: • Spatial • Temporal • Polarity • Partition sinks into 2 sets based on: • estimated timing criticality • signal polarity requirements • some other criteria... • Subtrees can break topology and “stitch” at different place

  18. S-Tree Topology Space s d c b s s b d d b c c a a Sink partition: {a,c,d} {b} a

  19. S-Tree Recurrence A1b = Join(A1.left , A1.right) A1 = GenDijsktra(A1b) A2b = Join(A2.left , A2.right) A2 = GenDijsktra(A2b) A12b = Join(A12.left , A12.right) + Join(A1 , A2) A12 = GenDijsktra(A12b)

  20. S-Tree Topology Space s s s b c f e c a f d a b d e s s b f c a d e c e c e a b f d f d a b Initial topology

  21. Incorporating polarity • 4 sets: • critical & positive signal polarity • critical & negative • non-critical & positive • non-critical & negative • Other partitioning schemes...

  22. Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree

  23. P-Tree Topology Space • All Permutation-Constrained Topologies s a b c d e s a e b c d a e b c d

  24. Limitations of P-Tree Space Driver Critical Non-critical • Isolation of Critical / Non-Critical Subtrees: “Temporal-Locality” • Min WL May Not Produce Min Cost Driver Critical Non-critical

  25. Algorithmic Description Timing-Driven Maze Routing Topology Embedding S-Tree P-Tree SP-Tree

  26. SP-Tree • Combine everything said so far... • From P-Tree • Spatial locality • Robustness • From S-Tree • Temporal locality • Polarity locality • Ability to fix “topology problems” by “stitching”

  27. Solution Space Entire space SP-Tree S-Tree P-Tree Fixed topo.

  28. Experiments • Randomly generated nets • Non-uniform required arrival time • Non-uniform sink input capacitance • Buffer-biased cost • Interested in: • Min cost feasible solution • Max slack solution for verification • Runtime • More details in the paper...

  29. Algorithms for Experiments • S-Tree • P-Tree • SP-Tree • RMP [Cong, Yuan; DAC 2000] • RMP-Quick [Cong, Yuan; DAC 2000]

  30. Results Net2-06 Min cost feasible Max slack # buffers

  31. Results Net2-08 Min cost feasible Max slack # buffers

  32. Results Net2-12 Min cost feasible Max slack # buffers

  33. SP-Tree vs. P-Tree

  34. Conclusions • Key Concepts: • General Cost Models • Routing Congestion • Buffer Congestion • Orthogonal Separation of Spatial and Temporal Locality • Polarity Requirements • Routing and Buffer Blockages • Targets: • Small-to-Medium Sized Signal Nets • Results Summary • Highly Cost-Efficient, High Performance Solutions • Substantially Outperforms Prior Approaches in Solution Quality and Runtime

  35. Part II: Fanin Tree Embedding/Replication

  36. Replication Overview • Hrkic, Lillis, Beraudo (DAC04, IWLS04) • Concept: Netlist structure limits potential of timing-driven placement • Difficult for top-down synthesis to fix • Main issue: inherently non-monotone paths • Approach (Hrkic, Lillis; DAC04) touches on placement, synthesis (netlist perturbation) and routing.

  37. Logic Replication • Duplicate logic cell • Preserve functionality • Improve timing • Place / Move cells • Adjust connections B A B A CR C C D E D E

  38. Early Work • Use replication to straighten I/O paths • Local monotonicity [Beraudo, Lillis, DAC 2003] • Sequence of 3 cells on the path • Incremental framework B B CR A A C C D E D E

  39. Limitations of Local Monotonicity • Local Monotonicity satisfied • Still many non-monotone paths A B C F D E

  40. Replication Tree Approach[Hrkic et. al. DAC04] • Identify critical sink • Extract critical fan-in tree (Replication Tree) • Optimize fan-in tree (Fan-in Tree Embedding) • Legalize placement

  41. Slowest Paths Tree • Focus on slowest paths • Find slowest paths tree from critical sink • Include paths within epsilon of current critical delay • Focus on most critical portions of fan-in cone

  42. Replication Tree C C CR A AR A B B BR DR D D E F E FR F • Most circuits do not contain large fan-in trees due to reconvergence • Given a critical tree temporarily replicate the entire tree • Assign connections: • if (u,v) is tree edge; connect uR to vR • else connect u to vR

  43. Placement cost C CR AR A B BR DR D E F FR • Replication is temporary • Placement cost is crucial • Cost discount for placing cell over its logical equivalent • low cost for placing DR over D • actual replication will never occur • multiple low cost location possible

  44. Fan-in Tree Embedding • Given: • Fan-in tree • Placement of sink and inputs • Arrival times at inputs • Placement and routing graph • Find: • Placement of internal tree nodes (Gates) • Minimizing Cost • s.t. Timing Constraints • cost / delay tradeoff

  45. Fan-in Tree Embedding Example C C A A B B sink sink Higher delay, lower cost Lower delay, higher cost

  46. Fan-out and Fan-in Tree C A source B C A sink B Bottom-up Top-down

  47. Fan-in Tree Embedding • Adaptation of S-Tree algorithm [Hrkic, Lillis, DAC 2002] • Keep: • Graph Model for Embedding Target • Modified Timing-Driven Maze Routing • multiple source, multiple targets • at each vertex keep a list of non-dominated solutions • S. Hur, J. Lillis, IEEE TCAD 2000 • Modify: • Top-down vs. Bottom-up • Solution signature (c,t): • c - cost • t - signal arrival time • Gate placement cost p(x,y)

  48. Fan-in Tree Embedding • Non-binary tree: multiple gate inputs • Top-Down Dynamic Programming • Maze Routing to populate solutions • deffered backtracking • Join Solutions • c=px,y + c1 + ... + cn • t=MAX(t1, ... ,tn) • Bottom-Up solution extraction • backtrack to extract maze route • extract gate placement Modified maze routing Join

  49. Aside: Legalization • Use Modified Gain-Graph approach [Hur, Lillis; ICCAD00] • Modified to incorporate timing information

  50. Optimization Flow • Identify critical sink (static timing analysis) • Extract Fan-in Tree • Replication Tree • epsilon-Slowest Paths Tree • Embed Fan-in Tree • Decide which cells to Replicate / Unify • Legalize placement • Repeat while there is improvement

More Related