1 / 31

An Analytic Placer for Mixed-Size Placement and Timing-Driven Placement

An Analytic Placer for Mixed-Size Placement and Timing-Driven Placement. Andrew B. Kahng and Qinke Wang UCSD CSE Department {abk, qiwang}@cs.ucsd.edu Work partially supported by the MARCO Gigascale Systems Research Center, NSF MIP-9987678 and the Semiconductor Research Corporation.

Download Presentation

An Analytic Placer for Mixed-Size Placement and Timing-Driven Placement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Analytic Placer for Mixed-Size Placement and Timing-Driven Placement Andrew B. Kahng and Qinke Wang UCSD CSE Department {abk, qiwang}@cs.ucsd.edu Work partially supported by the MARCO Gigascale Systems Research Center, NSF MIP-9987678 and the Semiconductor Research Corporation.

  2. Motivation • Mixed-size placement • design productivity increasingly requires IP reuse • processing / interface cores, embedded memories, etc. • “boulders and dust” challenge:sizes of placeable objects can vary by factors of 10,000 or more • placement is particularly complex in fixed-die context • Timing-driven placement • more critical with device and interconnect scaling

  3. Our Work • APlace[Kahng/Wang ISPD04]: an analytic placer for wirelength-driven standard-cell placement • [Naylor et al., US Patent 6301693, 2001] • superior wirelength quality compared to Cadence QPlace, Dragon and Capo • strong extensibility: congestion-directed placement, I/O-core co-placement, constraint handling for mixed-signal, etc. • poor scalability: average 13.2 X slower than Capo • This work: extend APlace to address mixed-size placement and timing-driven placement

  4. Outline • APlace Background • Extension to Mixed-Size Placement • Extension to Timing-Driven Placement • Conclusions and Ongoing Work

  5. Outline • APlace Background • Formulations • wirelength minimization • cell spreading = density control • Implementation • Extension to Mixed-Size Placement • Extension to Timing-Driven Placement • Conclusion and Ongoing Work

  6. Wirelength Formulation • Placement objective: HPWL • Smooth approximation Naylor et al., US Patent 6301693, 2001 • log-sum-exp formula: pick the most dominant terms among pin coordinates •  : smoothing parameter • closer to HPWL when α→ 0 • precise • strictly convex • continuously differentiable

  7. Density Control • Common strategy • divide the placement area into grids • equalize the total cell area in each grid • Penalty of an uneven cell distribution • not smooth or differentiable • difficult to optimize

  8. p(d) 2 2 1-2d /r 2 2 2(r-d) /r d r r/2 r/2 r Cell Potential Function • Bell-shaped cell potential function [Naylor et al., US Patent 6301693, 2001] • Cell c has potential(c, g) with respect to grid g • Cell c at (x, y) has area A • Grid point g = (x', y') • p(d) : bell-shaped function • r : the radius of cells' potential • C : a proportionality factor, s.t.

  9. Implementation • Cells are spread by minimizing the smooth density penalty function • APlace combines the above two objectives and optimizes the following function using a Conjugate Gradient optimizer: • Density term drives cell spreading • Wirelength term draws connected components back toward each other

  10. Wirelength vs. Density Objectives • Density weight: fixed • larger  spread cells out hastily without good wirelength • Wirelength weight: variable • larger  contract cells together and prevent them from spreading out • initially set to be large • repeat until all cells are spread out evenly: • execute conjugate-gradient solver until convergence • reduce the weight by half Objective:

  11. Outline • APlace Background • Extension to Mixed-Size Placement • Density control for macros • Legalization • Experimental results • Extension to Timing-Driven Placement • Conclusion and Ongoing Work

  12. Previous Works • Capo flow: a three stage placement-floorplanning-placement flow that uses Capo [Adya et al., ISPD02, ICCAD03] • mPG-MS: a simulated annealing based multi-level placer[Chang et al., ASPDAC03] • Feng Shui: a recursive bisection based placement tool using fractional cuts[Khatkhate et al., ISPD04]

  13. Potential Function for Macros (I) • Each module has a potential or influence with respect to nearby grids • APlace seeks to equalize the total module potential at each grid • rm is the radius of module’s potential • Standard-cell placement: rm is a constant r • Mixed-size placement: rm changes according to the module's dimension • A larger block will have potential with respect to more nearby grids

  14. p(d) 2 1-a*d 2 b*(r-d) d w/2+r w/2+r/2 w/2+r/2 w/2+r Potential Function for Macros (II) • p(d) : potential function d : distance from module to grid • Radius rm = w/2 + r for a block with width w • Convex curved < w/2 + r/2 • Concave curvew/2 + r/2 < d < w/2+ r • p(d) is smooth atd = w/2 + r/2

  15. Legalization • Simplified Tetris algorithm[Hill, US Patent 6370673, 2002] • sort modules based on a linear combination of vertical coordinate and width • search the current nearest available position for each module • Pros and cons •  fast •  larger blocks are fixed at a position ahead of nearby small cells •  best applied when modules are distributed evenly •  may fail if the global placement has many overlaps among macros

  16. circuit APlace-MS detailed placement WL WL_l inc. (%) CPU WL_dp impr. (%) CPU ibm01 0.20 0.24 18.5 15 0.23 5.7 1 ibm02 0.51 0.52 0.7 45 0.50 2.5 3 ibm03 0.70 0.74 6.2 56 0.72 3.5 3 ibm04 0.81 0.85 4.8 48 0.83 2.8 4 ibm05 1.01 1.00 -0.5 15 0.98 2.0 5 ibm06 0.65 0.71 9.6 76 0.68 4.4 5 ibm07 1.03 1.09 5.8 98 1.05 3.7 8 ibm08 1.49 1.50 0.6 128 1.46 2.7 8 ibm09 1.25 1.45 15.7 113 1.38 5.2 9 ibm10 2.97 3.07 3.3 206 3.00 2.2 11 APlace-MS Results • Ten ISPD02 Mixed-Size Benchmarks (10K-70K cells) • Average wirelength increase after legalization: 6.5% Detailed placement by Feng Shui: 3.5% avg. WL improvement

  17. HPWL Comparison • Capo flow[ICCAD03] 26.0% (11.5% ~ 34.0%) • mPG-MS [ASPDAC03]24.7% (9.9% ~ 40.1%) • Feng Shui [ISPD04] 4.0% (-7.3% ~ 20.0%) • Runtime • Xeon server (2.4GHz CPU, double-threaded) • much slower than Feng Shui

  18. Placements Before and After Legalization

  19. Outline • APlace Background • Extension to Mixed-Size Placement • Extension to Timing-Driven Placement • Slack-derived edge weights • Timing-driven placement flow • Experimental results • Conclusion and Ongoing Work

  20. Timing-Driven Approaches • Path based methods • consider all or a subset of paths directly • maintain an accurate timing view during optimization • complexity is relatively high • Net based methods • transform timing constraints or requirements into either net weight or net length (or delay) constraints

  21. Net Based Methods • Delay budgeting • distribute slacks from the end-points to constituent nets along the path • may severely over-constrain the problem without consideration of physical feasibility • Net weighting • assign weights to nets based on timing criticality • low complexity, strong flexibility and easy implementation • more attractive as circuit sizes increase and timing constraints become more complex

  22. Slack-Derived Edge Weights • Net weighting in TD-APlace • β: timing criticality exponent • slack(π) : the slack of path π • T : longest path delay • Heavy net weights are assigned to: • timing critical nets  exponential function [Marquardt et al. 2000] • nets included in many critical paths [Kong ICCAD02]

  23. Timing-Driven Placement Flow • Final placement stage • TrialRoute (SoC Encounter v3.2): a fast global and detailed routing • Extract RC • Pearl (SE v5.4): static timing analysis (STA) • Import critical path delays to decide net weights • Minimize weighted WL objective

  24. Timing Results: Indust1 Testcase • Indust1: ~ 7k cells • Xeon 2.4GHz CPU, double-threaded • Minimum cycle time • measures quality of TD placements • initially decreases with criticality exponent • gradually deteriorates as criticality exponent continues to increase Results with varying criticality exponents (β)

  25. Comparison vs. Industry Placers (I) • Two industry placers • QPlace (SE v5.4) • amoebaPlace (SoC Encounter v3.2) • Six industry circuits • 7k ~ 40k cells • two from the ISPD 2001 Circuit Benchmarks • Experimental flow • TD or non-TD placements • WarpRoute (SoC Encounter v3.2) : timing-driven routing • Extract RC • Pearl (SE v5.4): static timing analysis (STA)

  26. Comparison vs. Industry Placers (II) • Comparison to TD-QPlace and TD-amoebaPlace • Final HPWL • TD-QPlace: 7.2%(-1.2% ~ 7.1%) • TD-amoebaPlace: 6.5%(-11.1% ~ 23.2%) • Min Cycle • TD-QPlace: 9.6%(-1.2% ~ 14.8%) • TD-amoebaPlace: 8.5%(-0.8% ~ 28.5%) • APlace: 2%(0.1% ~ 3.8%)

  27. Conclusions • APlace analytic placement framework extended to address mixed-size and timing-driven placement • Mixed-size placement • HPWL outperforms mPG-MS, Feng Shui and the Capo flow respectively by 24.7%, 4.0% and 26.0% on average • Timing-driven placement • Minimum cycle time outperforms that of TD-QPlace and TD-amoebaPlace respectively by 9.6% and 8.5% • Routed WL outperforms that of TD-QPlace and TD-amoebaPlace respectively by 7.2% and 6.5%

  28. Ongoing Work • Scalability issue • APlace currently does not scale to large instances • control scheme for larger circuits • Augmented Lagrangian method for constrained nonlinear optimization • multigrid algorithm • Extension to low power or IR drop directed placement • Extension to 3D or thermal-aware placement

  29. Acknowledgments • We thank Brent Gregory, Will Naylor and Synopsys, Inc. for a research and educational license pertaining to U.S. Patents 6282693, 6662348, 6301693, 6671859 and 6665851.

  30. Thank You !

  31. HPWL Results Comparison • Comparison (HPWL) • the Capo flow[ICCAD03] 26.0% (11.5% ~ 34.0%) • mPG-MS [ASPDAC03]24.7% (9.9% ~ 40.1%) • Feng Shui [ISPD04] 4.0% (-7.3% ~ 20.0%) • Comparison (Running Time) • Xeon server (2.4GHz CPU, double-threaded) • much slower than Feng Shui Comparison of our results with the Capo flow, mPG-MS and Feng Shui

More Related