1 / 17

Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11

Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11. Wei-Jin Dai. Overview. Introduction Challenges of hierarchical design Hierarchical methodology – Full chip physical prototyping Performance data Summary. Introduction.

bonita
Download Presentation

Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hierarchical Physical Design Methodology for Multi-Million Gate ChipsSession 11 Wei-Jin Dai

  2. Overview • Introduction • Challenges of hierarchical design • Hierarchical methodology – Full chip physical prototyping • Performance data • Summary

  3. Introduction • As chip size and complexity grow, hierarchical design approach is necessary • During last 12 months, there is a big increase in the number of chips designed with hierarchical approach • The advantages of hierarchical approach is divide-and-conquer

  4. The Challenges • How to get full-chip (10 million gates+) physical reality early on to identify potential problems? • How to have convergence process to reach design closure from beginning to end? • How to achieve die utilization similar to “flat” approach? • How to achieve clock speed and skews similar to “flat” approach? • How to automatically generate optimal pin assignments for each module? • How to automatically come up with realistic timing budgets for each module? • How to achieve top level timing/signal integrity closure?

  5. Creating the Physical Prototype Flat Full-Chip Delivers an Accurate Physical Prototype • Full-chip flat prototype delivers the complete physical, timing, clock and power data • Eliminates the guessing of the traditional block-based approaches • Drives the partitioning in manageable blocks

  6. Design Completion P r o t o t y p i n g Prototyping Starts Early in the Flow RTL/ Black box 75% netlist/ Black box Complete netlist • Most accurate view possible at all design stages • Physical timing budgeting drives synthesis Optimization Estimation Refinement Initial timing budgets Refined timing budgets

  7. Hierarchical Design Flow LEF/GDSII RTL/Black Box Process Data • Quick synthesis • Floor planning • Placement • CTS • Trial route Flat Full Chip Physical Prototype Chip Level Timing Constraints • Die size • Timing • Clock skew • Power • SI Physically Feasible? NO • Pin assignment • Timing budget • Clock spec • Power grid Partition Data Partition Data Partition Data Partition Data Partition Data Physical Partitioning Block Implementation Place, CTS, Optimize Top Level Implementation CTS, Optimization, Power DEF Placement Optimized Top Level Netlist DEF Placement

  8. Hierarchical Partitioning • Pin assignment • Timing budgeting • Clock tree generation • Power grid planning Independent block-level implementation Partitioning SoC assembly

  9. Accurate Pin Assignment • Full-chip prototype results in optimal pin placement • Results in narrower channels and reduced die size • Reduces the routing congestion • Improves the chip timing Accurate Physical Prototype Flat Full-Chip Top Level Partition View

  10. Block 1 L Block 2 L Block 3 L Timing Budgeting Each block requires: • Clock definition • Set_input_delay • Set_output_delay • Set_drive • Set_load • Path exceptions (false, multicycle paths) Accurate timing budgets result in predictable timing convergence

  11. Balanced clock tree Hierarchical Clock Tree Synthesis • Accurate physical timing data enables the creation of an optimal clock tree • Block-level followed by top-level clock tree • Final clock tree routing generates near zero skew • Balanced tree at the top level 100ps skew 130ps skew 150ps skew Worst block skew + Zero top level skew = 150ps total clock skew 50ps skew 50ps skew 120ps skew

  12. Full Chip Power Analysis

  13. Hierarchical Power Grid Design • P/G are planned at full chip level • P/G network gets automatically pushed down during partitioning Block Full chip

  14. Performance Data

  15. Design 580K cells, 0.25um process, 5LM, 100MHz Data collected on a 500MHz processor workstation (*) SPC Trial Route High Performance Environment First Encounter Traditional 5 hr 25 min 7 hr 30 min 9 hr 35 hr 40 min 5 hr 45 min 3 hr 50 min 2 hr 50 min 1 hr 50 min 3 hr 20 min 6x 2 hr 15 min 4 hr 20 min 1x 5x 4 min 8 min 6 min 7 min 7x 60x 56x 57x 33x Design Import Detail Place Detail Route* RC Extract Delay Calculation Timing Analysis Design Iteration IPO

  16. Design: • 5LM • 0.25um • 580K cells • 620K nets • 572 I/Os • 4 blocks High Accuracy of the Prototype • The prototype closely correlates with post-route layout • Comparison to ‘tape-out’ back-end flow • More than 90% of the interconnect and IO path delays within 2%

  17. SummarySoC Hierarchical Methodology • Build a full-chip physical prototype early on • Start at RTL • Identify problems early • Achieve design closure before partitioning • Close full-chip timing • Optimize die size • Meet power requirements • Resolve signal integrity issues • Maintain the design closure throughout the design process

More Related