1 / 14

Inductor Design for Global Resonant Clock Distribution in a 28-nm CMOS Processor

Inductor Design for Global Resonant Clock Distribution in a 28-nm CMOS Processor. Visvesh Sathe 3 , Padelis Papadopoulos 2 , Alvin Loke 3 , Tarek Khan 1 , Anand Raman 2 , Gerry Vandevalk 3 , Nikolas Provatas 2 , Vincent Ross 1 1 Advanced Micro Devices, Inc. 2 Helic, Inc.

eddy
Download Presentation

Inductor Design for Global Resonant Clock Distribution in a 28-nm CMOS Processor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inductor Design for Global Resonant Clock Distribution in a 28-nm CMOS Processor Visvesh Sathe3, Padelis Papadopoulos2, Alvin Loke3, Tarek Khan1, Anand Raman2, Gerry Vandevalk3, Nikolas Provatas2, Vincent Ross1 1Advanced Micro Devices, Inc. 2Helic, Inc. 3 Formerly at Advanced Micro Devices, Inc.

  2. Outline Resonant Clock Distribution Inductor Design and Analysis Challenges Helic VeloceRaptor/X Inductor Extraction using VeloceRaptor/X Silicon Correlation Conclusion

  3. Processor Global Clock Distribution AMD “Piledriver” • Typical core-power breakdown consumption macros 18% flops 18% gaters 16% standard cells 19% bus 5% clocking 24% • Significant global clock loading • 7-ps clock skew target across > 20-mm2 core area • Constrained clock latency from grid to timing elements

  4. Basic Resonant Clocking Operation • Rely on efficient resonance between Ltank and Cclk near ω0 • Efficient operation around ω0 • Driving clock at much lower frequencies  Reduced efficiency, warped clock waveform

  5. AMD Resonant Clocking • 90 inductors distributed over custom power grid, signal wires, and core circuitry

  6. Inductor Design Clock macro, bump pitch constrain inductor size Metal sharing with existing power →cut-aways Centered power straps, HCK tree for mutual inductance

  7. Inductor and Grid Problem Summary • 87 x65 μm spiral over 113 x126 μm custom grid • 12 metal layers (2 thick) • Width: 0.13to 5.7 μm • Thickness: 0.1 to 1.2 μm • >5μm/μm2 interconnect length to be extracted!

  8. Inductor Design Methodology • Goal: Achieve desired L with maximum Q on a highly customized inductor • Available design variables • Winding width, outer spacing, inner spacing (NESW) • Winding height, winding width • Multiple extractions within reasonable time is vital • Extraction customization per-metal is crucial • Top metal layers dominate magnetic interaction, lower level metals have minimal interaction • Per-metal extraction/merging mode selection (R/C/RC/RLC/RLCk) • Process-aware, temperature-sensitive extraction

  9. What is VeloceRaptor/X ? Rapid, high-capacity multi-GHz EM extraction • Maxwell equations-based RLCk model per metal segment • Inductance calculations based on magnetic vector potential • Skin and proximity effects, substrate losses, capacitive and magnetic coupling • Silicon-proven accuracy • Use model: • In situ selection of nets and pin definition • Netlist and symbol creation for the marked nets • Model annotation and simulation

  10. VeloceRaptor/X Offers… • High capacity and speed • Multithreading support • S-parameters and RLCk netlist output • Temperature-aware model • Mixed-mode R/C/RC/RLC/RLCk per any net layer • Layout-dependent effects captured • Direct GDS extraction • Batch-mode support • Numerical network reduction

  11. Inductor-over-Grid Model Validation Best tradeoff between model accuracy and runtime/memory requirements Increasing interconnect density, runtime, memory requirement No improvement in model accuracy when adding more RLCk layers • Mixed-mode extraction per net layer: • M11- Mx: RLCk • Mx-1- M3: RC • RLCk extraction below M07 has negligible impact

  12. Turnaround Time vs. Metal Density

  13. Test Chip Silicon Validation Very good agreement between measured and extracted L and Q

  14. Conclusions • Resonant clocking feature reduces global clock power distribution • Use of multiple distributed on-chip inductors poses a significant challenge to inductor extraction • Metal-rich extraction environment • Significant mutual inductance with underlying and adjacent circuits and power grids • Exploiting design structure and VeloceRaptor/X capabilities enabled efficient inductor optimization • Batch mode and per-metal per-net extraction for extraction of a model with sufficient detail to accurately model silicon behavior.

More Related