DiMES: Multilevel Fast Direct Solver based on Multipole Expansions for Parasitic Extraction of Massively Coupled 3D Mi

DiMES: Multilevel Fast Direct Solver based on Multipole Expansions for Parasitic Extraction of Massively Coupled 3D Mi

426 Views

Download Presentation
## DiMES: Multilevel Fast Direct Solver based on Multipole Expansions for Parasitic Extraction of Massively Coupled 3D Mi

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**DiMES: Multilevel Fast Direct Solver based on Multipole**Expansions for Parasitic Extraction of Massively Coupled 3D Microelectronic Structures Indranil Chowdhury, Vikram Jandhyala Dipanjan Gope* ACE Research Department of Electrical Engineering University of Washington Design and Technology Solutions INTEL Corporation Supported by: NSF, SRC and DARPA**Class of Problems**Magnetostatic Problems Electrostatic Problems DiMES: FAST DIRECT SOLVER ALGORITHM Electric Field Integral Equations Magnetic Field Integral Equations PMCHW: Multi-Region Dielectric Problems**Outline**• Focus Application: Accurate Charge Distribution • - Circuit Parasitic Estimation • - MEMS Charge Distribution • Motivation behind Fast Direct Solution • - Large Number of RHS Vectors • - Re-simulation Advantages • DiMES: Fast Direct FMM based Solver • - Sparsification of MoM Using FMM • - Sparse 1.3 Solution • Numerical Results**Deep Sub-Micron and Nano Fabrication Technology**- Gate delay reduces Overall chip size does not decrease - More functionalities added to the same chip Switching Speed: Function of Interconnect Parsitics • = 0.0103pF • = 0.0103pF • = 0.0153pF • = 0.0153pF • = 0.0069pF • = 0.0069pF • = 0.0153pF • = 0.0069pF • = 0.0103pF • = 0.0153pF • = 0.0103pF • = 0.0069pF • = 0.0153pF • = 0.0103pF • = 0.0069pF • = 0.0153pF • = 0.0103pF • = 0.0069pF • Spacing between traces reduced • Spacing between traces reduced • Spacing between traces reduced • Spacing between traces reduced • Spacing between traces reduced • Spacing between traces reduced • Aspect Ratio (H/W) Increases • Aspect Ratio (H/W) Increases • Aspect Ratio (H/W) Increases • Aspect Ratio (H/W) Increases • Aspect Ratio (H/W) Increases • Aspect Ratio (H/W) Increases Size Size Size 250nm 250nm 250nm 70nm 70nm 70nm ITRS Data ITRS Data ITRS Data Spacing Spacing Spacing 340nm 340nm 340nm 100nm 100nm 100nm H/W H/W H/W 1.8:1 1.8:1 1.8:1 2.7:1 2.7:1 2.7:1 Courtesy: VLSI Systems WPI web-course Increasing Interconnect Parasitics**MEMS: Electrical Force Computations**MEMS Electrical Force Computation Requires Accurate Simulation of Charge Distribution • Approximate Solutions: Inaccurate Charge Distribution • Inaccurate Charge Distribution: Inaccurate Force Computation**Solution Scheme**Solution Scheme Analytic Numerical Inexpensive but Inaccurate Accurate for 3D Arbitrary Shaped Objects Accurate Prediction of Charge Distribution • Method of Moments (MoM) • Well-Conditioned System • Smaller Sized Matrix • Dense Matrix**Surface is Discretized into Patches (Basis Functions)**Pulse Method of Moments • Basis Functions Interact through the Green’s Function • Generates a Dense Method of Moments Matrix**Courtesy: Ansoft Corporation**Practical problems: N ~ 1 million Fast Solvers: Significance N = Number of basis functions; (50,000) p = Number of iterations per RHS; r = Number of RHS • Fast Iterative Methods: Mature Field • - Fast Multipole Method (FastCap) [Nabors and White 1992] • - Pre-Corrected FFT Method [Phillips and White 1997] • QR Based Method (IES3) [Kapur and Long 1997] • QR Based Method (PILOT) [Gope and Jandhyala 2003] • O(N)-O(NlogN) Matrix Vector Products • Why Look Any Further?**Outline**• Focus Application: Accurate Charge Distribution • - Circuit Parasitic Estimation • - MEMS Charge Distribution • Motivation behind Fast Direct Solution • - Large Number of RHS Vectors • - Re-simulation Advantages • DiMES: Direct Multipole Expansion Solver • - Sparsification of MoM Using FMM • - Sparse 1.3 Solution • Numerical Results**α=2; β=1**Fast Setup and Solve p1=2xp ILL-Conditioned Problem α=2; β=2 Fast LU Setup α=3; β=2 Direct / LU Fast Direct Fast Iterative Fast Iterative Fast Iterative Fast Iterative Fast Direct Fast Direct • ILL-Conditioned Problems (More Prominent for EFIE) • Large Number of Excitations / Large Number of RHS Vectors Motivn 1: Large Number of RHS Vectors Direct Setup + Solve Fast Iterative Setup + Solve Setup Solve Solve Setup N=10,000 p=90**Motivn 2: Fast Updates in Re-simulation**Critical Transition: Analysis to Solution 1. Schur Complement 2. SMW-Updates = B + A Ax+By=z1; Cx+Dy=z2; (A+BD-1C)x=z1-BD-1z2 D A U V M C Repeated Simulation: Update vs. Re-Solve**Advances In Fast Direct Solvers NOT Comparable To Advances**In Fast Iterative Solvers Existing Literature • Michielssen, Boag and Chew (1996) • - Reduced Source Field Representation • Canning and Rogovin (1999) • - SMW Method • - LUSIFER • Hackbusch (2000) • - H-Matrices • Gope and Jandhyala (2001) • - Compressed LU Method • Yan, Sarin and Shi (2004) • - Inexact Factorization • Forced Matrix Structure Unsuitable for Arbitrary 3D Shapes • Fillins: Chief Cost Factor / Neglected**Outline**• Focus Application: Accurate Charge Distribution • - Circuit Parasitic Estimation • - MEMS Charge Distribution • Motivation behind Fast Direct Solution • - Large Number of RHS Vectors • - Re-simulation Advantages • DiMES: Direct Multipole Expansion Solver • - Sparsification of MoM Using FMM • - Sparse 1.3 Solution • Numerical Results**Number of degrees of freedom**<< Number of basis functions (Well-separated groups) = = Translation: Same Size Translation Fast Multipole Basics 1D Geometry MoM Matrix**Multilevel Multipole Operators**Q – Q2M – M2M – M2L – L2L – L2P – P M2L Finest - 1 Level M2M L2L L2L M2M M2L Finest Level M2L Q2M L2P Down Tree Up Tree Across Tree**=**= Q2P = = Reconstruct with Multipoles Problems in Single Matrix Formation M2L M2Ms Q2M L2Ls L2P Fast Matrix Vector Products Fast Multipole Iterative Method Does Not Inherently Lend Itself to Fast Direct Solution**Step 1: Increase LHS Size**Step 2: Use Multipole Expansions Step 1 On Its Own Will NOT Expedite; Step 1 is ONLY Required To Achieve Step2 Modified LHS Z q V Are We Simply Increasing the Size of the Matrix to Make it Sparse? No = • Size of the Matrix Increases • Non-Zero Entries = O(No) • Non-Zero Entries NOT No2 q ML Nn Multipole Expansions ML-1 LL Local Expansions LL-1**Q2P**q Q2M ML L2P ML-1 M2M LL LL-1 M2L L2L Modified Set of Equations LHS • 1st Set of Equations: Formation of V • - Contribution from q via Q2P (Finest Level) • - Contribution from L via L2P (Finest Level) • 2nd Set of Equations: Formation of M • - Contribution from q via Q2M (Finest Level) • - Contribution from M (From Level Below) via M2M • 3rd Set of Equations: Formation of L • - Contribution from M via M2L (Same Level) • - Contribution from L (From Level Above) via L2L**=**4 Level Sparse Matrix Set 1 Set 2 Set 3 • Total Number of Non-zero Entries is O(N)**Optimization: Number of Levels**• Increase Levels: More Sparsity • Increase Levels: Larger Size of the Matrix Dry Run: Pre-Estimation of Number of Levels • Re-Order The Unknowns Based on Geometry • Dry-Run Cost is a Function of Fillin-Factor (w)**Outline**• Focus Application: Accurate Charge Distribution • - Circuit Parasitic Estimation • - MEMS Charge Distribution • Motivation behind Fast Direct Solution • - Large Number of RHS Vectors • - Re-simulation Advantages • DiMES: Direct Multipole Expansion Solver • - Sparsification of MoM Using FMM • - Sparse 1.3 Solution • Numerical Results**Hughes Test Chip ic_hrl_tc1**Validation Example Multipole Order (p): 2 1.5GB RAM and 1.6GHz Processor Speed Capacitance Matrix Norm Difference < 1e-3**Hughes Test Chip ic_hrl_tc1**Time and Memory α=3 Memory Time: LU Setup β=2 α=1.8 β=1.2 β=2 Time: LU Solve β=1.2**Substrate Coupling Problem**2500 Metal Contacts; 6500 Charge Basis Functions**Comparison with FastCap**Cutoff Point: 360 RHS Vectors Below Cutoff: Fast Iterative Solver Above Cutoff: Fast Direct Solver**Highlight**Conclusions and Future Work • Conclusions: • First of Its Kind Multilevel Multipole-based Direct Solver • Matrix Structure is Not Forced: • - Valid for Arbitrary 3D Structures • Fillins are Not Neglected • - Guaranteed High Accuracy • Future Work: • Reduce Setup Time • - Increasing N will Increase Cut-off Point More than Linearly