1 / 26

A Parallel Hierarchical Solver for the Poisson Equation

A Parallel Hierarchical Solver for the Poisson Equation. http://wavelets.mit.edu/~darshan/18.337 Seung Lee Deparment of Mechanical Engineering selee@mit.edu R.Sudarshan Department of Civil and Environmental Engineering darshan@mit.edu 13 th May 2003. Outline. Introduction

ahoward
Download Presentation

A Parallel Hierarchical Solver for the Poisson Equation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Parallel Hierarchical Solver for the Poisson Equation http://wavelets.mit.edu/~darshan/18.337 Seung Lee Deparment of Mechanical Engineering selee@mit.edu R.Sudarshan Department of Civil and Environmental Engineering darshan@mit.edu 13th May 2003

  2. Outline • Introduction • Recap of hierarchical basis methods • Used for preconditioning and adaptivity • Implementation • Methodology • Mesh subdivision • Data Layout • Assembly and Solution • Results and Examples • Comparison with finite element discretization • Speed up with number of processors • Conclusions and Further Work

  3. Introduction • Problem being considered: • Discretizing the operator using finite difference or finite element methods leads to stiffness matrices with large condition numbers • Need a good preconditioner (ILU, IC, circulant, domain decomp.) • Choosing such a preconditioner is “black magic” • Why create an ill-conditioned matrix and then precondition it, when you can create a well-conditioned matrix to begin with! • Instead of using: Use

  4. Is better conditioned than Single Level Vs. Multilevel Approaches Single Level Multilevel +  + 

  5. Formulation and Solution • Discretize using piecewise bilinear hierarchical bases: • Solve coarsest level problem with PCG (and diagonal preconditioning). Then determine first level details, second level details, etc • How well does this method parallelize? • Multiscale stiffness matrix is not as sparse • Are the savings over the single level method really significant?  Big Question

  6. Done by the “Oracle” Done in parallel Implementation - I: The Big Picture Read input, subdivide the mesh, and distribute node and edge info to processors For I=1:N_level*, Distribute elements Assemble K Solve system of Eqs. Use solution from previous mesh as a guess Perform inverse wavelet transform and consolidate solution No I > N_level?? Yes We (hopefully) have the converged solution!! *N_levels is known a priori

  7. Implementation – II: Mesh Subdivision • Each parent element is subdivided into four children elements • Number of elements and DoFs increase geometrically , and the solution convergences with only a few subdivision levels Level 0 Level 1 Level 2 1 element 4 elements 16 elements

  8. II I III IV Implementation – III: Data Layout • Degrees of freedom in the mesh are distributed linearly • Uses a naïve partitioning algorithm • Each processor gets roughly NDoF/NP dofs (the Update set) • Each processor assembles the rows of the stiffness matrix corresponding to elements in its update set • Each processor has info about all faces connected to vertices in its update set and all vertices connected to such faces • Equivalent to “basis partitioning”

  9. Implementation – IV: Assembly and Solution • Stiffness matrices stored in the modified sparse row format • Requires less storage than CSR or CSC formats • Equations solved using Aztec • Solves linear systems in parallel • Comes with PCG with diagonal preconditioning • Inverse wavelet transform (synthesis of the final solution) • Implemented using Aztec as a parallel matrix-vector multiply

  10. 1 0 1 2 0 2 3 3 1 Input File Format 7 Number of subdivisions 4 Number of vertices 4 Number of edges 1 Number of elements 1.0 1.0 1 Coordinate (x,y) of the vertices, and boundary info -1.0 1.0 1 (i.e. 1 = constrained, 0 = free) -1.0 -1.0 1 1.0 -1.0 1 3 0 1 Edge definition (vertex1, vertex2), and boundary info 1 0 1 (i.e. 1 = constrained, 0 = free) 2 1 1 2 3 1 0 1 2 3 Element definition (edge1, edge2, edge3, edge4) Vertices Edges Elements

  11. Results – I: Square Domain Level 1 Level 6

  12. Single Scale vs. Hierarchical Basis Number of iterations vs. Degrees of Freedom • Same order of convergence, but fewer number of iterations for larger problems Number of Iterations Degrees of Freedom

  13. Solution Time Vs. Number of Processors

  14. Square Domain Solution • Coarsest mesh (level 1) – 9 DoFs, 1 iteration to solve, took 0.004 seconds on 4 procs • Finest mesh (level 6) – 4225 DoFs, 43 iteration to solve, took 0.123 seconds on 4 procs

  15. Results – II: “L” Domain • Coarsest mesh (level 1) – 21 DoFs, 3 iteration to solve, took 0.0055 seconds on 4 procs • Finest mesh (level 6) – 12545 DoFs, 94 iteration to solve, took 0.280 seconds on 4 procs

  16. Single Level Vs. Hierarchical Basis

  17. Solution Time Vs. Number of Processors

  18. Results – III: “MIT” Domain • Coarsest mesh (level 1) – 132 DoFs, 15 iteration to solve, took 0.012 seconds on 4 procs • Finest mesh (level 6) – 91520 DoFs, 219 iteration to solve, took 4.77 seconds on 8 procs

  19. Single Level Vs. Hierarchical Basis Did not converge after 500 iters

  20. Solution Time Vs. Number of Processors

  21. Conclusions and Further Work • Hierarchical basis method parallelizes well • Provides a cheap and effective parallel preconditioner • Scales well with number of processors • With the right libraries, parallelism is easy! • With Aztec, much of the work involved writing (and debugging!) the mesh subdivision, mesh partitioning and the matrix assembly routines • Further work • Parallelize parts of the oracle (e.g. mesh subdivision) • Adaptive subdivision based on error estimation • More efficient geometry partitioning • More general element types (right now, we are restricted to rectangular four-node elements)

  22. The End

  23. (Galerkin) Weak Form • Formally, • Leads to a multilevel system of equations Coarse <=> coarse interactions Coarse <=> fine interactions Fine <=> fine interactions

More Related