1 / 24

Reducing Complexity in Algebraic Multigrid

Reducing Complexity in Algebraic Multigrid. Hans De Sterck Department of Applied Mathematics University of Colorado at Boulder Ulrike Meier Yang Center for Applied Scientific Computing Lawrence Livermore National Laboratory. Outline. introduction: AMG

ozzie
Download Presentation

Reducing Complexity in Algebraic Multigrid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reducing Complexity in Algebraic Multigrid Hans De SterckDepartment of Applied MathematicsUniversity of Colorado at BoulderUlrike Meier YangCenter for Applied Scientific ComputingLawrence Livermore National Laboratory

  2. Outline • introduction: AMG • complexity growth when using classical coarsenings • Parallel Modified Independent Set (PMIS) coarsening • scaling results • conclusions and future work

  3. Introduction • solve • from 3D PDE – sparse! • large problems (109 dof) - parallel • unstructured grid problems

  4. Algebraic Multigrid (AMG) • multi-level • iterative • algebraic: suitable forunstructured!

  5. AMG building blocks Setup Phase: • Select coarse “grids” • Defineinterpolation, • Definerestrictionandcoarse-grid operators Solve Phase

  6. AMG complexity - scalability • Operator complexity Cop= e.g., 3D: Cop= 1 + 1/8 + 1/64 + … < 8 / 7 measure of memory use, and work in solve phase • scalable algorithm: O(n) operations per V-cycle (Cop bounded) AND number of V-cycles independent of n (rAMG independent of n)

  7. AMG Interpolation • after relaxation: Ae  0 (relative to e) • heuristic: error after interpolation should also satisfy this relation approximately • derive interpolation from:

  8. AMG interpolation • “large” aij should be taken into account accurately • “strong connections”: i strongly depends on j (and j strongly influences i ) if with strong threshold

  9. AMG coarsening • (C1) Maximal Independent Set: Independent: no two C-points are connected Maximal: if one more C-point is added, the independence is lost • (C2) All F-F connections require connections to a common C-point (for good interpolation) • F-points have to be changed into C-points, to ensure (C2); (C1) is violated more C-points, higher complexity

  10. Classical Coarsenings • Ruge-Stueben (RS) • two passes: 2nd pass to ensure that F-F have common C • disadvantage: highly sequential • CLJP • based on parallel independent set algorithms developed by Luby and later by Jones & Plassman • also ensures that F-F have common C • hybrid RS-CLJP (“Falgout”) • RS in processor interiors, CLJP at interprocessor boundaries

  11. Classical coarsenings: complexity growth • example: hybrid RS-CLJP (Falgout), 7-point finite difference Laplacian in 3D, q = 0.25 • increased memory use, long solution times, long setup times = loss of scalability

  12. our approach to reduce complexity • do not add C points for strong F-F connections that do not have a common C point • less C points, reduced complexity, but worse convergence factors expected • can something be gained?

  13. PMIS coarsening (De Sterck, Yang) • Parallel Modified Independent Set (PMIS) • do not enforce condition (C2) • weighted independent set algorithm: points i that influence many equations (li large), are good candidates for C-points • add random number between 0 and 1 to li to break ties

  14. PMIS coarsening • pick C-points with maximal measures (like in CLJP), then make all their neighbors fine (like in RS) • proceed until all points are either coarse or fine

  15. 3.7 5.3 5.0 5.9 5.4 5.3 3.4 5.2 8.0 8.5 8.2 8.6 8.9 5.1 5.9 8.1 8.8 8.9 8.4 8.2 5.9 5.7 8.6 8.3 8.8 8.3 8.1 5.0 5.3 8.7 8.3 8.4 8.3 8.8 5.9 5.0 8.8 8.5 8.6 8.7 8.9 5.3 3.2 5.6 5.8 5.6 5.9 5.9 3.0 PMIS select 1 • select C-pts with maximal measure locally • make neighbor F-pts • remove neighbor edges

  16. PMIS: remove and update 1 3.7 5.3 5.0 5.9 • select C-pts with maximal measure locally • make neighbors F-pts • remove neighbor edges 5.2 8.0 5.9 8.1 5.7 8.6 8.1 5.0 8.4 8.6 5.6

  17. PMIS: select 2 5.9 3.7 5.3 5.0 • select C-pts with maximal measure locally • make neighbors F-pts • remove neighbor edges 5.2 8.0 5.9 8.1 5.7 8.6 8.1 5.0 8.4 8.6 5.6

  18. PMIS: remove and update 2 3.7 5.3 • select C-pts with maximal measure locally • make neighbors F-pts • remove neighbor edges 5.2 8.0

  19. PMIS: final grid • select C-pts with maximal measure locally • make neighbor F-pts • remove neighbor edges

  20. Preliminary results: 7pt 3D Laplacian on an un-structured grid (n = 76,527), serial, q=0.5, GS • Implementation in CASC/LLNL’s Hypre/BoomerAMG library (Falgout, Yang, Henson, Jones, …)

  21. PMIS results: 27-point finite element Laplacian in 3D, 403 dof per proc (IBM Blue) Falgout (q=0.5) and PMIS-GMRES(10) (q=0.25)

  22. PMIS results: 7-point finite difference Laplacian in 3D, 403 dof per proc (IBM Blue) Falgout (q=0.5) and PMIS-GMRES(10) (q=0.25)

  23. Conclusions • PMIS leads to reduced, scalable complexities for large problems on parallel computers • using PMIS-GMRES, large problems can be done efficiently, with good scalability, and without requiring much memory (Blue Gene/L)

  24. Future work • parallel aggressive coarsening, multi-pass interpolation to further reduce complexity (using one-pass RS, PMIS, or hybrid) • improved interpolation formulas for more aggressively coarsened grids (Jacobi improvement, …), to reduce need for GMRES • parallel First-Order System Least-Squares-AMG code for large-scale PDE problems • Blue Gene/L applications

More Related