Introduction to Scientific Computing II

Introduction to Scientific Computing II Preconditioned CG Continued Dr. Miriam Mehl

Conjugate Gradients – Basic Idea • solution of SLE • minimization • iterative • one-dimensional minima • no repeating search directions

Conjugate Gradients – Principle

CG – Convergence • Poisson with 5-point-stencil • like SOR • no parameter adjustment

PCG – Idea convergence rate cg: • Solve system M-1Ax=M-1b • better condition number k • M-1 easy to apply

PCG – Algorithm

PCG – Preconditioner • (Jacobi) • SSOR • incomplete Cholesky decomposition • incomplete LU decomposition • (algebraic) multigrid

Dr. Miriam Mehl Introduction to Scientific Computing II Parallel MG – A Scalable Alternative to Parallel CG?

Can we parallelize a solver iteration in a scalable way? Parallel Solvers – Story 1

parallel cg with “optimal” speedup #processors ~ #unknowns Parallel Speedup CG

parallel smoother with optimal speedup #processors ~ #unknowns (on the finest level!) Parallel Speedup MG

Can we parallelize the solver in a scalable way? Parallel Solvers -- Story 2

parallel cg with “optimal” speedup per iteration #processors ~ #unknowns Parallel Speedup CG

parallel smoother with optimal speedup #processors ~ #unknowns (on the finest level!) Parallel Speedup MG

parallel cg with “optimal” speedup #processors ~ #unknowns Parallel Speedup CG

Parallel Speedup CG h h/2 h/4 h/8 O(1/h) total time to solve? O(ln(1/h)/h)

parallel smoother with optimal speedup #processors ~ #unknowns (on the finest level!) #processors = const on all levels time T time T/4 computation > communication Parallel Speedup MG

parallel smoother with optimal speedup #processors ~ #unknowns (on the finest level!) #processors = const on all levels time T time T time T/2 computation < communication Parallel Speedup MG

parallel smoother with optimal speedup #processors ~ #unknowns (on the finest level!) #processors = const on all levels time T #dof < #proc Parallel Speedup MG

parallel smoother with optimal speedup #processors ~ #unknowns (on the finest level!) #processors = const on all levels time T time T time T #dof < #proc Parallel Speedup MG

Parallel Speedup MG level 0 #dof < #proc level L1 Tcomp < Tcomm level L2 Tcomp > Tcomm level Lmax

Parallel Speedup MG level 0 #dof < #proc level L1 Tcomp < Tcomm level L2 d Tcomp > Tcomm level Lmax

Parallel Speedup MG level 0 #dof < #proc level L1 Tcomp < Tcomm c level L2 d Tcomp > Tcomm level Lmax

Parallel Speedup MG level 0 ln(p) #dof < #proc level L1 Tcomp < Tcomm c level L2 d Tcomp > Tcomm level Lmax

Parallel Speedup MG level 0 ln(p)/2 #dof < #proc level L1 Tcomp < Tcomm c level L2 d Tcomp > Tcomm level Lmax

Parallel Speedup MG p 0 total time per iteration? ln(p) ln(p)+c ln(p)+c+d

Parallel Speedup MG h h/2 h/4 h/8 O(1) O(ln(1/h)) total time to solve?

Iterative Solvers – Overview X system matrix s.p.d. or diagonally dominant and othersX system matrix must be s.p.d.!!!

Introduction to Scientific Computing II