1 / 18

Administrivia : October 7, 2009

Administrivia : October 7, 2009. Homework 2 due next Wednesday (see web site). Reading in Davis: Sections 7.6 and 7.7 (nested dissection) Sections 4.4 – 4.8 and 4.11 (Cholesky). Nonsymmetric Ax = b: Gaussian elimination (without pivoting). Factor A = LU Solve Ly = b for y

thuy
Download Presentation

Administrivia : October 7, 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Administrivia: October 7, 2009 • Homework 2 due next Wednesday (see web site). • Reading in Davis: • Sections 7.6 and 7.7 (nested dissection) • Sections 4.4 – 4.8 and 4.11 (Cholesky)

  2. Nonsymmetric Ax = b: Gaussian elimination (without pivoting) • Factor A = LU • Solve Ly = b for y • Solve Ux = y for x • Variations: • Pivoting for numerical stability: PA=LU • Cholesky for symmetric positive definite A: A = LLT • Permuting A to make the factors sparser = x

  3. for column j = 1 to n do solve scale:lj = lj / ujj j U L A ( ) L 0L I ( ) ujlj L = aj for uj, lj Left-looking Column LU Factorization • Column j of A becomes column j of L and U

  4. L = speye(n);for column j = 1 : ndfs in G(LT) to predict nonzeros of x; x(1:n) = A(1:n, j); // x is a SPA for i = nonzero indices of x in topological order x(i) = x(i) / L(i, i); x(i+1:n) = x(i+1:n) – L(i+1:n, i) * x(i); U(1:j, j) = x(1:j); L(j+1:n, j) = x(j+1:n);cdiv: L(j+1:n, j) = L(j+1:n, j) / U(j, j); Left-looking sparse LU without pivoting (simple)

  5. Nonsymmetric Ax = b: Gaussian elimination with partial pivoting At step j, swap the unused row with largest diagonal element into the pivot position. • Factor PA = LU • Solve Ly = Pb for y • Solve Ux = y for x P = x

  6. for column j = 1 to n do solve pivot: swap ujj and an elt of lj scale:lj = lj / ujj j U L A ( ) L 0L I ( ) ujlj L = aj for uj, lj Left-looking Column LU Factorization • Column j of A becomes column j of L and U

  7. GPLU Algorithm [1988] • Left-looking column-by-column factorization • Depth-first search to predict structure of each column • Partial pivoting +: Symbolic analysis cost proportional to flops -: Big constant factor – symbolic cost still dominates => Prune symbolic representation

  8. j k r r = fill Symmetric pruning:Set Lsr=0 if LjrUrj 0 Justification:Ask will still fill in j = pruned = nonzero s Symmetric Pruning [Eisenstat, Liu] Idea: Depth-first search in a sparser graph with the same path structure • Use (just-finished) column j of L to prune earlier columns • No column is pruned more than once • The pruned graph is the elimination tree if A is symmetric

  9. L = speye(n); S = empty n-vertex graph;for column j = 1 : ndfs in S to predict nonzeros of x; x(1:n) = A(1:n, j); // x is a SPA for i = nonzero indices of x in topological order x(i) = x(i) / L(i, i); x(i+1:n) = x(i+1:n) – L(i+1:n, i) * x(i); U(1:j, j) = x(1:j); L(j+1:n, j) = x(j+1:n); cdiv: L(j+1:n, j) = L(j+1:n, j) / U(j, j);update S: add edges (j, i) for nonzero L(i, j);prune S using L(j+1:n,j); Left-looking sparse LU without pivoting (pruned)

  10. L = speye(n); S = empty n-vertex graph;for column j = 1 : ndfs in S to predict nonzeros of x; x(1:n) = A(1:n, j); // x is a SPA for i = nonzero indices of x in topological order x(i) = x(i) / L(i, i); x(i+1:n) = x(i+1:n) – L(i+1:n, i) * x(i); U(1:j, j) = x(1:j); L(j+1:n, j) = x(j+1:n); pivot: swap U(j, j) and an element of L(:, j);cdiv: L(j+1:n, j) = L(j+1:n, j) / U(j, j);update S: add edges (j, i) for nonzero L(i, j);prune S using L(j+1:n,j); Left-looking sparse LU with partial pivoting (pruned)

  11. GPMOD Algorithm [Eisenstat, Liu 1993] • Left-looking column-by-column factorization • Depth-first search to predict structure of each column • Symmetric pruning to reduce symbolic cost • Partial pivoting +: Much cheaper symbolic factorization than GPLU (~4x) -: Indirect addressing for each flop (sparse vector kernel) -: Poor reuse of data in cache (BLAS-1 kernel) => Supernodes (next week)

  12. Direct A = LU Iterative y’ = Ay More General Non- symmetric Symmetric positive definite More Robust More Robust Less Storage The Landscape of Sparse Ax=b Solvers D

  13. Sparse Cholesky factorization to solve Ax = b • Preorder: replace A by PAPT and b by Pb • Independent of numerics • Symbolic Factorization: build static data structure • Elimination tree • Nonzero counts • Supernodes • Nonzero structure of L • Numeric Factorization: A = LLT • Static data structure • Supernodes use BLAS3 to reduce memory traffic • Triangular Solves: solve Ly = b, then LTx = y

  14. for j = 1 : n for k = 1 : j -1 % cmod(j,k) for i = j : n A(i, j) = A(i, j) – A(i, k)*A(j, k); end; end; % cdiv(j) A(j, j) = sqrt(A(j, j)); for i = j+1 : n A(i, j) = A(i, j) / A(j, j); end; end; j LT L A L Column Cholesky Factorization • Column j of A becomes column j of L

  15. for j = 1 : n L(j:n, j) = A(j:n, j); for k < j with L(j, k) nonzero % sparse cmod(j,k) L(j:n, j) = L(j:n, j) – L(j, k) * L(j:n, k); end; % sparse cdiv(j) L(j, j) = sqrt(L(j, j)); L(j+1:n, j) = L(j+1:n, j) / L(j, j); end; j LT L A L Sparse Column Cholesky Factorization • Column j of A becomes column j of L

  16. Permutations for sparsity “I observed that most of the coefficients in our matrices were zero; i.e., the nonzeros were ‘sparse’ in the matrix, and that typically the triangular matrices associated with the forward and back solution provided by Gaussian elimination would remain sparse if pivot elements were chosen with care” - Harry Markowitz, describing the 1950s work on portfolio theory that won the 1990 Nobel Prize for Economics

  17. Heuristic fill-reducing matrix permutations • Nested dissection: • Find a separator, number it last, proceed recursively • Theory: approx optimal separators => approx optimal fill and flop count • Practice: often wins for very large problems • Minimum degree: • Eliminate row/col with fewest nzs, add fill, repeat • Hard to implement efficiently – current champion is “Approximate Minimum Degree” [Amestoy, Davis, Duff] • Theory: can be suboptimal even on 2D model problem • Practice: often wins for medium-sized problems • Banded orderings (Reverse Cuthill-McKee, Sloan, . . .): • Try to keep all nonzeros close to the diagonal • Theory, practice: often wins for “long, thin” problems • The best modern general-purpose orderings are ND/MD hybrids.

  18. Fill-reducing permutations in Matlab • Symmetric approximate minimum degree: • p = amd(A); • symmetric permutation: chol(A(p,p)) often sparser than chol(A) • Symmetric nested dissection: • not built into Matlab • several versions in meshpart toolbox (course web page references) • Nonsymmetric approximate minimum degree: • p = colamd(A); • column permutation: lu(A(:,p)) often sparser than lu(A) • also for QR factorization • Reverse Cuthill-McKee • p = symrcm(A); • A(p,p) often has smaller bandwidth than A • similar to Sparspak RCM

More Related