Administrivia : October 7, 2009

Administrivia: October 7, 2009 • Homework 2 due next Wednesday (see web site). • Reading in Davis: • Sections 7.6 and 7.7 (nested dissection) • Sections 4.4 – 4.8 and 4.11 (Cholesky)

Nonsymmetric Ax = b: Gaussian elimination (without pivoting) • Factor A = LU • Solve Ly = b for y • Solve Ux = y for x • Variations: • Pivoting for numerical stability: PA=LU • Cholesky for symmetric positive definite A: A = LLT • Permuting A to make the factors sparser = x

for column j = 1 to n do solve scale:lj = lj / ujj j U L A ( ) L 0L I ( ) ujlj L = aj for uj, lj Left-looking Column LU Factorization • Column j of A becomes column j of L and U

L = speye(n);for column j = 1 : ndfs in G(LT) to predict nonzeros of x; x(1:n) = A(1:n, j); // x is a SPA for i = nonzero indices of x in topological order x(i) = x(i) / L(i, i); x(i+1:n) = x(i+1:n) – L(i+1:n, i) * x(i); U(1:j, j) = x(1:j); L(j+1:n, j) = x(j+1:n);cdiv: L(j+1:n, j) = L(j+1:n, j) / U(j, j); Left-looking sparse LU without pivoting (simple)

Nonsymmetric Ax = b: Gaussian elimination with partial pivoting At step j, swap the unused row with largest diagonal element into the pivot position. • Factor PA = LU • Solve Ly = Pb for y • Solve Ux = y for x P = x

for column j = 1 to n do solve pivot: swap ujj and an elt of lj scale:lj = lj / ujj j U L A ( ) L 0L I ( ) ujlj L = aj for uj, lj Left-looking Column LU Factorization • Column j of A becomes column j of L and U

GPLU Algorithm [1988] • Left-looking column-by-column factorization • Depth-first search to predict structure of each column • Partial pivoting +: Symbolic analysis cost proportional to flops -: Big constant factor – symbolic cost still dominates => Prune symbolic representation

j k r r = fill Symmetric pruning:Set Lsr=0 if LjrUrj 0 Justification:Ask will still fill in j = pruned = nonzero s Symmetric Pruning [Eisenstat, Liu] Idea: Depth-first search in a sparser graph with the same path structure • Use (just-finished) column j of L to prune earlier columns • No column is pruned more than once • The pruned graph is the elimination tree if A is symmetric

L = speye(n); S = empty n-vertex graph;for column j = 1 : ndfs in S to predict nonzeros of x; x(1:n) = A(1:n, j); // x is a SPA for i = nonzero indices of x in topological order x(i) = x(i) / L(i, i); x(i+1:n) = x(i+1:n) – L(i+1:n, i) * x(i); U(1:j, j) = x(1:j); L(j+1:n, j) = x(j+1:n); cdiv: L(j+1:n, j) = L(j+1:n, j) / U(j, j);update S: add edges (j, i) for nonzero L(i, j);prune S using L(j+1:n,j); Left-looking sparse LU without pivoting (pruned)

L = speye(n); S = empty n-vertex graph;for column j = 1 : ndfs in S to predict nonzeros of x; x(1:n) = A(1:n, j); // x is a SPA for i = nonzero indices of x in topological order x(i) = x(i) / L(i, i); x(i+1:n) = x(i+1:n) – L(i+1:n, i) * x(i); U(1:j, j) = x(1:j); L(j+1:n, j) = x(j+1:n); pivot: swap U(j, j) and an element of L(:, j);cdiv: L(j+1:n, j) = L(j+1:n, j) / U(j, j);update S: add edges (j, i) for nonzero L(i, j);prune S using L(j+1:n,j); Left-looking sparse LU with partial pivoting (pruned)

GPMOD Algorithm [Eisenstat, Liu 1993] • Left-looking column-by-column factorization • Depth-first search to predict structure of each column • Symmetric pruning to reduce symbolic cost • Partial pivoting +: Much cheaper symbolic factorization than GPLU (~4x) -: Indirect addressing for each flop (sparse vector kernel) -: Poor reuse of data in cache (BLAS-1 kernel) => Supernodes (next week)

Direct A = LU Iterative y’ = Ay More General Non- symmetric Symmetric positive definite More Robust More Robust Less Storage The Landscape of Sparse Ax=b Solvers D

Sparse Cholesky factorization to solve Ax = b • Preorder: replace A by PAPT and b by Pb • Independent of numerics • Symbolic Factorization: build static data structure • Elimination tree • Nonzero counts • Supernodes • Nonzero structure of L • Numeric Factorization: A = LLT • Static data structure • Supernodes use BLAS3 to reduce memory traffic • Triangular Solves: solve Ly = b, then LTx = y

for j = 1 : n for k = 1 : j -1 % cmod(j,k) for i = j : n A(i, j) = A(i, j) – A(i, k)*A(j, k); end; end; % cdiv(j) A(j, j) = sqrt(A(j, j)); for i = j+1 : n A(i, j) = A(i, j) / A(j, j); end; end; j LT L A L Column Cholesky Factorization • Column j of A becomes column j of L

for j = 1 : n L(j:n, j) = A(j:n, j); for k < j with L(j, k) nonzero % sparse cmod(j,k) L(j:n, j) = L(j:n, j) – L(j, k) * L(j:n, k); end; % sparse cdiv(j) L(j, j) = sqrt(L(j, j)); L(j+1:n, j) = L(j+1:n, j) / L(j, j); end; j LT L A L Sparse Column Cholesky Factorization • Column j of A becomes column j of L

Permutations for sparsity “I observed that most of the coefficients in our matrices were zero; i.e., the nonzeros were ‘sparse’ in the matrix, and that typically the triangular matrices associated with the forward and back solution provided by Gaussian elimination would remain sparse if pivot elements were chosen with care” - Harry Markowitz, describing the 1950s work on portfolio theory that won the 1990 Nobel Prize for Economics

Heuristic fill-reducing matrix permutations • Nested dissection: • Find a separator, number it last, proceed recursively • Theory: approx optimal separators => approx optimal fill and flop count • Practice: often wins for very large problems • Minimum degree: • Eliminate row/col with fewest nzs, add fill, repeat • Hard to implement efficiently – current champion is “Approximate Minimum Degree” [Amestoy, Davis, Duff] • Theory: can be suboptimal even on 2D model problem • Practice: often wins for medium-sized problems • Banded orderings (Reverse Cuthill-McKee, Sloan, . . .): • Try to keep all nonzeros close to the diagonal • Theory, practice: often wins for “long, thin” problems • The best modern general-purpose orderings are ND/MD hybrids.

Fill-reducing permutations in Matlab • Symmetric approximate minimum degree: • p = amd(A); • symmetric permutation: chol(A(p,p)) often sparser than chol(A) • Symmetric nested dissection: • not built into Matlab • several versions in meshpart toolbox (course web page references) • Nonsymmetric approximate minimum degree: • p = colamd(A); • column permutation: lu(A(:,p)) often sparser than lu(A) • also for QR factorization • Reverse Cuthill-McKee • p = symrcm(A); • A(p,p) often has smaller bandwidth than A • similar to Sparspak RCM

Administrivia : October 7, 2009

Administrivia : October 7, 2009

Presentation Transcript

Welcome to 2009 Building Deputy Meeting

eLogic Model Training October 2009

Hypertension Update 2009

Florida State Board of Education 2010-11 Education Budget Request October 27, 2009

MICP- 12 October POM – 12 October

“Safety of Imported Foods - An Industry Perspective” October 29, 2009

Information Resources Management

CSAT’s View of Prescription Drug Misuse: Problems and Solutions COMP Conference October 28, 2009

Chairman’s Rounds October 16, 2009; 15 year old with an unintentional overdose

Journal Review

Basic Cryptography

PROVIDER RELATIONS SEMINAR OCTOBER 7, 2009

Faculty Training October 30, 2009 Collin County Community College Dr. Marsha Fralick

Please turn on your sound Right click mouse and select ‘Full Screen’

Moscone South Room 301 San Francisco, CA October 13, 2009

The New York State Education Department

AGORA - Krakow - 28-30 October 2009 Raymond Morel

Weichert Lead Network Team Meeting October 2009

Economic and Commodity Outlook ASFMRA Denver, Colorado October 30, 2009

October 2009 Strategic Planning for Providers to Improve Business Practices Meeting

Administrivia

PCC/CIG University of Washington Seattle, Washington October 27, 2009