column cholesky factorization a r t r
Download
Skip this Video
Download Presentation
Column Cholesky Factorization: A=R T R

Loading in 2 Seconds...

play fullscreen
1 / 16

Column Cholesky Factorization: A=R T R - PowerPoint PPT Presentation


  • 81 Views
  • Uploaded on

for j = 1 : n for k = 1 : j-1 % cmod(j,k) for i = j : n A(i,j) = A(i,j) – A(i,k)*A(j,k); end ; end ; % cdiv(j) A(j,j) = sqrt(A(j,j)); for i = j+1 : n A(i,j) = A(i,j) / A(j,j); end ; end ;. j. R. R T. A. R T. Column Cholesky Factorization: A=R T R.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Column Cholesky Factorization: A=R T R' - leonard-hunt


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
column cholesky factorization a r t r
for j = 1 : n

for k = 1 : j-1

% cmod(j,k)

for i = j : n

A(i,j) = A(i,j) – A(i,k)*A(j,k);

end;

end;

% cdiv(j)

A(j,j) = sqrt(A(j,j));

for i = j+1 : n

A(i,j) = A(i,j) / A(j,j);

end;

end;

j

R

RT

A

RT

Column Cholesky Factorization: A=RTR
  • Column j of A becomes column j of RT
data structures
Data structures
  • Full:
    • 2-dimensional array of real or complex numbers
    • (nrows*ncols) memory
  • Sparse:
    • compressed column storage
    • about (1.5*nzs + .5*ncols) memory

D

sparse column cholesky factorization a r t r
for j = 1 : n

for k < j with A(j,k) nonzero

% sparse cmod(j,k)

A(j:n, j) = A(j:n, j) – A(j:n, k)*A(j, k);

end;

% sparse cdiv(j)

A(j,j) = sqrt(A(j,j));

A(j+1:n, j) = A(j+1:n, j) / A(j,j);

end;

j

R

RT

A

RT

Sparse Column Cholesky Factorization: A=RTR
  • Column j of A becomes column j of RT
graphs and sparse matrices cholesky factorization

3

7

1

3

7

1

6

8

6

8

4

10

4

10

9

2

9

2

5

5

Graphs and Sparse Matrices: Cholesky factorization

Fill:new nonzeros in factor

Symmetric Gaussian elimination:

for j = 1 to n add edges between j’s higher-numbered neighbors

G+(A)[chordal]

G(A)

sparse cholesky factorization a r t r
Sparse Cholesky factorization: A=RTR
  • Preorder
    • Independent of numerics
  • Symbolic Factorization
    • Elimination tree
    • Nonzero counts
    • Supernodes
    • Nonzero structure of R
  • Numeric Factorization
    • Static data structure
    • Supernodes use BLAS3 to reduce memory traffic
  • Triangular Solves
incomplete cholesky factorization ic ilu

x

A

RT

R

Incomplete Cholesky factorization (IC, ILU)
  • Compute factors of A by Gaussian elimination, but ignore fill
  • Preconditioner B = RTR  A, not formed explicitly
  • Compute B-1z by triangular solves (in time nnz(A))
  • Total storage is O(nnz(A)), static data structure
  • Either symmetric (IC) or nonsymmetric (ILU)
incomplete cholesky and ilu variants

1

4

1

4

3

2

3

2

Incomplete Cholesky and ILU: Variants
  • Allow one or more “levels of fill”
    • unpredictable storage requirements
  • Allow fill whose magnitude exceeds a “drop tolerance”
    • may get better approximate factors than levels of fill
    • unpredictable storage requirements
    • choice of tolerance is ad hoc
  • Partial pivoting (for nonsymmetric A)
  • “Modified ILU” (MIC): Add dropped fill to diagonal of U or R
    • A and RTR have same row sums
    • good in some PDE contexts
incomplete cholesky and ilu issues
Incomplete Cholesky and ILU: Issues
  • Choice of parameters
    • good: smooth transition from iterative to direct methods
    • bad: very ad hoc, problem-dependent
    • tradeoff: time per iteration (more fill => more time)vs # of iterations (more fill => fewer iters)
  • Effectiveness
    • condition number usually improves (only) by constant factor (except MIC for some problems from PDEs)
    • still, often good when tuned for a particular class of problems
  • Parallelism
    • Triangular solves are not very parallel
    • Reordering for parallel triangular solve by graph coloring
matrix permutations symmetric direct methods
Matrix permutations: Symmetric direct methods
  • Goal is to reduce fill, making Cholesky factorization cheaper.
  • Minimum degree:
    • Eliminate row/col with fewest nzs, add fill, repeat
    • Theory: can be suboptimal even on 2D model problem
    • Practice: often wins for medium-sized problems
  • Nested dissection:
    • Find a separator, number it last, proceed recursively
    • Theory: approx optimal separators => approx optimal fill and flop count
    • Practice: often wins for very large problems
  • Banded orderings (Reverse Cuthill-McKee, Sloan, . . .):
    • Try to keep all nonzeros close to the diagonal
    • Theory, practice: often wins for “long, thin” problems
  • Best orderings for direct methods are ND/MD hybrids.
fill reducing permutations in matlab
Fill-reducing permutations in Matlab
  • Nonsymmetric approximate minimum degree:
    • p = colamd(A);
    • column permutation: lu(A(:,p)) often sparser than lu(A)
    • also for QR factorization
  • Symmetric approximate minimum degree:
    • p = symamd(A);
    • symmetric permutation: chol(A(p,p)) often sparser than chol(A)
  • Reverse Cuthill-McKee
    • p = symrcm(A);
    • A(p,p) often has smaller bandwidth than A
    • similar to Sparspak RCM

D

matrix permutations for iterative methods
Matrix permutations for iterative methods
  • Symmetric matrix permutations don’t change the convergence of unpreconditioned CG
  • Symmetric matrix permutations affect the quality of an incomplete factorization – poorly understood, controversial
  • Often banded (RCM) is best for IC(0) / ILU(0)
  • Often min degree & nested dissection are bad for no-fill incomplete factorization but good if some fill is allowed
  • Some experiments with orderings that use matrix values
    • e.g. “minimum discarded fill” order
    • sometimes effective but expensive to compute
nonsymmetric matrix permutations for iterative methods
Nonsymmetric matrix permutations for iterative methods
  • Dynamic permutation: ILU with row or column pivoting
    • E.g. ILUTP (Saad), Matlab “luinc”
    • More robust but more expensive than ILUT
  • Static permutation: Try to increase diagonal dominance
    • E.g. bipartite weighted matching (Duff & Koster)
    • Often effective; no theory for quality of preconditioner
  • Field is not well understood, to say the least
row permutation for heavy diagonal duff koster

1

2

3

4

5

1

4

1

5

2

2

3

3

3

1

4

2

4

5

PA

5

Row permutation for heavy diagonal [Duff, Koster]

1

2

3

4

5

  • Represent A as a weighted, undirected bipartite graph (one node for each row and one node for each column)
  • Find matching (set of independent edges) with maximum product of weights
  • Permute rows to place matching on diagonal
  • Matching algorithm also gives a row and column scaling to make all diag elts =1 and all off-diag elts <=1

1

2

3

4

5

A

sparse approximate inverses

A

B-1

Sparse approximate inverses
  • Compute B-1 A explicitly
  • Minimize || A B-1 – I ||F (in parallel, by columns)
  • Variants: factored form of B-1, more fill, . .
  • Good: very parallel, seldom breaks down
  • Bad: effectiveness varies widely
complexity of linear solvers

n1/2

n1/3

Complexity of linear solvers

Time to solve model problem (Poisson’s equation) on regular mesh

complexity of direct methods

n1/2

n1/3

Complexity of direct methods

Time and space to solve any problem on any well-shaped finite element mesh

ad