210 likes | 288 Views
This research focuses on a reduction algorithm for sparse LU factorization, introducing new strategies to optimize runtime and space complexity. Motivations, methods, and experimental results are discussed in detail along with various ordering strategies to preserve sparsity in matrix calculations.
E N D
Fill Reduction Algorithm Using Diagonal Markowitz Scheme with Local Symmetrization Patrick Amestoy ENSEEIHT-IRIT, France Xiaoye S. Li Esmond Ng Lawrence Berkeley National Laboratory
Contents • Motivation • Graph models for Gaussian elimination • Minimum priority metrics • Experimental results • Summary • Add: Runtime and space complexity
Motivation -- New Sparse LU Factorization Algorithms • Inexpensive pre/post-processing • Equilibration (or scaling) • Pre-permute rows or columns of A to maximize its diagonal • Find a matching with maximum weight for bipartite graph of A • Example: MC64 [Duff/Koster ‘99] • Iterative refinement • GESP (static pivoting) [Li/Demmel ‘98, SuperLU_DIST] • Pivots are chosen from the diagonal • Allow half-precision perturbation of small diagonals • Unsymmetrized multifrontal [Amestoy/Puglisi ‘00, MA41_NEW] • Prefer diagonal pivoting, but threshold pivoting is possible • Allow unsymmetric fronts, but dependency graph is still a tree • Diagonal is (almost) good • Struct(L’) Struct(U)
Existing Ordering Strategies to Preserve Sparsity • Symmetric ordering algorithms on A’+A • Greedy algorithms • e.g., minimum degree, minimum deficiency, etc. • Graph partitioning • Hybrid • Problem: unsymmetric structure is not respected!
i i j k j k 1 1 i i Eliminate 1 j j k k • Undirected graph • After a vertex is eliminated, all its neighbors become a clique • The edges of the clique are the potential fills (upper bound !) 1 i i j j Eliminate 1 k k Structural Gaussian Elimination -- Symmetric Case
c1 c2 c3 c1 c2 c3 1 1 r1 r1 Eliminate 1 Eliminate 1 r2 r2 • Bipartite graph • After a vertex is eliminated, all the row & column vertices adjacent to • it become fully connected – “bi-clique” (assuming diagonal pivot) • The edges of the bi-clique are the potential fills (upper bound!) r1 r1 1 c1 c1 1 c2 c2 r2 r2 c3 c3 Structural Gaussian Elimination -- Unsymmetric Case
Ordering Algorithms Revisit • Markowitz [1957] for unsymmetric matrices • At step k, pick pivot in the trailing submatrix so that: • It has minimum, and • It is bounded by a numerical threshold • Bound the size of the rank-1 update matrix • Expensive to implement because it is mixed with numerical consideration • Examples: MA48 (HSL), etc. • “Restricted” Markowitz -- only look ahead a few candidate columns (rows) with the lowest degrees [Zlatev ‘80] • Minimum degree [Tinney/Walker ‘67] • Special case of Markowitz for SPD systems • Efficient implementation, because: • Diagonal is stable as numerical pivot • Use quotient graph as a compact representation without regard of numerical values
Simulation Result • Order(A) vs. Order(A’+A) (Markowitz vs. min degree) • Diagonal pivoting • 88 unsymmetric matrices • Mean fill ratio 0.90 • Mean flops ratio 0.79 • 54 very unsymmetric (symmetry <= 0.5) • Mean fill ratio 0.85 • Mean flops ratio 0.56
Current pivot p: x e1 . element list = {e1, e2} . variable list e2 x x x p If variable v adjacent to e1, it will be adjacent to p e1 can be absorbed by p p is representative of conn. comp. {e1, e2, p} v Quotient Graph – Symmetric Case • Elements -- representative nodes of the connected components in the eliminated subgraph • Variables -- uneliminated nodes
Quotient Graph -- Unsymmetric Case Current pivot p: e1 x x x e2 p v Difficulty: Path length may be greater than 2 !
Quotient Graph -- “Local Symmetrization” Current pivot p: e1 x x x e2 s p s s v Advantage: - Path length bounded by 2 ! Disadvantage: - Lose some asymmetry - More fill
Cost of Implementation • Elimination models can be implemented using standard graphs or quotient graphs, with different cost in time & space.
Minimum Priority Metrics • Metrics are based on “approximate degree” in the sense of AMD, can be implemented efficiently • Almost the same cost using various metrics: • Based on row & column counts: • PRODUCT (a.k.a. Markowitz), SUM, MIN, MAX, etc. • Minimum fill : areas associated with the existing cliques are deducted • …...
Preliminary Results with Local Symmetrization • Matrices: 98 unsymmetric in structure • Metrics : based on row/column counts or fill • Solvers: • MA41_NEW : unsymmetrized multifrontal • Local symmetrization ordering is ideal for this solver • SuperLU_DIST : GESP
Compare Different Metrics • Solver: MA41_NEW • Average fill ratio using various metrics with respect to Markowitz (product of row & col counts)
Compare with AMD(A’+A) using Min Fill -- All Unsymmetric • MA41_NEW • SuperLU_DIST
Compare with AMD(A’+A) using Min Fill -- Very Unsymmetric • MA41_NEW • SuperLU_DIST
Summary • First implementation based on BQG model • Features: supervariable, element absorption, mass elimination • Using approximate degree (degree upper bound) • Tried various metrics on large collection of matrices • PRODUCT, SUM, MIN-FILL, etc. • Not a single one is universally best, MIN-FILL is often better • Local symmetrization • Cheaper to implement, harder to understand behavior • Especially suitable for unsymmetrized multifrontal, also benefit GESP • Respectable gain for very unsymmetric matrices
Summary (con’d) • Results for very unsymmetric matrices • Future work • Work underway for a fully unsymmetric version • Extend to graph partitioning strategy
1 1 2 2 3 3 4 4 5 5 6 6 7 7 Example G(A) A 1 x 2 x x x 3 x 4 x 5 x x x 6 x x 7 column row