Fine-Grain Sparse Matrix Partitioning with Constraints

Fine-Grain Sparse Matrix Partitioning with Constraints Erik Boman Sandia National Labs, Albuquerque, NM Umit Catalyurek Ohio State University

Outline • Load-balancing & partitioning • Sparse matrix-vector multiplication • Models: bipartite graph, hypergraph • 1d and 2d distributions • Constrained matrix partitioning • Hypergraph with fixed vertices • Vertex cover model • Numerical results

Proc 3 Proc 1 Proc 2 Load Balancing, Graph Partitioning • Load Balancing • Assign work to processors to distribute work evenly and minimize communication • Graph or hypergraph partitioning • Vertices (weighted) = computation • Edge (weighted) = data dependence

Parallel Sparse Matrix-Vector Multiply • Compute y=Ax, where A is large and sparse • A is distributed among processors • Kernel in scientific computing • Iterative methods (Krylov) • Eigenvalue computations • PageRank • Nice model problem • Other applications have similar computation and communication pattern

Parallel Sparse Matrix-vector Algorithm • Step 1: Expand (fan-out) • Send xj to processors with nonzero in column j • Step 2: Local multiply-add • yi = yi + Aij xj • Step 3: Fold (fan-in) • Send partial results of y to relevant processors • Step 4: Sum partial results Two communication phases (expand and fold).

Row distribution Column distribution 2 1 1 4 3 x 2 1 1 4 3 x 3 1 6 4 1 3 1 9 6 5 9 2 4 1 22 9 6 5 3 5 9 2 41 22 5 8 9 6 5 3 64 41 5 8 9 64 y A y A Communication (p=2)

1d Models • 1d distribution is most common • Assign either rows or columns • Models to represent a sparse matrix • Graph (only symmetric) • Bipartite graph • Hypergraph

C2 C4 C1 C3 C5 Bipartite Graph Model • G=(R,C,E), where • R are row vertices • C are column vertices • Partition both R and C • But only use R for row distribution • Works in nonsym. case • Edge cut approximates comm. volume • Is NOT exact

1d Hypergraph Model • Hypergraph • Hyperedge is a set of vertices (1 or more) • Rows = vertices • Columns = hyperedges • Partition • Minimize hyperedges cut • Edge cut is exactly comm volume • Aykanat & Catalyurek (’96)

2D partitioning methods Reduces communication volume further Many variations 2D Cartesian (checkerboard) First partition rows Then partition columns Via multiconstraint hgraph partitioning Catalyurek & Aykanat 2D Mondriaan Recursive hypergraph bisection Bisseling & Vastenhouw Sparse Matrix Partitioning Courtesy: Rob Bisseling

Fine-Grain Matrix Partitioning • Fine-grain partitioning • Assign each nonzero in matrix separately • Ultimate flexibility • Fine-grain hypergraph model • Catalyurek & Aykanat (2001) • Each nonzero is a vertex • Each row and column is a hyperedge • Exact model for comm. volume

Bipartite Fine-Grain Model • Bipartite graph model • Partition both • vertices (rows, columns) • edges (nonzeros) • Minimize #vertices with incident edge in other partition

Constrained Matrix Partitioning • Given A, x, y • where x and y already have a prescribed parallel distribution • Find optimal parallel distribution of A • Application: Iterative solver for Ax=b • Preconditioner code requires specific layout of the vectors • We are free to choose distribution for A, which is only used for matrix-vector multiply

Bisection • First consider bisection (p=2). • Suppose wlog the 2*2 block structure to the right • Can always permute to get this form • We only need solve for the two off-diagonal blocks • Independent subproblems

Hypergraph with fixed vertices Add constraints to fine-grain hypergraph: • Each row-net should contain a special blue vertex • Each column-net should contain a special red vertex • These special vertices are fixed, cannot change partition • Partitioning with fixed vertices available in Patoh and Zoltan packages

Cover nonzeros by rows/columns Columns (4) Rows (4) Both (3)

Graph Model: Vertex Cover x x x x x x x x

Bipartite Matching König’s Theorem: Let G be a bipartite graph, let C be a minimum vertex cover for G, and let M be a maximum-cardinality matching. Then |C| = |M|. Polynomial time algorithm for bipartite graphs.

K-way Generalization • Partition into k sets • Decouples into k(k-1) independent subproblems

Numerical Results • Bisection (k=2), split using natural order Matching code provided by Dobrian, Halappanavar, and Pothen.

Conclusions • Two new algorithms for fine-grain matrix partitioning with constraints • Empirical results show similar quality • VC/matching usually faster than partitioning • More experiments needed • Hypergraph partitioning: • Exact model, heuristic solver • Vertex cover via matching • Exact algorithm, but no balance constraint • K-way partitioning decouples, fully parallel

Fine-Grain Sparse Matrix Partitioning with Constraints

Fine-Grain Sparse Matrix Partitioning with Constraints

Presentation Transcript

sparse matrix-vector multiplication

Sparse Matrix Methods

Compressed Sparse Matrix Storage

Fine Grain MPI

Fine-Grain Communication

Fine Grain Entities Recognition

C onjugate gradients, sparse matrix-vector multiplication, and graph partitioning

Sparse Matrix Computations

Fine-Grain Parallelism

Fast Sparse Matrix Multiplication

Sparse Matrix Algorithms

Sparse Matrix Methods

Sparse matrix computations

Probabilistic Sparse Matrix Factorization

Autotuning sparse matrix kernels

Sparse Matrix ADT

Autotuning sparse matrix kernels

Sparse matrix formats

Sparse matrix data structure

Sparse Matrix Methods

Sparse Matrix Methods

Sparse matrix