Approaching optimality [FOCS10] An O(mlog 2 n) algorithm for solving SDD systems

Approaching optimality [FOCS10]An O(mlog2 n) algorithm for solving SDD systems Yiannis Koutis , U of Puerto Rico, Rio Piedras Gary Miller, Richard Peng, CMU TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA

Optimality achieved…probablyAn O(mlogn) algorithm for solving SDD systems Yiannis Koutis , U of Puerto Rico, Rio Piedras Gary Miller, Richard Peng, CMU TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA

Our motivation: Very large sparse linear systems Ax = b: dimension n, m non-zero entries • Lower Bound O(m) • Matrix inversion O(n!) • Symmetric positive definite matrices: O(m n) • Planar positive definite matrices: O(n1.5)[LRT79] • Planar non-singular matrices: O(n1.5)[AY10] Many open problems!

Laplacian and SDD matrices • Symmetric, negative off-diagonals, zero row-sums. • SDD matrix: Add a positive diagonal to Laplacian and flip off-diagonal signs, keeping symmetry. 30 2 1 1 20 15

Our motivation: Very large sparse SDD linear systems Spielman and Teng 04:GeneralSDD linear systems can be solved in time O(m log25 n). Planar systems can be solved in time O(m log2 n). KM07: Planar SDD systems can be solved in time O(m). KMP10--11: SDD linear systems in time O(m logn) . (up to lower order terms and error, probability of failure) A powerful algorithmic primitive

The Laplacian paradigm(or, why care for SDD systems? ) • Classical scientific computing dealt with SDD’s since the 70s. Certain systems were known to be solvable in linear time. • Vaidya sent a paper to FOCS or STOC in 90-91 • Vaidya then had a different idea • CASI is now the main solver provider to (market cap 3.45B) REJECT

The Laplacian paradigm(or, why care for SDD systems? ) The solver is the main routine in the fastest known algorithms for: • Approximate maximum flow, minimum cut problem • Christiano, Kelner, Madry, Spielman, Teng 2010 • Computation of a few fundamental eigenvectors • Spectral methods for image segmentation (Shi and Malik 00, Miller and Tolliver 08) • Analysis of protein structures (Liu, Eyal, Bahar 08) • Solving elliptic finite element systems • Boman, Hendrickson, Vavasis 04, Avron et.al. 2009 • Generating a random spanning tree in a graph • Madry and Kelner 09 • Max flow, generalized lossy flow problems • Daitch and Spielman 08

Things to take home #1 • A probably optimal and potentially practical linear system solver • Central component in several algorithms • Solving almost as easy as sorting • Think about re-formulating your problem to include an SDD system

Preconditioning • The solver is based on preconditioning Two contradictory goals: • Systems with B must be ‘simpler’ than A • Condition number must be small • Rate of convergence depends on

Recursive Preconditioning • Start with matrix A1 • Compute preconditioner B1 • Greedy-factorize B1 • The iteration needs to solve a system in B1which can be transformed via L to A2 • Use recursion on A2 : Solve it with a preconditioned method A1 B1 A2 B2 A3 Preconditioning Chain

Things to take home #2 • A good ‘two-level’ preconditioning algorithm implies a fast solver • The preconditioner B must satisfy: (up to small constants)

FORGET SDD MATRICES FOCUS ON LAPLACIANS AND GRAPHS (SDD systems can be transformed to Laplacian systems)

Combinatorial Preconditioning • ‘Traditional’ preconditioners were taken to be sub-matrices of the system. • PravinVaidya made the paradigm shift: The preconditioner of a graph must be a graph (support graph theory enables deeper understanding of condition numbers)

Graphs as electrical networks 30 • The edge weight corresponds to the capacity of the wire. • The effective resistance between two nodes u and w is equal to the voltage difference between nodes u and w that is necessary to drive one unit of electric flow between them. • Rayleigh’s Monotonicity Law: Dropping edges increases the effective resistance between any u and w. 2 1 1 20 15

Graph Sparsification:(the heart of preconditioning) The Sparsification Conjecture (Spielman and Teng) For every graph A there is a sparse graph B, such that:

Graph Sparsification:(the heart of preconditioning) • Spielman and Teng’skey theorem: For any Laplacian A, there is a Laplacian B with O(nloga n) edges such that k(A,B)<2. Furthermore B can be computed in nearly-linear O(nlogb n) time. • Spielman and Srivastava proved a strong theorem: For any Laplacian A, there is a Laplacian B with O(nlogn) edges such that k(A,B)<2. The graph B can be computed by solving O(log n) systems.

Spectral sparsifiers with O(n logn) edges The algorithm – Simple Sampling procedure • Let pe a probability distribution over the edges • pe is proportional to weRe, :: Re is the effective resistance of e • Let t = e we Re (*it can be shown that t= n-1*) • Draw q=O(t log t )samples from p • Each time an edge e is picked add e to B, with weight we/(qpe) Proof based n a thorem of Rudeslon and Vershynin + Linear Algebra

Things to take home #3 • Focus on Laplacians/graphs • Precondition graphs by graphs • The key is graph sparsification • A very elegant sampling algorithm works • The sampling probabilities depend on the effective resistances, which require a linear system solution

Sampling using upper bounds (Input: Graph A, Output: graph B with ) • Let pe a probability distribution over the edges • pe is proportional to weRe, where Re is the effective resistance of e • pe is proportional to weR’ewhere R’e > Re • Let t = ewe Re (*it can be shown that t= n-1*) • Let t = e weR’e(*t must be as small as possible *) • Draw q=O(t log t ) samples from p • Each time an edge e is picked add e to B, with weight we/(qpe) An equality in SS08, becomes an inequality, using Re < R’e

Effective resistances hard to compute?Compute good and fast bounds. • Fix a spanning tree T • Let RT(e) be the effective resistance of the edge e if we “throw out” all edges not in T • Rayleigh’s Monotonicity Law :RT(e) > Re • For fixed tree the RT(e)’s computed in linear time • The product RT(e)*weis the stretch of e by T • We want to minimize the total stretch

Low-strecth tree: The best tree preconditioner • Vaidya proposed a maximum spanning tree (MST) which gave the first non-trivial bound on condition number • Boman and Hendrickson pinned down the right catalytic idea: A low-stretch tree The stretch of e over T The condition of T and A An edge and its weight in A A unique path in T and its effective resistace

Spectral sparsifier with n+m/k edges? • It looks that number of samples is O(m log2n) despite the availability a low-stretch tree • However we can ‘make’ a related graph with a tree of even lower total stretch. How ? • Scale up the edges of the tree by a factor of k>1 • This decreases the total stretch and the number of off-tree samples by a factor of k, but incurs a condition number of k.

Scaling and sampling proportionally to stretch (Input: Graph A, Output: graph B with ) • Let A’ = A + k copies of low-stretch tree T • pe is proportional to we R(kT)ewhere R(kT)e is the effective resistance between the endpoints of e in kT • Let t = e weR(kT)e(*t is total stretch = n+ mlogn/k *) • Draw O(t log t ) samples from p • Each time an edge e is picked add e to B, with weight we/(qpe) Algorithm over-samples edges in T and samples mlog2n/k off-tree edges

Theorem: For each graph A with n nodes and m edges, there is an incremental-sparsifierB that O(k)-approximates A and has n+mlog2 n /k edges. B can be computed in time O(m logn). Solver follows from k=log4 n

Overview • Incremental sparsifier is computed by: • Computing low-stretch tree • Scaling-up low-stretch tree • Sampling with probabilities proportional to stretch • Incremental sparsification is applied iteratively, interleaved with greedy contractions to produce a preconditioning chain • Preconditioning chain is used with a standard recursive iterative method to solve system

O(m log n) solver? • O(m log2 n) solver computes a low-stretch tree for every graph in the preconditioning chain. • O(m logn) solver computes a low-stretch tree only for the top (input) graph. The same tree can be kept for the whole chain. • There are some complications. In usual chain the number of edges goes down by at least O(log2n) between every two graphs in the chain. Stagnation can occur in the new construction. But it all works out.

Open Questions • Parallelization ? • O(m logc n) work in O(n1/3) time [spaa11] • Ideally O(m log n) work in O(log n) time • Practical implementations • A practical low-stretch tree • A practical sparsifier • Is it possible to compute sparsifier with O(n log n) edges more efficiently than solving systems?

Thank you!

Approaching optimality [FOCS10] An O(mlog 2 n) algorithm for solving SDD systems