Approaching optimality [FOCS10] An O(mlog2 n) algorithm for solving SDD systems Yiannis Koutis , U of Puerto Rico, Rio Piedras Gary Miller, Richard Peng, CMU TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA
Optimality achieved…probably An O(mlog n) algorithm for solving SDD systems Yiannis Koutis , U of Puerto Rico, Rio Piedras Gary Miller, Richard Peng, CMU TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA
Our motivation: Very large sparse linear systems Ax = b: dimension n, m non-zero entries Lower Bound O(m) Matrix inversion O(n!) Symmetric positive definite matrices: O(m n) Planar positive definite matrices: O(n1.5) [LRT79] Planar non-singular matrices: O(n1.5) [AY10] Many open problems!
Laplacian and SDD matrices Symmetric, negative off-diagonals, zero row-sums. SDD matrix: Add a positive diagonal to Laplacian and flip off-diagonal signs, keeping symmetry. 30 2 1 1 20 15
Our motivation: Very large sparse SDD linear systems Spielman and Teng 04: General SDD linear systems can be solved in time O(m log25 n). Planar systems can be solved in time O(m log2 n). KM07: Planar SDD systems can be solved in time O(m). KMP10--11: SDD linear systems in time O(m log n) . (up to lower order terms and error, probability of failure) A powerful algorithmic primitive
The Laplacian paradigm (or, why care for SDD systems? ) Classical scientific computing dealt with SDD’s since the 70s. Certain systems were known to be solvable in linear time. Vaidya sent a paper to FOCS or STOC in 90-91 Vaidya then had a different idea CASI is now the main solver provider to (market cap 3.45B) REJECT
The Laplacian paradigm (or, why care for SDD systems? ) The solver is the main routine in the fastest known algorithms for: Approximate maximum flow, minimum cut problem Christiano, Kelner, Madry, Spielman, Teng 2010 Computation of a few fundamental eigenvectors Spectral methods for image segmentation (Shi and Malik 00, Miller and Tolliver 08) Analysis of protein structures (Liu, Eyal, Bahar 08) Solving elliptic finite element systems Boman, Hendrickson, Vavasis 04, Avron et.al. 2009 Generating a random spanning tree in a graph Madry and Kelner 09 Max flow, generalized lossy flow problems Daitch and Spielman 08
Things to take home #1 A probably optimal and potentially practical linear system solver Central component in several algorithms Solving almost as easy as sorting Think about re-formulating your problem to include an SDD system
Preconditioning Rate of convergence depends on The solver is based on preconditioning Two contradictory goals: Systems with B must be ‘simpler’ than A Condition number must be small Rate of convergence depends on
Recursive Preconditioning Start with matrix A1 Compute preconditioner B1 Greedy-factorize B1 The iteration needs to solve a system in B1 which can be transformed via L to A2 Use recursion on A2 : Solve it with a preconditioned method A1 B1 A2 B2 A3 Preconditioning Chain
Things to take home #2 A good ‘two-level’ preconditioning algorithm implies a fast solver The preconditioner B must satisfy: (up to small constants)
FOCUS ON LAPLACIANS AND GRAPHS FORGET SDD MATRICES FOCUS ON LAPLACIANS AND GRAPHS (SDD systems can be transformed to Laplacian systems)
Combinatorial Preconditioning ‘Traditional’ preconditioners were taken to be sub-matrices of the system. Pravin Vaidya made the paradigm shift: The preconditioner of a graph must be a graph (support graph theory enables deeper understanding of condition numbers)
Graphs as electrical networks The edge weight corresponds to the capacity of the wire. The effective resistance between two nodes u and w is equal to the voltage difference between nodes u and w that is necessary to drive one unit of electric flow between them. Rayleigh’s Monotonicity Law: Dropping edges increases the effective resistance between any u and w. 30 2 1 1 20 15
Graph Sparsification: (the heart of preconditioning) The Sparsification Conjecture (Spielman and Teng) For every graph A there is a sparse graph B, such that:
Graph Sparsification: (the heart of preconditioning) Spielman and Teng’s key theorem: For any Laplacian A, there is a Laplacian B with O(nloga n) edges such that k(A,B)<2. Furthermore B can be computed in nearly-linear O(nlogb n) time. Spielman and Srivastava proved a strong theorem: For any Laplacian A, there is a Laplacian B with O(nlog n) edges such that k(A,B)<2. The graph B can be computed by solving O(log n) systems.
Proof based n a thorem of Rudeslon and Vershynin + Linear Algebra Spectral sparsifiers with O(n logn) edges The algorithm – Simple Sampling procedure Let pe a probability distribution over the edges pe is proportional to we Re, :: Re is the effective resistance of e Let t = e we Re (*it can be shown that t= n-1*) Draw q=O(t log t ) samples from p Each time an edge e is picked add e to B, with weight we/(qpe) Proof based n a thorem of Rudeslon and Vershynin + Linear Algebra
Things to take home #3 Focus on Laplacians/graphs Precondition graphs by graphs The key is graph sparsification A very elegant sampling algorithm works The sampling probabilities depend on the effective resistances, which require a linear system solution
An equality in SS08, becomes an inequality, using Re < R’e Sampling using upper bounds (Input: Graph A, Output: graph B with ) Let pe a probability distribution over the edges pe is proportional to we Re, where Re is the effective resistance of e pe is proportional to we R’e where R’e > Re Let t = e we Re (*it can be shown that t= n-1*) Let t = e we R’e (*t must be as small as possible *) Draw q=O(t log t ) samples from p Each time an edge e is picked add e to B, with weight we/(qpe) An equality in SS08, becomes an inequality, using Re < R’e
Effective resistances hard to compute? Compute good and fast bounds. Fix a spanning tree T Let RT(e) be the effective resistance of the edge e if we “throw out” all edges not in T Rayleigh’s Monotonicity Law :RT(e) > Re For fixed tree the RT(e)’s computed in linear time The product RT(e)*we is the stretch of e by T We want to minimize the total stretch
Low-strecth tree: The best tree preconditioner Vaidya proposed a maximum spanning tree (MST) which gave the first non-trivial bound on condition number Boman and Hendrickson pinned down the right catalytic idea: A low-stretch tree The stretch of e over T The condition of T and A An edge and its weight in A A unique path in T and its effective resistace
Spectral sparsifier with n+m/k edges? It looks that number of samples is O(m log2n) despite the availability a low-stretch tree However we can ‘make’ a related graph with a tree of even lower total stretch. How ? Scale up the edges of the tree by a factor of k>1 This decreases the total stretch and the number of off- tree samples by a factor of k, but incurs a condition number of k.
Algorithm over-samples edges in T and samples mlog2n/k off-tree edges Scaling and sampling proportionally to stretch (Input: Graph A, Output: graph B with ) Let A’ = A + k copies of low-stretch tree T pe is proportional to we R(kT)e where R(kT)e is the effective resistance between the endpoints of e in kT Let t = e we R(kT)e (*t is total stretch = n+ mlog n/k *) Draw O(t log t ) samples from p Each time an edge e is picked add e to B, with weight we/(qpe) Algorithm over-samples edges in T and samples mlog2n/k off-tree edges
Solver follows from k=log4 n Theorem: For each graph A with n nodes and m edges, there is an incremental-sparsifier B that O(k)-approximates A and has n+mlog2 n /k edges. B can be computed in time O(m logn). Solver follows from k=log4 n
Overview Incremental sparsifier is computed by: Computing low-stretch tree Scaling-up low-stretch tree Sampling with probabilities proportional to stretch Incremental sparsification is applied iteratively, interleaved with greedy contractions to produce a preconditioning chain Preconditioning chain is used with a standard recursive iterative method to solve system
O(m log n) solver? O(m log2 n) solver computes a low-stretch tree for every graph in the preconditioning chain. O(m logn) solver computes a low-stretch tree only for the top (input) graph. The same tree can be kept for the whole chain. There are some complications. In usual chain the number of edges goes down by at least O(log2n) between every two graphs in the chain. Stagnation can occur in the new construction. But it all works out.
Open Questions Parallelization ? Practical implementations O(m logc n) work in O(n1/3) time [spaa11] Ideally O(m log n) work in O(log n) time Practical implementations A practical low-stretch tree A practical sparsifier Is it possible to compute sparsifier with O(n log n) edges more efficiently than solving systems?
Thank you!