Richard Peng Georgia Tech Michael Cohen Jon Kelner John Peebles

Slides:



Advertisements
Similar presentations
05/11/2005 Carnegie Mellon School of Computer Science Aladdin Lamps 05 Combinatorial and algebraic tools for multigrid Yiannis Koutis Computer Science.
Advertisements

Partitional Algorithms to Detect Complex Clusters
Statistical perturbation theory for spectral clustering Harrachov, 2007 A. Spence and Z. Stoyanov.
Fast Regression Algorithms Using Spectral Graph Theory Richard Peng.
Solving Laplacian Systems: Some Contributions from Theoretical Computer Science Nick Harvey UBC Department of Computer Science.
13 May 2009Instructor: Tasneem Darwish1 University of Palestine Faculty of Applied Engineering and Urban Planning Software Engineering Department Introduction.
The Combinatorial Multigrid Solver Yiannis Koutis, Gary Miller Carnegie Mellon University TexPoint fonts used in EMF. Read the TexPoint manual before you.
Uniform Sampling for Matrix Approximation Michael Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco, Richard Peng, Aaron Sidford M.I.T.
An Efficient Parallel Solver for SDD Linear Systems Richard Peng M.I.T. Joint work with Dan Spielman (Yale)
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Preconditioning in Expectation Richard Peng Joint with Michael Cohen (MIT), Rasmus Kyng (Yale), Jakub Pachocki (CMU), and Anup Rao (Yale) MIT CMU theory.
SDD Solvers: Bridging theory and practice Yiannis Koutis University of Puerto Rico, Rio Piedras joint with Gary Miller, Richard Peng Carnegie Mellon University.
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Random Walks Ben Hescott CS591a1 November 18, 2002.
All Rights Reserved © Alcatel-Lucent 2006, ##### Matthew Andrews, Alcatel-Lucent Bell Labs Princeton Approximation Workshop June 15, 2011 Edge-Disjoint.
Graph Clustering. Why graph clustering is useful? Distance matrices are graphs  as useful as any other clustering Identification of communities in social.
Graph Clustering. Why graph clustering is useful? Distance matrices are graphs  as useful as any other clustering Identification of communities in social.
Graph Sparsifiers: A Survey Nick Harvey Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,
Lecture 21: Spectral Clustering
Graph Sparsifiers: A Survey Nick Harvey UBC Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,
Sampling from Gaussian Graphical Models via Spectral Sparsification Richard Peng M.I.T. Joint work with Dehua Cheng, Yu Cheng, Yan Liu and Shanghua Teng.
Sampling: an Algorithmic Perspective Richard Peng M.I.T.
CS 584. Review n Systems of equations and finite element methods are related.
Clustering In Large Graphs And Matrices Petros Drineas, Alan Frieze, Ravi Kannan, Santosh Vempala, V. Vinay Presented by Eric Anderson.
Expanders Eliyahu Kiperwasser. What is it? Expanders are graphs with no small cuts. The later gives several unique traits to such graph, such as: – High.
Undirected ST-Connectivity In Log Space Omer Reingold Slides by Sharon Bruckner.
Yiannis Koutis , U of Puerto Rico, Rio Piedras
Fast, Randomized Algorithms for Partitioning, Sparsification, and
Institute for Advanced Study, April Sushant Sachdeva Princeton University Joint work with Lorenzo Orecchia, Nisheeth K. Vishnoi Linear Time Graph.
TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A AA A A A A A AAA Fitting a Graph to Vector Data Samuel I. Daitch (Yale)
Multifaceted Algorithm Design Richard Peng M.I.T..
Graph Partitioning using Single Commodity Flows
Graphs, Vectors, and Matrices Daniel A. Spielman Yale University AMS Josiah Willard Gibbs Lecture January 6, 2016.
Presented by Alon Levin
1 Algebraic and combinatorial tools for optimal multilevel algorithms Yiannis Koutis Carnegie Mellon University.
Spectral Clustering Shannon Quinn (with thanks to William Cohen of Carnegie Mellon University, and J. Leskovec, A. Rajaraman, and J. Ullman of Stanford.
Algorithm Frameworks Using Adaptive Sampling Richard Peng Georgia Tech.
 Introduction  Important Concepts in MCL Algorithm  MCL Algorithm  The Features of MCL Algorithm  Summary.
Sparsified Matrix Algorithms for Graph Laplacians Richard Peng Georgia Tech.
Laplacian Matrices of Graphs: Algorithms and Applications ICML, June 21, 2016 Daniel A. Spielman.
Laplacian Matrices of Graphs: Algorithms and Applications ICML, June 21, 2016 Daniel A. Spielman.
Density of States for Graph Analysis
High Performance Linear System Solvers with Focus on Graph Laplacians
Lap Chi Lau we will only use slides 4 to 19
Resparsification of Graphs
Topics in Algorithms Lap Chi Lau.
Efficient methods for finding low-stretch spanning trees
Markov Chains Mixing Times Lecture 5
Parallel Algorithm Design using Spectral Graph Theory
Umans Complexity Theory Lectures
June 2017 High Density Clusters.
A Polynomial-time Tree Decomposition for Minimizing Congestion
A Combinatorial, Primal-Dual Approach to Semidefinite Programs
Nearly-Linear Time Algorithms for Markov Chains and New Spectral Primitives for Directed Graphs Richard Peng Georgia Tech.
Lecture 22: Parallel Algorithms
Density Independent Algorithms for Sparsifying
Turnstile Streaming Algorithms Might as Well Be Linear Sketches
Grouping.
CIS 700: “algorithms for Big Data”
Matrix Martingales in Randomized Numerical Linear Algebra
CSE838 Lecture notes copy right: Moon Jung Chung
geometric representations of graphs
CSCI B609: “Foundations of Data Science”
Reinforcement Learning in MDPs by Lease-Square Policy Iteration
Ilan Ben-Bassat Omri Weinstein
Spectral Clustering Eric Xing Lecture 8, August 13, 2010
Undirected ST-Connectivity In Log Space
A Numerical Analysis Approach to Convex Optimization
On Solving Linear Systems in Sublinear Time
Optimization on Graphs
Presentation transcript:

Almost-Linear-Time Algorithms for Markov Chains and New Spectral Primitives for Directed Graphs Richard Peng Georgia Tech Michael Cohen Jon Kelner John Peebles Aaron Sidford Adrian Vladu Anup B. Rao 1

OUtline Undirected vs. Directed Graphs Eulerian graphs Sparsification for iterative methods Full algorithm for Lx = b

The Laplacian Paradigm b c Graph Laplacians occur in Spectral graph theory Optimization Markov chains b 1 2 -1 -1 -1 1 0 -1 0 1 a a b c 1 c Directly related: Elliptic systems Lx=b Few iterations: Eigenvectors, Heat kernels Many iterations / modify algorithm Graph problems, Image processing

Directed vs. Undirected Is it really the undirected Laplacian paradigm? Undirected Directed Linear systems mlog1/2n nω (before this) Approx. max flow mlog41n m10/7 Approx shortest path in parallel (no(1) depth) mlog3n mn1/2 Spectral clustering Cheeger’s inequality Less clear, [Chung `05]

Pieces of the Laplacian Paradigm Graph sparsifiers Embeddings into trees Iterative Methods Before: only `counterexamples’ for general directed graphs Works on matrices

Directed Laplacian L = D – AT A: adjacency matrix D: diag(out-degrees) Diagonal: out degree Off-diagonal:ij entry: wji a b c 1 b 2 -1 0 -1 1 -2 -1 0 2 a 1 2 b a c 1 c Alternate definition: 0 column sums, LT1 = 0, non-negative off-diagonals

Our Results Input: n × n directed Laplacian L with m non-zeros with poly(n) condition number, vector b Output: vector x s.t. ║x - L+b ║2 < ε Runtime: O(m2O(√logn)log(1/ε)). Implications: similar times for stationary distributions, hitting times, escape times, commute time oracles for irreversible Markov chains Key ideas: Sparsification of directed graphs Preconditioned iterative methods for solving Lx = b

OUtline Undirected vs. Directed Graphs Eulerian graphs Sparsification for iterative methods Full algorithm for Lx = b

Graph Sparsification Undirected: Cut: G and H have similar cuts Spectral: for all vectors x, xTLGx ≈ xTLHx `counterexample’ for cut sparsification: complete bipartite graph Each edge can be isolated by a cut

Algebraic Issue If LG ≈ LH, they must have the same null space Undirected case: all 1s vector xb=1 1 1 xa=1 Null space of directed L: Perron-Frobenius theorem Can differ on graphs D-1 × stationary distribution 1 xc=1 xb=4 1 1 2 xa=2 1 xc=1

Stationary Distribution c Null space of L: Lx = 0 2 -1 0 -1 1 -2 -1 0 2 a b L = D - AT 1 b sb=0.4 c 1 Dx = ATx 2 a sa=0.4 1 c s=Dx: stationary distribution of random walk in A, ATD-1s = s sc=0.2

Eulerian LapaclaiNs In-degree = out-degree 1 AT1 = din = dout = A1 2 1 Null space: all 1s vector 1/2(L+LT): undirected Laplacian, UL Cuts: same amount in each direction 1 Goal: reduce strongly connected Laplacians to Eulerian Laplacians, and only work with Eulerian graphs

Eulerian Rescaling s: stationary, ATD-1s = s Set x = D-1s, and rescale L to Ldiag(x) 1 xb=0.4 0.2 0.4 1 0.2 2 0.2 xa=0.2 1 xc=0.1 Ldiag(x)1 = Lx = 0, Eulerian! [CKPPSV`16]: rescale via solving polylog Eulerian Laplacians, will only work with Eulerian Laplacians from this point on

Previous Works ON Eulerian Graphs 1 [Chung `05]: spectral gap of (L+LT) related to its expansion 2 1 [EMPS`16]: cut sparsifiers and approximate maximum flow on balanced graphs 1 Can sparsify Eulerian case Eulerian case = general case Mincut   Lx = b Solve more than Eulerian graphs? Incorporate into spectral algorithms?

OUtline Undirected vs. Directed Graphs Eulerian graphs Sparsification for iterative methods Full algorithm for Lx = b

Preconditioned Iterative Methods Need: algebraic definition of ≈ that interacts well with iterative methods Solve linear systems in LGx=b by solving a sequence of problems in LH ≈ LG: x’  x + LH-1(b – LGx) Can check: If G = H, done in 1 step fixed point: LG-1b `driver’ for nearly-linear time algorithms for: Solving linear systems Approx maximum flow

Candidate: 2-norm of difference Candidate: ║(L – L’ )x║2 ≤ ε ║ L x║2 Behaves like ≈: Symmetric, triangle inequality, invertible Undirected L: degenerates into L2 ≈ L’2, unfriendly to perturbations 1 1 1 1 1 1 1 ≈ 1 2 1 2 1 2 1

Fix: Divide by Undirected Laplacian L ε-approximates L’, if ║UL+1/2(L – L’) UL+1/2║2+ ≤ ε UL=1/2(L+LT): undirected Laplacian Symmetric Composable / triangle inequality For undirected L, same as spectral approx. Generalizes directed expander from [Chung `05] If L O(1)-approximates L’, then can solve systems in L to error ε using O(log(1/ ε)) solves in L’

Generating Directed Approximations Hard case: forward cycle and undirected cycle are off by factor of n2 ≈ [Tropp `12]: Expanders in U still `easy’: random graph works

Expander Parititioning All but O(nlogn) edges of UL are contained in some expander ║UL+1/2(L – L’) UL+1/2║2+ ≤ ε decomposable onto pieces Algorithm (similar to [ST`04]): Partition undirected UL For each expander Sample to error ε/O(logn) Fix degree Result: O(nlogO(1)nε-2) sized ε-approximations in O(mlogO(1)n) time, and parallelizable

OUtline Undirected vs. Directed Graphs Eulerian graphs Sparsification for iterative methods Full algorithm for Lx = b

Solving Linear Systems Simplification: L Eulerian, L = I - A for a random walk matrix A: ║A║2 < 1 Iterative methods in 1 line: L-1 = (I – A)-1 = I + A + A2 + A3 +… If ║A║2 ≤ ρ, (1 - ρ)-1 terms well approximates (I – A)-1 b Ab A2b Adiameterb Graph interpretation: each term  1 step walk Need Ω(diameter) steps

Fewer Terms via. Squaring (I – A)-1 = (I - A2)-1(I + A) (I – A)-1 = (I + A)(I + A2)…(I + A2i)… A: one step transition of random walk A2i: 2i step transition of random walk A2i mixes for i = Ω(log(κ)), κ: condition number, can make poly(n) I 23

Sparsified Squaring I - A1 ≈ε I – A2 I – A2 ≈ε I – A12 … I – Ai ≈ε I – Ai-12 I - Alogn ≈ I I - A0 (I – Ai)-1 = (I - Ai2)-1(I + Ai) ≈(I - Ai+1)-1(I + Ai) ≈ : approximations via sparsifiers Algorithms involving repeated squaring NC algorithm for shortest path [Reingold `05][Rozenman-Vadhan `05] Logspace connectivity Multiscale methods [P-Spielman `14] Solving Lx = b I - Ad≈ I

Solver I - A0 (I – Ai)-1 ≈(I - Ai+1)-1(I + Ai) Can turn Zi, a preconditioner for Lj, into a preconditioner for Li via: Zi Zj (I + Aj-1) (I + Aj-2)… (I + A1) Error: if Zj is ε-approx. pseudoinverse of Lj, then Zi is exp(O(j – i)) ε-approx. pseudoinverse of Li I - Ad≈ I Directly convert Zd into Z0: need ε < 1/poly(n) 

Fix: Recursive Iterative Method Only work up Δ steps at a time Reduce error to 2-O(Δ) via iterative refinement: O(Δ) branching factor Δ Need ε = 2-O(Δ) in sparsifier O(Δ) branching factor, every Δ layers Overhead: 2(O(Δ))δd/Δ ≤2O(Δ+dlogd/Δ) Optimized at Δ = d1/2 = log1/2n Total: O(m 2O(√logn)log(1/ε)) Δ

Ongoing / Future work Nearly-linear time? Sparse LU decomposition? Applications of directed Laplacian solvers? Formalize `sparsify w.r.t. a problem’? Extensions / generalizations / applications of sparsification of directed graphs?