Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nearly-Linear Time Algorithms for Markov Chains and New Spectral Primitives for Directed Graphs Richard Peng Georgia Tech.

Similar presentations


Presentation on theme: "Nearly-Linear Time Algorithms for Markov Chains and New Spectral Primitives for Directed Graphs Richard Peng Georgia Tech."— Presentation transcript:

1 Nearly-Linear Time Algorithms for Markov Chains and New Spectral Primitives for Directed Graphs
Richard Peng Georgia Tech

2 In collaboration with Michael B. Cohen Jon Kelner Rasmus Kyng
John Peebles Anup B. Rao Aaron Sidford Adrian Vladu

3 Outline Graphs and Lx = b G ≈ H and algorithms
Sparsifying directed graphs

4 Graphs and Matrices a b c b 1 2 -1 -1 a -1 1 0 -1 0 1 a b c 1 c
Graph Laplacians a b c b 1 a a b c 1 c [ST `04] Nearly-linear time Laplacian solvers Input: n × n undirected Laplacian L with m non-zeros vector b Output: vector x that ε-approximates L+b Runtime: O(mlogO(1)nlog(1/ε)) Laplacians occur in Spectral graph theory Optimization Markov chains L+: pseudo-inverse of L

5 The Laplacian Paradigm
Directly related: Elliptic systems More recent: Laplacians.jl Lx=b Few iterations: Eigenvectors, Heat kernels Many iterations / modify algorithm Flows / matchings, Image processing

6 The Undireced Laplacian Paradigm?!?
Problem Undirected Directed Stationary Distribution ~ degree As hard as Lx = b Linear systems mlog1/2n nꙍ (previously) Maximum flow Approx: mlogO(1)n m7/4 (unit capacities) Transshipment m1+o(1)n mn1/2 Oblivious routing O(logn)-approx Only single source Rand. spanning trees min(m4/3, n7/3) nꙍ Dynamic matching logO(1)n m1/2

7 What makes Directed Graphs hard?
Complete bipartite graph, Reachability matrix encodes n2 bits Distinction more murky with numerical algorithms: GMRES (generalized min-residue) converges Pagerank on directed graphs Nearly-linear time algorithm wish-list: Low stretch trees Sparsifiers for all cuts Graph partitions

8 Our Results Input: n × n directed Laplacian L with m non-zeros, vector b Output: vector x that ε -approximates L+b Runtime: O(mlogO(1)nlog(1/ε)) a b c 1 b a 1 2 b a c 1 c Directed Laplacian Diagonal: out degree Row i column j: wj i Alternate definition: 0 column sums, LT1 = 0 non-negative off-diagonals

9 Directed Lx = b Key new ideas: Approximations of directed Ls
Decomposing directed graphs (theoretical) applications: Computing stationary distributions Pagerank clustering Hitting / mixing / covering times

10 What makes graphs hard? `easy’ graph classes: Highly connected
Long diameter, ‘Worst case’ graphs are a combination of both Data structures Path/tree width Min-degree Power method Gradients / CG Most Ω(nm) runtimes: Ω(n) steps × Ω(m) per step

11 Why graphs and matrices?
My view of the Laplacian paradigm: take apart graphs numerically Ideal: `globalness’ sensitive cost per step, ∑i=1n (m / i) and approximately Highly connected, need global steps Long diameter, many steps Examples: Geometric flows / matchings Brouvka’s MST algorithm Directed tree packing Issue: hard to decompose graphs to isolate `eventful’ parts

12 Outline Graphs and Lx = b G ≈ H and algorithms
Sparsifying directed graphs

13 Iterative Methods for solving Lx = b
Simplification: random walks, L = I - A Key identity / approximation L+ = (I – A) + = I + A + A2 + A3 +… If ║A║2 ≤ ρ, (1 - ρ)-1 terms well approximates (I – A) + b Ab A2b Adiameterb Graph interpretation: each term  1 step walk Need Ω(diameter) steps

14 (Preconditioned) Iterative Methods
Solve LGx = b by instead solving LH-1LGx = LH-1b LH: preconditioner of LG LH = LG: x = LG+b, 1 iteration LH = I: same as no preconditioner First requirement: LG and LH operate on the same space

15 Known null-space: eulerian case
Eulerian Laplacians: In-degree = out-degree, Null space: all 1-s vector a b c 1 b a [CKPPSV`16]: suffices to only consider Eulerian Lx = b 2 b 1 a c 1 c Previous works on Eulerian graphs: [Chung `05]: directed cheeger [EMPS`16]: maxflow on balanced graphs

16 [CKPPSV `16]: Reduction Simplified case: random walk, L = I - A
s: stationary, ATs = s, Ls = 0 Rescale L to L diag(s) Ldiag(s)1 = Ls = 0, Eulerian! [CKPPSV`16]: algorithmic version: Gradually remove extra diagonal entries Sequence of log(║L+║2) linear systems Definition of directed L: 0 column sums, non-negative off-diagonals To solve Lx = b: solve Ldiag(s)y = b return diag(s)y

17 Convergence of Iterative Methods
Solving LH-1LGx = LH+b iteratively: Iteration: x  x – LH+ (LGx - b) Effect on residue, r: r’  (I – LH+LG) r residue H ≈ G if all singular values of LH+LG are close 1 Primary motivation for our notion of graph approximations

18 Graph Sparsification [ST` 04][SS`08][BSS `09]: any undirected graph can be approximated by one with O(n) edges. [CKPPRSV`17]: can always find H with nlogO(1)n edges so LH+LG has all singular values in the range [0.9, 1.1] Approximation in undirected graphs: LH+LG has all eigenvalues close to 1 Implies all cuts are similar 18

19 [CKPPRSV`17]: Sparsified Squaring
L+ = (I – A) + = (I + A) (I + A2) (I + A4)… A: step of random walk, A2: 2 step random walk Can efficiently sparsify I – A2 without generating it [CKKPPSV`17]: some further control of errors via recursion, total runtime O(m1+α) for any α > 0. 19

20 More Dense Intermediate Objects
Matrix powers Matrix inverse LU factorizations Cost-prohibitive to store / find Directly access a sparsified version 20

21 Sparse Gaussian Elimination
[KS `16] Per-entry pivoting, pseudocode: Akin to iChol in MATLAB [CKKPPRS`17]: some partial progress towards this for directed graphs, runtime about mlog10n Issue: still needs to globally sparsify intermediate graphs 21

22 Outline Graphs and Lx = b G ≈ H and algorithms
Sparsifying directed graphs

23 Wish-List For Directed Approximations
Behaves like ≈: Symmetric, triangle inequality, invertible Candidate: ║LH x║2 ≈ ║LGx║2 ∀ x LG2 ≈ LH2, unfriendly even to perturbations 1 1 1 1 1 1 1 1 2 1 2 1 2 1 Need: `divide away’ one copy of G

24 Norm from symmetrization
b c 1 b a 2 L 1 b a c 1 c U = ½ (L + LT) is symmetric matrix, also norm UU norm: ║M║ UU = maxx ║Mx║U / ║x║U a b c 1.5 b a 0.5 U a b 0.5 c c

25 Symmetrizing Approximations
LG ε-approximates LH if ║ LG – LH ║UU ≤ ε Implies ║ I – LH+LG ║ UU ≤ ε Equivalent to LGTUG+LG ≈ LHTUH+LH Decomposable! Symmetric, satisfies triangle inequality For undirected L, same as spectral approx. Generalizes directed expander from [Chung `05] Preserves commute times

26 Easy Case G is an expander with expansion Φ ≥ log-O(1)n
Problem: can modify degrees / nullspace Fix: arbitrarily is ok! Cheeger’s inequality: UG has constant eigenvalues Matrix concentration (e.g. [Tropp `12]): Sampling edges by weights gives ║LG - LH║22≤ ε Implication when translated to norms: UGUG norm is close to 2-norm ║ LG – LH ║22 ≤ ε is sufficient after ε’  ε log-O(1)n

27 Hard case of Directed Approximations
Hard case: directed and undirected cycles are off by a factor of n2 Return time of a walk: O(1) in undirected n in directed

28 Fix: only work on expanders
More general issue UG-norm can be much less than 2-norm in some parts Fix: only work on expanders

29 Sparsification Algorithm
All except O(nlogn) edges of U are contained in some expander: [ST `04] sparsification algorithm Partition undirected UG so most edges are contained in expanders For each expander (Φ = 1/log2n) Sample to error ε/O(log3n) Fix degree Existence of such a partition: While there is a sparse cut, take it, recurse on both sides Charge edges to smaller side Charge per edge: O(Φ logn) Result: O(nlogO(1)nε-2) sized ε-approximations in O(mlogO(1)n) time, and parallelizable

30 Ongoing / Future work Faster? Sparse Gaussian elimination based algo?
Directed sparsification without partitioning / Matrix-concentration based sparsification? Extensions to other problems on directed graphs?

31 Accumulation of Errors
Zi Zj (I + Aj-1) (I + Aj-2)… (I + A1) Hiding: lazy random walks Can show for Ui = ½ (Li+LiT): Ui is 2-approximation of Ui+1 Lj+(I + Aj-1) is ε-approximation of Lj-1+ w.r.t. Uj-1 Error accumulation: if Zj is ε-approx. pseudoinverse of Lj, then Zi is exp(O(j – i)) ε-approx. pseudoinverse of Li Directly use I = Zd for Z0: need ε < 1/poly(Rn)

32 Fix: iterative Refinement
Only work up δ steps at a time Reduce error to exp(-O(δ)) via iterative refinement: O(δ) branching factor δ Need ε = exp(-O(δ)) in sparsifier O(δ) branching factor, every δ layers Overhead: 2(O(δ))δd/δ≤2O(δ +dlogd/δ) Optimized at δ = d1/2 = log1/2nR Total: O(m 2\sqrt(log(nR))log(1/ε)) δ

33 (Preconditioned) Iterative Methods
Solve LGx = b by iterating with x  x – LH+(b – LGx) Fixed point: b = LGx LH: preconditioner of LG LH = LG: x = LG+b, 1 iteration LH = I: same as no preconditioner First requirement: LG and LH have same null space


Download ppt "Nearly-Linear Time Algorithms for Markov Chains and New Spectral Primitives for Directed Graphs Richard Peng Georgia Tech."

Similar presentations


Ads by Google