Density Independent Algorithms for Sparsifying

Slides:



Advertisements
Similar presentations
Lower Bounds for Additive Spanners, Emulators, and More David P. Woodruff MIT and Tsinghua University To appear in FOCS, 2006.
Advertisements

Approximate Max-integral-flow/min-cut Theorems Kenji Obata UC Berkeley June 15, 2004.
05/11/2005 Carnegie Mellon School of Computer Science Aladdin Lamps 05 Combinatorial and algebraic tools for multigrid Yiannis Koutis Computer Science.
Matroid Bases and Matrix Concentration
Sparse Approximations
On the Density of a Graph and its Blowup Raphael Yuster Joint work with Asaf Shapira.
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
The Combinatorial Multigrid Solver Yiannis Koutis, Gary Miller Carnegie Mellon University TexPoint fonts used in EMF. Read the TexPoint manual before you.
Uniform Sampling for Matrix Approximation Michael Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco, Richard Peng, Aaron Sidford M.I.T.
An Efficient Parallel Solver for SDD Linear Systems Richard Peng M.I.T. Joint work with Dan Spielman (Yale)
1 The Monte Carlo method. 2 (0,0) (1,1) (-1,-1) (-1,1) (1,-1) 1 Z= 1 If  X 2 +Y 2  1 0 o/w (X,Y) is a point chosen uniformly at random in a 2  2 square.
Combinatorial Algorithms
A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees David R. Karger David R. Karger Philip N. Klein Philip N. Klein Robert E. Tarjan.
CSL758 Instructors: Naveen Garg Kavitha Telikepalli Scribe: Manish Singh Vaibhav Rastogi February 7 & 11, 2008.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
1 Representing Graphs. 2 Adjacency Matrix Suppose we have a graph G with n nodes. The adjacency matrix is the n x n matrix A=[a ij ] with: a ij = 1 if.
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey U. Waterloo Department of Combinatorics and Optimization Joint work with Isaac.
Graph Sparsifiers: A Survey Nick Harvey Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,
Graph Sparsifiers: A Survey Nick Harvey UBC Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey University of Waterloo Department of Combinatorics and Optimization Joint.
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey U. Waterloo C&O Joint work with Isaac Fung TexPoint fonts used in EMF. Read.
Sampling from Gaussian Graphical Models via Spectral Sparsification Richard Peng M.I.T. Joint work with Dehua Cheng, Yu Cheng, Yan Liu and Shanghua Teng.
Approximation Algorithms: Combinatorial Approaches Lecture 13: March 2.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 8 May 4, 2005
EXPANDER GRAPHS Properties & Applications. Things to cover ! Definitions Properties Combinatorial, Spectral properties Constructions “Explicit” constructions.
CSE 421 Algorithms Richard Anderson Lecture 4. What does it mean for an algorithm to be efficient?
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
Randomness in Computation and Communication Part 1: Randomized algorithms Lap Chi Lau CSE CUHK.
Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)
Graph Sparsifiers Nick Harvey University of British Columbia Based on joint work with Isaac Fung, and independent work of Ramesh Hariharan & Debmalya Panigrahi.
Graph Sparsifiers Nick Harvey Joint work with Isaac Fung TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.
Spectral graph theory Network & Matrix Computations CS NMC David F. Gleich.
Fan Chung University of California, San Diego The PageRank of a graph.
Spectrally Thin Trees Nick Harvey University of British Columbia Joint work with Neil Olver (MIT  Vrije Universiteit) TexPoint fonts used in EMF. Read.
Artur Czumaj DIMAP DIMAP (Centre for Discrete Maths and it Applications) Computer Science & Department of Computer Science University of Warwick Testing.
Unique Games Approximation Amit Weinstein Complexity Seminar, Fall 2006 Based on: “Near Optimal Algorithms for Unique Games" by M. Charikar, K. Makarychev,
Graphs, Vectors, and Matrices Daniel A. Spielman Yale University AMS Josiah Willard Gibbs Lecture January 6, 2016.
Estimating PageRank on Graph Streams Atish Das Sarma (Georgia Tech) Sreenivas Gollapudi, Rina Panigrahy (Microsoft Research)
Great Theoretical Ideas in Computer Science for Some.
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
C&O 355 Lecture 19 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A.
Sketching complexity of graph cuts Alexandr Andoni joint work with: Robi Krauthgamer, David Woodruff.
Generating Random Spanning Trees via Fast Matrix Multiplication Keyulu Xu University of British Columbia Joint work with Nick Harvey TexPoint fonts used.
Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.
Algorithm Frameworks Using Adaptive Sampling Richard Peng Georgia Tech.
Sparsified Matrix Algorithms for Graph Laplacians Richard Peng Georgia Tech.
Laplacian Matrices of Graphs: Algorithms and Applications ICML, June 21, 2016 Daniel A. Spielman.
Laplacian Matrices of Graphs: Algorithms and Applications ICML, June 21, 2016 Daniel A. Spielman.
Resparsification of Graphs
New Characterizations in Turnstile Streams with Applications
Efficient methods for finding low-stretch spanning trees
Approximating the MST Weight in Sublinear Time
Minimum Spanning Tree 8/7/2018 4:26 AM
Path Coupling And Approximate Counting
Spectral Clustering.
Lecture 18: Uniformity Testing Monotonicity Testing
Chapter 5. Optimal Matchings
Turnstile Streaming Algorithms Might as Well Be Linear Sketches
Enumerating Distances Using Spanners of Bounded Degree
CIS 700: “algorithms for Big Data”
Randomized Algorithms CS648
Instructor: Shengyu Zhang
Matrix Martingales in Randomized Numerical Linear Algebra
CSCI B609: “Foundations of Data Science”
Sampling in Graphs: node sparsifiers
Lecture 6: Counting triangles Dynamic graphs & sampling
A Numerical Analysis Approach to Convex Optimization
Clustering.
Switching Lemmas and Proof Complexity
On Solving Linear Systems in Sublinear Time
Presentation transcript:

Density Independent Algorithms for Sparsifying 𝑘-Step Random Walks Gorav Jindal, Pavel Kolev (MPI-INF) Richard Peng, Saurabh Sawlani (Georgia Tech) August 18, 2017

Talk Outline Definitions Our result Sparsification by Resistances Random walk graphs Our result Sparsification by Resistances Walk sampling algorithm 8/18/2017

Spectral Sparsification Sparsification: Removing (many) edges from a graph while approximating some property Spectral properties of the graph Laplacian! 𝐿 𝐺 = 2 −2 0 −2 3 −1 0 −1 1 Graph Laplacian =𝐷−𝐴 e.g.: 𝐺= 2 1 Formally, find sparse 𝐻 s.t.: 1−𝜖 𝑥 𝑇 𝐿 𝐺 𝑥 ≤ 𝑥 𝑇 𝐿 𝐻 𝑥≤ 1+𝜖 𝑥 𝑇 𝐿 𝐺 𝑥 ∀𝑥 This preserves eigenvalues, cuts. 8/18/2017

Applications of Sparsification Process huge graphs faster Dense graphs that arise in algorithms: Partial states of Gaussian elimination 𝑘-step random walks. 8/18/2017

Random Walk Graphs Walk along edges of 𝐺. When at vertex 𝑣, choose the next edge 𝑒 with probability proportional to 𝑤(𝑒) a u Pr 𝑢→𝑎 = 𝑤(𝑢𝑎) 𝑤 𝑢𝑎 +𝑤 𝑢𝑏 +𝑤(𝑢𝑐) = 𝑤(𝑢𝑎) 𝐷(𝑢) b c Each step = 𝐷 −1 𝐴 k step transition matrix = ( 𝐷 −1 𝐴) 𝑘 8/18/2017

Random Walk Graphs Special case: 𝐷 = 𝐼 (for this talk) Weights become probabilities Transition of k-step walk matrix: 𝐴 𝑘 Laplacian 𝐼− 𝐴 𝑘 8/18/2017

Random Walk Graphs Example: 𝐺= 𝐺 2 = .68 .68 .2 .8 .32 .32 .68 .8 .2 b b .32 𝐺= 𝐺 2 = .32 .68 .8 Make numbers bigger .2 c c .68 8/18/2017

Our Result Assume 𝜖 is constant 𝑂 hides log log terms Running time Comments Spielman Srivastava ‘08 𝑂 (𝑚 log 2.5 𝑛) Only 𝑘 = 1 Kapralov Panigrahi ‘11 𝑂 (𝑚 log 3 𝑛) 𝑘=1, combinatorial Koutis Levin Peng ‘12 𝑂 (𝑚 log 2 𝑛) 𝑂 (𝑚+𝑛 log 10 𝑛) Peng-Spielman `14, Koutis ‘14 𝑂 (𝑚 log 4 𝑛) 𝑘 ≤ 2, combinatorial Highlight density independent ones Cheng, Cheng, Liu, Peng, Teng ’15 𝑂 ( 𝑘 2 𝑚 log 𝑂(1) 𝑛) 𝑘≥1 Jindal, Kolev ’15 𝑂 (𝑚 log 2 𝑛+𝑛 log 4 𝑛 log 5 𝑘) Only 𝑘 = 2𝑖 Our result O (m+ k 2 n log 4 n) 𝑘≥1 O (m+ k 2 n log 6 n) combinatorial

Density Independence Only sparsify when m >> size of sparsifier. O (m+ k 2 n log 4 n) Only sparsify when m >> size of sparsifier. SS`08 + KLP `12: 𝑂(𝑛 log 𝑛 ) edge sparsifer in 𝑂(𝑚 log 2 𝑛 ) time Actual cost at least: 𝑂(𝑛 log 3 𝑛 ) Density independent: 𝑂 𝑚 + 𝑛⋅overhead Clearer picture of runtime

Algorithm Sample an edge in 𝐺 Pick an integer 𝑖 u.a.r. between 0 and k Walk 𝑖 steps from 𝑢 and k−1−𝑖 steps from 𝑣 Add the corresponding edge in 𝐺 𝑘 to sparsifier (with rescaling) Walk sampling has analogs in: Personalized page rank algorithms Triangle counting / sampling 8/18/2017

Effective Resistances View 𝐺 as an electrical circuit Resistance of 𝑒: 𝑅𝑒=1/ 𝑤 𝑒 Effective resistance (𝐸𝑅) between two vertices: Voltage difference required between them to get 1 unit of current between them. Leverage score = 𝑤 𝑢𝑣 ∙𝐸𝑅 𝑢𝑣 Importance Intuitive way of observing a graph Sparsification by 𝐸𝑅 is extremely useful! (Next slide) 8/18/2017

Sparsification using 𝐸𝑅 Suppose we have upper bounds on leverage scores of edges. ( 𝜏 ′ 𝑒 ≥𝑤 𝑒 𝐸𝑅(𝑒)) Algorithm: Repeat 𝑁=𝑂( 𝜀 −2 ∙ 𝜏 ′ 𝑒 ∙ log 𝑛 ) times Pick an edge with probability 𝜏 ′ 𝑒 / 𝜏 ′ 𝑒 . Add it to H with appropriate re-weighting. Was this first used in SS08? [Tropp ‘12] The output 𝐻 is an 𝜀–sparsifier of 𝐺. Need: leverage score bounds for the edges in 𝐺 𝑘 8/18/2017

Tools for Bounding Leverage Scores Odd-even lemma [CCLPT ‘15]: For odd 𝑘, 𝐸𝑅 𝐺 𝑘 𝑢,𝑣 ≤2∙ 𝐸𝑅 𝐺 𝑢,𝑣 For even 𝑘, 𝐸𝑅 𝐺 𝑘 𝑢,𝑣 ≤ 𝐸𝑅 𝐺 2 𝑢,𝑣 Triangle inequality of 𝐸𝑅 (on path 𝑢0…𝑢𝑘): 𝐸𝑅 𝐺 𝑢 0 , 𝑢 𝑘 ≤ 𝑖=0 𝑘 𝐸𝑅 𝐺 𝑢 𝑖 , 𝑢 𝑖+1 We will use these to implicitly select edges by leverage score 8/18/2017

Analysis: Goal: sample an edge (𝑢0,𝑢𝑘) in 𝐺 𝑘 w.p. proportional to: Simplifications: Assume odd 𝑘 (even 𝑘 uses one more idea) Assume access to exact effective resistances of 𝐺 (available from previous works) Goal: sample an edge (𝑢0,𝑢𝑘) in 𝐺 𝑘 w.p. proportional to: 𝑤 𝐺 𝑘 𝑢 0 , 𝑢 𝑘 ⋅ 𝐸𝑅 𝐺 𝑘 𝑢 0 , 𝑢 𝑘 Claim: walk sampling achieves this! 8/18/2017

Analysis: Claim: it suffices to sample a path 𝑢0…𝑢𝑘 w.p. proportional to: 𝑤 𝑢 0 , 𝑢 1 ,…, 𝑢 𝑘 ⋅ 𝑖=0 𝑘−1 𝐸 𝑅 𝐺 𝑢 𝑖 , 𝑢 𝑖+1 ≥𝑤 𝑢 0 , 𝑢 1 ,…, 𝑢 𝑘 ⋅𝐸 𝑅 𝐺 𝑢 0 , 𝑢 𝑘 (△ inequality) ≥𝑤 𝑢 0 , 𝑢 1 ,…, 𝑢 𝑘 ⋅𝐸 𝑅 𝐺 𝑘 𝑢 0 , 𝑢 𝑘 (odd-even lemma) Summing over all k-length paths from 𝑢 0 to 𝑢 𝑘 , = 𝑤 𝐺 𝑘 𝑢 0 , 𝑢 𝑘 ⋅ 𝐸𝑅 𝐺 𝑘 𝑢 0 , 𝑢 𝑘 8/18/2017

Walk Sampling Algorithm Algorithm (to pick an edge in 𝐺 𝑘 ): Choose an edge (𝑢,𝑣) in 𝐺 with probability proportional to 𝑤 𝑢𝑣 ∙𝐸𝑅 𝑢𝑣 Pick u.a.r. an index 𝑖 in the range 0, 𝑘−1 Walk 𝑖 steps from 𝑢 and k−1−𝑖 steps from 𝑣 8/18/2017

Analysis of Walk Sampling Probability of sampling the walk ( 𝑢 0 , 𝑢 1 ,⋯, 𝑢 𝑘 )∝ Pr[selecting the edge ( 𝑢 𝑖 , 𝑢 𝑖+1 )] Pr[index=i] x x Pr[Walk from 𝑢 𝑖 to 𝑢 0 ] x Pr[Walk from 𝑢 𝑖+1 to 𝑢 𝑘 ] = 𝑖=0 𝑘−1 1 𝑘 ⋅𝑤 𝑢 𝑖 , 𝑢 𝑖+1 𝐸 𝑅 𝐺 𝑢 𝑖 , 𝑢 𝑖+1 ⋅ 𝑗=0 𝑖−1 𝑤 𝑢 𝑗 , 𝑢 𝑗+1 ⋅ 𝑗=𝑖+1 𝑘−1 𝑤 𝑢 𝑗 , 𝑢 𝑗+1 = 𝑖=0 𝑘−1 1 𝑘 ⋅𝐸 𝑅 𝐺 𝑢 𝑖 , 𝑢 𝑖+1 ⋅ 𝑗=0 𝑘−1 𝑤 𝑢 𝑗 , 𝑢 𝑗+1 = 1 𝑘 ⋅𝑤(𝑢0,𝑢1…𝑢𝑘)⋅ 𝑖=0 𝑘−1 𝐸 𝑅 ′ 𝑢 𝑖 , 𝑢 𝑖+1 8/18/2017

𝑘 = even: 𝐺 2 𝐸 𝑅 𝐺∗𝑃2 𝑎1,𝑏1 =𝐸 𝑅 𝐺 2 (𝑎,𝑏) 𝐺 2 is still dense and cannot be computed! Compute product of G and length 2 path, return ER from that .68 .68 d a b .2 .8 𝑎1 𝑎2 a .32 b 𝑏1 𝑏2 .68 .32 .8 .2 𝑐1 𝑐2 c c 𝑑1 𝑑2 .68 𝐺 𝐺×𝑃2 𝐺 2 𝐸 𝑅 𝐺∗𝑃2 𝑎1,𝑏1 =𝐸 𝑅 𝐺 2 (𝑎,𝑏) 8/18/2017

Future Work This result: 𝑂 𝑚+ 𝑘 2 𝑛 log 4 𝑛 time. Log-dependency on 𝑘 (as in JK ‘15) Better runtime of 𝑂 𝑚+𝑛 log 2 𝑛 ??? (combinatorial algorithm) 8/18/2017

ER estimates for 𝐺 (or 𝐺 × 𝑃 2 ) Iterative improvement similar to KLP ’12: Create sequence of graphs, each more tree like than the previous, 𝐺1…𝐺𝑡 𝑂 1 -Sparsify the last graph to get 𝐻𝑡 Use sparsifier 𝐻 𝑖+1 to construct an 𝑂 1 -sparsifier 𝐻 𝑖 of 𝐺 𝑖 . 8/18/2017