Sampling in Graphs: node sparsifiers

Slides:



Advertisements
Similar presentations
Approximate Max-integral-flow/min-cut Theorems Kenji Obata UC Berkeley June 15, 2004.
Advertisements

1 LP, extended maxflow, TRW OR: How to understand Vladimirs most recent work Ramin Zabih Cornell University.
Vertex sparsifiers: New results from old techniques (and some open questions) Robert Krauthgamer (Weizmann Institute) Joint work with Matthias Englert,
Poly-Logarithmic Approximation for EDP with Congestion 2
1 LP Duality Lecture 13: Feb Min-Max Theorems In bipartite graph, Maximum matching = Minimum Vertex Cover In every graph, Maximum Flow = Minimum.
ECE Longest Path dual 1 ECE 665 Spring 2005 ECE 665 Spring 2005 Computer Algorithms with Applications to VLSI CAD Linear Programming Duality – Longest.
Dynamic Graph Algorithms - I
Introduction to Algorithms
Primal-Dual Algorithms for Connected Facility Location Chaitanya SwamyAmit Kumar Cornell University.
Multicut Lower Bounds via Network Coding Anna Blasiak Cornell University.
Dependent Randomized Rounding in Matroid Polytopes (& Related Results) Chandra Chekuri Jan VondrakRico Zenklusen Univ. of Illinois IBM ResearchMIT.
Paths, Trees and Minimum Latency Tours Kamalika Chaudhuri, Brighten Godfrey, Satish Rao, Satish Rao, Kunal Talwar UC Berkeley.
All Rights Reserved © Alcatel-Lucent 2006, ##### Matthew Andrews, Alcatel-Lucent Bell Labs Princeton Approximation Workshop June 15, 2011 Edge-Disjoint.
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey U. Waterloo Department of Combinatorics and Optimization Joint work with Isaac.
Graph Sparsifiers: A Survey Nick Harvey Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,
Graph Sparsifiers: A Survey Nick Harvey UBC Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey University of Waterloo Department of Combinatorics and Optimization Joint.
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey U. Waterloo C&O Joint work with Isaac Fung TexPoint fonts used in EMF. Read.
Duality Lecture 10: Feb 9. Min-Max theorems In bipartite graph, Maximum matching = Minimum Vertex Cover In every graph, Maximum Flow = Minimum Cut Both.
Job Scheduling Lecture 19: March 19. Job Scheduling: Unrelated Multiple Machines There are n jobs, each job has: a processing time p(i,j) (the time to.
Primal-Dual Algorithms for Connected Facility Location Chaitanya SwamyAmit Kumar Cornell University.
Linear Programming – Max Flow – Min Cut Orgad Keller.
Vertex Sparsification of Cuts, Flows, and Distances Robert Krauthgamer, Weizmann Institute of Science WorKer 2015, Nordfjordeid TexPoint fonts used in.
Edge-disjoint induced subgraphs with given minimum degree Raphael Yuster 2012.
Primal-Dual Algorithms for Connected Facility Location Chaitanya SwamyAmit Kumar Cornell University.
Graph Sparsifiers Nick Harvey Joint work with Isaac Fung TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.
CS223 Advanced Data Structures and Algorithms 1 Maximum Flow Neil Tang 3/30/2010.
Approximating the k Steiner Forest and Capacitated non preemptive dial a ride problems, with almost uniform weights Guy Kortsarz Joint work with Dinitz.
MAP Estimation in Binary MRFs using Bipartite Multi-Cuts Sashank J. Reddi Sunita Sarawagi Sundar Vishwanathan Indian Institute of Technology, Bombay TexPoint.
Sampling in Graphs Alexandr Andoni (Microsoft Research)
PRIMAL-DUAL APPROXIMATION ALGORITHMS FOR METRIC FACILITY LOCATION AND K-MEDIAN PROBLEMS K. Jain V. Vazirani Journal of the ACM, 2001.
Approximation Algorithms Duality My T. UF.
Sketching complexity of graph cuts Alexandr Andoni joint work with: Robi Krauthgamer, David Woodruff.
1 Chapter 5 Branch-and-bound Framework and Its Applications.
TU/e Algorithms (2IL15) – Lecture 8 1 MAXIMUM FLOW (part II)
Graphcut Textures:Image and Video Synthesis Using Graph Cuts
Lap Chi Lau we will only use slides 4 to 19
Resparsification of Graphs
Dimension reduction for finite trees in L1
Topics in Algorithms Lap Chi Lau.
June 2017 High Density Clusters.
From dense to sparse and back again: On testing graph properties (and some properties of Oded)
Lecture 18: Uniformity Testing Monotonicity Testing
EMIS 8373: Integer Programming
Chapter 5. Optimal Matchings
Sublinear Algorithmic Tools 3
NP-Completeness Yin Tat Lee
Density Independent Algorithms for Sparsifying
Structural Properties of Low Threshold Rank Graphs
1.3 Modeling with exponentially many constr.
Turnstile Streaming Algorithms Might as Well Be Linear Sketches
Lecture 16: Earth-Mover Distance
CIS 700: “algorithms for Big Data”
Instructor: Shengyu Zhang
Analysis of Algorithms
Graph Partitioning Problems
3.5 Minimum Cuts in Undirected Graphs
CSCI B609: “Foundations of Data Science”
Linear Programming and Approximation
Neuro-RAM Unit in Spiking Neural Networks with Applications
Linear Programming Duality, Reductions, and Bipartite Matching
1.3 Modeling with exponentially many constr.
TRLabs & University of Alberta © Wayne D. Grover 2002, 2003, 2004
Flow Networks and Bipartite Matching
Algorithms (2IL15) – Lecture 7
Embedding Metrics into Geometric Spaces
Flow Feasibility Problems
Lecture 6: Counting triangles Dynamic graphs & sampling
Clustering.
Branch-and-Bound Algorithm for Integer Program
Presentation transcript:

Sampling in Graphs: node sparsifiers Alexandr Andoni (Microsoft Research)

Graph compression ≈ Why smaller graphs? use less storage space faster algorithms easier visualization

Sparsification of edges Preserve some structure: e.g., cuts Also: distances, effective resistances, etc

Sparsification of nodes ? Generally: not well-defined natural to define properties on nodes… Preserve a property with respect to a small set 𝐾 of “important nodes” using a small graph ideally: of size 𝑝𝑜𝑙𝑦(|𝐾|), independent of 𝑛

Node sparsifiers Cut (node) sparsifier [HKNR98, Moi09] graph 𝐻 s.t. for each 𝐾=𝑆∪𝑇, we have 𝑚𝑖𝑛𝑐𝑢 𝑡 𝐺 𝑆,𝑇 =𝑚𝑖𝑛𝑐𝑢 𝑡 𝐻 (𝑆,𝑇) Flow (node) sparsifier [LM10] graph 𝐻 s.t. for any multi-commodity flow 𝑑 on 𝐾: max concurrent flow in 𝐺 = max concurrent flow in 𝐻 𝐺 𝐻

Results on cut sparsifiers Graph size Approximation Reference Comments 𝑘 𝑂 log 𝑘 log log 𝑘 [Moi09, LM10, CLLM10, EGKRTCT10, MM10] Ω log 𝑘 [LM10, CLLM10, MM10] 𝑝𝑜𝑙𝑦(𝐶) 1 [Chu12, KW12] 𝐶 = capacity of 𝐾 (may depend on 𝑛) 2 2 𝑘 [HKNR98, KRTV12] 2 Ω 𝑘 [KRTV12, KR13] bipartite* graphs 𝑝𝑜𝑙𝑦(𝑘/𝜖) 1+𝜖 [AGK’14] bipartite* graphs Similar results for flow (node) sparsifier

Small cut (node) sparsifiers [A-Gupta-Krauthgamer’14] Theorem: for bipartite graphs, can construct 1+𝜖 approximate cut (node) sparsifier sparsifier size: 𝑝𝑜𝑙𝑦(𝑘/𝜖) Non-terminals form independent set

Main idea ? Sampling edges doesn’t work here Need to sample entire sub-structures of the graph

Sampling in Bipartite Graphs Sample non-terminals, together with edges reweight edges accordingly

Sampling in Bipartite Graphs Sample non-terminals, together with edges reweight edges accordingly Uniform sampling doesn’t work

Non-uniform sampling Non-terminal 𝑣 has sampling probability 𝑝 𝑣 If 𝑣 sampled, weight edges by 1/ 𝑝 𝑣 Expectation is right: consider a partition 𝐾=𝑆∪𝑇 𝑚𝑖𝑛𝑐𝑢 𝑡 𝐺 (𝑆,𝑇) = 𝑣 min {𝐶 𝑣,𝑆 , 𝐶(𝑣,𝑇)} 𝑚𝑖𝑛𝑐𝑢 𝑡 𝐻 (𝑆,𝑇) = 𝑣 𝐼 𝑣 𝑝 𝑣 ⋅ min {𝐶 𝑣,𝑆 , 𝐶(𝑣,𝑇)} =1 with probability 𝑝 𝑣 𝑆 𝑣 𝐶 𝑣,𝑆 =1 𝐶 𝑣,𝑇 =2 𝑇

How to choose 𝑝 𝑣 ? Want 1) 𝑚𝑖𝑛𝑐𝑢 𝑡 𝐻 (𝑆,𝑇) = 𝑣 𝐼 𝑣 𝑝 𝑣 ⋅ min {𝐶 𝑣,𝑆 , 𝐶(𝑣,𝑇)} concentrates 2) 𝑣 𝑝 𝑣 small, 𝑝𝑜𝑙𝑦 𝑘 𝜖 Issue: contribution can come from just a few terms

Tool: Importance sampling 𝑚𝑐 𝐻 𝑆,𝑇 = 𝑣 𝐼 𝑣 𝑝 𝑣 ⋅ min {𝐶 𝑣,𝑆 , 𝐶(𝑣,𝑇)} Idea: Choose 𝑝 𝑣 proportional to contribution, min {𝐶 𝑣,𝑆 , 𝐶(𝑣,𝑇)} Suppose 𝑝 𝑣 = 1 𝜆 min {𝐶 𝑣,𝑆 , 𝐶(𝑣,𝑇)} 𝑚𝑐 𝐻 𝑆,𝑇 =𝜆 𝑣 𝐼 𝑣 concentrates well if ≫1/ 𝜖 2 nodes 𝑣 are sampled easy to normalize 𝑝 𝑣 : make sure 𝑣 𝑝 𝑣 ≫1/ 𝜖 2 => 𝜆≈ 𝜖 2 ⋅ 𝑣 min {𝐶 𝑣,𝑆 , 𝐶(𝑣,𝑇)} Issue: 𝑝 𝑣 cannot depend on partition 𝑆∪𝑇 !

Importance sampling Idea 2: for any 𝐾=𝑆∪𝑇, large fraction supported on some terminals 𝑠∈𝑆,𝑡∈𝑇 ! 𝑚𝑖𝑛𝑐𝑢𝑡 𝑆,𝑇 ≈ 𝑣 min { 𝑐 𝑣,𝑠 , 𝑐 𝑣,𝑡 } (up to 𝑘 2 ) enough to “take care” of all pairs 𝑠,𝑡 Will set 𝑝 𝑣 to be proportional to the contribution of 𝑣 to the cut between 𝑠,𝑡, for the “worst” possible 𝑠,𝑡 then 𝑝 𝑣 is ≈ 𝑘 2 factor approximation to “ideal” 𝑝 𝑣 enough!

Actual Sampling 𝑝 𝑣 =𝐹⋅ max 𝑠,𝑡 min { 𝑐 𝑣,𝑠 , 𝑐 𝑣,𝑡 } 𝑢 min { 𝑐 𝑢,𝑠 , 𝑐 𝑢,𝑡 } (thresholded at 1) 1) 𝑝 𝑣 good approximation to the contribution => concentration by importance sampling 2) 𝑣 𝑝 𝑣 ≤𝐹 𝑘 2 . Apply union bound over all choices of cuts 𝑆∪𝑇 oversampling factor =𝑝𝑜𝑙𝑦(𝑘/𝜖) if there were only two terminals 𝑠,𝑡, how important would 𝑣 be ?

Checking importance sampling 𝑝 𝑣 =𝐹⋅ max 𝑠,𝑡 min { 𝑐 𝑣,𝑠 , 𝑐 𝑣,𝑡 } 𝑢 min { 𝑐 𝑢,𝑠 , 𝑐 𝑢,𝑡 } (thresholded at 1) 1) 𝑝 𝑣 ∗ = min 𝐶 𝑣,𝑆 , 𝐶 𝑣,𝑇 𝑢 min 𝐶 𝑣,𝑆 , 𝐶 𝑣,𝑇 ≤ 𝑘 2 min 𝑐 𝑣,𝑠 , 𝑐 𝑣,𝑡 𝑢 min 𝑐 𝑢,𝑠 , 𝑐 𝑢,𝑡 ≤ 𝑝 𝑣 if 𝐹≥ 𝑘 2 2) ∑ 𝑝 𝑣 ≤𝐹 𝑣 𝑠,𝑡 min 𝑐 𝑣,𝑠 , 𝑐 𝑣,𝑡 𝑢 min 𝑐 𝑢,𝑠 , 𝑐 𝑢,𝑡 ≤𝐹 𝑠,𝑣 1 = 𝐹 𝑘 2

Flow (node) sparsifiers Same 𝑝 𝑣 ’s work also for flow sparsifier: concentration => concentration of LP values need to show concentration for both primal and dual LP Also works when non-terminals = small independent graphs

Remarks Node sparsifiers: OPEN: 𝑝𝑜𝑙𝑦(𝑘/𝜖) size for general graphs? Via structure sampling: sample graph sub-structures Assign probabilities using importance sampling Works for bipartite graphs beats the 2 Ω(𝑘) lower bounds for exact sparsifiers! OPEN: 𝑝𝑜𝑙𝑦(𝑘/𝜖) size for general graphs?

Graph compression via sampling ≈ Seen: I) Cut sparsifiers via sampling edges II) Smaller sparsifiers by relaxing constraints III) Small cut (node) sparsifiers for bipartite graphs, via structure sampling Meta-open: Structure sampling for node sparsifier in general graphs? How to define “≈” without fixed terminal set 𝐾 ?