Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)

Slides:



Advertisements
Similar presentations
Sublinear-time Algorithms for Machine Learning Ken Clarkson Elad Hazan David Woodruff IBM Almaden Technion IBM Almaden.
Advertisements

Chapter 23 Minimum Spanning Tree
Great Theoretical Ideas in Computer Science
Bart Jansen 1.  Problem definition  Instance: Connected graph G, positive integer k  Question: Is there a spanning tree for G with at least k leaves?
Max Cut Problem Daniel Natapov.
Approximation, Chance and Networks Lecture Notes BISS 2005, Bertinoro March Alessandro Panconesi University La Sapienza of Rome.
Minimum Spanning Trees Definition Two properties of MST’s Prim and Kruskal’s Algorithm –Proofs of correctness Boruvka’s algorithm Verifying an MST Randomized.
Combinatorial Algorithms
A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees David R. Karger David R. Karger Philip N. Klein Philip N. Klein Robert E. Tarjan.
Lectures on Network Flows
Christian Sohler | Every Property of Hyperfinite Graphs is Testable Ilan Newman and Christian Sohler.
Artur Czumaj Dept of Computer Science & DIMAP University of Warwick Testing Expansion in Bounded Degree Graphs Joint work with Christian Sohler.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
Hardness Results for Problems P: Class of “easy to solve” problems Absolute hardness results Relative hardness results –Reduction technique.
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 8 May 4, 2005
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
A general approximation technique for constrained forest problems Michael X. Goemans & David P. Williamson Presented by: Yonatan Elhanani & Yuval Cohen.
Sublinear Algorithms for Approximating Graph Parameters Dana Ron Tel-Aviv University.
Michael Bender - SUNY Stony Brook Dana Ron - Tel Aviv University Testing Acyclicity of Directed Graphs in Sublinear Time.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Augmenting Paths, Witnesses and Improved Approximations for Bounded Degree MSTs K. Chaudhuri, S. Rao, S. Riesenfeld, K. Talwar UC Berkeley.
1 On the Benefits of Adaptivity in Property Testing of Dense Graphs Joint work with Mira Gonen Dana Ron Tel-Aviv University.
1 Algorithmic Aspects in Property Testing of Dense Graphs Oded Goldreich – Weizmann Institute Dana Ron - Tel-Aviv University.
Lower Bounds for Property Testing Luca Trevisan U C Berkeley.
Lecture 20: April 12 Introduction to Randomized Algorithms and the Probabilistic Method.
Sublinear Algorithms for Approximating Graph Parameters Dana Ron Tel-Aviv University.
Steiner trees Algorithms and Networks. Steiner Trees2 Today Steiner trees: what and why? NP-completeness Approximation algorithms Preprocessing.
Approximation Algorithms Motivation and Definitions TSP Vertex Cover Scheduling.
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
Randomized Algorithms Morteza ZadiMoghaddam Amin Sayedi.
Approximating the Minimum Spanning Tree Weight in Sublinear Time Speaker: Chuang-Chieh Lin Advisor: Professor Maw-Shang Chang Computation Theory Laboratory.
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 223 – Advanced Data Structures Graph Algorithms: Minimum.
Theory of Computing Lecture 10 MAS 714 Hartmut Klauck.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Fixed Parameter Complexity Algorithms and Networks.
Chapter 2 Graph Algorithms.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Edge-disjoint induced subgraphs with given minimum degree Raphael Yuster 2012.
Expanders via Random Spanning Trees R 許榮財 R 黃佳婷 R 黃怡嘉.
 2004 SDU Lecture 7- Minimum Spanning Tree-- Extension 1.Properties of Minimum Spanning Tree 2.Secondary Minimum Spanning Tree 3.Bottleneck.
CSE332: Data Abstractions Lecture 24.5: Interlude on Intractability Dan Grossman Spring 2012.
EMIS 8374 Optimal Trees updated 25 April slide 1 Minimum Spanning Tree (MST) Input –A (simple) graph G = (V,E) –Edge cost c ij for each edge e 
Minimum Spanning Trees Easy. Terms Node Node Edge Edge Cut Cut Cut respects a set of edges Cut respects a set of edges Light Edge Light Edge Minimum Spanning.
Lecture 19 Greedy Algorithms Minimum Spanning Tree Problem.
Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.
Testing the independence number of hypergraphs
Introduction to Graph Theory
Topics in Algorithms 2007 Ramesh Hariharan. Tree Embeddings.
Artur Czumaj DIMAP DIMAP (Centre for Discrete Maths and it Applications) Computer Science & Department of Computer Science University of Warwick Testing.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
Introduction Wireless Ad-Hoc Network  Set of transceivers communicating by radio.
CSE 589 Applied Algorithms Spring 1999 Prim’s Algorithm for MST Load Balance Spanning Tree Hamiltonian Path.
Lower Bounds for Property Testing
Randomized Min-Cut Algorithm
Introduction to Randomized Algorithms and the Probabilistic Method
Chapter 5. Greedy Algorithms
Approximating the MST Weight in Sublinear Time
From dense to sparse and back again: On testing graph properties (and some properties of Oded)
Lecture 18: Uniformity Testing Monotonicity Testing
Exact Inference Continued
Enumerating Distances Using Spanners of Bounded Degree
CIS 700: “algorithms for Big Data”
Sublinear Algorihms for Big Data
CSCI B609: “Foundations of Data Science”
Introduction Wireless Ad-Hoc Network
Winter 2019 Lecture 11 Minimum Spanning Trees (Part II)
More Graphs Lecture 19 CS2110 – Fall 2009.
Autumn 2019 Lecture 11 Minimum Spanning Trees (Part II)
Presentation transcript:

Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)

Sublinear Time Algorithms Make sense for problems on very large data sets Go contrary to common intuition that “an algorithm must be given at least enough time to read all the input” Must be probabilistic Must be approximate

Approximation For decision problems: the output is the correct answer either for the given input, or at least for some other input “close” to it. (Property Testing) For optimization problems: the output is a number that is close to the cost of the optimal solution for the given input. (There is not enough time to construct a solution)

Previous Examples The cost of the max cut in a graph with n nodes and cn 2 edges can be approximated to within a factor  in time 2 poly(1/  c). (Goldreich, Goldwasser, Ron) Other results for “dense” instances of optimization problems, for low-rank approximation of matrices,... No results (that we know of) for problems on bounded-degree graphs.

Our Result Given a connected weighted graph G, with maximum degree d and with weights in the range {1,..., w}, we can compute the weight of the minimum spanning tree of G to within a factor of  in time O(dw  -2 log w/  ); we also prove that it is necessary to look at  dw  -2 ) entries in the representation of G. (We assume that G is represented using adjacency lists)

Main Intuition Suppose all weights are 1 or 2 Then the MST weight is equal to n – 2 + # of conn. comp. induced by weight-1 edges weight 1 weight 2 connected components Induced by weight-1 edges MST

Algorithm

Algorithm for weights in {1,2} To approximate the MST weight to within a multiplicative factor (1+  ) it’s enough to approximate c1 to within an additive factor  n (c1:= # of connected components induced by weight-1 edges) To approximate c1 we use ideas from Goldreich-Ron (property testing of connectivity) The algorithm runs in time O(d  -2 log  -1 )

Approximating # of connected components Given a graph G of max degree d with n nodes we want to compute c, the number of connected components of G up to an additive error  n. For every vertex u, define n u := 1 / size of component of u Then c =  u n u And if we call a u := max {n u,  } Then c =  u a u  n

Wrapping up the analysis Can estimate summation of a u using sampling Once we pick a vertex u at random, the value a u can be computed in time O(d/  ) We need to pick O(1/   ) vertices, so we get running time O(d/   )

Algorithm CC-APPROX(  ) Repeat O(1/  2 ) times pick a random vertex v do a BFS from v, stopping after 2/  steps b:= 1 / number of visited vertices return (average of the values b) * n

Improved Algorithm CC-APPROX( , W) Repeat O(1/  2 ) times pick a random vertex v do first step of a BSF from v b:=0; t:=1 (*) flip a coin If heads, and visited <W nodes so far t:=2*t continue BSF until ends or t nodes are visited if BSF ends, b:= 2 #random coins / nodes visited else go to (*) return (average of the values b) * n Inner procedure takes average O(dlog W) time

Analysis Main idea: if v is in a component of size c<W, then b is zero with prob. ~(1 – 1/c) and ~1 with probability ~1/c. The average of b is 1/c. Setting W:=2/  we get –each time, the average of b is within  /2 from the average over v of n v (that is, (# conn. comp.)/n) –Repeating O(1/  2 ) times, the probability of deviating by another factor  /2 is bounded by a constant –The average running time is O(d  -2 logW), that is O(d  -2 log  -1 ).

General Weights Generalize argument for weight 1 and 2. Let c i = # of connected components induced by edges of weight at most i Then the MST weight is n – w +  i=1,..., w-1 c i

Final Algorithm For j=1,..., w-1, call CC-APPROX( ,2w/  ) on the subgraph of G obtained by removing edges of cost >j Get a i, an approximation of c i Return n – w +  i=1,..., w-1 a i Average answer is within  n/2 from cost of MST, and variance is bounded Total running time O(dw  -2 log w/  )

Extensions Low average degree Non-integer weights

Lower Bound

Abstract sampling problem Fix p,  Define two binary distributions A,B Pr[A=1] = p, Pr[A=0]=1-p Pr[B=1] = p+  p, Pr[B=0]=1-p-  p Distinguishing A from B with constant probability requires  (1/p  2 ) samples

Reduction Fix p = 1/w We consider two distributions of weights over a cycle of length n In distribution G, for each edge we sample from A; if A=0 the edge gets weight 1, otherwise it gets weight w In distribution H, same with B H and G are likely to have MST costs that differ by about  n To distinguish them we need to look at  (w/  2 ) edge weights

Higher Degree Sample from G or H as before, –also add d-1 forward edges of weight w+1 from each vertex –randomly permute names of vertices Now, on average, reading t edge weights gives us t/d samples from A or B, so t=  (dw/  2 )

Conclusions A plausibility result showing that approximation for a standard graph problem in bounded degree (and sparse) graphs can be achieved in time independent of number of vertices Use of approximate cost without solution? More problems? –The trivial Max SAT approximation algorithms can be implemented in constant time, and give (an implicit representation of) a solution –Non-trivial Max SAT approximation? (say, 3/4) –Something really useful?