Building MST with Õ(n) bits of communication in a Distributed Network Building MST with Õ(n) bits of communication in a Distributed Network 1 Valerie King,

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

Chapter 13 Leader Election. Breaking the symmetry in system Similar to distributed mutual exclusion problems, the first process to enter the CS can be.
WSPD Applications.
Greedy Algorithms Greed is good. (Some of the time)
Minimum Spanning Trees Definition Two properties of MST’s Prim and Kruskal’s Algorithm –Proofs of correctness Boruvka’s algorithm Verifying an MST Randomized.
Lecture 7: Synchronous Network Algorithms
Minimum Spanning Trees
© The McGraw-Hill Companies, Inc., Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees David R. Karger David R. Karger Philip N. Klein Philip N. Klein Robert E. Tarjan.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
1 Spanning Trees Lecture 20 CS2110 – Spring
1 University of Freiburg Computer Networks and Telematics Prof. Christian Schindelhauer Wireless Sensor Networks 21st Lecture Christian Schindelhauer.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
CPSC 668Set 2: Basic Graph Algorithms1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 8 May 4, 2005
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
1 University of Freiburg Computer Networks and Telematics Prof. Christian Schindelhauer Distributed Coloring in Õ(  log n) Bit Rounds COST 293 GRAAL and.
Breaking the O(n 2 ) Bit Barrier: Scalable Byzantine Agreement with an Adaptive Adversary Valerie King Jared Saia Univ. of VictoriaUniv. of New Mexico.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Distributed MST for Constant Diameter Graphs Zvi Lotker & Boaz Patt-Shamir & David Peleg.
External Memory Algorithms Kamesh Munagala. External Memory Model Aggrawal and Vitter, 1988.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 11 Instructor: Paul Beame.
The Complexity of Algorithms and the Lower Bounds of Problems
Maximal Independent Set Distributed Algorithms for Multi-Agent Networks Instructor: K. Sinan YILDIRIM.
Raeda Naamnieh 1. The Partition Algorithm Intuitively, the idea of the following algorithm is to choose each cluster as a maximal subset of nodes whose.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Johannes PODC 2009 –1 Coloring Unstructured Wireless Multi-Hop Networks Johannes Schneider Roger Wattenhofer TexPoint fonts used in EMF. Read.
Data Structures Using C++ 2E
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
Minimum Spanning Trees and Clustering By Swee-Ling Tang April 20, /20/20101.
Distributed Algorithms on a Congested Clique Christoph Lenzen.
Theory of Computing Lecture 10 MAS 714 Hartmut Klauck.
Distributed Verification and Hardness of Distributed Approximation Atish Das Sarma Stephan Holzer Danupon Nanongkai Gopal Pandurangan David Peleg 1 Weizmann.
Broadcast & Convergecast Downcast & Upcast
MST Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
Distributed Coloring Discrete Mathematics and Algorithms Seminar Melih Onus November
ACSS 2006, T. Radzik1 Communication Algorithms for Ad-hoc Radio Networks Tomasz Radzik Kings Collage London.
Minimum Spanning Tree Given a weighted graph G = (V, E), generate a spanning tree T = (V, E’) such that the sum of the weights of all the edges is minimum.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Spring 2015 Lecture 11: Minimum Spanning Trees
UNC Chapel Hill Lin/Foskey/Manocha Minimum Spanning Trees Problem: Connect a set of nodes by a network of minimal total length Some applications: –Communication.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
PODC Distributed Computation of the Mode Fabian Kuhn Thomas Locher ETH Zurich, Switzerland Stefan Schmid TU Munich, Germany TexPoint fonts used in.
1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Christian Schindelhauer Algorithms for Radio Networks Winter Term 2005/2006.
Neighborhood-Based Topology Recognition in Sensor Networks S.P. Fekete, A. Kröller, D. Pfisterer, S. Fischer, and C. Buschmann Corby Ziesman.
Data Structures & Algorithms Graphs
Amplification and Derandomization Without Slowdown Dana Moshkovitz MIT Joint work with Ofer Grossman (MIT)
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 2: Basic Graph Algorithms 1.
Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
1 Optimal Allocation of Electronic Content in Networks Israel Cidon- Technion Shay Kutten- Technion Ran Soffer- Redux.
2016/7/2Appendices A and B1 Introduction to Distributed Algorithm Appendix A: Pseudocode Conventions Appendix B: Graphs and Networks Teacher: Chun-Yuan.
Graph Search Applications, Minimum Spanning Tree
Stochastic Streams: Sample Complexity vs. Space Complexity
Minimum Spanning Tree 8/7/2018 4:26 AM
MST in Log-Star Rounds of Congested Clique
Lecture 7: Dynamic sampling Dimension Reduction
MST GALLAGER HUMBLET SPIRA ALGORITHM
CIS 700: “algorithms for Big Data”
Tree Construction (BFS, DFS, MST) Chapter 5
Minimum Spanning Tree.
CSCI B609: “Foundations of Data Science”
Neuro-RAM Unit in Spiking Neural Networks with Applications
MST GALLAGER HUMBLET SPIRA ALGORITHM
Lecture 8: Synchronous Network Algorithms
CSE 373: Data Structures and Algorithms
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSE 373: Data Structures and Algorithms
Presentation transcript:

Building MST with Õ(n) bits of communication in a Distributed Network Building MST with Õ(n) bits of communication in a Distributed Network 1 Valerie King, University of Victoria Joint work with Mikkel Thorup and Shay Kutten

Network with n nodes, m edges Each node has local info: incident edges, wts, neighbors’ IDs Nodes have distinct IDs in [1,…,n c ] G F C D E A CONGEST model

Communication: Each node may send (different) messages of size O(log n ) to all its neighbors in a single step. Synchronous vs. Asynchronous F C D E A

A network maintains a subgraph if its edges are marked by their endpoints J H G F C D E A

ST (or MST) Problem: Build a tree or MST G F C D E A

Classic Result: 1983: MST with O(m + nlogn) messages Gallager, Humblet and Spira (JACM) asychronous works if starting with any number of nodes wake time O(n 2 ), or O(n log n) if all nodes wake at start

“Folk Theorem” Auerbuch, Goldreich, Peleg, Vainish( JACM 1990) “If each proc knows only its ID and the IDs of its neighbors, then flooding is the best that can be done”

Auerbuch, Goldreich, Peleg, Vainish( JACM 1990) show: Constructing a spanning tree requires Ω(m) messages Even if synchronous nodes are all awake at the start alg is Monte Carlo nodes know the IDs of their immediate neighbour nodes know n, size of the network

Provided… Constructing a spanning tree requires Ω(m) messages Even if synchronous nodes are all awake at the start alg is Monte Carlo nodes know the IDs of their immediate neighbour nodes know the size of the network Provided IDs can only be compared

Without proviso, they show Constructing a spanning tree requires Ω(m) messages Even if synchronous nodes are all awake at the start alg is deterministic the nodes know the IDs of their immediate neighbour nodes know the size of the network ID space is super poly

Another Lower Bound Kutten, Pandurangan, Peleg, Robinson, Trehan (PODC2013) Constructing a tree requires Ω(m) messages Even if synchronous nodes are all initialized at the start alg is Monte Carlo nodes do not know the IDs of their immediate neighbours nodes know the size of the network

This work An MST can be constructed in Õ(n) bits of communication and time (and O(log n) bits space per node) if synchronous the nodes are all initialized at the start the alg is Monte Carlo nodes know the IDs of their immediate neighbours nodes know n ? Using operations which combine IDs.

The algorithm Boruvka style parallel alg with lg n phases: In each phase, w.h.p. each component finds the lightest edge incident to it and merges. Communications, Time of phase ~Õ(n) Start a new phase only after time to finish processing the largest cluster has passed. End when no progress after log n tries

Basic problem: find edge leaving a marked tree in a marked forest G F C D E A

Also solves MST updating problem Previous method requires lots of memory O(n) amortized messages per update Awerbuch, Cidon, Kutten:1990, 2008 Local memory= O(n* degree of node*logn) (Each node stores a belief about each neighbor’s version of the forest (!)) Here we use only O(log n) local memory

Ideas came from Dynamic Connectivity data structure Kapron, King, Mountjoy 2013 Graph Sketching Ahn, Guha, McGregor 2012 Sitting in Simons Institute

ideas from that work 1. Σ v in C deg(v) mod 2 = Σ e in ( C, V\C) deg(v) mod 2 because edges with both endpoints in C contribute If edges randomly sampled Pr( Σ deg(v) mod 2 =1) = ½ iff (C,V\C) is nonempty Else it’s always XOR of edge names = name of edge leaving if there is exactly 1 (Note: Edge name {u,v}=(u,v) iff u<v)

We use: basic communication step: broadcast-and-echo over known tree edges J H G F C D E A

We use: basic communication step: broadcast-and-echo over tree edges J H G F C D E A Leader node sends to whole tree, Response composed from leaves back up to leader

Find any edge leaving a component uses O(1) expected broadcast-and echoes Find min-weight edge leaving a component uses O(log n/loglog n) expected broadcast-and-echoes

Find Any: Parallel search packed in one word U ={edges} Leader randomly chooses h, a 2-wise indep hash h : U  U Each node v computes b(v)= b 1 b 2 … b log n b i (v) =| {(u,v) s. t. h(u,v) ≤ 2 i }| mod 2 e.g. parity of hashed values in each range. Hashed values 1 2 4

Find Any (cont’d): 1.XOR the b(v) over nodes v in tree 2.min  first i ≠0 3.XOR names of edges incident to all v in tree with h(edge) ≤ 2 min 4. With constant prob, there is exactly one such edge which is leaving.  XOR =Name of edge leaving

Findmin weight edge leaving Two Tests to see if there is an edge leaving a comp: 1.Constant prob. Test out: 1 broadcast, 1 1-bit echo 2.High prob. HP-Test out: 1 broadcast, 1 log n-bit echo

Findmin weight edge leaving 1. In parallel, with 1 Broadcast-and-echo do log n-wise search on weights with TestOut ‘s to find smallest wt interval 2. Verify its minimality with HP TestOut 3. Repeat if wrong or recurse on that interval

Testout randomly choose a function F: edges  {0,1} s.t. for a nonempty set S, w. constant prob. >0, an ODD number of S’s elements map to 1.

Simple F (Thorup) F has two parts, a 2-wise indep.hash function h: U  U t, a random element of U F(s)=1 iff h(s) < t F can be described in O(log |U|) bits.

Why it works h hashes U  U, S subset of U Imagine 2|S| equal sized intervals. Exactly one x in I, in middle third t lands in some I, in top third or bottom third

HP TestOut Repeat Testout in parallel O(log n) times to get prob error 1/n c ? Would need to send clog n hash functions Or use deterministic amplification Or…

Test 2: Set equality method to get HP-TestOut A tree T is maximal iff the set Out(T) of edges leaving all nodes in T = the set In(T) of edges entering all nodes in T

Test equality of two sets using polynomial ID-testing O(log n) bits of communication f(x)=Π a in Out (x-a) g(x)=Π b in In (x-b) Does f(x)=g(x)? (Schwartz-Zippel): Set x=random α in Z p, p be a prime, p > n c+2 compute over Z p : Pr(f(α)-g(α))=0 if In ≠ Out = Pr [α= root] = #roots/p< n 2 / p = 1/n c

Implementation (1 broadcast-and-echo) Leader broadcasts α Each node v computes f v (α) and g v (α) for its incident edges Starting at the leaves, pass to parent f v (α) * π c f c (α), c child of v g is computed similarly Leader (root) computes f-g. – If 0, Testout is true, else false.

In short build an MST in O(n log 2 n/loglog n) messages and time. Can we build a ST in O(n log n) messages and time? repair an ST in O(n) expected message repair an MST in O(n log/log log n) messages w.h.p. All using local memory O(log n)

Open problem and discussion Is Ω(m) communication required for building MST in asynchronous model? Lower bound for findmin Time v. communication tradeoffs Applications to map reduce etc. Thank you

Open problem and discussion Is Ω(m) communication required for building MST in asynchronous model? Lower bound for findmin Time v. communication tradeoffs Applications to map reduce etc. Thank you

“Impromptu”= Only local info maintained by node betw updates J H G F C D E A