MST in Log-Star Rounds of Congested Clique Mohsen Ghaffari and Merav Parter Ben Gurion Seminar Nov. 2017
The Congested Clique Model 𝑛 machines communicating over complete graph. Bandwidth restriction: In every round, send on each link only 𝑩=𝑶( 𝐥𝐨𝐠 𝒏) bits. Classical multi-party communication, number-in-hand model.
The Congested Clique Model Two Graphs: Problem Graph & Communication Graph – Clique In every round, every pair of vertices can exchange O(log n) bits. Main complexity parameter: communication rounds.
Congested Clique: Practical Motivation Overlay Networks Large Scale Graph Processing (Every node holds partial information about the graph)
Congested Clique: Theoretical Motivation Congestion Congest Local Model Ω(𝐷𝑖𝑎𝑚𝑒𝑡𝑒𝑟) Locality
MST in the Congested Clique Model Boruvka (1926): O(log n) rounds of components merging. Each component, select minimum outgoing edge. Merge components connected by an edge.
MST in O(log n) Rounds (Boruvka 1926) At the beginning of every round, each node knows: Its component ID (leader ID). Components of neighbors. 2 2 a b c d 1 1 1 1 2 e f g h 3 3 3 3 i j k l 2 1 1 1 1 m n o p 2 2
MST in O(log n) Rounds (Boruvka 1926) R1: send lightest outgoing edge to the local leader. R2: local leader picks lightest edge and send to the global leader. R3: global leader merges components. Send to nodes new component ID. R4: each node broadcast its component ID. 2 2 a b c d 1 1 1 1 2 e f g h 3 3 3 3 i j k l 2 1 1 1 1 m n o p 2 2
MST in O(log n) Rounds (Boruvka 1926) Phase 1: a b c d 1 1 1 1 2 e f g h 3 3 3 3 i j k l 1 1 1 1 m n o p 2 2
MST in O(log n) Rounds (Boruvka 1926) Phase 2: a b c d 1 1 1 1 2 e f g h 3 3 3 3 i j k l 1 1 1 1 m n o p 2 2
MST in O(log n) Rounds (Boruvka 1926) Phase 3: 2 2 a b c d 1 1 1 1 2 e f g h 3 3 3 3 2 i j k l 1 1 1 1 m n o p 2 2
MST in O(log n) Rounds (Boruvka 1926) Phase 4: 2 2 a b c d 2 e f g h 3 3 3 3 i j k l 2 1 1 1 1 m n o p 2 2
MST in O(log n) Rounds (Boruvka 1926) c d 2 e f g h 3 3 3 3 i j k l 2 1 1 1 1 m n o p 2 2
MST in the Congested Clique Model Lotker, Pavlov, Patt-Shamir and Peleg [SIGCOMP’05]: O(log log n) rounds of component merging. Quadratic growth rate: In O(1) rounds, component of size 𝑥 to components of size 𝑥 2 Find 𝑥 minimum outgoing edges Quadratic growth rate
Faster MST in the Congested Clique Hegeman, Pandurangan, Pemmaraju, Sardeshmukh and Scquizzato [PODC’15]: O(log log log n) rounds of components merging. Run Lotker et al. for O(log log log n) rounds. Number of components: O(n/poly-log n). Finish in O(1) rounds using linear sketches.
… Faster MST in the Congested Clique Hegeman, Pandurangan, Pemmaraju, Sardeshmukh and Scquizzato [PODC’15]: O(log log log n) rounds of components merging. … O( log 4 𝑛) Leader
This Talk: MST in 𝑂( log ∗ 𝑛 ) Rounds Road Map 𝑂( log ∗ 𝑛 ) graph connectivity algorithm (growing maximal forest) Parallel computation of O(√𝑛) connectivity instances MST
Basic Connectivity (BC) Algorithm For each node, select an incident edge. Contract selected edges. Repeat until no edges. Lemma: Compute maximal forest within 𝑂( log 𝑛) rounds.
𝑂( log ∗ 𝑛) Graph Connectivity Algorithm Forest Growth: From 𝑂(𝑛/ log 2 𝑥) to 𝑛/𝑥 components in O(1) rounds, w.h.p. The Goal: Simulate locally log 𝑥 merging of components in a ``Boruvka style” manner. Leader
𝑂( log ∗ 𝑛) Graph Connectivity Algorithm Forest Growth: From 𝑂(𝑛/ log 2 𝑥) to 𝑛/𝑥 components. The Challenge: How can vertex know which of its edges is is an outgoing edge? Our Approach: Distinguish between dense and sparse case. Leader
Intuition: Dense Scenario Forest Growth: From 𝑂(𝑛/ log 2 𝑥) to 𝑛/𝑥 components. The Dense Scenario: The outgoing degree of each component is ≥ 𝑥 5 . Select a RANDOM edge and send to leader. Leader
Intuition: Dense Scenario Forest Growth: From 𝑂(𝑛/ log 2 𝑥) to 𝑛/𝑥 components. The Dense Scenario: outgoing degree ≥ 𝑥 5 . Why random edge works? If the current component size ≤𝑥 then with good prob., a random edge is an outgoing edge. One step of random merging is ``equivalent” to 𝑂( log 𝑥) steps of Boruvka-Style merging.
Intuition: Sparse Scenario Forest Growth: From 𝑂(𝑛/ log 2 𝑥) to 𝑛/𝑥 components. The Sparse Scenario: outgoing degree ≤ 𝑥 5 . Design Sparsity-Sensitive Sketch of size 𝑶( 𝐥𝐨𝐠 𝟐 𝒙⋅ 𝐥𝐨𝐠 𝒏) The leader simulates 𝑂( log 𝑥) Boruvka-Style merging. Projecting incident edges into lower dimensional space. linear ℓ 0 sampler
Sparsity Sensitive Sketching Define the sketch and properties. Show how to locally simulate 𝑶( 𝐥𝐨𝐠 𝒙) rounds of BC. Show that it can be delivered to leader with O(1) rounds.
Sketch of vertex v Each row 𝑖 is XOR of a subset of edges 𝐸 𝑖 . An edge is sampled to 𝐸 𝑖 with prob. 2 −𝑖 . Θ( log 𝑛 ) bits 𝐼 𝐷 1 ⊕𝐼 𝐷 4 ⊕𝐼 𝐷 9 ⊕…⊕𝐼 𝐷 ℓ 𝐼 𝐷 1 𝐼 𝐷 2 ⊕𝐼 𝐷 5 ⊕..⊕𝐼 𝐷 ℓ−1 … Θ( log 𝑥 ) rows 𝑣 𝐼 𝐷 6 ⊕𝐼 𝐷 12 Row log ℓ contains the ID of one edge with const. prob. 𝐼 𝐷 11 … 𝐼 𝐷 ℓ 𝑆𝑘𝑒𝑡𝑐 ℎ 𝑖 (𝑣)
Sketch of component 𝐶: 𝑆𝑘𝑒𝑡𝑐ℎ 𝐶 = ⊕ 𝑢 𝑆𝑘𝑒𝑡𝑐ℎ 𝑢 Sketch of a Component Sketch of component 𝐶: 𝑆𝑘𝑒𝑡𝑐ℎ 𝐶 = ⊕ 𝑢 𝑆𝑘𝑒𝑡𝑐ℎ 𝑢 1000 𝑒 1 1001 0111 𝑣 𝑢 𝑒 4 𝑒 3 0101 𝑒 2 𝑒 1 ⊕ 𝑒 2 ⊕ 𝑒 4 𝑒 4 1001 𝑒 1 ⊕ 𝑒 2 1101 0100 0111 𝑒 1 ⊕ 𝑒 3 1111 𝑒 1 1000 𝑒 3 𝑆𝑘𝑒𝑡𝑐ℎ(𝑣) 𝑆𝑘𝑒𝑡𝑐ℎ(𝑢) 𝑆𝑘𝑒𝑡𝑐ℎ({𝑢,𝑣})= 𝑆𝑘𝑒𝑡𝑐ℎ(𝑢) ⊕ 𝑆𝑘𝑒𝑡𝑐ℎ(𝑣)
Properties of the Sketch Sketches allows a leader to simulate BC algorithm locally. Linearity The sketch of a component is the sum of its elements. Internal edges cancelled out. For that, the sampling of an edge (u,v) is consistent between endpoints.
The Sketch of Component is the Sketch of the Contracted Node 1000 𝑒 1 1001 0111 𝑣 𝑢 𝑒 4 𝑒 3 0101 𝑒 2 𝑒 1 ⊕ 𝑒 2 ⊕ 𝑒 4 𝑒 4 1001 𝑒 1 ⊕ 𝑒 2 1101 0100 0111 𝑒 1 ⊕ 𝑒 3 1111 𝑒 1 1000 𝑒 3 𝑆𝑘𝑒𝑡𝑐ℎ(𝑣) 𝑆𝑘𝑒𝑡𝑐ℎ(𝑢) 𝑆𝑘𝑒𝑡𝑐ℎ({𝑢,𝑣})= 𝑆𝑘𝑒𝑡𝑐ℎ(𝑢) ⊕ 𝑆𝑘𝑒𝑡𝑐ℎ(𝑣)
Properties of the Sketch Sketches allows a leader to simulate BC algorithm locally. Linearity (cancel internal edges) 𝐿 0 −Sampler. Completeness: With constant prob. one row contains single edge. Soundness: With high prob., detect rows containing XOR of more than one edge.
Example: Sketch of Components 1000 𝑒 1 1001 0111 𝑣 𝑢 𝑒 4 𝑒 3 0101 𝑒 2 𝑒 1 ⊕ 𝑒 2 ⊕ 𝑒 4 𝑒 4 1001 𝑒 1 ⊕ 𝑒 2 1101 0100 0111 𝑒 1 ⊕ 𝑒 3 1111 𝑒 1 1000 𝑒 3 𝑆𝑘𝑒𝑡𝑐ℎ(𝑣) 𝑆𝑘𝑒𝑡𝑐ℎ(𝑢) 𝑆𝑘𝑒𝑡𝑐ℎ({𝑢,𝑣})= 𝑆𝑘𝑒𝑡𝑐ℎ(𝑢) ⊕ 𝑆𝑘𝑒𝑡𝑐ℎ(𝑣)
Properties of the Sketches: 𝐿 0 −Sampler Component 𝐶 with 𝒕≤ 𝒙 𝟓 outgoing edges. Row 𝐥𝐨𝐠 𝒕 of the sketch: Each outgoing edge is sampled w.p. 𝟏 𝒕 . With constant prob., exactly one outgoing edge is sampled.
Sparsity Sensitive Sketching Define the sketch and properties. Show how to locally simulate 𝑶( 𝐥𝐨𝐠 𝒙) rounds of BC. Show that it can be delivered to leader with O(1) rounds.
Simulating 𝑂( log 𝑥) rounds of BC In iteration 𝑖 use the 𝑖 𝑡ℎ sketch: Sample & Merge 𝑆𝑘𝑒𝑡𝑐 ℎ 𝑖 ( 𝐶 1 ) 𝐶 1 𝐼 𝐷 𝑥 𝐶 1 𝐶 3 𝐼 𝐷 𝑥 𝐶 2 𝐶 2 𝑆𝑘𝑒𝑡𝑐 ℎ 𝑖+1 𝐶 3 = 𝑆𝑘𝑒𝑡𝑐 ℎ 𝑖+1 𝐶 1 + 𝑆𝑘𝑒𝑡𝑐 ℎ 𝑖+1 𝐶 2 𝑆𝑘𝑒𝑡𝑐 ℎ 𝑖 ( 𝐶 2 )
Sparsity Sensitive Sketching Define the sketch and properties. Show how to locally simulate 𝑶( 𝐥𝐨𝐠 𝒙) rounds of BC. Show that it can be delivered to leader with O(1) rounds.
Delivering Sketches to Leader in O(1) rounds #Components 𝑂(n/ log 2 𝑥). Sketch size O( log 2 𝑥 log 𝑛 ) bits. … ⊕ ⊕ ⊕ ⊕ ⊕ 𝑂( log 2 𝑥) Leader
Putting Things Together: The Graph Connectivity Algorithm Input: Forest F with 𝑂(𝑛/ log 2 𝑥) components. Output: Forest F’ with 𝑛/𝑥 components. Types of components in F: 1. Large: with more than 8𝑥 vertices – Easy! 2. Non growable: small components with no outgoing edge. 3. Growable components: Low-degree (sparse): with less than 𝑥 5 outgoing edges. High-degree (dense): with more than 𝑥 5 outgoing edges.
The Graph Connectivity Algorithm Input: Forest with 𝑂(𝑛/ log 2 𝑥) growable components. Output: Forest with 𝑛/𝑥 growable components. Step S1: Handling low-degree components (≤ 𝒙 𝟓 ) Every vertex 𝑣 computes L=𝑂( log 𝑥) sketches: 𝑺𝒌𝒆𝒕𝒄 𝒉 𝟏 𝒗 ,…𝑺𝒌𝒆𝒕𝒄 𝒉 𝑳 𝒗 . Compute the sketch of each component 𝐶 Route 𝑆𝑘𝑒𝑡𝑐 ℎ 𝑖 𝐶 of growable component to leader. The leader locally simulates 𝑂( log 𝑥) BC merging. 𝑆𝑘𝑒𝑡𝑐 ℎ 𝑖 𝐶 = ⊕ 𝑢 𝑆𝑘𝑒𝑡𝑐 ℎ 𝑖 (𝑢), 1≤𝑖≤𝐿.
The Graph Connectivity Algorithm Input: Forest F with 𝑂(𝑛/ log 2 𝑥) growable components. Output: Forest F’ with 𝑛/𝑥 growable components. Step S2: Handling high-degree components (≥ 𝒙 𝟓 ) Every vertex 𝑣 sends a random edge to the leader. The Leader merges components into forest F’. Step S3: Cleanup The leader identifies and deactivates components in F’ that are smaller than 𝑥 and non-growable.
Analysis of (S1): The Sparse Case Lemma[Low-Deg]: After Step (S1), w.h.p., there are ≤𝑛/4𝑥 growable low-degree components. Leader simulates 𝑂( log 𝑥) of BC algorithm. i’th phase: 𝑦 𝑖 low-deg components. 𝔼(#edges decoded successfully using 𝑖 ′ 𝑡ℎ sketch)= 𝑦 𝑖 /10 W.h.p., we get 𝑦 𝑖 /20 outgoing edges. W.h.p., 𝑦 𝑖+1 ≤39/40 𝑦 𝑖
Analysis of (S2): The Dense Case Lemma[High-Deg]: After Step (S2), there are ≤𝑛/4𝑥 growable high-degree components. Assume the current size of the component is less than 8𝑥. The prob. that a random edge is internal is (8𝑥) 2 / 𝑥 4 =1/ 𝑥 2 . Hence, the probability that the final component has size less than 8𝑥 is at most 1/𝑐𝑥. Overall, in expectation, 𝑂(𝑛/ 𝑥 ) components of size ≤8𝑥. 𝑥 5 ≤8𝑥 vertices
Analysis of (S2): The Dense Case Lemma[High-Deg]: After Step (S2), there are ≤𝑛/4𝑥 growable high-degree components. 𝑖=1 𝜅=1 𝜅≔ bound on number of small components that do not contain low-degree components 𝑖=2 𝜅=1 𝜅=2 𝑖=3
The Graph Connectivity Algorithm Input: Forest with 𝑂(𝑛/ log 2 𝑥) growable components. Output: Forest with 𝑛/𝑥 growable components. Step S1: Handling low-degree components (≤ 𝒙 𝟓 ) Every vertex 𝑣 computes L=𝑂( log 𝑥) sketches: 𝑺𝒌𝒆𝒕𝒄 𝒉 𝟏 𝒗 ,…𝑺𝒌𝒆𝒕𝒄 𝒉 𝑳 𝒗 . Compute the sketch of each component 𝐶 Route 𝑆𝑘𝑒𝑡𝑐 ℎ 𝑖 𝐶 of growable component to leader. The leader locally simulates 𝑂( log 𝑥) BC merging. 𝑆𝑘𝑒𝑡𝑐 ℎ 𝑖 𝐶 = ⊕ 𝑢 𝑆𝑘𝑒𝑡𝑐 ℎ 𝑖 (𝑢), 1≤𝑖≤𝐿.
The Graph Connectivity Algorithm Input: Forest F with 𝑂(𝑛/ log 2 𝑥) growable components. Output: Forest F’ with 𝑛/𝑥 growable components. Step S2: Handling high-degree components (≥ 𝒙 𝟓 ) Every vertex 𝑣 sends a random edge to the leader. The Leader merges components into forest F’. Step S3: Cleanup The leader identifies and deactivates components in F’ that are smaller than 𝑥 and non-growable.
𝑂( log ∗ 𝑛) MST Algorithm Karger-Klein-Tarjan (KKT) Sampling: Reduce the problem into solving two MST problems, each on graph with 𝑂( 𝑛 3/2 ) edges. 1. Sample edges 𝑯⊆𝑮 with probability 𝒑= 𝟏 𝒏 . 2. Compute MSF 𝐹 on the subgraph 𝑯. Def: Edge 𝒆=(𝒙,𝒚) in G is 𝑭−𝒍𝒊𝒈𝒉𝒕 if it is not the heaviest on 𝒙−𝒚 path in 𝑭. 3. W.h.p, there are 𝑶 𝒏 𝒑 =𝑶( 𝒏 𝟑/𝟐 ) F-light edges 𝑳. 4. Compute MST tree on 𝑯∪𝑳. 𝑥 3 5 𝑦 4
𝑂( log ∗ 𝑛) MST Algorithm Karger-Klein-Tarjan (KKT) Sampling: Reduce the problem into solving two MST problems, each on graph with 𝑂( 𝑛 3/2 ) edges. Sorting Edges into 𝑛 buckets: By Lenzen’12, within O(1) rounds: Every vertex knows the bucket of each of its edges. For every bucket 𝑖, there is a node 𝑢 𝑖 that knows 𝐸 𝑖 𝑒 1 , 𝑒 2 ,… 𝑒 𝑛 ,…, 𝑒 𝑖𝑛 , ….., 𝑒 𝑖+1 𝑛 , …., 𝑒 𝑛 3/2 𝐸 1 … 𝐸 𝑖 … 𝐸 𝑛
From MST to 𝑛 Connectivity Problems Key Observation: An edge 𝑒 𝑖𝑛 = x,y in MST if 𝒙 and 𝒚 are not connected in 𝐸 1 ∪ 𝐸 2 …∪ 𝐸 𝑖−1 𝒆 𝟏 , 𝒆 𝟐 ,… 𝒆 𝒏 ,…, 𝒆 𝒊𝒏 , ….., 𝒆 𝒊+𝟏 𝒏 , …., 𝒆 𝒏 𝟑/𝟐 𝐸 𝑖 Leader 𝑢 𝑖 should know the connected components of 𝐻 𝑖 =𝐸 1 ∪ 𝐸 2 …∪ 𝐸 𝑖−1 . Nodes compute 𝒏 connectivity problems for 𝐻 1 ,…, 𝐻 𝑛 in parallel.
Key Result -Summary A Minimum Spanning Tree (Forest) can be computed within 𝑂( log ∗ 𝑛) Congested Clique rounds. Corollaries: Can be done within 𝑂( log ∗ 𝑛) rounds: Testing bipartiteness, cut verification, s-t connectivity, and cycle containment.