Download presentation
Presentation is loading. Please wait.
1
CIS 700: βalgorithms for Big Dataβ
Lecture 6: Graph Sketching Slides at Grigory Yaroslavtsev
2
Sketching Graphs? We know how to sketch vectors: π£βππ£
How about sketching graphs? πΊ π,πΈ β‘ π΄ πΊ (adjacency matrix): π΄ πΊ βπ π΄ πΊ Sketch columns of π΄ πΊ π= π , π=|πΈ| π(ππππ¦( log π )) sketch per vertex / π (π) total Check connectivity Check bipartiteness As always, space rather than dimension. Why?
3
Graph Streams Semi-streaming model: [Muthukrishnan β05; Feigenbaum, Kannan, McGregor, Suri, Zhangβ05] Graph defined by the stream of edges π 1 ,β¦, π π Space π π , edges processed in order Connectivity is easy on π (π) space for insertion-only Dynamic graphs: Stream of insertion/deletion updates + π π 1 , β π π 2 , β¦, β π π π‘ (assume sequence is correct) Resulting graph has edge π π if it wasnβt deleted after the last insertion Linear sketching dynamic graphs: π π΄ πΊβπ =π π΄ πΊ β MA e
4
Distributed Computing
Linear sketches for distributed processing π servers with o(m) memory: Send π/π edges ( πΈ 1 ,β¦, πΈ π ) to each server Compute sketches π πΈ 1 ,β¦, π πΈ π locally Send sketches to a central server Compute π π΄ πΊ = π π π πΈ π π has to have a small representation (same issue as in streaming)
5
Connectivity Thm. Connectivity is sketchable in π (π) space Framework:
Take existing connectivity algorithm (Boruvka) Sketch π΄ πΊ βπ π΄ πΊ Run Boruvka on π π΄ πΊ Important that the sketch is homomorphic w.r.t the algorithm
6
Part 1: Parallel Connectivity (Boruvka)
Repeat until no edges left: For each vertex, select any incident edge Contract selected edges Lemma: process converges in π( log π ) steps
7
Part 2: Graph Representation
For a vertex π let π π be a vector in β π 2 Non-zero entries for edges π,π π π π,π =1 if π>π π π π,π =β1 if j <π Example: π 1 = 1, 1, 1, 1, 0, β¦,0 π 2 = β1, 0, 0, 0, 0, 0, 1, 0, 1, β¦, 0 Lem: For any πβπ supp πβπ π π =πΈ(π,πβπ) 7 3 2 6 1 4 5 1,2 , 1,3 , 1,4 , 1,5 , 1,6 , 1,7 , 2,3 , ,5 , β¦
8
Part 3: πΏ 0 -Sampling There is a distribution over πβ β πΓπ with π=π( log 2 π ) such w.p. 9/10 that βπβ β π : πΆπβπβπ π’ππ(π) [Cormode, Muthukrishnan, Rozenbaumβ05; Jowhari, Saglam, Tardos β11] Constant probability suffices β still π log π Boruvka iterations
9
βπβπ π’ππ πβπ π π =πΈ(π,πβπ)
Final Algorithm Construct log π β 0 -samplers for each π π Run Boruvka on sketches: Use πΆ 1 π π to get an edge incident on a node π For π=2 to π‘: To get incident edge on a component πβπ use: πβπ πΆ π π π = πΆ π πβπ π π β βπβπ π’ππ πβπ π π =πΈ(π,πβπ)
10
K-Connectivity Graph is π-connected is every cut has size β₯π
Thm: There is a π(ππ log 3 π )-size linear sketch for k-connectivity Generalization: There is an π(π log 5 π / π 2 )-size linear sketch which allows to approximate all cuts in a graph up to error (1Β±π)
11
K-connectivity Algorithm
Algorithm for π-connectivity: Let πΉ 1 be a spanning forest of πΊ(π,πΈ) For π=2, β¦, π Let πΉ π be a spanning forest of πΊ(π, πΈβ πΉ 1 ββ¦β πΉ πβ1 ) Lem: πΊ(π, πΉ 1 +β¦+ πΉ π ) is k-connected iff G(V,E) is. β Trivial β Consider a cut in πΊ(π, π=1 π πΉ π ) of size <π ββ π β : this cut didnβt grow in step π β β there is a cut in πΊ(π,πΈ) of size <π β contradiction
12
K-connectivity Algorithm
Construct π independent linear sketches { π 1 π΄ πΊ , π 2 π΄ πΊ β¦, π π π΄ πΊ } for connectivity Run π-connectivity algorithm on sketches: Use π 1 π΄ πΊ to get a spanning forest πΉ 1 of πΊ Use π 2 π΄ πΊ β π 2 π΄ πΉ 1 = π 2 ( π΄ πΊβ πΉ 1 ) to find πΉ 2 Use π 3 π΄ πΊ β π 3 π΄ πΉ 1 β π 3 π΄ πΉ 2 = π 3 ( π΄ πΊβ πΉ 1 β πΉ 2 ) to find πΉ 3 β¦
13
Bipartiteness Reduction: Given πΊ define πΊβ² where vertices π£β( π£ 1 , π£ 2 ); edges π’,π£ β( π’ 1 , π£ 2 ) & ( π’ 2 , π£ 1 ) Lem: # connected components doubles iff the graph is bipartite. Thm: π(π log 3 π )-size linear sketch for k-connectivity (sketch πΊβ² (implicitly).) β β
14
π€ πππ β€πβ 1+π π + π=0β¦πβ1 π π π π β€ 1+π π€ πππ
Minimum Spanning Tree If π π =# connected components in a subgraph induced by edges of weight β€ 1+π π : π€ πππ β€πβ 1+π π + π=0β¦πβ1 π π π π β€ 1+π π€ πππ where π π = ( 1+π π+1 β 1+π π cc(G) = #connected components of πΊ Round weights up to the nearest power of 1+π πΊ π β‘ subgraph with edges of weight β€ 1+π π Edges taken by the Kruskalβs algorithm: n β cc( πΊ 0 ) edges of weight 1 ππ πΊ 0 βππ( πΊ 1 ) edges of weight (1+π) β¦ cc πΊ πβ1 β cc πΊ π edges of weight 1+π π
15
Minimum Spanning Tree Let π= log 1+π π where π= max edge weight
Overall weight: π βππ πΊ π 1+π π (ππ πΊ πβ1 βππ( πΊ π )) =πβ 1+π π + 0 πβ1 ( 1+π π+1 β 1+π π ) ππ( πΊ π ) Thm: 1+π -approx. MST weight can be computed with π π linear sketch for π=ππππ¦(π)
16
MST: Single Linkage Clustering
[Zahnβ71] Clustering via MST (Single-linkage): k clusters: remove πβπ longest edges from MST Maximizes minimum intercluster distance ββ [Kleinberg, Tardos]
17
Cut Sparsification Two problems: General cut sparsification framework:
Approximating Min-Cut in the graph (up to 1Β±π) Preserving all cuts in the graph (up to 1Β±π) General cut sparsification framework: Sample each edge π with probability π π Assign sampled edges weights 1/ π π Expected weight of each cut is preserved, but too many cuts β canβt take union bound
18
Cut Sparsification For an edge π let π π = weight of the minimum cut that contains π π= size of the Min-Cut in G Thm [Fung et al.]: If πΊ is an undirected weighted graph then if π π β₯ min πΆ log 2 π π π π 2 ,1 then the cut sparsification alg. preserves weights of all cuts up to (1Β±π) Thm [Karger]: π π β₯ min πΆ log π π π 2 ,1 preserves Min-Cut up to (1Β±π)
19
Minimum Cut Algorithm: For π = 0,1,β¦, 2 log π :
Let πΊ π be the subgraph of πΊ where each edge is sampled with probability 1/ 2 π Let π» π = F 1 ,β¦, πΉ π where π=π 1 π 2 β
log π and πΉ π are forests constructed by the k-connectivity alg. Return 2 π π( π» π ) where π=minβ‘{π :π π» π <π} Space: π π log 4 π π 2 , works for dynamic graph streams
20
Minimum Cut: Analysis Key property: If πΊ π has β€π edges across a cut then π» π contains all such edges π β = log max 1, π π 2 6 log π πβ€ π β β π π β₯ min 6 log π π π 2 ,1 β min cut in πΊ π is approximating min-cut in πΊ up to (1Β±π) π= π β : By Chernoff bound # edges in πΊ π β that crosses min-cut in πΊ is π 1 π 2 log π β€π w.h.p.
21
Cut Sparsification Algorithm: For π = 0,1,β¦, 2 log π :
Let πΊ π be the subgraph of πΊ where each edge is sampled with probability 1/ 2 π Let π» π = F 1 ,β¦, πΉ π where π=π 1 π 2 β
log 2 π and πΉ π are forests constructed by the k-connectivity alg. For each edge π let π π =min i: π π π» π <π . If πβ π» π π then add e to the sparsifier with weight 2 π π Space: π π log 5 π π 2 , works for dynamic graph streams Analysis similar to the Min-Cut using [Fung et al.]
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.