Near Optimal Streaming algorithms for Graph Spanners Surender Baswana IIT Kanpur
Graph spanner : a subgraph which is sparse and still preserves all-pairs approximate distances.
t-spanner G=(V,E) : an undirected graph, |V|=n, |E|=m, t > 1 δ(u,v) : distance between u and v in G. A subgraph G S = (V,E S ), where E S is a subset of E such that for all u,v ε V, δ(u,v) ≤ δ S (u,v) ≤ t δ(u,v) t : stretch of the spanner.
Sparseness versus stretch Consider a graph modeling some network Edges correspond to possible links. Each edge has certain cost. Aim : to select as few edges as possible without increasing the pair wise distance too much.
t-spanner Computing a t-spanner of smallest possible size is NP-complete. For a graph on n vertices, how large can a t-spanner be ? v u
t-spanner Computing a t-spanner of smallest possible size is NP-complete. For a graph on n vertices, how large can a t-spanner be ? u v
t-spanner Computing a t-spanner of smallest possible size is NP-complete. For a graph on n vertices, how large can a t-spanner be ? u v 2-spanner may require Ω(n 2 ) edges
t-spanner [Erdös 1963, Bollobas, Bondy & Simonovits] “There are graphs on n vertices for which every 2k-spanner or a (2k-1)- spanner has Ω(n 1+1/k ) edges.” G=(V,E) ALGORITHM G S =(V,E S ), |E S |=O(n 1+1/k ) G S is (2k-1)-spanner
Algorithms for t-spanner (RAM model) StretchSizeRunning time Das et al., k-1O(n 1+1/k )O(mn 1+1/k ) Deterministic Roditty et al k-1O(n 1+1/k )O(n 2+1/k ) Deterministic B & Sen, k-1O(kn 1+1/k )O(km) Randomized Roditty et al., k-1O(kn 1+1/k )O(km) Deterministic
Algorithms for t-spanner (RAM model) StretchSizeRunning time Das et al., k-1O(n 1+1/k )O(mn 1+1/k ) Deterministic Roditty et al k-1O(n 1+1/k )O(n 2+1/k ) Deterministic B & Sen, k-1O(kn 1+1/k )O(km) Randomized Roditty et al., k-1O(kn 1+1/k )O(km) Deterministic avoids distance computation altogether. near optimal algorithms in parallel, external-memory, distributed environment
Computing a t-spanner in streaming environment Input : n, m, k, and a stream of edges of an unweighted graph Aim : to compute a (2k-1)-spanner Efficiency measures : 1. number of passes 2. space (memory) required 3. time to process the entire stream
Computing a t-spanner in streaming environment Input : n, m, k, and a stream of edges of an unweighted graph Aim : to compute a (2k-1)-spanner Algo 1 : Streaming model Efficiency measures : 1. number of passes 1 2. space (memory) requiredO(kn 1+1/k ) 3. time to process the entire stream O(m)
Computing a t-spanner in streaming environment Input : n, m, k, and a stream of edges of an unweighted graph Aim : to compute a (2k-1)-spanner [Feigenbaum et al., SODA 2005] Efficiency measures : 1. number of passes 1 2. space (memory) requiredO(kn 1+1/k ) for (2k+1)-spanner 3. time to process the entire stream O(mn 1/k )
Computing a t-spanner in streaming environment Input : n, m, k, and a stream of edges of a weighted graph Aim : to compute a (2k-1)-spanner Algo 2 : StreamSort model Efficiency measures : 1. number of passes O(k) 2. working memory requiredO(log n) bits 3. time spent in one stream pass O(m)
Relation to previous results B. & Sen, 2003 Feigenbaum et al., 2005 Algo 1 Algo 2 slightly different hierarchy simple buffering technique
Algorithm 1
Intuition u
Spanner edge u
Intuition Spanner edge u
Cluster u v o C(x) : center of cluster containing x Radius : maximum distance from center to a vertex in the cluster Clustering : a set of disjoint clusters
0 1 2 K K-1 Preprocessing : Clustering for the initial (empty) graph
0 1 2 K K-1 Sampling probability = n -1/k Preprocessing : Clustering for the initial (empty) graph
0 1 2 K K-1 Sampling probability = n -1/k Preprocessing : Clustering for the initial (empty) graph
0 1 2 K-1 K Sampling probability = n -1/k n n 1-1/k n 1-2/k n 1/k 0 Preprocessing : Clustering for the initial (empty) graph
0 1 2 K-1 K n n 1-1/k n 1-2/k n 1/k 0 Sampling probability = n -1/k Preprocessing : Clustering for the initial (empty) graph
Processing the stream of edges Each vertex u at level i<k-1 wishes to move to higher levels. Condition for upward movement : “an edge (u,v) such that C i (v) is a sampled cluster”
0 1 2 K-1 K uv v
0 1 2 K uv v
0 1 2 K uv v u
0 1 2 K uv v u yx x x
0 1 2 K uv v u yx x x y
0 1 2 K uv v u yx x x y y
0 1 2 K uv v u yx x x y y
0 1 2 K uv v u yx x x y y u
0 1 2 K
u i From perspective of a vertex u …
u i
u i
u i
u i
u i u x x y y i+1 From perspective of a vertex u …
Processing an edge (u,v) If C i (v) is a sampled cluster : C i+1 (u) C i+1 (v); add (u,v) to spanner; u moves to level i+1 (or even higher) Else if C i (v) was not adjacent to u earlier : add edge (u,v) to spanner; Else Discard (u,v) u i u x x y y i+1
0 1 2 K-1 K n n 1-1/k n 1-2/k n 1/k 0
Size and stretch of spanner Expected number of spanner edges contributed by a vertex = O(k n 1/k ). Radius of a cluster at level i is at most i. For each edge discarded, there is a path in spanner of length (2i+1) u i
Size and stretch of spanner Expected number of spanner edges contributed by a vertex = O(k n 1/k ). Radius of a cluster at level i is at most i. A single pass streaming algorithm A (2k-1)-spanner of expected size O(kn 1+1/k )
Running time of the algorithm u i If C i (v) is a sampled cluster : C i+1 (u) C i+1 (v); add (u,v) to spanner; u moves to level i+1 (or even higher) Else if C i (v) was not adjacent to u earlier θ(n 1/k ) time add edge (u,v) to spanner; Else Discard (u,v) v
Slight modification Each vertex u keeps two buffers for storing edges incident from clusters at its present level. 1. Temp(u) 2. E s (u) Whenever u moves to higher level, move all the edges of Temp(u) and E s (u) to the spanner.
Modified algorithm i If C i (v) is a sampled cluster : C i+1 (u) C i+1 (v); add (u,v) to spanner; u moves to level i+1 (or even higher) Else add (u,v) to Temp(u) and Prune(u) if Temp(u) ≥ E S (u) uv
u Adding edges to Temp(u)
u
u Prune(u) u
Time complexity analysis Prune(u) can be executed in O(|Temp(u)| + |E s (u)|) time using an an auxiliary O(n) space. when is Prune(u) executed ?
Time complexity analysis Prune(u) can be executed in O(|Temp(u)| + |E s (u)|) time using an an auxiliary O(n) space. Prune(u) is executed only when |Temp(u)| ≥ |E s (u)|
Time complexity analysis Prune(u) can be executed in O(|Temp(u)| + |E s (u)|) time using an an auxiliary O(n) space. Prune(u) is executed only when |Temp(u)| ≥ |E s (u)| We can charge O(1) cost to each edge in Temp(u).
Time complexity analysis Prune(u) can be executed in O(|Temp(u)| + |E s (u)|) time using an an auxiliary O(n) space. Prune(u) is executed only when |Temp(u)| ≥ |E s (u)| We can charge O(1) cost to each edge in Temp(u). An edge is processed in Temp(u) at most once.
Time complexity analysis Prune(u) can be executed in O(|Temp(u)| + |E s (u)|) time using an an auxiliary O(n) space. Prune(u) is executed only when |Temp(u)| ≥ |E s (u)| We can charge O(1) cost to each edge in Temp(u). An edge is processed in Temp(u) at most once. Total time spent in processing the stream = O(m)
Size of (2k-1)-spanner Expected size of |E s (u)| = O(n 1/k ) Temp(u) never exceeds |E s (u)| +1. Expected size of (2k-1)-spanner is O(k n 1+1/k )
Conclusion THEOREM 1 : Given any k ε N, a (2k-1)-spanner of expected size O(kn 1+1/k ) for any unweighted graph can be computed in one Stream pass with O(m) time to process the entire stream of edges. THEOREM 2 : Given any k ε N, a (2k-1)-spanner of expected size O(kn 1+1/k ) for any weighted graph can be computed in O(k) StreamSort passes with O(log n) bits of working memory.