A survey on approximating graph spanners Guy Kortsarz, Rutgers Camden

A survey on approximating graph spanners Guy Kortsarz, Rutgers Camden

Input: An undirected graph G(V,E) and an integer k
Spanners: definition for unweighted undirected graphs. Peleg and Upfal. Input: An undirected graph G(V,E) and an integer k Required: a subgraph G’ so that for every u and v  V: DistG’(u,v)/DistG(u,v)≤ k

It is enough to take care of edges
Say we are talking about the lengths and weights that are 1. A.K.A Basic Spanners. Say that you have a k-spanner G(V,E’) of a graph G(V,E) Let E-E’ be the omitted edges What happens when you add an omitted edge is added to a k-spanner G(V,E’)? Then a cycle of size at most k+1 must be closed.

Why dealing with edges is enough
k k k

Why dealing with edges is enough
k k k Distance 3 becomes 3k

An alternative definition
Find a subgraph G’(V,E’) so that for every edges e in E-E’, adding e must close a cycle of size at most k+1. More general variants in which the above is not true. We can have weights and lengths. We can have weights equal the lengths or unrelated. Basically nothing is known for general lengths.

A look on an omitted edge e
For example a 4-spanner

A look on an omitted edge e
For example a 4-spanner: the omitted edge creates a cycle of size 4+1=5 It enough that all edges are “spanned”, to get distance between any two vertices is at most k the original one.

An example of a 2-spanners
The original graph:

A 2-spanner A few of the edge that were part of triangles were removed.

Applications In geometry.
Small routing tables: spanners have less edges. Thus smaller tables. But not much larger distance Synchronizers: make non synchronized distributed computation, synchronized. Parallel distributed and streaming algorithms. Distance oracles. Handle queries about distance between two vertices quick by preprocessing. Property testing Minimum time broadcast.

Directed spanners, applications:
Transitive Closure Spanners See Grigorescu, Jung, Raskhodnikova, Woodruff, SODA’09 Property testing. See a survey by Raskhodnikova. Access Control hierarchies. See a paper by De Santis et al.

TC spanners Transitive closure spanners.
Given G. Find H with same TC of G, and has diameter k. See a paper by Bhattacharyya et al. Studied in access control (but were not given a name). Very long time ago. In the last 20 years many known properties of transitive closure were rediscovered.

TC spanners: remarkable many applications
Circuit construction. See Chandra et al et al STOC 1983. Data structures. See Alon et al 1998. Property testing: Raskhodnikova, FOCS 2012. Lower bounds for stretch (a.k.a Lipshitzness). See JR And much more.

2-spanners There is a difficulty. Unlike k≥3 there are not necessarily 2 spanners with few edges. The only 2-spanners of a complete bipartite graph is the graph itself. Also true for TC spanners if all the directed edges go from left to right. Like in 2-SAT and 2-Coloring and other problems, 2-spanners is different than the rest.

For k at least 3 there are spanners with few edges
As we shall see: 3-spanners with O(n*sqrt{n}) edges always exist, and the same goes for 4-spanners. And this is tight. The larger k is, the smaller is the upper bound on the number of edges in the best spanner. Remarkable fact: maximum number of edges in a graph with girth g not known. Maybe for 40 years the upper and lower bound are quite far!

General lengths The definition is the same, just that the length is weighted length. Also weights=lengths. Definition: The weighted distance of every pair of vertices in G’ must be no larger than k times their weighted distance in G. It is not enough that an omitted edge closes a cycle of length at most k+1. However, it works if the the omitted edge is the largest in the closed cycle.

An edge that can be removed
For example a 4-spanner 9 4 7 8 6

Removing the heaviest edge in a cycle of length at most k+1 is OK.
For example a 4-spanner 4 7 8 6

A generalization of the Kruskal algorithm:
Sort the edges of the graph in increasing weights. c(e1) ≤ c(e2) ≤ c(e3) ≤……….. ≤ c(em) Go over all edges from small cost to large. For the next edge ei, if the edge does not close a cycle of length at most k+1 with previously added edges, add ei to G’ or else i=i+1 This algorithm is due to I. Althofer, G. Das, D. Dobkin, D. Joseph, and J. Soares.

The resulting graph is a k-spanner
If an edge e is missing, then by construction, this edge is the most heavy edge in a cycle of length at most k+1. This is because we go over edges in non decreasing costs. If we reach a cycle of size k+1, then it means that previous edges were not removed. This implies that e is the largest edge in a cycle of length at most k+1 and it is safe to remove it.

Girth k+2 We observe that the resulting graph has girth at least k+2
The girth is the size of the minimum simple cycle. Observe that when we reach the largest edge e of a cycles with at most k+1 edges it will be removed. Therefore, there are no k+1 size or smaller cycles. Graphs with large girth have “few’’ edges.

Example: graphs with girth 5 and 6
We show that graphs with girth 5 and 6 have O(n*sqrt{n}) edges. First remove all vertices of degree strictly smaller than m/n. Here m is the numbers of edges and n is the number of vertices. Since we have removed at most n vertices and each vertex removes less than m/n edges it is clear that the resulting graph is not empty.

Two layers BFS graph All the vertices seen below are distinct as otherwise there is a cycle of length at most 4. (m/n)-1 (m/n)-1 m/n

Number of edges This implies that m/n(m/n-1)≤ n, or m2/n2-m/n ≤n
As m/n<n we get that m2/n2<2n or m2<2n3 Thus m=O(n sqrt{n}) A matching lower bound. A graph of girth 6 that has Ω(n*sqrt{n}) edges. A projective plane for our needs is a bipartite graph with n vetices on each side and degree Ө(sqrt{n}) thus contains Ө(n*sqrt{n}) edges. The main property: every pair of vertices in the same side share exactly 1 neighbor.

Girth 6 There could not be a cycle of size 4:
A cycle of length 4 implies that two vertices on the same side share two neighbors. Contradiction

Girth 6 There could not be a cycle of size 4: Therefore girth 6

General bounds on the minimum number of edges for a given girth
It is known that there is always a 2k-1 spanner with O(n1+1/k) edges. By the above we know that there is always a 3,4 spanners with O(n*sqrt{n}) edges. But better than sqrt{n} approximation is known (later). For if we want a 3-spanner put k=2. This indeed gives O(n* sqrt{n}) edges. The number minimum number of edges for k=3 and k=4 is the same as the bad example is a bipartite graph.

There are spanners with few edges but what about the cost?
An interesting result by Chandra et al. The same algorithm described before returns a graph with small weight. It is roughly of cost n1/t times the cost of the minimum spanning tree (on top of the fact that there are just few edges). For log n it gives about O(1) times the MST weight, and O(n) edges.

The story of the minimum cost 2-spanner problem.
The edges have length 1. We show an approximation for costs 1, even though works for arbitrary costs. In 1992, K, Peleg: the 2-spanner problem admits an O(log n) ratio approximation in polynomial time In 1998: Kortsarz: unless P=NP there exists a constant so that c*log n/k approximation is not possible for min cost k-spanners. Ω(log n) for k=2. TIGHT. There is one k for which the approximation and the lower bound are equal . Unnatural k. See a paper by Dinitz and K and Raz from ICALP 2012.

The idea for the 2-spanner O(logn) ratio approximation algorithm
A covering problem. A universe U of covering items. A non decreasing function f: 2S---> R+ given indirectly via an oracle. The cost is sub-additive. Namely, c(A B)≤ c(A)+c(B). The cost of a set A is given by an oracle. We look for a minimum cost S so that f(S)=f(U).

The density theorem S is initiated to .
At iteration i we add a set Si to S. Let OPT be the optimum solution and opt its cost. The density condition: (f(S  Si)-f(S))/c(Si)≤ (f(OPT  S)-f(S))/opt Intuition: f(OPT  S)-f(S)=f(OPT)-f(S)=f(U)-f(S). Therefore the density of adding Si is no worse than the density of adding OPT .

The density theorem Say that S is initiated to .
Now say that at any iteration we meet the density condition Si so that: (f(Si+S)-f(S))/c(Si)≥(f(U)-f(S))/opt Then the final set S has cost bounded by (ln(U)+1) opt

One of many ways to get a good density solution for subadditive costs
Decompose OPT=i OPTi . OPTi OPT . The frequency property: each covering item participates in at most f different OPTi . By the frequency property and sub-additivity c(i OPTi )≤  c(OPTi) ≤ f*opt. We need that f(OPT)=f(i OPTi) ≤  f(OPTi) Density:  f(OPTi)/  c(OPTi) ≥f(OPT)/f*opt

f(OPTi)/c(OPTi)≥f(OPT)/f*opt.
By Averaging There exists an i so that f(OPTi)/c(OPTi)≥f(OPT)/f*opt. The question is if we can find or approximate the type of structure that OPTi is. The decomposition in the 2-spanner problem: a collection of stars. For each vertex the vertex and all its optimal edges are some OPTi

Applying the decomposition scheme
We get f=2 as every edge participated in 2 stars. For a vertex v that is connected in the optimum to QN(v) let f(v,Q) be the number of edges in the graph induced by Q. We clearly get that f(OPT)=f(v f(v,Qv) )≤  f(v,Qv) . Can we find the best star? Namely, the one who minimizes f(v, Qv)/|Qv|?

The set Qv N(v) so Algorithm: check every vertex v
For a vertex v look at the graph induced by N(v) Find a desnsest subgraph S(v) in N(v) Return the edges from v to S(v) that is the most dense set over all v and iterate S

The problem we need to solve is the densest subgraph
Let e(S), SN(v) be the number of edges in the graph induced by S. This problem requires finding a subset of the vertices with maximum density e(S)/|S| and can be solved exactly via flow. This implies (by the density claim) an O(log d) ratio follows for d the average degree. A faster algorithm, approximates the best density by 2 but get O(n) time and not flow time. Adds 2 to the ratio (so negligible). Was done by K,Peleg in Also Charikar 1998. Very extensively cited for social networks. Almost always attribute the result to Charikar.

A quick approximation for densest subgraph
Let  be e(OPT)/|OPT| We show that all vertices in the optimum have degree at least . Otherwise, removing a vertex with degree less than  increases the optimum. Therefore we can iteratively remove vertices of degree strictly less than . We never remove a vertex of OPT and so the remaining graph is not empty.

The 2 approximation continued
The degree of all vertices in the final graph is at least . Say that there are i vertices. The density is the sum of the degrees, over 2, divided by i. In other words the density is at least i*  /(2i)=  /2 Thus ratio 2. With a bit of data structure, O(n) running time.

A reduction from Set-Cover to the minimum size 2-spanner
SC C A S

A reduction from Set-Cover
h C The edges between A and S are spanned via the edges of h SC C A S

h C Few edges in the set cover instance, so add them. Will be negligible in size. SC C A S

h C If there is an edge in the optimum between a vertex of A and a vertex of E, replace it by a path of length 2 via S SC C A S

h C This means that every vertex in A is joined to a set-cover because the edges of A to E can not span themselves SC C A S

h C If the size of the set cover is opt the size of the 2-spanner is about n*opt and the inapproximability is implied. SC C A S

How hard is it to approximate spanners for k≥3?
Strong hardness is exp(log 1- n) Weak hardness is (log n)/k K. 98. First hardness. Weak hardness for w(e)=l(e)=1. Later similar methods employed for hardness for Buy at Bulk. Elkin Peleg: Strong hardness for: 1) l(e)=w(e). 2) w=1 arbitrary length 3)Unit length, arbitrary weights 4) Basic but directed spanners.

A question posed in 1992 Does the case w=l=1, k=3 admit strong hardness? In 2000 Elkin and Peleg showed conditional hardness. If Min-Rep (see K. 98) with large super girth is strongly hard, then approximating basic k-spanners for k≥ 3 is strongly hard. Dinic, Kortsarz and Raz ICALP Min-Rep with large supergirth is indeed strongly hard, thus basic spanners as well. Solved after 20 years.

A technique employed for directed Steiner Forest
Feldman, K. and Nutov. The following picture: s t LP flow at least ¼ between every pair s,t At most n2/5 vertices in every layer

An edge with large xe Between every two layers there is at most n4/5 edges. Let xe be the largest capacity. Thus via every edge at most xe flow unit pass from s to t. The total flow between s and t is at least ¼. Therefore n4/5* xe≥ ¼ Therefore there is an edge of value about 1/4n4/5 Iterative rounding gives ratio n4/5

Approximating directed spanners
Krauthgamer and Dinic employed (part of) our techniques to get an n2/3 approximation for unit length directed k-spanners. The ratio does not depend on k. As happens many times, the techniques were invented independently. Improvement: non iterative but randomized rounding gets about n1/2 ratio. Very clever trick! Due to Berman, Bhattacharyya, Makarychev, Raskhodnikova, Yaroslavtsev.

Berman et al For all pairs that have at least k vertices close to s and t, we can get O(k) (up to log factors). Using the probabilistic method and trees from all the s to their chosen center and from the center to all the t. Each tree O(opt) edges. Otherwise a paper by Berman et al give a randomized rounding gives ratio O(n/k) This is leads to a sqrt{n} ratio. It uses the property opt ≥ n-1 that does not holds in other problems.

Other results For k=3 they get ratio n1/3 ratio for the directed case. Note that even for undirected graphs n1/2 is trivial but n1/3 not. They also improve the result for Directed Steiner forest about 3 years after it was established. The new best ratio is n2/3. There are two more possibilities. Deal with small k. Or improve over ADDJ.

Dinitz and Zhang Ratio n1/3 for k=4
The ADDJ upper bound and the integrality gaps of the natural LP are not that far. Interesting proof: builds its own type of Min-Rep and uses the fact that Min-Rep is hard for large super girth several times. I would guess that the ratio of ADDJ will not be easily improved if at all.

The dense k-subgraph problem
In 1993 K., Peleg approximation of 1/4 for minimum maximum degree 2-spanner A related remark: in 1996 Feige, K, Peleg: a n1/3-1/60 for the dense k-subgraph problem. Improved by Bhsaskara, Chlamtac, Charikar, Feige and Vijayaraghavan almost 20 years later to about n1/4 ratio.

Improvements The minimization version of the dense k-subgraph problem.
Given a bound B on the edges, find a minimum size collection V’ of vertices so that e(V’)≥B. Its ratio is never worse than the dense k-subgraph ratio. Improvement 1: Chlamtac, Dinic and Krauthgamer. Ratio  for minimum maximum degree 2-spanner Improvement 2: n0.172 ratio for the minimization version of dense k-subgraph. In my opinion great result.

Remarks on this algorithm
The improvement for Dense k-subgraph can be thought of: instead of counting walks, count Caterpillars. Many tried to improve, the above did. The new results of CDK uses the Sherali-Adams hierarchy. Also a notion of faithful rounding which is new and very interesting. In 1993 K, Peleg said that Bounded degree 2-spanner related to Dense k-Subgraph. Showed to be indeed the case by CDK.

Preservers The input contains a collection of pairs {x,y} and you want minimum edges G’ so that the distance between every x,y is the same as in G. A paper by E. Chlamtac, M. Dinitz, G. Kortsarz and B. Laekhanukit, Ratio O(n3/5 ) approximation for preservers. There is a big problem. The inequality opt≥n-1 does not hold.

How to overcome this The latter paper introduced junction trees at the last stage. Junction trees are trees that connect many s,t pairs so that all paths from s,t for every pair goes via the same vertex r. Invented in relation to Buy at Bulk. Namely when the relative cost of items goes down if you buy many.

Why do the junction trees help
First, its only for w=l=1. At least so far. Instead of bounding the cost by n-1 you bound the cost by the number of terminal pairs connected times the maximum length. It has some small tricks like applying a different algorithm if the number of pairs is Ω(n4/5 ). In fact the previous method applies. Otherwise junction trees work.

Approximation pairs spanners
Given a collection of paits {s,t} with each pair having distance bound D(s,t) find a minimum cost solution so that the distance between every pair of vertices s,t is at most D(s,t). You may require that DistG’(s,t)/DistG(s,t)≤ k Only for pairs {s,t}. Thus pairs spanners.

Steiner forest with distance constarins
While the length and weights are 1, this is a version of the Directed Steiner Forest with added length constrains. This is a hugely complex algorithm. In preservers we could deal only with shortests path. But here D(s,t) can be decomposed into dist(s,r)=x, dist(r,t)=y, so that x+y≤D(s,t).

The solution Layering and a very complex algorithm, by far the most complex algorithm/analysis in the paper. We still get ratio O(n3/5 ). The same paper deals with optimizing additive spanner. A k additive spanner satisfied: DistG’(u,v)≤DistG(u,v)+ k

Very interesting history
Aingworth, Chekuri, Indyk, Motwani '96. For any graph, n∙ sqt{n} edges +2 spanners. Chechik. +4 spanners always exists with O(n7/5 ) Baswana, Kavitha, Mehlhorn, Pettie show: Always exists +6, O(n4/3) Can we continue with this hobby for k=8, k=10 and so on?

Surprise (at least for me)
Amir Abboud and Greg Bodwin. The O(n4/3) can not be not be improved. There are large µ, so that µ, additive spanners requires Ω(n4/3) edges. The last result for k=6 is best possible for much higher k. How do additive spanners compare to spanners for approximation? Turns out: Also harder.

The case of k=1 We gave the first lower bound.
If we have edges of cost 0 this is easy. We can not show that its hard to spann edges because of the O(log n) for k=2. Dividing edges brings new edges that need to be spanned. Feels like catch 22. Overcoming that by making the new paths added the same edges. Complex proof. Labelcover hard. CDLK, 2016.

For k=O(polylog(n)) Again Labercover hard. Harder proof.
Additive spanners are harder to aproximate that spanners. Any +1 spanner is a 2-spanner but +1 spanner much harder Also O(log n) spanner has constant ratio but additive polylog(n) spanner is Labelcover hard. The +1 spanners result surprised me.

Getting back to Directed Steiner Forest
First sub-linear ratio by Feldman, Kortsarz, Nutov an is O(n4/5 ). Berman et al improved their paper to O(n2/3 ) using their clever randomized rounding method. Using our additional junction tree and threshold trick we improve Berman et al to O(n3/5 ) for the unweighted case. CDLK 2016 What is your guess for the ratio in the general case?

The message of this last paper
Introducing junction tress can help approximating spanner problems. The first time junction trees ever used in spanners. A second message is that it seems that additive spanners are harder to approximate than usual spanners. Basically my adviser Peleg invented spanners on general graphs around I am sure he did not imagine they will spread so wide.

Open problems (in approximation as otherwise too many)
The case k=2 for unit length and arbitrary cost and even on directed graphs is tight. Almost all other problems large gap between ratio and inapproximability. New variants: Augmenting spanners, Client-server spanners, All server Client-server spanners. All these problems invented by Elkin and Peleg. Also Transitive closure spanners. Tree spanners. Additive spanners. In addition a slightly new concept is Fault tolerant spanners. Finally, what about k additive spanners for large k?

A survey on approximating graph spanners Guy Kortsarz, Rutgers Camden

Similar presentations

Presentation on theme: "A survey on approximating graph spanners Guy Kortsarz, Rutgers Camden"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A survey on approximating graph spanners Guy Kortsarz, Rutgers Camden

Similar presentations

Presentation on theme: "A survey on approximating graph spanners Guy Kortsarz, Rutgers Camden"— Presentation transcript:

Similar presentations

About project

Feedback