Geometric Embeddings, Graph Partitioning, and Expander flows: A survey of recent results Sanjeev Arora Princeton ( touches upon: S. A., Satish Rao, Umesh Vazirani, STOC’04; S. A., Elad Hazan, and Satyen Kale, FOCS’04; S. A., James Lee, and Assaf Naor, unpublished + papers that are not mine)
Outline: graph partitioning problems: intro and history new approximation algorithm + analysis (“Structure Theorem”) [A., Rao, Vazirani] applications of “S. T.” to other NP-hard problems Outline of proof of “S. T.” Uses of “S. T.” in Geometric embeddings Introduction to expander flows Using expander flows to design O(n 2 ) algorithms for graph partitioning [A., Hazan, Kale] Open problems
Sparsest Cut S S G = (V, E) c- balanced separator Both NP-hard G) = min S µ V | E(S, S c )| |S| |S| < |V|/2 c (G) = min S µ V | E(S, S c )| |S| c |V| < |S| < |V|/2
Why these problems are important Arise in analysis of random walks, PRAM simulation, packet routing, clustering, VLSI layout etc. Underlie many divide-and-conquer graph algorithms (surveyed by Shmoys’95) Discrete analogs of isoperimetric constant; useful in study of Riemannian manifolds and 2 nd eigenvalue of Laplacian (Cheeger’70) Graph-theoretic parameters of inherent interest (cf. Lipton-Tarjan planar separator theorem)
Previous approximation algorithms 1)Eigenvalue approaches ( Cheeger’70, Alon’85, Alon-Milman’85 ) 2c(G) ¸ L (G) ¸ c(G) 2 /2 c(G) = min S µ V E(S, S c )/ E(S) 2) O(log n) - approximation via LP (multicommodity flows ) (Leighton-Rao’88) Approximate max-flow mincut theorems Region-growing argument (Linial, London, Rabinovich’94, AR’94) 3) Embeddings of finite metric spaces into l 1 Geometric approach; more general result (but still O(log n) approximation)
New results of [ARV’04] 1.O( ) -approximation to sparsest cut and conductance 2.O( )-pseudoapproximation to c-balanced separator (algorithm outputs a c’-balanced separator, c’ < c) 3.Existence of expander flows in every graph (approximate certificates of expansion) log n Disparate approaches from previous slide get “unified”
Semidefinite relaxations for c-balanced separator (and how to round the solutions)
LP Relaxations for c-balanced separator Motivation: Every cut (S, S c ) defines a (semi) metric X ij 2 {0,1} i< j X ij ¸ c(1-c)n 2 X ij + X j k ¸ X ik 0 · X ij · 1 Semidefinite There exist unit vectors v 1, v 2, …, v n 2 < n such that X ij = |v i - v j | 2 /4 Min (i, j) 2 E X ij
Semidefinite relaxation (contd) Min (i, j) 2 E |v i –v j | 2 /4 |v i | 2 = 1 |v i –v j | 2 + |v j –v k | 2 ¸ |v i –v k | 2 8 i, j, k i < j |v i –v j | 2 ¸ 4c(1-c)n 2 Unit l 2 2 space
Unit vectors v 1, v 2,… v n 2 < d |v i –v j | 2 + |v j –v k | 2 ¸ |v i –v k | 2 8 i, j, k ViVi VkVk VjVj Angles are non obtuse Taking r steps of length s only takes you squared distance rs 2 (i.e. distance r s) ss ss
Example of l 2 2 space: hypercube {-1, 1} k |u – v| 2 = i |u i – v i | 2 = 2 i |u i – v i | = 2 |u – v| 1 In fact, every l 1 space is also l 2 2 Conjecture (Goemans, Linial): Every l 2 2 space is l 1 up to distortion O(1)
Structure Theorem for l 2 2 spaces Two subsets S and T are -separated if for every v i 2 S, v j 2 T |v i –v j | 2 ¸ ¸ Thm: If i< j |v i –v j | 2 = (n 2 ) then there exist two sets S, T of size (n) that are -separated for = ( 1 ) <d<d log n
Main thm ) O( )-approximation log n v 1, v 2,…, v n 2 < d is optimum SDP soln; SDP opt = (i, j) 2 E |v i –v j | 2 S, T : –separated sets of size (n) Do BFS from S until you hit T. Take the level of the BFS tree with the fewest edges and output the cut (R, R c ) defined by this level (i, j) 2 E |v i –v j | 2 ¸ |E(R, R c )| £ ) |E(R, R c )| · SDP opt / · O( SDP opt ) log n
Other new -approximation algorithms MIN-2-CNF deletion and several graph deletion problems. [Agarwal, Charikar, Makarychev, Makarychev’04] MIN-LINEAR ARRANGEMENT [Charikar, Karloff, Rao’04] General SPARSEST CUT [A., Lee, Naor ’04] Min-ratio VERTEX SEPARATORS and Balanced VERTEX SEPARATORS [ Feige, Hajiaghayi, Lee, ’04] log n
Next min: Proof-sketch of Structure Thm ( algorithm to produce -separated S, T of size (n); = 1/ ) S T
Projection onto a random line <d<d v u ?? 1 d 1 d e -t 2 /2 d Pr u [ projection exceeds 2 ] < 1/n 2 log n
Algorithm to produce two –separated sets <d<d u SuSu TuTu 0.01 d Check if S u and T u have size (n) If any v i 2 S u and v j 2 T u satisfy |v i –v j | 2 · repeat until no such v i, v j remain delete them and If S u, T u still have size (n), output them Main difficulty: Show that whp only o(n) points get deleted d “Stretched pair”: v i, v j such that |v i –v j | 2 · and | h v i –v j, u i | ¸ 0.01 Obs: Deleted pairs are stretched and they form a matching.
“Matching is of size o(n) whp” : naive argument fails d “ Stretched pair”: v i, v j such that |v i –v j | 2 · and | h v i –v j, u i | ¸ 0.01 O( 1 ) £ standard deviation ) Pr U [ v i, v j get stretched] = exp( - 1 ) = exp( - ) log n E[# of stretched pairs] = O( n 2 ) £ exp(- )log n
ViVi Ball (v i, ) u VjVj 0.01 d Suppose matching of (n) size exists with probability (1)… ….stretched pairs are almost everywhere you look!
Generating a contradiction: the walk on stretched pairs u ViVi VjVj 0.01 d d r steps 0.01 d r |v final - v i | < r | | ¸ 0.01r d = O( r ) x standard dev. v fina l Contradiction (if r large enuff)!!
Measure concentration (P. Levy, Gromov etc.) <d<d A A : measurable set with (A) ¸ 1/4 A : points with distance · to A AA A ) ¸ 1 – exp(- 2 d) Reason: Isoperimetric inequality for spheres
Embeddings of finite metric spaces into geometric spaces
Finite metric space (X, d) x y d(x,y) < k (with l 2 norm) f distortion of f is minimum C>1 such that d( x, y) · |f(x ) – f( y)| 2 · C d( x, y) 8 x, y Thm (Bourgain’85): For every n-point metric space, a map exists with distortion O(log n) [LLR’94]: Efficient algorithm to find the map; Proof that O(log n) cannot be improved in general Qs: Improve O(log n) when X is a geometric space; say l 1 ?
Status report of this area l 1 into l 2 log 0.5 n [Enflo’69] l 2 2 into l [Zatloukal’04] Superconstant [Khot, Vishnoi’04] l 2 2 into l 2 log 0.5 n [Enflo’69] Best lowerbound Best upperbound Exactly the integrality gap of SDP for general SPARSEST CUT [LLR’94, AR’94] log n [Bourgain’85] log 0.75 n [Chawla,Gupta,Racke ’04] log 0.5 n log log n [A., Lee, Naor’04]
Frechet’s recipe to embed metric space (X, d) into R k Pick k suitable subsets A 1, A 2, …, A k of X Map x 2 X to (d(x, A 1 ), d(x, A 2 ), …, d(x, A k )) AiAi x In recent results, A i ’s are chosen using [ARV] Structure Theorem and “Measured descent” idea of [Krauthgamer, Lee, Naor, and Mendel’04]
Expander flows (approximate certificates of expansion)
Expander flows: Motivation G = (V, E) S S Idea: Embed a D-regular (weighted) graph such that 8 S w(S, S c ) = (D |S|) Cf. Jerrum-Sinclair, Leighton-Rao (embed a complete graph) “Expander” Weighted Graph w satisfies (*) iff L (w) = (1) [Cheeger] (*) Our Thm: If G has expansion , then a D-regular expander flow exists in it where D= log n (certifies expansion = (D) )
Example of expander flow n-cycle Take any 3-regular expander on n nodes Put a weight of 1/3n on each edge Embed this into the n-cycle Routing of edges does not exceed any capacity ) expansion = (1/n)
New Result (A., Hazan, Kale;FOCS’04) O(n 2 ) time algorithm that given any graph G finds for some D >0 a D-regular expander flow a cut of expansion O( D ) log n Ingredients: Approximate eigenvalue computations; Approximate flow computations (Garg-Konemann; Fleischer) Random sampling (Benczur-Karger + some more) Idea: Define a zero-sum game whose optimum solution is an expander flow; solve approximately using Freund-Schapire approximate solver. ) D) · (G) · O(D ) log n
Expander flows: LP view LP feasible ) ¸ (D) G G · D · 1 Thm [ARV]: 9 0 s.t. the LP is feasible with D = /√log n Thm [ARV]: 9 0 s.t. the LP is feasible with D = /√log n
OPEN PROBLEMS Better approximation factor than O( )? (For general SPARSEST CUT, log log n “lowerbound” ) Better distortion bound for embedding l 2 2 into l 1 ? ( upperbound v/s loglog n lowerbound.) Combinatorial approximation algorithms for other problems ? (similar to one for SPARSEST CUT from [A., Hazan, Kale] ) O(m) time algorithm for SPARSEST CUT instead of O(n 2 )? (not known even for Leighton-Rao’88 O(log n) approximation) Other applications of expander flows? (Useful in results about Banach spaces [Naor, Rabani, Sinclair’04])
Looking forward to more progress… Thanks !
Open problems (circa April’04) Better running time/combinatorial algorithm? Improve approximation ratio to O(1); better rounding?? (our conjectures may be useful…) Extend result to other expansion-like problems (multicut, general sparsest cut; MIN-2CNF deletion) Resolve conjecture about embeddability of l 2 2 into l 1 ; of l 1 into l 2 Any applications of expander flows? O(n 2 ) time; [A., Hazan, Kale] log 3/4 n distortion; [Chawla,Gupta, Racke] Integrality gap is (log n) [Charikar] Yes [Naor,Sinclair,Rabani] Better embeddings of l p into l q [Lee]
Various new results O(n 2 ) time combinatorial algorithm for sparsest cut (does not use semidefinite programs) [A., Hazan, Kale’04] New results about embeddings: (i) l p into l q [J. Lee’04] (ii) l 2 2 and l 1 into l 2 [CGR’04] (approx for general sparsest cut) Clearer explanation of expander flows and their connection to embeddings [NRS’04]
Formal statement : 9 0 >0 s.t. foll. LP is feasible for d = (G) log n f p ¸ 0 8 paths p in G 8 i j p 2 P ij f p = d (degree) P ij = paths whose endpoints are i, j 8 S µ V i 2 S j 2 S c p 2 P ij f p ¸ 0 d |S| (demand graph is an expander) 8 e 2 E p 3 e f p · 1 (capacity)
A concrete conjecture (prove or refute) G = (V, E); = (G) For every distribution on n/3 –balanced cuts {z S } (i.e., S z S =1) there exist (n) disjoint pairs ( i 1, j 1 ), ( i 2, j 2 ), ….. such that for each k, distance between i k, j k in G is O(1/ ) i k, j k are across (1) fraction of cuts in {z S } ( i.e., S: i 2 S, j 2 S c z S = (1) ) Conjecture ) existence of d-regular expander flows for d =
log n