Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]

October 29, 2008 Metric Embeddings as Computational Primitives 2 Distance Estimation (DE) Fix a metric M  Example: Hamming or edit distance on {0,1} d Distance Estimation: Given x,y  M as input, compute d M (x,y)  Exactly or within approximation a  1  Decision version (for input R>0): determine if d(x,y)  R or d(x,y)>a  R  A basic computational problem We want efficient algorithms – but in what sense? 1. Classical – best runtime of exact algorithm  Example: quadratic time for edit distance 2. Fast approximation – sublinear time regime  Example: Hamming distance can be estimated in O(d/R) time 3. Communication complexity (variants: # rounds, streaming)  Example: Hamming distance admits1+  approximation

October 29, 2008 Metric Embeddings as Computational Primitives 3 Communication Complexity (CC) Alice x y Estimate d M (x,y) Known, powerful methodology Many connections/applications to computation and algorithms  E.g. quickly estimate similarity, Near Neighbor Search (NNS), or network proximity Efficiency measured by: Communication  Information … few bits Protocol rounds: 0 (aka simultaneous/sketching): Referee decides based on sketch(x), sketch(y) 1 (aka one-way protocol) arbitrary Referee sketch(y) sketch(x) Bob public randomness

October 29, 2008 Metric Embeddings as Computational Primitives 4 From L 1 -Embeddings to Protocols Theorem [Kushilevitz-Ostrovsky-Rabani’98]: Hamming distance (L 1 ) admits 1+  approximation using sketches of size O(1/  2 ). Namely, sketching protocol with O(1/  2 ) communication Sketch-size is optimal, even for one round [Woodruff’04] Open: What is the optimal communication for more rounds? Corollary: Every metric M that embeds into L 1 with distortion D>1, admits (2D)-approximation by sketches of size O(1). Provides several state-of-the-art communication bounds (e.g. for edit, block- edit, and planar earthmover distance) Theorem [Saks-Sun’02, BarYossef-Jayram-Kumar-Sivakumar’04]: L  on [m] n with approx. m-1 requires  (n/m 2 ) communication I.e., the trivial protocol is optimal Embedding into L p for large p, is not useful Embedding into L 1 may be viewed as a “protocol” or “algorithm” Alternative perspective: Protocols generalize L 1 -embeddings

October 29, 2008 Metric Embeddings as Computational Primitives 5 Protocols Generalize L 1 -Embeddings A strict generalization – includes embedding into L 2 -squared  Which is provably richer than L 1 [Khot-Vishnoi’05, K.-Rabani’06, Lee- Naor’06] (though precise gap between these two is open) Even with restricted # rounds  Open: Can more rounds improve communication or approximation? Can we characterize the metrics admit efficient protocols? Some concrete questions: 1. What about arbitrary finite metrics?  Alternatively, what are the hardest metrics? 2. What about restricted families/topologies?  Specifically, what metrics are easy/efficient? 3. Are protocols stronger than embeddings? How much?  In particular, should we embed into spaces richer than L 1 ?

October 29, 2008 Metric Embeddings as Computational Primitives 6 1. Arbitrary Metrics Theorem [Andoni-K.]: Every n-point metric admits a one-way protocol with communication O((log n)/a), for all a  1. (Previously known for a=O(1) and a=Clog n) 1. Both parties compute s random partitions with diameter  =a  r of M, using algorithm of [Calinescu-Karloff-Rabani’01] 2. Alice finds i s.t. x is padded in partition P i, and sends 3. Bob accepts iff hash(P i (y)) = hash(P i (x)) Analysis: By [Mendel-Naor’07], probability of padding t>0 is Letting s=n O(1/a), protocol succeeds WHP using O(log s) bits. Open: Sketching protocol? Is this bound optimal for all 1  a  o(log n)? (plugging t=r)

October 29, 2008 Metric Embeddings as Computational Primitives 7 1. Arbitrary Metrics – Lower Bound Theorem [Andoni-K.]: For expanders, CC   ((log n)/a) for all a  1. Define 2 distributions over pairs (x,y)  M:   0 = uniform;   1 = random x, random walk of length r=(log d n)/a to y (d=degree) Lemma: Protocol with t bits implies functions A,B : M  {0,1} s.t. Pr  1 [A(x)=B(y)] – Pr  0 [A(x)=B(y)]   (2 -t ) However, G r =(E r,V r ) defined by r-walks, has degree d r and 2 nd eigenvalue r, and Applying the expander mixing lemma: Altogether,  (2 -t )  ( /d) r  t   (r  log(d/ )).

October 29, 2008 Metric Embeddings as Computational Primitives 8 2. Easy Metrics Theorem [Andoni-K.]: Every tree metric admits a 1+  approximation sketching protocol using Õ(1/  ) bits This bound is optimal, even for complete binary trees Compare with:  L 1 metrics in general are less efficient – require  (1/  2 ) bits  Line (path) metric is significantly more efficient – require  (log(1/  )) bits Extends to product of k=O(1) lines or trees ( R k or H k ) Open: 1+  approximation protocol for planar metrics?  Related to planar distance labeling [Thorup’04]

October 29, 2008 Metric Embeddings as Computational Primitives 9 3. Richer Host Spaces – Powers Several metrics embed into L 2 -squared with O(1) distortion:  Doubling metrics [Assouad’82]  Planar (and excluded-minor) metrics, via [Klein-Plotkin-Rao’93] Shift metric = quotient of Hamming cube {0,1} d wrt cyclic shifts Open: Embedding shift metric into L 2 -squared In fact, p-th power of L 1 (or L 2 ) admits a sketching protocol with 1+  approximation and O(p 2 /  2 ) bits Because ||x-y|| p > R(1+  )  ||x-y|| > R 1/p (1+  ) 1/p  R 1/p (1+  /p) Open: How strict is the hierarchy L 1  (L 1 ) 2  …  (L 1 ) p ?  When allowing constant distortion  [Deza-Maehara’90]: n-point metrics embed isometrically into (L 2 ) O(n)

October 29, 2008 Metric Embeddings as Computational Primitives 10 3. Richer Host Spaces – Products The L p -product of k copies of M, denoted  Lp M, is the space M k =M  …  M equipped with distance function d Lp (x,y) = ( d M (x 1,y 1 ) p + … + d M (x k,y k ) p ) 1/p  E.g. Z 3 grid is the L 1 -product of 3 copies of the integer line Theorem [Andoni-Indyk-K.’09]: The Ulam metric (edit distance on permutations) embeds with O(1) distortion into  L2 2  L  L 1. Leads to improved algorithms (NNS, sketching, sublinear DE) Embedding into L 1 or into L 2 -squared (in fact, even sketching) requires higher distortion  ̃(log d) [Andoni-K.’07] Open: Use similar products to improve edit distance?  Known distortion is 2 Õ(  log d) into L 1 Open: Embedding expanders into  L  L 1 ? Into  L  L 1 (low dim)?

October 29, 2008 Metric Embeddings as Computational Primitives 11 Conclusion We discussed connections between communication protocols and embeddings, into L 1 and into other spaces  We view embeddings as algorithmic framework (model of computation)  Alternatively, view protocols as a computational analogue of embeddings Open: What metrics admit efficient protocols? Open: Do protocols imply embeddings? Even under mild technical conditions, e.g.  Target space is (L 2 ) p, for fixed p  M is closed under l 1 -product (aka sum-product)  The regime of 1+  approximation

Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]

Similar presentations

Presentation on theme: "Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]

Similar presentations

Presentation on theme: "Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]"— Presentation transcript:

Similar presentations

About project

Feedback