Download presentation
Presentation is loading. Please wait.
Published byAnna Fields Modified over 9 years ago
1
Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]
2
October 29, 2008 Metric Embeddings as Computational Primitives 2 Distance Estimation (DE) Fix a metric M Example: Hamming or edit distance on {0,1} d Distance Estimation: Given x,y M as input, compute d M (x,y) Exactly or within approximation a 1 Decision version (for input R>0): determine if d(x,y) R or d(x,y)>a R A basic computational problem We want efficient algorithms – but in what sense? 1. Classical – best runtime of exact algorithm Example: quadratic time for edit distance 2. Fast approximation – sublinear time regime Example: Hamming distance can be estimated in O(d/R) time 3. Communication complexity (variants: # rounds, streaming) Example: Hamming distance admits1+ approximation
3
October 29, 2008 Metric Embeddings as Computational Primitives 3 Communication Complexity (CC) Alice x y Estimate d M (x,y) Known, powerful methodology Many connections/applications to computation and algorithms E.g. quickly estimate similarity, Near Neighbor Search (NNS), or network proximity Efficiency measured by: Communication Information … few bits Protocol rounds: 0 (aka simultaneous/sketching): Referee decides based on sketch(x), sketch(y) 1 (aka one-way protocol) arbitrary Referee sketch(y) sketch(x) Bob public randomness
4
October 29, 2008 Metric Embeddings as Computational Primitives 4 From L 1 -Embeddings to Protocols Theorem [Kushilevitz-Ostrovsky-Rabani’98]: Hamming distance (L 1 ) admits 1+ approximation using sketches of size O(1/ 2 ). Namely, sketching protocol with O(1/ 2 ) communication Sketch-size is optimal, even for one round [Woodruff’04] Open: What is the optimal communication for more rounds? Corollary: Every metric M that embeds into L 1 with distortion D>1, admits (2D)-approximation by sketches of size O(1). Provides several state-of-the-art communication bounds (e.g. for edit, block- edit, and planar earthmover distance) Theorem [Saks-Sun’02, BarYossef-Jayram-Kumar-Sivakumar’04]: L on [m] n with approx. m-1 requires (n/m 2 ) communication I.e., the trivial protocol is optimal Embedding into L p for large p, is not useful Embedding into L 1 may be viewed as a “protocol” or “algorithm” Alternative perspective: Protocols generalize L 1 -embeddings
5
October 29, 2008 Metric Embeddings as Computational Primitives 5 Protocols Generalize L 1 -Embeddings A strict generalization – includes embedding into L 2 -squared Which is provably richer than L 1 [Khot-Vishnoi’05, K.-Rabani’06, Lee- Naor’06] (though precise gap between these two is open) Even with restricted # rounds Open: Can more rounds improve communication or approximation? Can we characterize the metrics admit efficient protocols? Some concrete questions: 1. What about arbitrary finite metrics? Alternatively, what are the hardest metrics? 2. What about restricted families/topologies? Specifically, what metrics are easy/efficient? 3. Are protocols stronger than embeddings? How much? In particular, should we embed into spaces richer than L 1 ?
6
October 29, 2008 Metric Embeddings as Computational Primitives 6 1. Arbitrary Metrics Theorem [Andoni-K.]: Every n-point metric admits a one-way protocol with communication O((log n)/a), for all a 1. (Previously known for a=O(1) and a=Clog n) 1. Both parties compute s random partitions with diameter =a r of M, using algorithm of [Calinescu-Karloff-Rabani’01] 2. Alice finds i s.t. x is padded in partition P i, and sends 3. Bob accepts iff hash(P i (y)) = hash(P i (x)) Analysis: By [Mendel-Naor’07], probability of padding t>0 is Letting s=n O(1/a), protocol succeeds WHP using O(log s) bits. Open: Sketching protocol? Is this bound optimal for all 1 a o(log n)? (plugging t=r)
7
October 29, 2008 Metric Embeddings as Computational Primitives 7 1. Arbitrary Metrics – Lower Bound Theorem [Andoni-K.]: For expanders, CC ((log n)/a) for all a 1. Define 2 distributions over pairs (x,y) M: 0 = uniform; 1 = random x, random walk of length r=(log d n)/a to y (d=degree) Lemma: Protocol with t bits implies functions A,B : M {0,1} s.t. Pr 1 [A(x)=B(y)] – Pr 0 [A(x)=B(y)] (2 -t ) However, G r =(E r,V r ) defined by r-walks, has degree d r and 2 nd eigenvalue r, and Applying the expander mixing lemma: Altogether, (2 -t ) ( /d) r t (r log(d/ )).
8
October 29, 2008 Metric Embeddings as Computational Primitives 8 2. Easy Metrics Theorem [Andoni-K.]: Every tree metric admits a 1+ approximation sketching protocol using Õ(1/ ) bits This bound is optimal, even for complete binary trees Compare with: L 1 metrics in general are less efficient – require (1/ 2 ) bits Line (path) metric is significantly more efficient – require (log(1/ )) bits Extends to product of k=O(1) lines or trees ( R k or H k ) Open: 1+ approximation protocol for planar metrics? Related to planar distance labeling [Thorup’04]
9
October 29, 2008 Metric Embeddings as Computational Primitives 9 3. Richer Host Spaces – Powers Several metrics embed into L 2 -squared with O(1) distortion: Doubling metrics [Assouad’82] Planar (and excluded-minor) metrics, via [Klein-Plotkin-Rao’93] Shift metric = quotient of Hamming cube {0,1} d wrt cyclic shifts Open: Embedding shift metric into L 2 -squared In fact, p-th power of L 1 (or L 2 ) admits a sketching protocol with 1+ approximation and O(p 2 / 2 ) bits Because ||x-y|| p > R(1+ ) ||x-y|| > R 1/p (1+ ) 1/p R 1/p (1+ /p) Open: How strict is the hierarchy L 1 (L 1 ) 2 … (L 1 ) p ? When allowing constant distortion [Deza-Maehara’90]: n-point metrics embed isometrically into (L 2 ) O(n)
10
October 29, 2008 Metric Embeddings as Computational Primitives 10 3. Richer Host Spaces – Products The L p -product of k copies of M, denoted Lp M, is the space M k =M … M equipped with distance function d Lp (x,y) = ( d M (x 1,y 1 ) p + … + d M (x k,y k ) p ) 1/p E.g. Z 3 grid is the L 1 -product of 3 copies of the integer line Theorem [Andoni-Indyk-K.’09]: The Ulam metric (edit distance on permutations) embeds with O(1) distortion into L2 2 L L 1. Leads to improved algorithms (NNS, sketching, sublinear DE) Embedding into L 1 or into L 2 -squared (in fact, even sketching) requires higher distortion ̃(log d) [Andoni-K.’07] Open: Use similar products to improve edit distance? Known distortion is 2 Õ( log d) into L 1 Open: Embedding expanders into L L 1 ? Into L L 1 (low dim)?
11
October 29, 2008 Metric Embeddings as Computational Primitives 11 Conclusion We discussed connections between communication protocols and embeddings, into L 1 and into other spaces We view embeddings as algorithmic framework (model of computation) Alternatively, view protocols as a computational analogue of embeddings Open: What metrics admit efficient protocols? Open: Do protocols imply embeddings? Even under mild technical conditions, e.g. Target space is (L 2 ) p, for fixed p M is closed under l 1 -product (aka sum-product) The regime of 1+ approximation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.