Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]

Slides:



Advertisements
Similar presentations
1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.
Advertisements

Optimal Space Lower Bounds for All Frequency Moments David Woodruff MIT
Quantum t-designs: t-wise independence in the quantum world Andris Ambainis, Joseph Emerson IQC, University of Waterloo.
Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper.
Xiaoming Sun Tsinghua University David Woodruff MIT
Tight Lower Bounds for the Distinct Elements Problem David Woodruff MIT Joint work with Piotr Indyk.
Nearest Neighbor Search in High Dimensions Seminar in Algorithms and Geometry Mica Arie-Nachimson and Daniel Glasner April 2009.
Embedding the Ulam metric into ℓ 1 (Ενκρεβάτωση του μετρικού χώρου Ulam στον ℓ 1 ) Για το μάθημα “Advanced Data Structures” Αντώνης Αχιλλέως.
Overcoming the L 1 Non- Embeddability Barrier Robert Krauthgamer (Weizmann Institute) Joint work with Alexandr Andoni and Piotr Indyk (MIT)
The Unique Games Conjecture with Entangled Provers is False Julia Kempe Tel Aviv University Oded Regev Tel Aviv University Ben Toner CWI, Amsterdam.
Metric embeddings, graph expansion, and high-dimensional convex geometry James R. Lee Institute for Advanced Study.
Geometric embeddings and graph expansion James R. Lee Institute for Advanced Study (Princeton) University of Washington (Seattle)
Navigating Nets: Simple algorithms for proximity search Robert Krauthgamer (IBM Almaden) Joint work with James R. Lee (UC Berkeley)
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Institute) Robert Krauthgamer (Weizmann Institute) Ilya Razenshteyn (CSAIL MIT)
On the tightness of Buhrman- Cleve-Wigderson simulation Shengyu Zhang The Chinese University of Hong Kong On the relation between decision tree complexity.
Proximity algorithms for nearly-doubling spaces Lee-Ad Gottlieb Robert Krauthgamer Weizmann Institute TexPoint fonts used in EMF. Read the TexPoint manual.
Network Design Adam Meyerson Carnegie-Mellon University.
Testing Metric Properties Michal Parnas and Dana Ron.
Dept. of Computer Science Distributed Computing Group Asymptotically Optimal Mobile Ad-Hoc Routing Fabian Kuhn Roger Wattenhofer Aaron Zollinger.
1 Streaming Computation of Combinatorial Objects Ziv Bar-Yossef U.C. Berkeley Omer Reingold AT&T Labs – Research Ronen.
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Inst. / Columbia) Robert Krauthgamer (Weizmann Inst.) Ilya Razenshteyn (MIT, now.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 13 June 22, 2005
Embedding and Sketching Alexandr Andoni (MSR). Definition by example  Problem: Compute the diameter of a set S, of size n, living in d-dimensional ℓ.
Embedding and Sketching Non-normed spaces Alexandr Andoni (MSR)
Algorithms on negatively curved spaces James R. Lee University of Washington Robert Krauthgamer IBM Research (Almaden) TexPoint fonts used in EMF. Read.
Efficient Approximation of Edit Distance Robert Krauthgamer, Weizmann Institute of Science SPIRE 2013 TexPoint fonts used in EMF. Read the TexPoint manual.
On Embedding Edit Distance into L_11 On Embedding Edit Distance into L 1 Robert Krauthgamer (Weizmann Institute and IBM Almaden)‏ Based on joint work (i)
Information Complexity Lower Bounds for Data Streams David Woodruff IBM Almaden.
Streaming Algorithms Piotr Indyk MIT. Data Streams A data stream is a sequence of data that is too large to be stored in available memory Examples: –Network.
Fast, precise and dynamic distance queries Yair BartalHebrew U. Lee-Ad GottliebWeizmann → Hebrew U. Liam RodittyBar Ilan Tsvi KopelowitzBar Ilan → Weizmann.
Lower Bounds for Read/Write Streams Paul Beame Joint work with Trinh Huynh (Dang-Trinh Huynh-Ngoc) University of Washington.
One-way multi-party communication lower bound for pointer jumping with applications Emanuele Viola & Avi Wigderson Columbia University IAS work done while.
13 th Nov Geometry of Graphs and It’s Applications Suijt P Gujar. Topics in Approximation Algorithms Instructor : T Kavitha.
Sublinear Algorithms via Precision Sampling Alexandr Andoni (Microsoft Research) joint work with: Robert Krauthgamer (Weizmann Inst.) Krzysztof Onak (CMU)
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
Embeddings, flow, and cuts: an introduction University of Washington James R. Lee.
Amplification and Derandomization Without Slowdown Dana Moshkovitz MIT Joint work with Ofer Grossman (MIT)
Massive Data Sets and Information Theory Ziv Bar-Yossef Department of Electrical Engineering Technion.
Data Stream Algorithms Lower Bounds Graham Cormode
Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton.
Lattice-based cryptography and quantum Oded Regev Tel-Aviv University.
Lower Bounds for Embedding Edit Distance into Normed Spaces A. Andoni, M. Deza, A. Gupta, P. Indyk, S. Raskhodnikova.
Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.
1 Introduction to Quantum Information Processing CS 467 / CS 667 Phys 467 / Phys 767 C&O 481 / C&O 681 Richard Cleve DC 3524 Course.
Tight Bound for the Gap Hamming Distance Problem Oded Regev Tel Aviv University TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
Sketching complexity of graph cuts Alexandr Andoni joint work with: Robi Krauthgamer, David Woodruff.
Succinct Routing Tables for Planar Graphs Compact Routing for Graphs Excluding a Fixed Minor Ittai Abraham (Hebrew Univ. of Jerusalem) Cyril Gavoille (LaBRI,
Algorithms for Big Data: Streaming and Sublinear Time Algorithms
On Sample Based Testers
Approximate Near Neighbors for General Symmetric Norms
Information Complexity Lower Bounds
New Characterizations in Turnstile Streams with Applications
Dimension reduction for finite trees in L1
Generalized Sparsest Cut and Embeddings of Negative-Type Metrics
Sublinear Algorithmic Tools 3
Lecture 10: Sketching S3: Nearest Neighbor Search
Sketching and Embedding are Equivalent for Norms
CS 154, Lecture 6: Communication Complexity
Turnstile Streaming Algorithms Might as Well Be Linear Sketches
Linear sketching with parities
Lower Bounds for Edit Distance Estimation
Near-Optimal (Euclidean) Metric Compression
Linear sketching over
Overcoming the L1 Non-Embeddability Barrier
On the effect of randomness on planted 3-coloring models
Streaming Symmetric Norms via Measure Concentration
Embedding Metrics into Geometric Spaces
Lecture 15: Least Square Regression Metric Embeddings
Shengyu Zhang The Chinese University of Hong Kong
On Solving Linear Systems in Sublinear Time
Presentation transcript:

Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]

October 29, 2008 Metric Embeddings as Computational Primitives 2 Distance Estimation (DE) Fix a metric M  Example: Hamming or edit distance on {0,1} d Distance Estimation: Given x,y  M as input, compute d M (x,y)  Exactly or within approximation a  1  Decision version (for input R>0): determine if d(x,y)  R or d(x,y)>a  R  A basic computational problem We want efficient algorithms – but in what sense? 1. Classical – best runtime of exact algorithm  Example: quadratic time for edit distance 2. Fast approximation – sublinear time regime  Example: Hamming distance can be estimated in O(d/R) time 3. Communication complexity (variants: # rounds, streaming)  Example: Hamming distance admits1+  approximation

October 29, 2008 Metric Embeddings as Computational Primitives 3 Communication Complexity (CC) Alice x y Estimate d M (x,y) Known, powerful methodology Many connections/applications to computation and algorithms  E.g. quickly estimate similarity, Near Neighbor Search (NNS), or network proximity Efficiency measured by: Communication  Information … few bits Protocol rounds: 0 (aka simultaneous/sketching): Referee decides based on sketch(x), sketch(y) 1 (aka one-way protocol) arbitrary Referee sketch(y) sketch(x) Bob public randomness

October 29, 2008 Metric Embeddings as Computational Primitives 4 From L 1 -Embeddings to Protocols Theorem [Kushilevitz-Ostrovsky-Rabani’98]: Hamming distance (L 1 ) admits 1+  approximation using sketches of size O(1/  2 ). Namely, sketching protocol with O(1/  2 ) communication Sketch-size is optimal, even for one round [Woodruff’04] Open: What is the optimal communication for more rounds? Corollary: Every metric M that embeds into L 1 with distortion D>1, admits (2D)-approximation by sketches of size O(1). Provides several state-of-the-art communication bounds (e.g. for edit, block- edit, and planar earthmover distance) Theorem [Saks-Sun’02, BarYossef-Jayram-Kumar-Sivakumar’04]: L  on [m] n with approx. m-1 requires  (n/m 2 ) communication I.e., the trivial protocol is optimal Embedding into L p for large p, is not useful Embedding into L 1 may be viewed as a “protocol” or “algorithm” Alternative perspective: Protocols generalize L 1 -embeddings

October 29, 2008 Metric Embeddings as Computational Primitives 5 Protocols Generalize L 1 -Embeddings A strict generalization – includes embedding into L 2 -squared  Which is provably richer than L 1 [Khot-Vishnoi’05, K.-Rabani’06, Lee- Naor’06] (though precise gap between these two is open) Even with restricted # rounds  Open: Can more rounds improve communication or approximation? Can we characterize the metrics admit efficient protocols? Some concrete questions: 1. What about arbitrary finite metrics?  Alternatively, what are the hardest metrics? 2. What about restricted families/topologies?  Specifically, what metrics are easy/efficient? 3. Are protocols stronger than embeddings? How much?  In particular, should we embed into spaces richer than L 1 ?

October 29, 2008 Metric Embeddings as Computational Primitives 6 1. Arbitrary Metrics Theorem [Andoni-K.]: Every n-point metric admits a one-way protocol with communication O((log n)/a), for all a  1. (Previously known for a=O(1) and a=Clog n) 1. Both parties compute s random partitions with diameter  =a  r of M, using algorithm of [Calinescu-Karloff-Rabani’01] 2. Alice finds i s.t. x is padded in partition P i, and sends 3. Bob accepts iff hash(P i (y)) = hash(P i (x)) Analysis: By [Mendel-Naor’07], probability of padding t>0 is Letting s=n O(1/a), protocol succeeds WHP using O(log s) bits. Open: Sketching protocol? Is this bound optimal for all 1  a  o(log n)? (plugging t=r)

October 29, 2008 Metric Embeddings as Computational Primitives 7 1. Arbitrary Metrics – Lower Bound Theorem [Andoni-K.]: For expanders, CC   ((log n)/a) for all a  1. Define 2 distributions over pairs (x,y)  M:   0 = uniform;   1 = random x, random walk of length r=(log d n)/a to y (d=degree) Lemma: Protocol with t bits implies functions A,B : M  {0,1} s.t. Pr  1 [A(x)=B(y)] – Pr  0 [A(x)=B(y)]   (2 -t ) However, G r =(E r,V r ) defined by r-walks, has degree d r and 2 nd eigenvalue r, and Applying the expander mixing lemma: Altogether,  (2 -t )  ( /d) r  t   (r  log(d/ )).

October 29, 2008 Metric Embeddings as Computational Primitives 8 2. Easy Metrics Theorem [Andoni-K.]: Every tree metric admits a 1+  approximation sketching protocol using Õ(1/  ) bits This bound is optimal, even for complete binary trees Compare with:  L 1 metrics in general are less efficient – require  (1/  2 ) bits  Line (path) metric is significantly more efficient – require  (log(1/  )) bits Extends to product of k=O(1) lines or trees ( R k or H k ) Open: 1+  approximation protocol for planar metrics?  Related to planar distance labeling [Thorup’04]

October 29, 2008 Metric Embeddings as Computational Primitives 9 3. Richer Host Spaces – Powers Several metrics embed into L 2 -squared with O(1) distortion:  Doubling metrics [Assouad’82]  Planar (and excluded-minor) metrics, via [Klein-Plotkin-Rao’93] Shift metric = quotient of Hamming cube {0,1} d wrt cyclic shifts Open: Embedding shift metric into L 2 -squared In fact, p-th power of L 1 (or L 2 ) admits a sketching protocol with 1+  approximation and O(p 2 /  2 ) bits Because ||x-y|| p > R(1+  )  ||x-y|| > R 1/p (1+  ) 1/p  R 1/p (1+  /p) Open: How strict is the hierarchy L 1  (L 1 ) 2  …  (L 1 ) p ?  When allowing constant distortion  [Deza-Maehara’90]: n-point metrics embed isometrically into (L 2 ) O(n)

October 29, 2008 Metric Embeddings as Computational Primitives Richer Host Spaces – Products The L p -product of k copies of M, denoted  Lp M, is the space M k =M  …  M equipped with distance function d Lp (x,y) = ( d M (x 1,y 1 ) p + … + d M (x k,y k ) p ) 1/p  E.g. Z 3 grid is the L 1 -product of 3 copies of the integer line Theorem [Andoni-Indyk-K.’09]: The Ulam metric (edit distance on permutations) embeds with O(1) distortion into  L2 2  L  L 1. Leads to improved algorithms (NNS, sketching, sublinear DE) Embedding into L 1 or into L 2 -squared (in fact, even sketching) requires higher distortion  ̃(log d) [Andoni-K.’07] Open: Use similar products to improve edit distance?  Known distortion is 2 Õ(  log d) into L 1 Open: Embedding expanders into  L  L 1 ? Into  L  L 1 (low dim)?

October 29, 2008 Metric Embeddings as Computational Primitives 11 Conclusion We discussed connections between communication protocols and embeddings, into L 1 and into other spaces  We view embeddings as algorithmic framework (model of computation)  Alternatively, view protocols as a computational analogue of embeddings Open: What metrics admit efficient protocols? Open: Do protocols imply embeddings? Even under mild technical conditions, e.g.  Target space is (L 2 ) p, for fixed p  M is closed under l 1 -product (aka sum-product)  The regime of 1+  approximation