Advances in Metric Embedding Theory Yair Bartal Hebrew University &Caltech UCLA IPAM 07.

Slides:



Advertisements
Similar presentations
Parikshit Gopalan Georgia Institute of Technology Atlanta, Georgia, USA.
Advertisements

Routing Complexity of Faulty Networks Omer Angel Itai Benjamini Eran Ofek Udi Wieder The Weizmann Institute of Science.
Embedding Metric Spaces in Their Intrinsic Dimension Ittai Abraham, Yair Bartal*, Ofer Neiman The Hebrew University * also Caltech.
1 SOFSEM 2007 Weighted Nearest Neighbor Algorithms for the Graph Exploration Problem on Cycles Eiji Miyano Kyushu Institute of Technology, Japan Joint.
Nearest Neighbor Search in High Dimensions Seminar in Algorithms and Geometry Mica Arie-Nachimson and Daniel Glasner April 2009.
Trees and Markov convexity James R. Lee Institute for Advanced Study [ with Assaf Naor and Yuval Peres ] RdRd x y.
Metric Embeddings with Relaxed Guarantees Hubert Chan Joint work with Kedar Dhamdhere, Anupam Gupta, Jon Kleinberg, Aleksandrs Slivkins.
Metric Embedding with Relaxed Guarantees Ofer Neiman Ittai Abraham Yair Bartal.
Cse 521: design and analysis of algorithms Time & place T, Th pm in CSE 203 People Prof: James Lee TA: Thach Nguyen Book.
Random Geometric Graph Diameter in the Unit Disk Robert B. Ellis, Texas A&M University coauthors Jeremy L. Martin, University of Minnesota Catherine Yan,
Metric embeddings, graph expansion, and high-dimensional convex geometry James R. Lee Institute for Advanced Study.
Geometric embeddings and graph expansion James R. Lee Institute for Advanced Study (Princeton) University of Washington (Seattle)
Embedding Metrics into Ultrametrics and Graphs into Spanning Trees with Constant Average Distortion Ittai Abraham, Yair Bartal, Ofer Neiman The Hebrew.
Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]
Tools from Computational Geometry Bernard Chazelle Princeton University Bernard Chazelle Princeton University Tutorial FOCS 2005.
Interchanging distance and capacity in probabilistic mappings Uriel Feige Weizmann Institute.
A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.
On the Spread of Viruses on the Internet Noam Berger Joint work with C. Borgs, J.T. Chayes and A. Saberi.
Navigating Nets: Simple algorithms for proximity search Robert Krauthgamer (IBM Almaden) Joint work with James R. Lee (UC Berkeley)
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey University of Waterloo Department of Combinatorics and Optimization Joint.
Proximity algorithms for nearly-doubling spaces Lee-Ad Gottlieb Robert Krauthgamer Weizmann Institute TexPoint fonts used in EMF. Read the TexPoint manual.
Network Design Adam Meyerson Carnegie-Mellon University.
Advances in Metric Embedding Theory Ofer Neiman Ittai Abraham Yair Bartal Hebrew University.
Dept. of Computer Science Distributed Computing Group Asymptotically Optimal Mobile Ad-Hoc Routing Fabian Kuhn Roger Wattenhofer Aaron Zollinger.
Theta Function Lecture 24: Apr 18. Error Detection Code Given a noisy channel, and a finite alphabet V, and certain pairs that can be confounded, the.
Lower Bounds on the Distortion of Embedding Finite Metric Spaces in Graphs Y. Rabinovich R. Raz DCG 19 (1998) Iris Reinbacher COMP 670P
Dimensionality Reduction
Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen.
A General Approach to Online Network Optimization Problems Seffi Naor Computer Science Dept. Technion Haifa, Israel Joint work: Noga Alon, Yossi Azar,
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Embedding and Sketching Alexandr Andoni (MSR). Definition by example  Problem: Compute the diameter of a set S, of size n, living in d-dimensional ℓ.
Right Protection via Watermarking with Provable Preservation of Distance-based Mining Spyros Zoumpoulis Joint work with Michalis Vlachos, Nick Freris and.
Distance scales, embeddings, and efficient relaxations of the cut cone James R. Lee University of California, Berkeley.
Volume distortion for subsets of R n James R. Lee Institute for Advanced Study & University of Washington Symposium on Computational Geometry, 2006; Sedona,
Efficient Regression in Metric Spaces via Approximate Lipschitz Extension Lee-Ad GottliebAriel University Aryeh KontorovichBen-Gurion University Robert.
Algorithms on negatively curved spaces James R. Lee University of Washington Robert Krauthgamer IBM Research (Almaden) TexPoint fonts used in EMF. Read.
Correlation testing for affine invariant properties on Shachar Lovett Institute for Advanced Study Joint with Hamed Hatami (McGill)
1 By: MOSES CHARIKAR, CHANDRA CHEKURI, TOMAS FEDER, AND RAJEEV MOTWANI Presented By: Sarah Hegab.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Edge-disjoint induced subgraphs with given minimum degree Raphael Yuster 2012.
Fast, precise and dynamic distance queries Yair BartalHebrew U. Lee-Ad GottliebWeizmann → Hebrew U. Liam RodittyBar Ilan Tsvi KopelowitzBar Ilan → Weizmann.
13 th Nov Geometry of Graphs and It’s Applications Suijt P Gujar. Topics in Approximation Algorithms Instructor : T Kavitha.
Embeddings, flow, and cuts: an introduction University of Washington James R. Lee.
Topics in Algorithms 2007 Ramesh Hariharan. Tree Embeddings.
Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton.
What is a metric embedding?Embedding ultrametrics into R d An embedding of an input metric space into a host metric space is a mapping that sends each.
On the Impossibility of Dimension Reduction for Doubling Subsets of L p Yair Bartal Lee-Ad Gottlieb Ofer Neiman.
1 Approximation Algorithms for Low- Distortion Embeddings into Low- Dimensional Spaces Badoiu et al. (SODA 2005) Presented by: Ethan Phelps-Goodman Atri.
Introduction Wireless Ad-Hoc Network  Set of transceivers communicating by radio.
Oct 23, 2005FOCS Metric Embeddings with Relaxed Guarantees Alex Slivkins Cornell University Joint work with Ittai Abraham, Yair Bartal, Hubert Chan,
An algorithmic proof of the Lovasz Local Lemma via resampling oracles Jan Vondrak IBM Almaden TexPoint fonts used in EMF. Read the TexPoint manual before.
Succinct Routing Tables for Planar Graphs Compact Routing for Graphs Excluding a Fixed Minor Ittai Abraham (Hebrew Univ. of Jerusalem) Cyril Gavoille (LaBRI,
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
Approximating Graphs by Trees Marcin Bieńkowski University of Wrocław
Dimension reduction for finite trees in L1
Minimum Spanning Tree 8/7/2018 4:26 AM
Advances in Metric Embedding Theory
Ultra-low-dimensional embeddings of doubling metrics
Lecture 16: Earth-Mover Distance
Near-Optimal (Euclidean) Metric Compression
Light Spanners for Snowflake Metrics
Yair Bartal Lee-Ad Gottlieb Hebrew U. Ariel University
cse 521: design and analysis of algorithms
Metric Methods and Approximation Algorithms
Dimension versus Distortion a.k.a. Euclidean Dimension Reduction
Embedding Metrics into Geometric Spaces
Lecture 15: Least Square Regression Metric Embeddings
The Intrinsic Dimension of Metric Spaces
Clustering.
Routing in Networks with Low Doubling Dimension
Presentation transcript:

Advances in Metric Embedding Theory Yair Bartal Hebrew University &Caltech UCLA IPAM 07

Metric Spaces  Metric space: (X,d) d:X 2 →R  Metric space: (X,d) d:X 2 →R +  d( u,v)=d(v,u)  d(v,w) ≤ d(v,u) + d(u,w)  d(u,u)=0  Data Representation:  Data Representation: Pictures (e.g. faces), web pages, DNA sequences, …  Network:  Network: communication distance

Metric Embedding  Simple Representation:  Simple Representation: Translate metric data into easy to analyze form, gain geometric structure: e.g. embed in low- dimensional Euclidean space  Algorithmic Application:  Algorithmic Application: Apply algorithms for a “nice” space to solve problem on “problematic” metric spaces

Embedding Metric Spaces  Metric spaces (X,d X ), (Y,d y )  Embedding is a function f:X→Y  For an embedding f, Given u,v in X let Given u,v in X let  Distortion c = max {u,v  X} dist f (u,v) / min {u,v  X} dist f (u,v)

Special Metric Spaces  Euclidean space  l p metric in R n :  Planar metrics  Tree metrics  Ultrametrics  Doubling

Embedding in Normed Spaces  [Fréchet Embedding]: Any n -point metric space embeds isometrically in L ∞  Proof. x y w

Embedding in Normed Spaces  [Bourgain 85]: Any n -point metric space embeds in L p with distortion (log n)  [Bourgain 85]: Any n -point metric space embeds in L p with distortion Θ(log n)  [Johnson-Lindenstrauss 85]: Any n - point subset of Euclidean Space embeds with distortion (1+  ) in dimension (  - 2 log n)  [Johnson-Lindenstrauss 85]: Any n - point subset of Euclidean Space embeds with distortion (1+  ) in dimension Θ(  - 2 log n)  [ABN 06, B 06]: Dimension Θ(log n) In fact: Θ * (log n/ loglog n)

Embeddings Metrics in their Intrinsic Dimension  Definition: A metric space X has doubling constant λ, if any ball with radius r>0 can be covered with λ balls of half the radius.  Doubling dimension: dim(X) = log λ  [ABN 07b]: Any n point metric space X can be embedded into L p with distortion O(log 1+θ n), dimension O(dim(X))  Same embedding, using:  nets  Lovász Local Lemma  Distortion-Dimension Tradeoff

Average Distortion  Practical measure of the quality of an embedding  Network embedding, Multi-dimensional scaling, Biology, Vision,…  Given a non-contracting embedding f : (X,d X )→(Y,d Y ): f : (X,d X )→(Y,d Y ):  [ABN06]: Every n point metric space embeds into L p with average distortion O(1), worst-case distortion Θ(log n) and dimension Θ(log n).

The l q -Distortion  l q -distortion  l q -distortion: [ABN 06]: [ABN 06]: l q -distortion is bounded by Θ(q)

Dimension Reduction into Constant Dimension  [B 07]: Any finite subset of Euclidean Space embeds in dimension h with distortion  [B 07]: Any finite subset of Euclidean Space embeds in dimension h with l q- distortion e O(q/h) ~ 1+ O(q/h)  Corollary: Every finite metric space embeds into L p in dimension h with distortion  Corollary: Every finite metric space embeds into L p in dimension h with l q- distortion

Local Embeddings  Def: A k -local embedding has distortion D(k) if for every k -nearest neighbors x,y: dist f (x,y) ≤ D(k)  [ABN 07c]: For fixed k, k -local embedding into L p distortion (log k ) and dimension  (log k) (under very weak growth bound condition)  [ABN 07c]: For fixed k, k -local embedding into L p distortion  (log k ) and dimension  (log k) (under very weak growth bound condition)  [ABN 07c]: k -local embedding into L p with distortion Õ(log k) on neighbors, for all k simultaneously, and dimension (log n)  [ABN 07c]: k -local embedding into L p with distortion Õ(log k) on neighbors, for all k simultaneously, and dimension  (log n)  Same embedding method  Lovász Local Lemma

Local Dimension Reduction  [BRS 07]: For fixed k, any finite set of points in Euclidean space has k -local embedding with distortion (1+  ) in dimension  (  - 2 log k) (under very weak growth bound condition)  New embedding ideas  Lovász Local Lemma

Time for a …

Metric Ramsey Problem   Given a metric space what is the largest size subspace which has some special structure, e.g. close to be Euclidean  Graph theory:  Graph theory: Every graph of size n contains either a clique or an independent set of size  (log n)  Dvoretzky’s theorem…  [BFM 86]:  [BFM 86]: Every n point metric space contains a subspace of size  (c  log n) which embeds in Euclidean space with distortion (1+  )

Basic Structures: Ultrametric, k-HST [B 96] d(x,z)=  (lca(x,z))=  (v) (w)(w) (u)(u) 0 =  (z)   (w)/k   (v)/k 2   (u)/k 3 (v)(v) xz  (z)=0 An ultrametric k-embeds in a k-HST (moreover this can be done so that labels are powers of k).

Hierarchically Well- Separated Trees 11 11 11 11 11 22 22 22  2   1 / k 33 33 33 33 33  3   2 / k

Properties of Ultrametrics  An ultrametric is a tree metric.  Ultrametrics embed isometrically in l 2.  [BM 04]: Any n -point ultrametric (1+  )- embeds in l p d, where d = O (  - 2 log n ).

A Metric Ramsey Phenomenon  Consider n equally spaced points on the line.  Choose a “Cantor like” set of points, and construct a binary tree over them.  The resulting tree is 3-HST, and the original subspace embeds in this tree with distortion 3.  Size of subspace:.

Metric Ramsey Phenomena  [BLMN 03, MN 06, B 06]: Any n -point metric space contains a subspace of size which embeds in an ultrametric with distortion (1/  )  [BLMN 03, MN 06, B 06]: Any n -point metric space contains a subspace of size which embeds in an ultrametric with distortion Θ (1/  )  [B 06]: Any n -point metric space contains a subspace of linear size which embeds in an ultrametric with l q Õ  [B 06]: Any n -point metric space contains a subspace of linear size which embeds in an ultrametric with l q -distortion is bounded by Õ(q)

Metric Ramsey Theorems  Key Ingredient: Partitions

Complete Representation via Ultrametrics ?  Goal:  Goal: Given an n point metric space, we would like to embed it into an ultrametric with low distortion.  Lower Bound: [RR 95]  Lower Bound:  (n), in fact this holds event for embedding the n-cycle into arbitrary tree metrics [RR 95]

Probabilistic Embedding  [Karp 89]: The n -cycle probabilistically- embeds in n -line spaces with distortion 2  If u,v are adjacent in the cycle then  If u,v are adjacent in the cycle C then E(d L (u,v))= (n-1)/n + (n-1)/n < 2 = 2 d C (u,v) E(d L (u,v))= (n-1)/n + (n-1)/n < 2 = 2 d C (u,v) C

Probabilistic Embedding  [B 96,98,04, FRT 03]: Any n -point metric space probabilistically embeds into with distortion (log n)  [B 96,98,04, FRT 03]: Any n -point metric space probabilistically embeds into an ultrametric with distortion Θ (log n) [ABN 05,06, CDGKS 05]: l q -distortion is Θ(q)

Probabilistic Embedding  Key Ingredient: Probabilistic Partitions

Probabilistic Partitions  P={S 1,S 2,…S t } is a partition of X if  P(x) is the cluster containing x.  P is Δ-bounded if diam(S i )≤Δ for all i.  A probabilistic partition P is a distribution over a set of partitions.  P is (η,  )-padded if  Call P η-padded if  x1x1 x2x2 ηη ηη [B 96][B 96]  =  (1/(log n)) [CKR01+FRT03, ABN06]:[CKR01+FRT03, ABN06]: η(x)= Ω(1/log (ρ(x,Δ))

 [B 96, Rao 99, …]  Let Δ i =4 i be the scales.  For each scale i, create a probabilistic Δ i - bounde d partitions P i, that are η- padded.  For each cluster choose σ i (S)~Ber(½) i.i.d. f i (x)= σ i (P i (x))·d(x,X\P i (x)) f i (x)= σ i (P i (x))·d(x,X\P i (x))  Repeat O(log n) times.  Distortion : O(η -1 ·log 1/p Δ).  Dimension : O(log n·log Δ). Partitions and Embedding diameter of X = diameter of X = Δ ΔiΔi 4 16 x d(x,X\P(x))

Time to …

Uniform Probabilistic Partitions  In a Uniform Probabilistic Partition η:X→[0,1] all points in a cluster have the same padding parameter.  [ABN 06]: Uniform partition lemma: There exists a uniform probabilistic Δ-bounded partition such that for any, η(x)=log -1 ρ(v,Δ), where  The local growth rate of x at radius r is: v1v1 v2v2 v3v3 C1C1 C2C2 η(C 2 )  η(C 1 ) 

 Let Δ i =4 i.  For each scale i, create uniformly padded probabilistic Δ i - bounde d partitions P i.  For each cluster choose σ i (S)~Ber(½) i.i.d., f i (x)= σ i (P i (x))·η i -1 (x)·d(x,X\P i (x)), f i (x)= σ i (P i (x))·η i -1 (x)·d(x,X\P i (x)) 1.Upper bound : |f(x)-f(y)| ≤ O(log n)·d(x,y). 2.Lower bound: E[|f(x)-f(y)|] ≥ Ω(d(x,y)) 3.Replicate D=Θ(log n) times to get high probability. Embedding into a single dimension

Upper Bound: |f(x)-f(y)| ≤ O(log n) d(x,y)  For all x,yєX : - P i (x)≠P i (y) implies f i (x)≤ η i -1 (x)· d(x,y) - P i (x)≠P i (y) implies f i (x)≤ η i -1 (x)· d(x,y) - P i (x)=P i (y) implies f i (x)- f i (y)≤ η i -1 (x)· d(x,y) - P i (x)=P i (y) implies f i (x)- f i (y)≤ η i -1 (x)· d(x,y) Use uniform padding in cluster

x y  Take a scale i such that Δ i ≈d(x,y)/4.  It must be that P i (x)≠P i (y)  With probability ½ : η i -1 (x)d(x,X\P i (x))≥Δ i LowerBound:

Lower bound : E[|f(x)-f(y)|] ≥ d(x,y)  Two cases: 1.R < Δ i /2 then  prob. ⅛: σ i (P i (x))=1 and σ i (P i (y))=0  Then f i (x) ≥ Δ i, f i (y)=0  |f(x)-f(y)| ≥ Δ i /2 =Ω(d(x,y)). 2.R ≥ Δ i /2 then  prob. ¼: σ i (P i (x))=0 and σ i (P i (y))=0  f i (x)=f i (y)=0  |f(x)-f(y)| ≥ Δ i /2 =Ω(d(x,y)).

Partial Embedding & Scaling Distortion  Definition: A (1-ε)- partial embedding has distortion D(ε), if at least 1-ε of the pairs satisfy dist f (u,v) ≤ D(ε)  Definition: An embedding has scaling distortion D(·) if it is a 1-ε partial embedding with distortion D(ε), for all ε>0  [KSW 04]  [ABN 05, CDGKS 05]:  Partial distortion and dimension  (log(1/ε))  [ABN06]: Scaling distortion  (log(1/ε)) for all metrics

l q -Distortion vs. Scaling Distortion  Upper bound D  c log(1/  ) on Scaling distortion:  ½ of pairs have distortion ≤ c log 2 = c  + ¼ ofpairsdistortion ≤ c log 4 = 2c  + ¼ of pairs have distortion ≤ c log 4 = 2c  + ⅛ ofpairsdistortion ≤ c log 8 = 3c  + ⅛ of pairs have distortion ≤ c log 8 = 3c  ….  Average distortion = O(1)  Wost case distortion = O(log(n))  l q - distortion = O(min{q,log n})

Coarse Scaling Embedding into L p  Definition: For uєX, r ε (u) is the minimal radius such that |B(u,r ε (u))| ≥ εn.  Coarse scaling embedding: For each uєX, preserves distances to v s.t. d(u,v) ≥ r ε (u). u r ε (u) v r ε (v) r ε (w) w

Scaling Distortion  Claim: If d(x,y) ≥ r ε (x) then 1 ≤ dist f (x,y) ≤ O(log 1/ε)  Let l be the scale d(x,y) ≤ Δ l < 4d(x,y) 1.Lower bound: E[|f(x)-f(y)|] ≥ d(x,y) 2.Upper bound for high diameter terms 3.Upper bound for low diameter terms 4.Replicate D=Θ(log n) times to get high probability.

Upper Bound for high diameter terms: |f(x)-f(y)| ≤ O(log 1/ε) d(x,y) Scale l such that r ε (x)≤d(x,y) ≤ Δ l < 4d(x,y). Scale l such that r ε (x)≤d(x,y) ≤ Δ l < 4d(x,y).

Upper Bound for low diameter terms: |f(u)-f(v)| = O(1) d(u,v) Scale l such that d(x,y) ≤ Δ l < 4d(x,y). Scale l such that d(x,y) ≤ Δ l < 4d(x,y).  All lower levels i ≤ l are bounded by Δ i.

Embedding into trees with Constant Average Distortion  [ABN 07a]: An embedding of any n point metric into a single ultrametric.  An embedding of any graph on n vertices into a spanning tree of the graph.  Average distortion = O(1).  L 2 -distortion =  L q -distortion = Θ(n 1-2/q ), for 2<q≤∞

Conclusion  Developing mathematical theory of embedding of finite metric spaces  Fruitful interaction between computer science and pure/applied mathematics  New concepts of embedding yield surprisingly strong properties

Summary  Unified framework for embedding finite metrics.  Probabilistic embedding into ultrametrics.  Metric Ramsey theorems.  New measures of distortion.  Embeddings with strong properties:  Optimal scaling distortion.  Constant average distortion.  Tight distortion-dimension tradeoff.  Embedding metrics in their intrinsic dimension.  Embedding that strongly preserve locality.