Advances in Metric Embedding Theory

Slides:



Advertisements
Similar presentations
Routing Complexity of Faulty Networks Omer Angel Itai Benjamini Eran Ofek Udi Wieder The Weizmann Institute of Science.
Advertisements

Embedding Metric Spaces in Their Intrinsic Dimension Ittai Abraham, Yair Bartal*, Ofer Neiman The Hebrew University * also Caltech.
On allocations that maximize fairness Uriel Feige Microsoft Research and Weizmann Institute.
1 SOFSEM 2007 Weighted Nearest Neighbor Algorithms for the Graph Exploration Problem on Cycles Eiji Miyano Kyushu Institute of Technology, Japan Joint.
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
Trees and Markov convexity James R. Lee Institute for Advanced Study [ with Assaf Naor and Yuval Peres ] RdRd x y.
Metric Embeddings with Relaxed Guarantees Hubert Chan Joint work with Kedar Dhamdhere, Anupam Gupta, Jon Kleinberg, Aleksandrs Slivkins.
Metric Embedding with Relaxed Guarantees Ofer Neiman Ittai Abraham Yair Bartal.
Cse 521: design and analysis of algorithms Time & place T, Th pm in CSE 203 People Prof: James Lee TA: Thach Nguyen Book.
Random Geometric Graph Diameter in the Unit Disk Robert B. Ellis, Texas A&M University coauthors Jeremy L. Martin, University of Minnesota Catherine Yan,
Geometric embeddings and graph expansion James R. Lee Institute for Advanced Study (Princeton) University of Washington (Seattle)
Embedding Metrics into Ultrametrics and Graphs into Spanning Trees with Constant Average Distortion Ittai Abraham, Yair Bartal, Ofer Neiman The Hebrew.
Interchanging distance and capacity in probabilistic mappings Uriel Feige Weizmann Institute.
A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.
On the Spread of Viruses on the Internet Noam Berger Joint work with C. Borgs, J.T. Chayes and A. Saberi.
Approximation Algorithms for TSP Mohit Singh McGill University (in lieu of Amin Saberi)
Navigating Nets: Simple algorithms for proximity search Robert Krauthgamer (IBM Almaden) Joint work with James R. Lee (UC Berkeley)
Approximation Algorithms for Non-Uniform Buy-at-Bulk Network Design Problems Mohammad R. Salavatipour Department of Computing Science University of Alberta.
Proximity algorithms for nearly-doubling spaces Lee-Ad Gottlieb Robert Krauthgamer Weizmann Institute TexPoint fonts used in EMF. Read the TexPoint manual.
Network Design Adam Meyerson Carnegie-Mellon University.
Advances in Metric Embedding Theory Ofer Neiman Ittai Abraham Yair Bartal Hebrew University.
Dept. of Computer Science Distributed Computing Group Asymptotically Optimal Mobile Ad-Hoc Routing Fabian Kuhn Roger Wattenhofer Aaron Zollinger.
Theta Function Lecture 24: Apr 18. Error Detection Code Given a noisy channel, and a finite alphabet V, and certain pairs that can be confounded, the.
Lower Bounds on the Distortion of Embedding Finite Metric Spaces in Graphs Y. Rabinovich R. Raz DCG 19 (1998) Iris Reinbacher COMP 670P
A General Approach to Online Network Optimization Problems Seffi Naor Computer Science Dept. Technion Haifa, Israel Joint work: Noga Alon, Yossi Azar,
Embedding and Sketching Alexandr Andoni (MSR). Definition by example  Problem: Compute the diameter of a set S, of size n, living in d-dimensional ℓ.
Distance scales, embeddings, and efficient relaxations of the cut cone James R. Lee University of California, Berkeley.
Efficient Regression in Metric Spaces via Approximate Lipschitz Extension Lee-Ad GottliebAriel University Aryeh KontorovichBen-Gurion University Robert.
Algorithms on negatively curved spaces James R. Lee University of Washington Robert Krauthgamer IBM Research (Almaden) TexPoint fonts used in EMF. Read.
CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
Fast, precise and dynamic distance queries Yair BartalHebrew U. Lee-Ad GottliebWeizmann → Hebrew U. Liam RodittyBar Ilan Tsvi KopelowitzBar Ilan → Weizmann.
13 th Nov Geometry of Graphs and It’s Applications Suijt P Gujar. Topics in Approximation Algorithms Instructor : T Kavitha.
Embeddings, flow, and cuts: an introduction University of Washington James R. Lee.
Topics in Algorithms 2007 Ramesh Hariharan. Tree Embeddings.
Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton.
What is a metric embedding?Embedding ultrametrics into R d An embedding of an input metric space into a host metric space is a mapping that sends each.
Multi-way spectral partitioning and higher-order Cheeger inequalities University of Washington James R. Lee Stanford University Luca Trevisan Shayan Oveis.
On the Impossibility of Dimension Reduction for Doubling Subsets of L p Yair Bartal Lee-Ad Gottlieb Ofer Neiman.
Advances in Metric Embedding Theory Yair Bartal Hebrew University &Caltech UCLA IPAM 07.
Oct 23, 2005FOCS Metric Embeddings with Relaxed Guarantees Alex Slivkins Cornell University Joint work with Ittai Abraham, Yair Bartal, Hubert Chan,
An algorithmic proof of the Lovasz Local Lemma via resampling oracles Jan Vondrak IBM Almaden TexPoint fonts used in EMF. Read the TexPoint manual before.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
Network (graph) Models
Approximating Graphs by Trees Marcin Bieńkowski University of Wrocław
Dimension reduction for finite trees in L1
Generalized Sparsest Cut and Embeddings of Negative-Type Metrics
Minimum Spanning Tree 8/7/2018 4:26 AM
Ultra-low-dimensional embeddings of doubling metrics
Reconstruction on trees and Phylogeny 1
Dimension reduction techniques for lp (1<p<2), with applications
MST in Log-Star Rounds of Congested Clique
Turnstile Streaming Algorithms Might as Well Be Linear Sketches
Lecture 16: Earth-Mover Distance
Near-Optimal (Euclidean) Metric Compression
Light Spanners for Snowflake Metrics
Yair Bartal Lee-Ad Gottlieb Hebrew U. Ariel University
Random Geometric Graph Diameter in the Unit Disk
cse 521: design and analysis of algorithms
Introduction Wireless Ad-Hoc Network
Miniconference on the Mathematics of Computation
Metric Methods and Approximation Algorithms
Dimension versus Distortion a.k.a. Euclidean Dimension Reduction
Embedding Metrics into Geometric Spaces
Lecture 15: Least Square Regression Metric Embeddings
The Intrinsic Dimension of Metric Spaces
Clustering.
President’s Day Lecture: Advanced Nearest Neighbor Search
Eötvös Loránd Tudományegyetem, Budapest
Routing in Networks with Low Doubling Dimension
Presentation transcript:

Advances in Metric Embedding Theory ICMS - Geometry and Algorithms workshop Advances in Metric Embedding Theory Yair Bartal Hebrew University & Caltech IPAM UCLA

My (metric) Space

Metric Spaces Metric space: (X,d) d:X2→R+ d(u,v)=d(v,u) d(v,w) ≤ d(u,v) + d(v,w) d(u,u)=0 Data Representation: Pictures (e.g. faces), web pages, DNA sequences, … Network: communication distance

Metric Embedding Simple Representation: Translate metric data into easy to analyze form, gain geometric structure: e.g. embed in low-dimensional Euclidean space Algorithmic Application: Apply algorithms for a “nice” space to solve problem on “problematic” metric spaces

Embedding Metric Spaces Metric spaces (X,dX), (Y,dy) Embedding is a function f:X→Y For an embedding f, Given u,v in X let Distortion c = max{u,v  X} distf(u,v) / min{u,v  X} distf(u,v)

Special Metric Spaces Euclidean space lp metric in Rn: Planar metrics Tree metrics Ultrametrics Doubling

Embedding in Normed Spaces [Bourgain 85]: Any n-point metric space embeds in Lp with distortion Θ(log n) [Johnson-Lindenstrauss 85]: Any n-point subset of Euclidean Space embeds with distortion (1+e) in dimension Θ(-2 log n) [ABN 06, B 06]: Dimension Θ(log n) In fact: Θ*(log n/ loglog n)

Average Distortion Practical measure of the quality of an embedding Network embedding, Multi-dimensional scaling, Biology, Vision,… Given a non-contracting embedding f:(X,dX)→(Y,dY): [ABN06]: Every n point metric space embeds into Lp with average distortion O(1), worst-case distortion Θ(log n) and dimension Θ(log n).

[ABN 06]: lq-distortion is bounded by Θ(q) The lq-Distortion lq-distortion: [ABN 06]: lq-distortion is bounded by Θ(q)

Time for a…

Metric Ramsey Problem Given a metric space what is the largest size subspace which has some special structure, e.g. close to be Euclidean Graph theory: Every graph of size n contains either a clique or an independent set of size Q(log n) Dvoretzky’s theorem… [BFM 86]: Every n point metric space contains a subspace of size W(ce log n) which embeds in Euclidean space with distortion (1+e)

Basic Structures: Ultrametric, k-HST [B 96] d(x,z)= (lca(x,z))= (v) (w) (u) 0 = (z)  (w)/k  (v)/k2 (u)/k3 (v) x z (z)=0 An ultrametric k-embeds in a k-HST (moreover this can be done so that labels are powers of k).

Hierarchically Well-Separated Trees 1 D2 D2  D1/ k D3 D3 D2/ k

Properties of Ultrametrics An ultrametric is a tree metric. Ultrametrics embed isometrically in l2. [BM 04]: Any n-point ultrametric (1+)- embeds in lpd, where d = O(-2 log n) .

A Metric Ramsey Phenomenon Consider n equally spaced points on the line. Choose a “Cantor like” set of points, and construct a binary tree over them. The resulting tree is 3-HST, and the original subspace embeds in this tree with distortion 3. Size of subspace: .

Metric Ramsey Phenomena [BLMN 03, MN 06, B 06]: Any n-point metric space contains a subspace of size which embeds in an ultrametric with distortion Θ(1/e) [B 06]: Deterministic Construction [B 06]: Any n-point metric space contains a subspace of linear size which embeds in an ultrametric with lq-distortion is bounded by Õ(q)

Metric Ramsey Theorems Key Ingredient: Partitions!

Complete Representation via Ultrametrics ? Goal: Given an n point metric space, we would like to embed it into an ultrametric with low distortion. Lower Bound: W(n), in fact this holds event for embedding the n-cycle into arbitrary tree metrics [RR 95]

Probabilistic Embedding [Karp 89]: The n-cycle probabilistically-embeds in n-line spaces with distortion 2 If u,v are adjacent in the cycle C then E(dL(u,v))= (n-1)/n + (n-1)/n < 2 = 2 dC(u,v) C

Probabilistic Embedding [B 96,98,04, FRT 03]: Any n-point metric space probabilistically embeds into an ultrametric with distortion Θ(log n) [ABN 05,06, CDGKS 05]: lq-distortion is Θ(q)

Probabilistic Embedding Key Ingredient: Probabilistic Partitions

Probabilistic Partitions P={S1,S2,…St} is a partition of X if P(x) is the cluster containing x. P is Δ-bounded if diam(Si)≤Δ for all i. A probabilistic partition P is a distribution over a set of partitions. P is (η,d)-padded if Call P η-padded if d=1/2. x1 x2 η [B 96] h=Q(1/(log n)) [CKR01+FRT03]: η(x)= Ω(1/log (ρ(x,Δ))

Partitions and Embedding [B 96, Rao 99, …] Let Δi=4i be the scales. For each scale i, create a probabilistic Δi-bounded partitions Pi, that are η-padded. For each cluster choose σi(S)~Ber(½) i.i.d. fi(x)= σi(Pi(x))·d(x,X\Pi(x)) Repeat O(log n) times. Distortion : O(η-1·log1/pΔ). Dimension : O(log n·log Δ). diameter of X = Δ Δi 16 4 x d(x,X\P(x))

Time to…

Uniform Probabilistic Partitions In a Uniform Probabilistic Partition η:X→[0,1] all points in a cluster have the same padding parameter. [ABN 06]: Uniform partition lemma: There exists a uniform probabilistic Δ-bounded partition such that for any , η(x)=log-1ρ(v,Δ), where The local growth rate of x at radius r is: C1 C2 v2 v1 v3 η(C1)  η(C2) 

Embedding into a single dimension Let Δi=4i. For each scale i, create uniformly padded probabilistic Δi-bounded partitions Pi. For each cluster choose σi(S)~Ber(½) i.i.d. , fi(x)= σi(Pi(x))·ηi-1(x)·d(x,X\Pi(x)) Upper bound : |f(x)-f(y)| ≤ O(log n)·d(x,y). Lower bound: E[|f(x)-f(y)|] ≥ Ω(d(x,y)) Replicate D=Θ(log n) times to get high probability.

Upper Bound: |f(x)-f(y)| ≤ O(log n) d(x,y) For all x,yєX: - Pi(x)≠Pi(y) implies fi(x)≤ ηi-1(x)· d(x,y) - Pi(x)=Pi(y) implies fi(x)- fi(y)≤ ηi-1(x)· d(x,y) Use uniform padding in cluster

Take a scale i such that Δi≈d(x,y)/4. It must be that Pi(x)≠Pi(y) Lower Bound: y x Take a scale i such that Δi≈d(x,y)/4. It must be that Pi(x)≠Pi(y) With probability ½ : ηi-1(x)d(x,X\Pi(x))≥Δi

Lower bound : E[|f(x)-f(y)|] ≥ d(x,y) Two cases: R < Δi/2 then prob. ⅛: σi(Pi(x))=1 and σi(Pi(y))=0 Then fi(x) ≥ Δi ,fi(y)=0 |f(x)-f(y)| ≥ Δi/2 =Ω(d(x,y)). R ≥ Δi/2 then prob. ¼: σi(Pi(x))=0 and σi(Pi(y))=0 fi(x)=fi(y)=0

Partial Embedding & Scaling Distortion Definition: A (1-ε)-partial embedding has distortion D(ε), if at least 1-ε of the pairs satisfy distf(u,v) ≤ D(ε) Definition: An embedding has scaling distortion D(·) if it is a 1-ε partial embedding with distortion D(ε), for all ε>0 [KSW 04] [ABN 05, CDGKS 05]: Partial distortion and dimension Q(log(1/ε)) [ABN06]: Scaling distortion Q(log(1/ε)) for all metrics

lq-Distortion vs. Scaling Distortion Upper bound D(e) = c log(1/e) on Scaling distortion: ½ of pairs have distortion ≤ c log 2 = c + ¼ of pairs have distortion ≤ c log 4 = 2c + ⅛ of pairs have distortion ≤ c log 8 = 3c …. Average distortion = O(1) Wost case distortion = O(log(n)) lq-distortion = O(min{q,log n})

Coarse Scaling Embedding into Lp Definition: For uєX, rε(u) is the minimal radius such that |B(u,rε(u))| ≥ εn. Coarse scaling embedding: For each uєX, preserves distances to v s.t. d(u,v) ≥ rε(u). rε(w) w rε(u) u rε(v) v

Scaling Distortion Claim: If d(x,y) ≥ rε(x) then 1 ≤ distf(x,y) ≤ O(log 1/ε) Let l be the scale d(x,y) ≤ Δl < 4d(x,y) Lower bound: E[|f(x)-f(y)|] ≥ d(x,y) Upper bound for high diameter terms Upper bound for low diameter terms Replicate D=Θ(log n) times to get high probability.

Upper Bound for high diameter terms: |f(x)-f(y)| ≤ O(log 1/ε) d(x,y) Scale l such that rε(x)≤d(x,y) ≤ Δl < 4d(x,y).

Upper Bound for low diameter terms: |f(u)-f(v)| = O(1) d(u,v) Scale l such that d(x,y) ≤ Δl < 4d(x,y). All lower levels i ≤ l are bounded by Δi.

Embedding into trees with Constant Average Distortion [ABN 07a]: An embedding of any n point metric into a single ultrametric. An embedding of any graph on n vertices into a spanning tree of the graph. Average distortion = O(1). L2-distortion = Lq-distortion = Θ(n1-2/q), for 2<q≤∞

Embeddings Metrics in their Intrinsic Dimension Definition: A metric space X has doubling constant λ, if any ball with radius r>0 can be covered with λ balls of half the radius. Doubling dimension: dim(X) = log λ [ABN 07b]: Any n point metric space X can be embedded into Lp with distortion O(log1+θ n), dimension O(dim(X)) Same embedding, using: nets Lovász Local Lemma Distortion-Dimension Tradeoff

Locality Preserving Embeddings Def: A k-local embedding has distortion D(k) if for every k-nearest neighbors x,y: distf(x,y) ≤ D(k) [ABN 07c]: For fixed k, k-local embedding into Lp distortion Q(log k) and dimension Q(log k) (under very weak growth bound condition) [ABN 07c]: k-local embedding into Lp with distortion Õ(log k) on neighbors, for all k simultaneously, and dimension Q(log n) Same embedding appropriately scaled down Lovász Local Lemma

Summary Unified framework for embedding finite metrics. Probabilistic embedding into ultrametrics. Metric Ramsey theorems. New measures of distortion. Embeddings with strong properties: Optimal scaling distortion. Constant average distortion. Tight distortion-dimension tradeoff. Embedding metrics in their intrinsic dimension. Embedding that strongly preserve locality.

Embedding Examples Euclidean distortion =

Embedding in Normed Spaces [Fréchet Embedding]: Any n-point metric space embeds isometrically in L∞ Proof. w x y

Embedding in Normed Spaces [Bourgain 85]: Any n-point metric space embeds in Lp with distortion Θ(log n) [Johnson-Lindenstrauss 85]: Any n-point subset of Euclidean Space embeds with distortion (1+e) in dimension Θ(-2 log n) [ABN 06, B 06]: Dimension Θ(log n) In fact: Θ*(log n/ loglog n)

Basic Structure: Ultrametric (v) (w) x z (z)=0 0 = (z)  (w)  (v)  (u) d(x,z)= (lca(x,z))= (v)

Upper Bound For all x,yєX: Pi(x)≠Pi(y) implies d(x,X\Pi(x))≤d(x,y) fi(x)= σi(Pi(x))·d(x,X\Pi(x)) For all x,yєX: Pi(x)≠Pi(y) implies d(x,X\Pi(x))≤d(x,y) Pi(x)=Pi(y) implies d(x,A)-d(y,A)≤d(x,y)

Lower bound: Take a scale i such that Δi≈d(x,y)/4. It must be that Pi(x)≠Pi(y) With probability ½ : d(x,X\Pi(x))≥ηΔi With probability ¼ : σi(Pi(x))=1 and σi(Pi(y))=0

Matrixpace Developing mathematical theory of embedding of finite metric spaces Fruitful interaction between computer science and pure/applied mathematics New concepts of embedding yield surprisingly strong properties