Download presentation
Presentation is loading. Please wait.
1
Advances in Metric Embedding Theory Ofer Neiman Ittai Abraham Yair Bartal Hebrew University
2
Talk Outline Current results: New method of embedding. New partition techniques. Constant average distortion. Extend notions of distortion. Optimal results for scaling embeddings. Tradeoff between distortion and dimension. Work in progress: Low dimension embedding for doubling metrics. Scaling distortion into a single tree. Nearest neighbors preserving embedding.
3
Embedding Metric Spaces Metric spaces (X,d X ), (Y,d y ) Embedding is a function f:X→Y For non-contracting Embedding f, Given u,v in X let Given u,v in X let Distortion c if max {u,v X} dist f (u,v) ≤ c
4
Low-Dimension Embeddings into L p For arbitrary metric space on n points: [Bourgain 85]: distortion O(log n) [LLR 95]: distortion Θ(log n) dimension O(log 2 n) Can the dimension be reduced? For p=2, yes using [JL] to dimension O(log n) Theorem: embedding into L p with distortion O(log n), dimension O(log n) for any p. Theorem: distortion O(log 1+θ n), dimension Θ(log n/ (θ loglog n))
5
Average Distortion Embeddings In many practical uses, the quality of an embedding is measured by its average distortion Network embedding Multi-dimensional scaling Biology Vision Theorem: Every n point metric space can be embedded into L p with average distortion O(1), worst-case distortion O(log n) and dimension O(log n).
6
Variation on distortion: The L q distortion of an embedding Given a non-contracting embedding f from (X,d X ) to (Y,d Y ): f from (X,d X ) to (Y,d Y ): Define it’s L q -distortion Thm: L q -distortion is bounded by O(q)
7
Partial & Scaling Distortion Definition: A (1-ε)- partial embedding has distortion D(ε), if at least 1-ε of the pairs satisfy dist(u,v)<D(ε). Definition: An embedding has scaling distortion D(·) if it is a 1-ε partial embedding with distortion D(ε), for all ε>0 simultaneously. [KSW 04]: Introduce the problem in context of network embeddings. Initial results. [A+ 05]: Partial distortion and dimension O(log(1/ε)) for all metrics. Scaling distortion O(log(1/ε)) for doubling metrics. Thm: Scaling distortion O(log(1/ε)) for all metrics.
8
L q -Distortion Vs Scaling Distortion Upper bound O(log 1/ε) on Scaling distortion implies: L q - distortion = O(min{q,log n}). Average distortion = O(1). Distortion = O(log n). For any metric: ½ of pairs distortion are ≤ c log(2) = c +¼ ofpairsdistortion are ≤ c log(4)= 2c +¼ of pairs distortion are ≤ c log(4)= 2c +⅛ ofpairsdistortion are ≤ c log(8) = 3c +⅛ of pairs distortion are ≤ c log(8) = 3c …. +1/n 2 ofpairsdistortion are ≤ 2 c log(n) +1/n 2 of pairs distortion are ≤ 2 c log(n) For ε<1/n 2, no pairs are ignored. Lower bound Ω(log 1/ε) on partial distortion implies: L q - distortion = Ω(min{q,log n}). L q - distortion = Ω(min{q,log n}).
9
Probabilistic Partitions P={S 1,S 2,…S t } is a partition of X if P(x) is the cluster containing x. P is Δ-bounded if diam(S i )≤Δ for all i. A probabilistic partition P is a distribution over a set of partitions. P is η-padded if
10
Let Δ i =4 i be the scales. For each scale i, create a probabilistic Δ i - bounde d partitions P i, that are η- padded. For each cluster choose σ i (S)~Ber(½) i.i.d. f i (x)= σ i (P i (x))·d(x,X\P i (x)) f i (x)= σ i (P i (x))·d(x,X\P i (x)) Repeat O(log n) times. Distortion : O(η -1 ·log 1/p Δ). Dimension : O(log n·log Δ). Partitions and Embedding diameter of X = diameter of X = Δ ΔiΔi 4 8 x d(x,X\P(x))
11
Upper Bound f i (x)= σ i (P i (x))·d(x,X\P i (x)) f i (x)= σ i (P i (x))·d(x,X\P i (x)) For all x,yєX : P i (x)≠P i (y) implies d(x,X\P i (x))≤d(x,y) P i (x)=P i (y) implies d(x,A)-d(y,A)≤d(x,y)
12
x y Take a scale i such that Δ i ≈d(x,y)/4. It must be that P i (x)≠P i (y) With probability ½ : d(x,X\P i (x))≥ηΔ i With probability ¼ : σ i (P i (x))=1 and σ i (P i (y))=0 LowerBound:
13
η-padded Partitions The parameter η determines the quality of the embedding. [Bartal 96]: η=Ω(1/log n) for any metric space. [Rao 99]: η=Ω(1) used to embed planar metrics into L 2. [CKR01+FRT03]: Improved partitions with η(x)=log -1 (ρ(x,Δ)). [KLMN 03]: Used to embed general + doubling metrics into L p : distortion O(η -(1-1/p) log 1/p n), dimension O(log 2 n) The local growth rate of x at radius r is:
14
Uniform Probabilistic Partitions In a Uniform Probabilistic Partition η:X→[0,1] η:X→[0,1] All points in a cluster have the same padding parameter. Uniform partition lemma: There exists a uniform probabilistic Δ-bounded partition such that for any, η(x)=log -1 ρ(v,Δ), where v1v1 v2v2 v3v3 C1C1 C2C2 η(C 2 ) η(C 1 )
15
Let Δ i =4 i. For each scale i, create uniformly padded probabilistic Δ i - bounde d partitions P i. For each cluster choose σ i (S)~Ber(½) i.i.d., f i (x)= σ i (P i (x))·η i -1 (x)·d(x,X\P i (x)), f i (x)= σ i (P i (x))·η i -1 (x)·d(x,X\P i (x)) 1.Upper bound : |f(x)-f(y)| ≤ O(log n)·d(x,y). 2.Lower bound: E[|f(x)-f(y)|] ≥ Ω(d(x,y)) 3.Replicate D=Θ(log n) times to get high probability. Embedding into one dimension
16
Upper Bound: |f(x)-f(y)| ≤ O(log n) d(x,y) For all x,yєX : - P i (x)≠P i (y) implies f i (x)≤ η i -1 (x)· d(x,y) - P i (x)≠P i (y) implies f i (x)≤ η i -1 (x)· d(x,y) - P i (x)=P i (y) implies f i (x)- f i (y)≤ η i -1 (x)· d(x,y) - P i (x)=P i (y) implies f i (x)- f i (y)≤ η i -1 (x)· d(x,y) Use uniform padding in cluster
17
x y Take a scale i such that Δ i ≈d(x,y)/4. It must be that P i (x)≠P i (y) With probability ½ : f i (x)= η i -1 (x)d(x,X\P i (x))≥Δ i LowerBound:
18
Lower bound : E[|f(x)-f(y)|] ≥ d(x,y) Two cases: 1.R < Δ i /2 then prob. ⅛: σ i (P i (x))=1 and σ i (P i (y))=0 Then f i (x) ≥ Δ i, f i (y)=0 |f(x)-f(y)| ≥ Δ i /2 =Ω(d(x,y)). 2.R ≥ Δ i /2 then prob. ¼: σ i (P i (x))=0 and σ i (P i (y))=0 f i (x)=f i (y)=0 |f(x)-f(y)| ≥ Δ i /2 =Ω(d(x,y)).
19
Coarse Scaling Embedding into L p Definition: For uєX, r ε (u) is the minimal radius such that |B(u,r ε (u))| ≥εn. Coarse scaling embedding: For each uєX, preserves distances outside B(u,r ε (u)). u r ε (u) v r ε (v) r ε (w) w
20
Scaling Distortion Claim: If d(x,y) > r ε (x) then 1 ≤ dist f (x,y) ≤ O(log 1/ε) Let l be the scale d(x,y) ≤ Δ l < 4d(x,y) 1.Lower bound: E[|f(x)-f(y)|] ≥ d(x,y) 2.Upper bound for high diameter terms 3.Upper bound for low diameter terms 4.Replicate D=Θ(log n) times to get high probability.
21
Upper Bound for high diameter terms: |f(x)-f(y)| ≤ O(log 1/ε) d(x,y) Scale l such that r ε (x)≤d(x,y) ≤ Δ l < 4d(x,y). Scale l such that r ε (x)≤d(x,y) ≤ Δ l < 4d(x,y).
22
Upper Bound for low diameter terms: |f(u)-f(v)| = O(1) d(u,v) Scale l such that d(x,y) ≤ Δ l < 4d(x,y). Scale l such that d(x,y) ≤ Δ l < 4d(x,y). All lower levels i ≤ l are bounded by Δ i.
23
Embedding into L p Partition P is (η,δ)- padded if Lemma: there exists ( η,δ)- padded partitions with η(x)=log -1 (ρ(v,Δ))·log(1/δ), where v=min uєP(x) {ρ(u,Δ)}. Hierarchical partition : every cluster in level i is a refinement of cluster in level i+1. Theorem: Every n point metric space can be embedded into L p with dimension O(e p log n ). For every q :
24
Embedding into L p Embedding into L p with scaling distortion: Use partitions with small probability of padding : δ=e -p. Hierarchical Uniform Partitions. Combination with Matousek’s sampling techniques.
25
Low Dimension Embeddings Embedding with distortion O(log 1+θ n), dimension Θ(log n/ (θ loglog n)). Optimal trade-off between distortion and dimension. Use partitions with high probability of padding : δ=1-log -θ n.
26
Additional Results: Weighted Averages Embedding with weighted average distortion O(log Ψ) for weights with aspect ratio Ψ Algorithmic applications: Sparsest cut, Uncapacitated quadratic assignment, Multiple sequence alignment.
27
Low Dimension Embeddings Doubling Metrics Definition: A metric space has doubling constant λ, if any ball with radius r>0 can be covered with λ balls of half the radius. Doubling dimension = log λ. [GKL03]: Embedding doubling metrics, with tight distortion. Thm: Embedding arbitrary metrics into L p with distortion O(log 1+θ n), dimension O(log λ). Same embedding, with similar techniques. Use nets. Use Lovász Local Lemma. Thm: Embedding arbitrary metrics into L p with distortion O(log 1-1/p λ·log 1/p n), dimension Õ(log n·logλ). Use hierarchical partitions as well.
28
Scaling Distortion into trees [A+ 05]: Probabilistic Embedding into a distribution of ultrametrics with scaling distortion O(log(1/ε)). Thm: Embedding into an ultrametric with scaling distortion. Thm: Every graph contains a spanning tree with scaling distortion. Imply : Average distortion = O(1). L 2 -distortion = Can be viewed as a network design objective. Thm: Probabilistic Embedding into a distribution of spanning trees with scaling distortion Õ(log 2 (1/ε)).
29
New Results: Nearest-Neighbors Preserving Embeddings Definition: x,y are k -nearest neighbors if |B(x,d(x,y))|≤k. Thm: Embedding into L p with distortion Õ(log k) on k-nearest neighbors, for all k simultaneously, and dimension O(log n). Thm: For fixed k, embedding into L p distortion O(log k ) and dimension O(log k). Practically the same embedding. Every level is scaled down, higher levels more aggressively. Lovász Local Lemma.
30
Nearest-Neighbors Preserving Embeddings Thm: Probabilistic embedding into a distribution of ultrametrics with distortion Õ(log k) for all k-nearest neighbors. Thm: Embedding into an ultrametric with distortion k-1 for all k-nearest neighbors. Applications : Sparsest-cut with “neighboring” demand pairs. Approximate ranking / k -nearest neighbors search.
31
Conclusions Unified framework for embedding arbitrary metrics. New measures of distortion. Embeddings with improved properties: Optimal scaling distortion. Constant average distortion. Tight distortion-dimension tradeoff. Embedding metrics in their doubling dimension. Nearest-neighbors preserving embedding. Constant average distortion spanning trees.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.