The Intrinsic Dimension of Metric Spaces Anupam Gupta Carnegie Mellon University 3ème cycle romand de Recherche Opérationnelle, 2010 Seminar, Lecture #4
y z x Metric space M = (V, d) set V of points symmetric non-negative distances d(x,y) triangle inequality d(x,y) ≤ d(x,z) + d(z,y) d(x,x) = 0 y z x
in the previous two lectures… We saw: every metric is (almost) a (random) tree every metric is (almost) an Euclidean metric every Euclidean metric is (almost) a low-dimensional metric Today: how to measure the complexity of a metric space and doing “better” dimension reduction
“better” dimension reduction Can we do “better” than random projections?
tight bounds on dimension reduction Theorem [1984] Given any n-point subset V of Euclidean space, one can map into O(log n/²2)-dimensional space while incurring a distortion at most (1+²). Theorem (Alon): Lower bound of (log n/(²2 log ² -1)) again for the “uniform/equidistant” metric Un
so what do we do now? Note that these are uniform results. (“All n-point Euclidean metrics have property blah(n) …”) Can we give a better “per-instance” guarantee? (“For any n-point Euclidean metric M, we get property blah(M) …”) Note: If the metric contains a copy of equidistant metric Ut we know we need (log t) dimensions if we want O(1) distortion Similar lower bounds if our metric contains almost equidistant metrics But are these the only obstructions to low-dimensionality?
more nuanced results To paraphrase the results recent proven: If a Euclidean metric embeds into Rk for some dimension k with distortion O(1) then we can find an embedding into RO(k) with distortion O(log n) [Chan G. Talwar ’08, Abraham Bartal Neiman ’08] JL says: we can find an embedding into RO(log n)
here’s how we do it come up with convenient ways of capturing the property that there are no large near-equidistant metrics formally: define a notion of “intrinsic dimension” of a point set in Euclidean space in fact, our definitions will hold for any metric space, not just for point sets in Euclidean space. And we will make use of this generality to get good algos…
rest of this talk Give two definitions of metric dimension: a) pointwise dimension b) doubling dimension Develop good algorithms for metrics with low pointwise/doubling dimension. Also, get “better” dimension reduction for metrics with low doubling dimension
pointwise dimension The pointwise dimension of a metric is at most k if for all points x, number of points within distance 2r of point x ≤ 2k (Number of points within distance r of point x) x r
pointwise dimension captures sets of regularly spaced points in Rk
using it for near-neighbor searching Given a metric space M = (X, d) and a subset S of X preprocess it so that given query point q we can return point in S (approximately) closest to q.
using it for near-neighbor searching Query point q. Suppose I know a point x at distance d(q,x) = r. If I find a point y at distance r/2 from q, have made progress. Allowed operations: Can sample points from “balls” around any point in the original point set S. (sample from points in S Ball(point x, radius r))
algorithm by picture Query point q. Suppose I know a point x at distance d(q,x) = r. Want to find a point y at distance r/2 from q. 3r/2 x r q r/2
algorithm by picture Query point q. Suppose I know a point x at distance d(q,x) = r. Want to find a point y at distance r/2 from q. Suppose we sample from B(x, 3r/2) to find such a point y 3r/2 Bad case: Most points in this ball lie close to x x r Not possible! This would cause high “growth rate”!!! q r/2
algorithm by picture |B(x,3r/2)| ≤ |B(q, 5r/2)| ≤ |B(q, 4r)| ≤ 23k |B(q,r/2)| Probability of hitting a good point ≥ 1/23k 3r/2 Near-neighbor Algorithm: 1. Let x = closest point seen so far, with distance r = d(x,q) 2. Pick 23k samples from B(x, 5r/2) 3. If see some point at distance ≤ r/2 from q, go to line1 else output closest point seen so far x r q r/2
pointwise dimension used widely Used to model “tractable/simple” networks by [Kamoun Kleinrock], [Kleinberg Tardos], [Plaxton Rajaraman Richa], [Karger Ruhl], [Beygelzimer Kakade Langford]… x r
but… Drawback: the definition is not closed under taking subsets (A subset of a low-dimensional metric can be high dimensional.) r O(lg r2) 2-dimensional set
and that’s not all Pointwise dimension is a somewhat restrictive notion (unclear if “real” metrics have low pointwise dimension) Would like something a bit more robust and general
new notion: doubling dimension Dimension dimD(M) is at most k such that for any set S, if S has diameter DS it can be covered by 2k sets of diameter ½DS D
new notion: doubling dimension Dimension dimD(M) is at most k such that for any set S, if S has diameter DS it can be covered by 2k sets of diameter ½DS D
doubling generalizes geometric dimension Take k-dim Euclidean space Rk Claim: dimD(Rk) ≈ Θ(k) Easy to see for boxes Argument for spheres a bit more involved. 23 boxes to cover larger box in R3
“doubling metrics” Dimension at most k if every set S with diameter DS can be covered by 2k sets of diameter ½DS A family of metric spaces is called “doubling” if there exists a constant k such that the doubling dimension of these metrics is bounded by k
what is not a doubling metric? The equidistant metric Ut on t points has dimension (log t) Hence low doubling dimension captures the fact that the metric does not have large equidistant metrics.
doubling generalizes pointwise dim Fact: If a metric has pointwise dimension k, it has doubling dimension O(k)
the picture thus far…
rest of this talk Give two definitions of metric dimension: a) pointwise dimension b) doubling dimension Develop good algorithms for metrics with low pointwise/doubling dimension. Also get “better” dimension reduction for metrics with low doubling dimension
some useful properties of doubling dimension
small near-equidistant metrics (restated) this 2-dim set has (/d)2 points Fact: Suppose a metric (X,d) has doubling dimension k. If any subset S µ X of points has all inter-point distances lying between ± and ¢ there are at most (¢/±)O(k) points in S.
the (simple) proof
advantages of this fact Thm: Doubling metrics admit O(dim(M))-padded decompositions Useful wherever padded decompositions are useful E.g.: can prove that all doubling metrics embed into ℓ2 with distortion
btw, just to check Natural Q: Do all doubling metrics embed into ℓ2 with distortion O(1)?
a substantial generalization Many geometric algorithms can be extended to doubling spaces… Near neighbor search Compact routing Distance labeling Network triangulation Sensor placements Small-world networks Traveling Salesman Sparse Spanners Approx. inference Network Design Clustering problems Well-separated pair decomposition Data structures Learnability
example application Assign labels L(x) to each host x in a metric space Looking just at L(x) and L(y), can infer distance d(x,y) Results labels with (O(1)/ε)dim × log n bits estimates within (1 + ε) factor y 010001 010001 Contrast with lower bound of n bit labels in general for any factor < 2 110001 110001 x f( , ) ≈ d(x,y)
another example [Arora 95] showed that TSP on Rk was (1+²)-approximable in time [Talwar 04] extended this result to metrics with doubling dimension k
example in action: sparse spanners for doubling metrics [Chan G example in action: sparse spanners for doubling metrics [Chan G. Maggs Zhou]
spanners Given a metric M = (V, d), a graph G = (V, E) is an (m, ²)-spanner if 1) number of edges in G is m 2) d(x,y) ≤ dG(x,y) ≤ (1 + ²) d(x,y) A reasonable goal: (?) ² = 0.1, m = O(n) Fact: For the equidistant metric Un, if ² < 1 then G = Kn
spanners for doubling metrics Theorem: [Chan G. Maggs Zhou] Given any metric M, and any ² < ½, we can efficiently find an (m, ²)-spanner G with m = n (1 + 1/²) dimD(M) Hence, for doubling metrics, linear-sized spanners! Independently proved by [HarPeled Mendel], they give better runtime Generalizes a similar theorem for Euclidean metrics due to [Arya Das Narasimhan]
standard tool: nets Nets: A set of points N is an r-net of a set S if d(u,v) ≥ r for any u,v 2 N For every w 2 S \ N, there is a u 2 N with d(u,w) < r r
standard tool: nets Nets: A set of points N is an r-net of S if d(u,v) ≥ r for any u,v 2 N For every w 2 S \ N, there is a u 2 N with d(u,w) < r Fact: If a metric has doubling dim k and N is an r-net ) B(x,2r) Å N · O(1)k
recursive nets 16 8 4 2 Suppose all the points were at least unit distance apart so you take a 2-net N1 of these points Now you can take a 4-net N2 of this net And so on…
recursive nets Nt is a 2t-net of the set Nt-1 N0 = V Nt is a 2t-net of the set Nt-1 Nt is a 2t+1-net of the set V (almost)
the spanner construction N0 = V Nt is a 2t-net of the set Nt-1 Nt is a 2t+1-net of the set V (almost)
the sparsity
the stretch
rest of this talk Give two definitions of metric dimension: a) pointwise dimension b) doubling dimension Develop good algorithms for metrics with low pointwise/doubling dimension. Also get “better” dimension reduction for metrics with low doubling dimension
want to improve on J-L Theorem [1984] Given any n-point subset V of Euclidean space, one can map it into O(log n/²2)-dimensional space while incurring distortion at most (1+²). Want map to depend on the metric space. (JL just computes a random projection.) Fact: we need a non-linear map.
back to dimensionality reduction To paraphrase the best currently-known results: If a Euclidean metric embeds into Rk for some dimension k with distortion O(1) the metric has doubling dimension O(k) we can find an embedding into RO(k) with distortion O(log n)
new: “per-instance” bounds Theorem: [Chan G. Talwar] Any metric with doubling dimension k embeds into Euclidean space with T dimensions with distortion (where T 2 [ k log log n, log n]) Independently, [Abraham Bartal Neiman] get similar results.
special cases of interest Distortion on using O(dimD (M)) Euclidean dimensions Distortion on using O(log n) General metrics Euclidean If the metric is doubling, this quantity is sqrt{log n}. In general, this is never more than O(log n). Again generalizes the previous result. This generalizes result we talked about in Lecture #2: any metric embeds into Euclidean space with O(log n) distortion This is just the Johnson-Lindenstrauss lemma.
thank you!