Ultra-low-dimensional embeddings of doubling metrics T-H. Hubert Chan Max Planck Institute, Saarbrucken Anupam Gupta Carnegie Mellon University Kunal Talwar Microsoft Research SVC
Doubling Dimension Def: Every ball of radius 2R can be covered by 2k balls of radius R Doubling dimension = k k has dimension (k) Abstract analog of Euclidean dimension 2R R
Doubling metric Def: Doubling metric = an n-point metric with doubling dimension constant (which is independent of n). Advantage: Robust definition, resistant to distorting points slightly. Points on a constant-dimensional manifold have constant doubling dimension! (Even with small noise.)
Low distortion Embedding A map f: (X,d)m such that for all pairs x,y X d(x,y) ≤ ║f(x)-f(y)║2 ≤ C d(x,y) Small C f faithfully represents (X,d) Goal: Given an n-point metric space (X,d) with doubling dimension k, find an embedding into m with small distortion. “Distortion” of the embedding Ideally: dimension m and distortion C should be O(k), independent of n when (X,d) is Euclidean.
Our Results Dimension-Distortion Trade-off Theorem: Take any (not necessarily Euclidean) metric space (X,d) with doubling dimension k. Fix any integer T such that k loglog n ≤ T ≤ ln n. Then there exists a map f:X T into T-dimensional Euclidean space with distortion (dimD) O log n T
Comments on Our Results A non-linear technique for dimensionality reduction. Interesting special cases of the tradeoff: Very low dimension: Dimension k loglog n for distortion ≈ O(log n) Balanced trade-off: Dimension log2/3 n for distortion O(k log2/3 n)
Tools Randomized low-diameter partitioning of doubling metrics Co-ordinates at different scales combined using random +1/-1 linear combinations (reminiscent of random projections) Lovasz Local Lemma used to prove existence of an embedding with the desired bounds.
Future Work Our results apply to all doubling metrics Thus cannot beat distortion O(log n) (there are known lower bounds) Future Work: better for Euclidean metrics ? Ideally: If (X,d) with doubling dimension k embeds into Euclidean space with distortion D, then want an embedding into O(k) dimensional Euclidean space with distortion O(D). Paper at http://www.cs.cmu.edu/~hubert