Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton.

Slides:



Advertisements
Similar presentations
Lower Bounds for Additive Spanners, Emulators, and More David P. Woodruff MIT and Tsinghua University To appear in FOCS, 2006.
Advertisements

The Capacity of Wireless Networks Danss Course, Sunday, 23/11/03.
WSPD Applications.
Nearest Neighbor Search in High Dimensions Seminar in Algorithms and Geometry Mica Arie-Nachimson and Daniel Glasner April 2009.
 Distance Problems: › Post Office Problem › Nearest Neighbors and Closest Pair › Largest Empty and Smallest Enclosing Circle  Sub graphs of Delaunay.
Trees and Markov convexity James R. Lee Institute for Advanced Study [ with Assaf Naor and Yuval Peres ] RdRd x y.
Doubling dimension and the traveling salesman problem Yair BartalHebrew University Lee-Ad GottliebHebrew University Robert KrauthgamerWeizmann Institute.
Metric Embeddings with Relaxed Guarantees Hubert Chan Joint work with Kedar Dhamdhere, Anupam Gupta, Jon Kleinberg, Aleksandrs Slivkins.
Cse 521: design and analysis of algorithms Time & place T, Th pm in CSE 203 People Prof: James Lee TA: Thach Nguyen Book.
Compact and Low Delay Routing Labeling Scheme for Unit Disk Graphs Chenyu Yan, Yang Xiang, and Feodor F. Dragan (WADS 2009) Kent State University, Kent,
Embedding Metrics into Ultrametrics and Graphs into Spanning Trees with Constant Average Distortion Ittai Abraham, Yair Bartal, Ofer Neiman The Hebrew.
Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]
Interchanging distance and capacity in probabilistic mappings Uriel Feige Weizmann Institute.
A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.
Navigating Nets: Simple algorithms for proximity search Robert Krauthgamer (IBM Almaden) Joint work with James R. Lee (UC Berkeley)
Great Theoretical Ideas in Computer Science.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
Routing, Anycast, and Multicast for Mesh and Sensor Networks Roland Flury Roger Wattenhofer RAM Distributed Computing Group.
Algorithms for Max-min Optimization
Approximation Algorithms: Combinatorial Approaches Lecture 13: March 2.
Proximity algorithms for nearly-doubling spaces Lee-Ad Gottlieb Robert Krauthgamer Weizmann Institute TexPoint fonts used in EMF. Read the TexPoint manual.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
1 Optimization problems such as MAXSAT, MIN NODE COVER, MAX INDEPENDENT SET, MAX CLIQUE, MIN SET COVER, TSP, KNAPSACK, BINPACKING do not have a polynomial.
1 University of Freiburg Computer Networks and Telematics Prof. Christian Schindelhauer Wireless Sensor Networks 19th Lecture Christian Schindelhauer.
Polynomial Time Approximation Scheme for Euclidian Traveling Salesman
Advances in Metric Embedding Theory Ofer Neiman Ittai Abraham Yair Bartal Hebrew University.
On Euclidean Vehicle Routing With Allocation Reto Spöhel, ETH Zürich Joint work with Jan Remy and Andreas Weißl TexPoint fonts used in EMF. Read the TexPoint.
Doubling Dimension in Real-World Graphs Melitta Lorraine Geistdoerfer Andersen.
Distance scales, embeddings, and efficient relaxations of the cut cone James R. Lee University of California, Berkeley.
Volume distortion for subsets of R n James R. Lee Institute for Advanced Study & University of Washington Symposium on Computational Geometry, 2006; Sedona,
Efficient Regression in Metric Spaces via Approximate Lipschitz Extension Lee-Ad GottliebAriel University Aryeh KontorovichBen-Gurion University Robert.
Algorithms on negatively curved spaces James R. Lee University of Washington Robert Krauthgamer IBM Research (Almaden) TexPoint fonts used in EMF. Read.
Algorithms Design and Analysis: PTAS for Euclidean TSP Prof. Dr. Jinxing Xie Dept. of Mathematical Sciences Tsinghua University, Beijing , China.
Fast, precise and dynamic distance queries Yair BartalHebrew U. Lee-Ad GottliebWeizmann → Hebrew U. Liam RodittyBar Ilan Tsvi KopelowitzBar Ilan → Weizmann.
Polynomial-time approximation schemes for geometric NP-hard problems Reto Spöhel Reading Group, May 17, 2011 TexPoint fonts used in EMF. Read the TexPoint.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
A linear time approximation scheme for Euclidean TSP Yair BartalHebrew University Lee-Ad GottliebAriel University TexPoint fonts used in EMF. Read the.
On Euclidean Vehicle Routing With Allocation Reto Spöhel, ETH Zürich Joint work with Jan Remy and Andreas Weißl TexPoint fonts used in EMF. Read the TexPoint.
Geometric Problems in High Dimensions: Sketching Piotr Indyk.
A light metric spanner Lee-Ad Gottlieb. Graph spanners A spanner for graph G is a subgraph H ◦ H contains vertices, subset of edges of G Some qualities.
On the Impossibility of Dimension Reduction for Doubling Subsets of L p Yair Bartal Lee-Ad Gottlieb Ofer Neiman.
Advances in Metric Embedding Theory Yair Bartal Hebrew University &Caltech UCLA IPAM 07.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. Fast.
Approximation Algorithms by bounding the OPT Instructor Neelima Gupta
1 Approximation Algorithms for Low- Distortion Embeddings into Low- Dimensional Spaces Badoiu et al. (SODA 2005) Presented by: Ethan Phelps-Goodman Atri.
Oct 23, 2005FOCS Metric Embeddings with Relaxed Guarantees Alex Slivkins Cornell University Joint work with Ittai Abraham, Yair Bartal, Hubert Chan,
Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.
1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Christian Schindelhauer Algorithms for Radio Networks Winter Term 2005/2006.
S IMILARITY E STIMATION T ECHNIQUES FROM R OUNDING A LGORITHMS Paper Review Jieun Lee Moses S. Charikar Princeton University Advanced Database.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
Information Complexity Lower Bounds
Dimension reduction for finite trees in L1
Optimization problems such as
Polynomial-time approximation schemes for NP-hard geometric problems
Ultra-low-dimensional embeddings of doubling metrics
Enumerating Distances Using Spanners of Bounded Degree
Lecture 16: Earth-Mover Distance
Near-Optimal (Euclidean) Metric Compression
Light Spanners for Snowflake Metrics
Yair Bartal Lee-Ad Gottlieb Hebrew U. Ariel University
cse 521: design and analysis of algorithms
Introduction Wireless Ad-Hoc Network
Metric Methods and Approximation Algorithms
Dimension versus Distortion a.k.a. Euclidean Dimension Reduction
Embedding Metrics into Geometric Spaces
Lecture 15: Least Square Regression Metric Embeddings
The Intrinsic Dimension of Metric Spaces
Clustering.
President’s Day Lecture: Advanced Nearest Neighbor Search
Routing in Networks with Low Doubling Dimension
Presentation transcript:

Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton

Metric space M = (V, d) (finite) set V of points symmetric non-negative distances d(x,y) triangle inequality d(x,y) ≤ d(x,z) + d(z,y) x y z

Dimension dim D (M) is the smallest k such that every set S with diameter D S can be covered by 2 k sets of diameter ½D S D doubling dimension ¸ = 2 dim_D = doubling constant

doubling generalizes geometric dimension Take k-dim Euclidean space R k Claim: dim D (R k ) ≈ Θ(k) Easy to see for boxes Argument for spheres a bit more involved. 2 3 boxes to cover larger box in R 3

facts about doubling The notion of doubling dimension behaves smoothly under metric distortion definition closed under taking submetrics jargon: “doubling” = family of metrics with doubling dimension bounded by some absolute constant c independent of n.

Suppose a metric (X,d) has doubling dimension k. If any subset S µ X of points has all inter-point distances lying between ± and ¢ then |S| ≤ (2 ¢ / ± ) k useful property of doubling Proof: recursively apply the definition…

Suppose a metric (X,d) has doubling dimension k. If any subset S µ X of points has all inter-point distances lying between ± and ¢ then |S| ≤ (2 ¢ / ± ) k useful property of doubling   this 2-dim set has O(  /  ) 2 points

Uniform metric: All non-zero distances equal to R 2-uniform metric: All non-zero distances in [R,2R] Doubling Dimension k iff largest 2-uniform submetric has ¼ 2 O(k) points alternate characterization

what is not a doubling metric? The equidistant metric U n on n points has dimension  (log n) Hence low doubling dimension captures the fact that the metric does not have large (near)-equidistant metrics.

the picture thus far… Doubling dimension k Euclidean dimension £ (k) Metrics with >> 2 k nearly-equidistant points

btw, just to check Natural Q: Do all doubling metrics embed into ℓ2 with distortion O(1)? No. The Laakso fractals require  ( √ log n) distortion to embed into ℓ2 with any number of dimensions. [GKL’03] In fact, the right behavior is £ ( √ dim D log n) [KLMN’04, ABN’05, JLM’09] The Laakso fractals require  ( √ log n) distortion to embed into ℓ2 with any number of dimensions. [GKL’03] In fact, the right behavior is £ ( √ dim D log n) [KLMN’04, ABN’05, JLM’09]

Many geometric algorithms can be extended to doubling spaces… Near neighbor search Compact routing Distance labeling Network triangulation Sensor placements Small-world networks Traveling Salesman Sparse Spanners Approx. inference Network Design Clustering problems Well-separated pair decomposition Data structures Learnability a substantial(?) generalization Doubling dimension k Euclidean dimension £ (k)

example application Assign labels L(x) to each host x in a metric space Looking just at L(x) and L(y), can infer distance d(x,y) Results labels with (O( 1 )/ε) dim × log n bits estimates within ( 1 + ε) factor Contrast with lower bound of n bit labels in general for any factor < 2 x y f(, ) ≈ d(x,y)

[Arora 95] showed that TSP on R k was (1+ ² )-approximable in time [Talwar 04] extended the first result to metrics with doubling dimension k another example Can we get the PTAS as well?

example in action: sparse spanners for doubling metrics

spanners Given a metric M = (V, d), a graph G = (V, E) is an (m, ² )-spanner if 1) number of edges in G is m 2) d(x,y) ≤ d G (x,y) ≤ (1 + ² ) d(x,y) A reasonable goal: ² = 0.1, m = O(n) Fact: For the equidistant metric U n, if ² < 1 then G = K n

spanners for doubling metrics Theorem: Given any metric M, and any ² < ½, we can efficiently find an spanner G with stretch ² and number of edges m = n (1 + 1/ ² ) dim D (M) Hence, for doubling metrics, linear-sized spanners! Generalizes a similar theorem for Euclidean metrics.

standard tool: nets Nets: A set of points N is an r-net of a set S if – d(u,v) ≥ r for any u,v 2 N – For every w 2 S \ N, there is a u 2 N with d(u,w) < r r

standard tool: nets Nets: A set of points N is an r-net of a set S if – d(u,v) ≥ r for any u,v 2 N – For every w 2 S \ N, there is a u 2 N with d(u,w) < r Fact: If a metric has doubling dim k and N is an r-net ) B(x,2r) \ N has O(1) k points.

recursive nets so you take a 2-net N 1 of these points Now you can take a 4-net N 2 of this net And so on… Suppose all the points were at least unit distance apart

recursive nets N 0 = V N t is a 2 t -net of the set N t-1 N1N1 N2N2 N3N3 N4N4  N t is a 2 t+1 -net of the set V (almost)

the spanner construction N 0 = V N t is a 2 t -net of the set N t-1 N1N1 N2N2 N3N3 N4N4  N t is a 2 t+1 -net of the set V (almost) Connect each net point in N t to other net points at distance at most O(1/ ² ) 2 t

the number of edges Number of points in N t within O(1/ ² ) 2 t of some net point at most O(1/ ² ) k Number of levels = O(log diameter) Number of nodes in net at each level ≤ n Hence, number of edges ≤ n × log diameter × O(1/ ² ) k Can be improved to n × O(1/ ² ) k

the stretch factor

spanners for doubling metrics Theorem: Given any metric M, and any ² < ½, we can efficiently find an (m, ² )-spanner G with number of edges m = n (1 + 1/ ² ) dim D (M) Hence, for doubling metrics, linear-sized spanners!

example in action: TSP for doubling metrics

plan of attack We have PTASs for TSP for points in constant-dimensional ℓ2. If we could embed doubling metrics into constant-dimensional ℓ2 that maintains distances to within (1+ ² ) (in expectation) we’d be done. completely ridiculous strategy, but maybe we’ll get somewhere.

embedding doubling trees into ℓ2 Recall: embedding doubling metrics into ℓ2 requires  ( √ log n) distortion, regardless of dim’n. however… Theorem: if a doubling metric is also a tree metric, embeds into ℓ2 with distortion O(1) and dimension O(1) poly( ¸ )

embedding doubling metrics into doubling trees Bad news: 2-d grids require  (log n) distortion to embed into distributions over trees Good news: All doubling metrics embed into distributions over doubling trees with distortion O(log n).

plan of attack We have PTASs for TSP for points in constant-dimensional ℓ2. If we could embed doubling metrics into constant-dimensional ℓ2 that maintains distances to within (1+ ² ) (in expectation) we’d be done. revised

Arora’s simpler TSP idea Given any TSP tour of length L in d-dim space find B = (log n/ ± ) d portals in each cluster and show there exists a portal-respecting tour which increases length by ≤ ± L Now dynamic program to find best portal-resp tour  runtime ~ (n log n) B B

Arora’s simpler TSP idea Given any TSP tour of length L in d-dim space find B = (log n/ ± ) d portals in each cluster and show there exists a portal-respecting tour which increases length by ≤ ± L define portals, choosing ± = ² /O(log n) OPT tour of length L* in original doubling metric embeds into O(1)-dim space with length L = O(log n)L* increase in length = ± L = ² L* and now find the best portal-respecting tour in original doubling metric!

recap for TSP embedded doubling metric randomly into doubling trees embedded those into constant-dimensional ℓ2 use that to find clusters/portals and claim existence of (1+ ² ) OPT tour find best tour in original metric using dynamic programming. Talwar’s algorithm does it better, dependence on dim D, not on ¸

open problem Is there a PTAS for TSP on doubling metrics? Can we embed doubling trees into ℓ2 of O(dim D ) dimensions with O(dim D ) distortion? (suffices to consider unweighted doubling trees)

low dimensional embeddings (and dimensionality reduction)

dimensionality reduction If a Euclidean metric embeds into R k for some dimension k with distortion O(1) the Euclidean metric has doubling dimension O(k) we want to efficiently find an Euclidean embedding into R O(k) with distortion O(1)   We just saw: embed any metric with doubling dimension k into distribution over 2 O(k) -dimensional ℓ1 spaces with distortion O(log n)2 O(k). We just saw: embed any metric with doubling dimension k into distribution over 2 O(k) -dimensional ℓ1 spaces with distortion O(log n)2 O(k).

dimensionality reduction   We just saw: embed any metric with doubling dimension k into distribution over 2 O(k) -dimensional ℓ1 spaces with distortion O(log n)2 O(k). We just saw: embed any metric with doubling dimension k into distribution over 2 O(k) -dimensional ℓ1 spaces with distortion O(log n)2 O(k). If a Euclidean metric embeds into R k for some dimension k with distortion O(1) the Euclidean metric has doubling dimension O(k) we want to efficiently find an Euclidean embedding into R O(k) with distortion O(1)

dimensionality reduction   We just saw: embed any metric with doubling dimension k into distribution over 2 O(k) -dimensional ℓ1 spaces with distortion O(log n)2 O(k). We just saw: embed any metric with doubling dimension k into distribution over 2 O(k) -dimensional ℓ1 spaces with distortion O(log n)2 O(k). O(k) ℓ2 space O*(log n) Better: If a Euclidean metric embeds into R k for some dimension k with distortion O(1) the Euclidean metric has doubling dimension O(k) we want to efficiently find an Euclidean embedding into R O(k) with distortion O(1)

a more general bound Example Theorem: Any metric with doubling dimension dim D embeds into Euclidean space with T dimensions with distortion (where T 2 [ dim D log log n, log n]) All these techniques are ultimately limited by fact that they embed all doubling metrics, and not just Euclidean ones. log n dim D T

special cases of interest Distortion on using O(dim D (M)) Euclidean dimensions Distortion on using O(log n) Euclidean dimensions General metrics Euclidean This generalizes result we talked about in Lecture #2: any metric embeds into Euclidean space with O(log n) distortion This is just the Johnson-Lindenstrauss lemma. If the metric is doubling, this quantity is sqrt{log n}. In general, this is never more than O(log n). Again generalizes the previous result.

weaken requirements? Low-dimensional projection preserving near-neighbors O(log dim D poly ² -1 ) dimension random projection [IN05?] (random projections also work for points on smooth manifolds) Give low-dim set of points approximating d(x,y) 0.99 Again, can get similar dimensionality… [GK10, BRS10]

one more useful tool.. Given a metric M, want to partition it randomly into pieces of “small” diameter such that “nearby” vertices lie in different pieces only with “small” probability. “random metric decompositions”

“padded” decompositions A metric (V,d) admits ¯ -padded decompositions, if for every ¢, we can output a random partition V = V 1 ] V 2 ] … ] V k 1.each V j has diameter ≤ ¢ 2.Pr[ B(x, ½ ) split ] ≤

the facts Thm: Doubling metrics admit O(dim D )-padded decompositions Useful wherever padded decompositions are useful E.g.: can prove that all doubling metrics embed into ℓ2 with distortion

last slide: some questions For specific metric space problems, can we match the performance for their geometric counterparts? Which problems admit algorithms whose performance can be parameterized using such a notion of dimension? Other notions of dimension that are algorithmically significant?

thank you!