Small-size  -nets for Axis- Parallel Rectangles and Boxes Boris Aronov Esther Ezra Micha sharir polytechnic Duke Tel-Aviv Institute of NYU University.

Small-size  -nets for Axis- Parallel Rectangles and Boxes Boris Aronov Esther Ezra Micha sharir polytechnic Duke Tel-Aviv Institute of NYU University University

Range Spaces Range space (X, R) : X – Ground set (the “universe”). R – Ranges: Subsets of X. |R|  2 |X| Abstract form: Hypergraphs. X – vertices. R – hyperedges.

specification: X   d, R = set of simply-shaped regions in  d. X – Points on the real line. R – Intervals. X – Points on the plane. R – halfplanes, disks,… For simplicity, assume X is finite: |R| is polynomial in |X|. Geometric Range Spaces

 -nets for range spaces Given: A range space (X, R), assume X is finite, |X| = n. A parameter 0 <  < 1, An  -net for (X, R) is a subset N  X that hits every range Q  R, with |Q  X|   n. N is a hitting set for all the ``heavy'' ranges. Example: Points and intervals on the real line: |N| = 1/ .  n n Bound does not depend on n. Captures at least an  -fraction of the universe.

The hitting-set problem A hitting set for (X, R) is a subset H  X, s.t., for any Q  R, Q  H  . Goal: find smallest hitting set. Useful applications: art-gallery, sensor networking, and more.

Hardness of hitting sets Finding a hitting set of smallest size is NP-hard, even for geometric range spaces! Use an approximation algorithm instead. Abstract range spaces [Chvatal 79]: Greedy algorithm. Approximation factor: O(log |X|) Geometric range spaces [Bronimann-Goodrich95], [Clarkson 93]: Achieve improved approximation factor! Approximation factor: O(log OPT), or smaller! This is achieved via  -nets: Small-size  -nets imply small approximation factors! OPT = size of the smallest hitting set.

An upper bound for the  -net size The  -net theorem [Haussler-Welzl 87]: If the ranges are simply-shaped regions, then, for any  > 0, a random sample of size O(1/  log (1/  )) is an  -net, with constant probability. Remark: In fact, it is sufficient to assume that the number of ranges is only polynomial in n. Is it optimal? Bound does not depend on n.

The lower bound Theorem [Komlos, Pach, Woeginger 92]: The bound is tight! The construction: Artificial on abstract hypergraphs (non-geometric!). No lower bound better than  (1/  ) is known in geometry. What is the actual bound? O(1/  ) ? Goal: Obtain smaller bounds for geometric range spaces. Ideally O(1/  ), but anything better than O(1/  log (1/  )) is `exciting‘ ! Achieved by points and intervals on the real line.

Previous results Points and halfspaces in 2D, 3D. O(1/  ) [Matousek 92], [Pyrga, Ray 08], [Har-Peled et al. 08] Points and disks, or pseudo-disks in 2D: O(1/  ) [Matousek, Seidel, Welzl 90], [Pyrga, Ray 08]. Pseudo-disks

Our results Points and axis-parallel rectangles in the plane.  -net size is O(1/  log log (1/  )). Points and axis-parallel boxes in 3-space.  -net size is O(1/  log log (1/  )). Points and  -fat triangles in the plane.  -net size is O(1/  log log (1/  )). Points uniformly distributed over the unit-cube, and axis-parallel boxes in d-space.  -net size is O(1/  log log (1/  )). Each of the angles  

Improved approximation factors for geometric hitting sets Ranges previous bound new bound Axis-parallel rectangles log OPT log log OPT Axis-parallel 3-boxes log OPT log log OPT  -fat triangles log OPT log log OPT Axis-parallel d-boxes log OPT log log OPT Uniformly distributed points in [0,1] d.

Main idea : Use two-level sampling Primary sampling step: Obtain an initial sample S of ~ 1/  points of X. On average, each heavy rectangle Q must satisfy Q  S  . Second sampling step (repair step): In each heavy rectangle Q  R, with Q  S = , sample additional points to guarantee that Q is stabbed by the net. contains at least  n points S Q

The  -net construction Input: X - a set of n points. Parameters: r := 1/ . Primary sample: Produce a random sample S  X of size r. Make S part of the output. |S| = r. Apply the second sampling step in each empty rectangle… Instead of processing all input rectangles, we consider a smaller set of representative rectangles.

The set of maximal S-empty rectangles A maximal S-empty rectangle M satisfies int(M)  S = , and for each rectangle M’  M, int(M’)  S  . M is defined by  4 points of S. M - set of all maximal S-empty rectangles. Apply repair-step on M instead on the input rectangles. S M

Why is it sufficient to consider M? For each input heavy rectangle Q, with Q  S = , expand Q until each of its sides touches a point of S or continues to . Since Q is heavy, a sufficiently large sample in M will hit Q, with high probability. Q M Otherwise, done!

The repair step [CF-90, CV-07] Consider a heavy rectangle M, with |M  X | = t  n/r, 1  t  log r. Second sampling step: Construct (1/ t)-net N M inside M, by sampling O(t log t ) points in M. According to the  -net theorem, each input (empty) rectangle Q  R, Q  M, with |Q|  n/r, must be stabbed by N M ! According to the  -net theorem Q M The excess of M The “universe” size is now t  n/r r = 1/ 

The final  -net Output: The union of S and  M  M N M. What is the expected size of the  -net ? r + E{  t  1 t log t | M t | } Exponential Decay Lemma: [Chazelle, Friedman 90], [Agarwal Matousek, Schwarzkopf 98] E{ | M  t | } = O( 2 -t E{ | M | }), The number of heavy rectangles decreases exponentially! M t = set of rectangles in M with excess t Expected number of maximal empty rectangles

Theorem: E{ | M | } = O(r log r) E{ | M  t | } = O(r log r). The expected  -net size is O(1/  log (1/  )). Key observation: Use oversampling. Choose a slightly larger primary sample, and repair only rectangles M with excess t  c  log log r. An improved  -net No improvement yet… |M  X |  n/r c > 1 |S| = c r log log r |S| = r t  1

What have we gained? “On average”, an S-empty rectangle contains now at most O(n/(r loglog r)) << n/r points. So M cannot be an “average” S-empty rectangle. It is much heavier. Exponential Decay Lemma: E{ | M  t | } = O( 2 -t E{ | M | }). # maximal heavy S-empty rectangles is much smaller! E{ | M  t | } = O(s log s / polylog r) = o(s) = o(r). The number of heavy (empty) rectangles is only sublinear in r ! The expected  -net size is O(r log log r). s = |S| = c r log log r t = c log log r.

Oversampling: A trick or a technique? By oversampling at the preliminary step, we significantly decrease the size of the secondary sample. Note: The number of maximal S-empty rectangles is O(s log s), however, we do not traverse all of them, but only the heavy ones! New Concept: The sample points and the maximal S-empty rectangles are two different entities.

merci beaucoup!

Bounding the number of maximal S-empty rectangles Upper bound: O(s 2 ). Each rectangle is determined by its two opposite corners. problem: The bound O(s 2 ) is bad for the analysis, and yields an  -net of size O(1/  2 ) ! Recall s = c 1/  log log 1/ 

Quadratic Lower bound construction A staircase construction: Each point in the upper staircase is matched with each point in the lower staircase.  (s 2 ) empty rectangles. We can prune away most of these rectangles and remain only with O(s log s) rectangles.

An O(s log s) bound for | M | Key observation: Consider a vertical line l, and all points to its left. Claim: The number of maximal S-empty rectangles, anchored at l is only linear. Next step: Use a tree decomposition built on top of X in order to obtain the O(s log s) bound. vv Q Q’ l1l1 l3l3 l‘2l‘2 l2l2 l ‘‘ 3 l‘3l‘3 l3l3 l

Dual (Geometric) Range Spaces Flip roles of X and R, and obtain (R, X*). R = set of regions in  d, X* = {R p | p  X}, R p = {r | r  R, r contains p}. R – Intervals. X* – Subsets of intervals containing a common point in  1. R – Disks. X* – Subsets of Disks containing a common point in  2 p p

 -nets for dual range spaces  -net for (R, X*) is a subset N  R that covers all points at depth   |R|. An  -net is a set cover for all the “deep” points. Upper bound [Haussler-Welzl, 87]: O(  /  log (  /  )),  is the VC-dimension of (R, X*). depth(p) = #ranges that cover p  X.

Range space (R, X*), s.t., for each T  R, |T| = m, the union  T has (a small) complexity o(m log m) : o(1/  log (1/  )). [Clarkson, Varadarajan 07] Theorem: [Clarkson, Varadarajan 07] The complexity of the union is O(m  (m))  -net size is O(1/   (1/  )). Previous results  (  ) is a slowly growing function. In fact, this should be the complexity of the vertical decomposition of the complement of the union.

More about the Clarkson- Varadarajantechnique More about the Clarkson- Varadarajan technique Example: disks (or pseudo-disks) and points Input: A set T of m (pseudo) disks. Union complexity: O(m). [kedem et al. 86]  -net size is O(1/  ). Example: fat triangles and points Input: A set T of m  -fat triangles. Union complexity: O(m loglog m). [Matousek et al. 1994]  -net size is O(1/  log log (1/  )). Each of the angles  

Our results: Dual Theorem: [Clarkson, Varadarajan 07] The complexity of the union is O(m  (m))  -net size is O(1/   (1/  )). Using the oversampling concept: Theorem (improvement!): The complexity of the union is O(m  (m))  -net size is O(1/  log  (1/  )).

Draw a random sample S of s = c/  log  (1/  ) regions. Construct the union of S: Decompose its complement into O(s  (s)) “trapezoidal cells”. Each cell  is defined by  4 regions. Claim: With high probability,  meets  (n/s) log s regions of the input. Proof sketch 

Apply a repair step on the heavy cells: Sample O(t log t ) regions in each cell  that meets t  n/s regions, for t  c log  (1/  ) Each point at depth  n  is covered by at least one region. Use the Exponential Decay Lemma to show: # regions sampled at the repair step = o(1/  ). Overall  -net size: O(1/  log  (1/  )).

New  -net bounds Fat triangles: Union complexity: O(m loglog m)  -net size is O(1/  log log log (1/  )). Locally  -fat objects: Union complexity: O(m polylog m)  -net size is O(1/  log log (1/  )). And several other improved bounds. O D area(D  O)    area(D) 0 <   1

Improve our upper bound O(1/  log log (1/  )) for points and axis-parallel rectangles. Conservative goal: Obtain a weak  -net of size o(1/  log log (1/  )). Extend our bound to points and axis-parallel boxes in d  4. Best known upper bound: O(1/  log (1/  )). Dual range spaces for rectangles and points. Best known upper bound: O(1/  log (1/  )). Can improve to O(1/  log log (1/  )) ? Open problems p The points of the  -net are not necessarily chosen from X.

Motivation: Approximation for geometric hitting sets The Bronimann-Goodrich technique / LP-relaxation If (X, R) admits an  -net of size f(1/  ), then there exists a polynomial-time approximation algorithm that reports a hitting set of size O(f(OPT)). Idea: Assign weights on X s.t each range Q  R becomes heavy. Construct an  -net for the weighed range space. Each range is hit by the  -net. Small-size  -nets imply small approximation factors!

The repair step repair step: On average, each heavy rectangle Q must satisfy Q  S  . The number of “bad” rectangles is small. It is sufficient to consider a set M of maximal S-empty rectangles, instead of R. M is defined over the points of S. | M | = f(1/  ) (does not depend on n). and so does #points sampled at the repair step. Q M S

An O(s log s) bound for | M | Key observation: Consider a vertical line l, and all points to its left. Claim: The number of maximal S-empty rectangles, anchored at l is only linear. Handling a query rectangle Q: One of the halves Q’of Q contains at least n/(2r) points. Q’ is anchored at l. Expand Q’ on “heavier” side of l. l l Q Q’

Tree decomposition Build balanced binary tree T on X, sorted by x-coordinate Stop expansion of T when nodes have  n/r points. T has O(log r) = O(log s) levels. At each level: #maximal S-empty anchored rectangles: O(s) Overall (over all levels): O(s log s). l1l1 l3l3 l‘2l‘2 l2l2 l ‘‘ 3 l‘3l‘3 l3l3 vv Each node is a vertical strip

Query rectangle Q For an input rectangle Q with  n/r points: Find the first (highest) node of T whose bounding line l meets Q. Expand Q within the “heavier” strip  v bounded by l. The maximal S-empty anchored rectangles comprise the representative set for R. vv Q Q’ l1l1 l3l3 l‘2l‘2 l2l2 l ‘‘ 3 l‘3l‘3 l3l3

Is the bound optimal? Theorem [Komlos, Pach, Woeginger 92]: The bound is tight! The construction: Artificial on abstract hypergraphs (non-geometric!). No lower bound better than  (1/  ) is known in geometry. What is the actual bound? O(1/  ) ? Goal: Obtain smaller bounds for geometric range spaces. Ideally O(1/  ), but anything better than O(1/  log (1/  )) is `exciting‘ ! Achieved by points and intervals on the real line.

Bounding the  -net size Exponential Decay Lemma: [Chazelle, Friedman 90], [Agarwal Matousek, Schwarzkopf. 98] E{ | M t | } = O( 2 -t E{ | M '| }), where: S' is a smaller random sample, each point chosen with probability s/(t  n). M t - all maximal S-empty rectangles M with t M  t. M ' - all maximal S'-empty rectangles.

Bounding the final  -net size A very useful tool: Exponential Decay Lemma: [Chazelle, Friedman 90], [Agarwal Matousek, Schwarzkopf. 98] E{ | M t | } = O( 2 -t E{ | M | }), where M t is all maximal S-empty rectangles M with t M  t. The number of heavy rectangles decreases exponentially!

A nearly-linear bound for | M | Fix a node v of T and its strip  v : X v = S   v, S v = S   v Lemma: The number of maximal S v -empty anchored rectangles in  v is O(S v ). At a fixed level i of T, overall number is O(s). Overall: O(s log r). vv Entry side

The set-cover problem Primal: A hitting set for (X, R) is a subset H  X, s.t., for any Q  R, Q  H  . Dual: A set cover for (X, R) is a subset S  R, s.t., any x  X is covered by S. A set cover for (X, R) is a hitting set for (R, X*) Finding a set cover of smallest size is NP-hard! (even for geometric range spaces). Achieve improved approximation factors via  -nets (using the Bronimann-Goodrich technique / LP-relaxation). f(1/  ) O(f(OPT))

 -nets for dual range spaces  -net for (R, X*) is a subset N  R that covers all points at depth   |R|. An  -net is a set cover for all the deep points. Example: Intervals and points on the real line: |N| = 1/ . depth(p) = #ranges that cover p  X.  n n

Extensions to axis-parallel boxes in 3-space Use similar machinery, with s = c r log log r, and a 3-level range tree decomposition. At each fixed triple-level of the tree, we have a subdivision of space into (clipped) orthants. x-order y-order z-order

Axis-parallel boxes in 3-space Fix a orthant . Consider the points in , and the set M  of all maximal S-empty boxes anchored at the apex of . Claim: M  = O(s). E{|   M  | } = E{ | M | } = O(s log 3 s) The expected size of the  -net is O(1/  log log (1/  )).  All these boxes grow from a common point. They behave as maximal S-empty orthants!

Small-size  -nets for Axis- Parallel Rectangles and Boxes Boris Aronov Esther Ezra Micha sharir polytechnic Duke Tel-Aviv Institute of NYU University.

Similar presentations

Presentation on theme: "Small-size  -nets for Axis- Parallel Rectangles and Boxes Boris Aronov Esther Ezra Micha sharir polytechnic Duke Tel-Aviv Institute of NYU University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Small-size  -nets for Axis- Parallel Rectangles and Boxes Boris Aronov Esther Ezra Micha sharir polytechnic Duke Tel-Aviv Institute of NYU University.

Similar presentations

Presentation on theme: "Small-size  -nets for Axis- Parallel Rectangles and Boxes Boris Aronov Esther Ezra Micha sharir polytechnic Duke Tel-Aviv Institute of NYU University."— Presentation transcript:

Similar presentations

About project

Feedback