2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 7: Labeling.

2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 7: Labeling

Label placement Utrecht Nieuwegein Houten Culemborg Zeist  Positioning text with point, line, and area features on a map.

Cartographic criteria Utrecht Zeist Bunnik De Bilt Odijk Driebergen Doorn Placementno overlap no ambiguity readable non-obscuring Other criteria harmony in fonts good choice of color size of text corresponds to importance

The label placement problem  Compute where the labels should be placed  The label placement problem assumes that: font, text, typeface, color, etc., is given all map features other than labels are given only the position of the labels is not given

Point label placement Utrecht Zeist  Set of n points to be labeled, each with a text, where a label is a rectangle (bounding box)  Objective: label as many points as possible without overlap, where the label must have a corner at its point (NP-hard)

Point label placement  Common: the 4-position or the 8-position model  Solution method e.g. greedy, dynamic programming, simulated annealing, genetic, LP-based branch & cut 4-position 8-position

Sliding labels 4-slider model2-slider model  The label may touch the point anywhere on its (top or bottom) boundary  Not yet discretized

Example

 How many more points can potentially be labeled?  Is there an efficient (heuristic) algorithm for label placement in the slider model? Sliding labels vs. fixed position labels 4-position: 4 labels is optimum 2-slider or 4-slider: 6 labels is optimum

Sliding labels vs. fixed position labels Lemma For unit-size squares, the 2-slider model sometimes allows twice as many labels as the 4-position model, but never more than twice.

Sliding labels vs. fixed position labels Lemma For unit-size squares, the 2-slider model sometimes allows twice as many labels as the 4-position model, but never more than twice. Proof Consider optimal labeling in the 2-slider model: At least half of the labels intersect the odd or even lines. Slide these into corner position. ➨ Never more than twice as many

Sliding labels vs. fixed position labels 1–1– 1–1– 2-slider solution 4-position solution ➨ Sometimes twice as many

Sliding labels vs. fixed position labels  The 2-slider model can sometimes label twice as many points as the 4-position model, but never more than twice  The 4-slider model can sometimes label twice as many points as the 2-slider model, but never more than twice  The 4-slider model can sometimes label 1½ times as many points as any fixed position model (… but we cannot label optimally in any model …)

Maximum non-intersecting subset  4-position: If the four label positions of a point intersect, then we get maximum non-intersecting subset of rectangles  Also: maximum independent set in rectangle intersection graphs

Simple heuristics  Assume labels have equal height but varying width  Heuristic 1: choose any label, eliminate the intersecting candidates and repeat  Approximation factor?

First heuristic  Approximation factor: Θ(1/n)

Second heuristic  Choose shortest label, eliminate the intersecting candidates and repeat  Approximation factor?

Second heuristic  Any chosen label can eliminate many candidate labels but every eliminated label contains a corner of the chosen label!  The chosen label together with the intersected (eliminated) rectangles has an MIS of size  4 ➨ ¼-approximation (tight)

Factor- ½ approximation algorithm  Assume labels have equal height but varying width  A greedy, left to right placement gives a ½-approximation (to follow)

Greedy algorithm (4-position) Placed label Not place-able Not yet placed Not yet placed; leftmost right edge 4-position example 1. Always choose the label with the leftmost right side 2. Remove labels that cannot be placed anymore 3. Repeat

Greedy algorithm (4-position) Reference points Reference point: lower left corner of label; each point feature has 4 reference points  Use efficient data structures to maintain candidate reference points; only maintain candidates for which the label doesn’t intersect any placed label

Greedy algorithm (4-position) Leftmost label Regions where no reference points can lie  The algorithm discards all useless candidates immediately after a new label is placed

Greedy algorithm (4-position) Heap data structure that stores all reference points sorted by: “x-coordinate + label width” ➨ To find leftmost candidate Priority search treedata structure that contains all reference points of candidates ➨ To find useless candidates after a placement

1. Get reference point p with minimum “x-coordinate + label width” from the heap 2. Place the label at reference point p 3. Search in priority search tree for all reference points at which the label cannot be placed anymore 4. Delete these from the heap and priority search tree Greedy algorithm (4-position) No candidates here

Greedy (4-position), efficiency  The data structures allow each candidate label position to be handled in O(log n) time  Overall running time: O(n log n)

Greedy (4-position), approximation What’s the maximum number of labels we could have placed among R and the intersecting label candidates? R  Factor ½-approximation:

Greedy (4-position), approximation R cannot exist because R is leftmost non-chosen  All candidates that intersect R contain the upper right or lower right corner of R  Hence, the max. non-intersecting subset of R and the intersected candidates has size 2  We choose 1, so approximation is ½

 Maximum non-intersecting subset in a set of axis-parallel rectangles Labels with varying heights Leftmost rectangle can intersect a large independent set ➨ heuristic gives approximation factor Θ(1/n)

A PTAS for label placement  Fixed height rectangles; maximum independent set: polynomial time approximation scheme  More precisely: for any integer k > 1, a (k/(k+1))-approximation in time O(n log n + n 2k-1 ) 2/3-approx. in O(n 3 ) time 3/4-approx. in O(n 5 ) time 4/5-approx. in O(n 7 ) time etc.

A PTAS for label placement 1. Optimal algorithm if all rectangles intersect one horizontal line 2. New ½-approximation algorithm 3. Dynamic programming for optimal sub-solutions 4. Shifting lemma to combine sub-solutions into a PTAS

PTAS for labels: one line  Assume all rectangles intersect a horizontal line  Greedy left to right (first one ending) gives optimal solution  Note: equal height is not needed (height is irrelevant)

PTAS for labels: ½-approx.  Assume labels have unit height  Draw horizontal lines such that: Separation between any two lines is >1 Each line intersects at least one rectangle Each rectangle is intersected by some line

PTAS for labels: ½-approx. 1. Compute the MIS for the rectangles for each line 2. Add the MIS for lines L 1, L 3, L 5, … 3. Add the MIS for lines L 2, L 4, L 6, … 4. Return the larger of the two MIS’s

PTAS for labels: ½-approx.  Why a ½-approximation? the MIS for lines L 1, L 3, L 5, … is optimal the MIS for lines L 2, L 4, L 6, … is also optimal  The pigeon-hole principle says that the lines L 1, L 3, L 5, … or the lines L 2, L 4, L 6, … must contain half the optimal MIS

PTAS for labels: two lines Yes, using dynamic programming in O(n 3 ) time  Can we compute a MIS for a set of rectangles intersected by two horizontal lines?

PTAS for labels: two lines L1L1 L2L2 L3L3 L4L4 L5L5 L6L6  Assuming this result, we compute OPT-MIS of L 1  L 2 and L 2  L 3 and L 3  L 4 and …

PTAS for labels: two lines L1L1 L2L2 L3L3 L4L4 L5L5 L6L6  Note that the solution for L 1  L 2 and for L 4  L 5 cannot have intersections

PTAS for labels: two lines 1, 2, 3, 4, 5, 6, 7, 8, 9, … 1. Compute the OPT-MIS for L 1  L 2, L 4  L 5, L 7  L 8, … 2. Compute the OPT-MIS for L 2  L 3, L 5  L 6, L 8  L 9, … 3. Compute the OPT-MIS for L 1, L 3  L 4, L 6  L 7, … 4. Choose the largest solution

PTAS for labels: two lines 1, 2, 3, 4, 5, 6, 7, 8, 9, …  Consider the real MIS M  Claim: M has at least 2/3 of its rectangles in 1 of the 3 solutions  Why? Let M i  M be those rectangles of M that intersect line L i  The rectangles of any M i are considered in 2 out of 3 sub-problems

 Our 3 solutions are optimal for L 1  L 2, L 4  L 5, L 7  L 8, … so at least as large as |M 1 |+ |M 2 |+ |M 4 |+ |M 5 |+ |M 7 |+… PTAS for labels: two lines |M 1 |+ |M 2 |+ |M 4 |+ |M 5 |+ |M 7 |+… |M 2 |+ |M 3 |+ |M 5 |+ |M 6 |+ |M 8 |+… |M 1 |+ |M 3 |+ |M 4 |+ |M 6 |+ |M 7 |+… 2|M 1 |+2|M 2 |+2|M 3 |+2|M 4 |+ 2|M 5 |+2|M 6 |+2|M 7 |+… = 2|M| + ≤ |solution 1| ≤ |solution 2| ≤ |solution 3| ≤ |solution 1| + |solution 2| + |solution 3|

PTAS for labels: two lines  2 |M| ≤ |solution 1| + |solution 2| + |solution 3| ➨ at least one of solutions 1, 2, and 3 must have size |solution i| ≥ 2 |M| / 3 (pigeon-hole principle) ➨ 2/3-approximation

PTAS for labels: k lines 1, 2, …, k, k+1, k+2, …, 2k+1, 2k+2, 2k+3, … 1, 2, …, k, k+1, k+2, k+3, …, 2k+2, 2k+3, 2k+4, … 1, 2, …, k, k+1, k+2, …, 2k+1, 2k+2, 2k+3, … k +1...  The rectangles of every line L i are considered in k out of k +1 sub-problems

PTAS for labels: k lines  We get k +1 solutions from k +1 sub-problems whose summed size is at least k |M| (= k times OPT)  So one of the sub-problems gives a solution of size k / (k +1) ➨ (k / (k +1))-approximation for any integer k > 0 ➨ (1- ε)-approximation for any real ε > 0

PTAS by optimal sub-solutions Shifting strategy (Hochbaum & Maass, 1985) Choose an integer k (for a (1  1/k)-approximation) Partition the problem into “narrow” sub-problems that can be solved optimally in time O(f (n, k)) (polynomial in n) and can be combined into one optimal solution to a “large” sub-problem Use a scheme of partitions into “narrow” sub-problems (each solution part must occur as candidate solution part in many of the partitions in the scheme introduced for covering and packing for VLSI-design

Optimal labeling: 2 lines 1. Normalize to integer coordinates 2. Set up recurrence for optimal solution 3. Use arrays to compute recurrence (to re-use solutions to sub-problems) ➨ dynamic programming

Optimal labeling: 2 lines  Normalization 1. Sort all left and right sides by x-coordinate 2. Normalize them to 0, 1, 2, … 3. Sort all bottom and top sides by y-coordinate 4. Normalize them to 0, 1, 2, … 01245673 ➨

Optimal labeling: 2 lines p q t Set up recurrence Define A(p, q, t) to be the optimal number of rectangles in a certain sub-region

Optimal labeling: 2 lines p q t A(p, q, t) = 2 Set up recurrence Define A(p, q, t) to be the optimal number of rectangles in a certain sub-region: left of green polyline

Optimal labeling: 2 lines Set up recurrence How to define A(p, q, t) expressed in A(.,.,.) with smaller indices? Case 1 no rectangle ends at q and is below t ➨ A(p, q, t) = A(p, q -1, t) p q t q -1

Optimal labeling: 2 lines Set up recurrence Case 2 a rectangle ends at q and is below t and is right of p ➨ A(p, q, t) = max { A(p, q -1, t), 1 + A(p, r, t) } p q t rq-1

Optimal labeling: 2 lines Set up recurrence Case 3 a rectangle ends at q and is below t and is not right of p ➨ A(p, q, t) = max { A(p, q -1, t), 1 + A(p, r, u) } p q t r u q-1

Optimal labeling: 2 lines  Set up recurrence: A(p, q, t) depends on A(…) with smaller indices only, and the value is determined by 1 of 3 cases (maximizing a choice in 2)  A(p, q, t) can be determined in O(1) time if we know all A(…) with smaller indices  Note: If several rectangles end at q we must be a bit more careful

Optimal labeling: 2 lines 1. Make array A[max-p, max-q, max-t] with ≤ n 3 entries 2. Fill A[…] bottom up in O(1) time per entry ➨ the optimal solution for 2 lines is computed in O(n 3 ) time  Total for 2/3-approximation is also O(n 3 ) time

 Similar, but need a (2k -1)-dimensional array Optimal labeling: k lines t k-1 t2t2 t1t1 p1p1 pkpk A(p 1, …, p k, t 1, …, t k-1 )

Optimal labeling: k lines  The (2k -1)-dimensional array has O(n 2k-1 ) entries  Each takes O(1) time to fill  The approximation factor is k / (k+1) by the shifting strategy  Running time is O(kn 2k-1 ) Literature Label placement by maximum independent set in rectangles P. Agarwal, M. van Kreveld, and S. Suri. Computational Geometry: Theory and Applications, 11:209-218, 1998.

2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 7: Labeling.

Similar presentations

Presentation on theme: "2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 7: Labeling."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 7: Labeling.

Similar presentations

Presentation on theme: "2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 7: Labeling."— Presentation transcript:

Similar presentations

About project

Feedback