Presentation is loading. Please wait.

Presentation is loading. Please wait.

2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 7: Labeling.

Similar presentations


Presentation on theme: "2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 7: Labeling."— Presentation transcript:

1 2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 7: Labeling

2 Label placement Utrecht Nieuwegein Houten Culemborg Zeist  Positioning text with point, line, and area features on a map.

3 Cartographic criteria Utrecht Zeist Bunnik De Bilt Odijk Driebergen Doorn Placementno overlap no ambiguity readable non-obscuring Other criteria harmony in fonts good choice of color size of text corresponds to importance

4 The label placement problem  Compute where the labels should be placed  The label placement problem assumes that: font, text, typeface, color, etc., is given all map features other than labels are given only the position of the labels is not given

5 Point label placement Utrecht Zeist  Set of n points to be labeled, each with a text, where a label is a rectangle (bounding box)  Objective: label as many points as possible without overlap, where the label must have a corner at its point (NP-hard)

6 Point label placement  Common: the 4-position or the 8-position model  Solution method e.g. greedy, dynamic programming, simulated annealing, genetic, LP-based branch & cut 4-position 8-position

7 Sliding labels 4-slider model2-slider model  The label may touch the point anywhere on its (top or bottom) boundary  Not yet discretized

8 Example

9  How many more points can potentially be labeled?  Is there an efficient (heuristic) algorithm for label placement in the slider model? Sliding labels vs. fixed position labels 4-position: 4 labels is optimum 2-slider or 4-slider: 6 labels is optimum

10 Sliding labels vs. fixed position labels Lemma For unit-size squares, the 2-slider model sometimes allows twice as many labels as the 4-position model, but never more than twice.

11 Sliding labels vs. fixed position labels Lemma For unit-size squares, the 2-slider model sometimes allows twice as many labels as the 4-position model, but never more than twice. Proof Consider optimal labeling in the 2-slider model: At least half of the labels intersect the odd or even lines. Slide these into corner position. ➨ Never more than twice as many

12 Sliding labels vs. fixed position labels 1–1– 1–1– 2-slider solution 4-position solution ➨ Sometimes twice as many

13 Sliding labels vs. fixed position labels  The 2-slider model can sometimes label twice as many points as the 4-position model, but never more than twice  The 4-slider model can sometimes label twice as many points as the 2-slider model, but never more than twice  The 4-slider model can sometimes label 1½ times as many points as any fixed position model (… but we cannot label optimally in any model …)

14 Maximum non-intersecting subset  4-position: If the four label positions of a point intersect, then we get maximum non-intersecting subset of rectangles  Also: maximum independent set in rectangle intersection graphs

15 Simple heuristics  Assume labels have equal height but varying width  Heuristic 1: choose any label, eliminate the intersecting candidates and repeat  Approximation factor?

16 First heuristic  Approximation factor: Θ(1/n)

17 Second heuristic  Choose shortest label, eliminate the intersecting candidates and repeat  Approximation factor?

18 Second heuristic  Any chosen label can eliminate many candidate labels but every eliminated label contains a corner of the chosen label!  The chosen label together with the intersected (eliminated) rectangles has an MIS of size  4 ➨ ¼-approximation (tight)

19 Factor- ½ approximation algorithm  Assume labels have equal height but varying width  A greedy, left to right placement gives a ½-approximation (to follow)

20 Greedy algorithm (4-position) Placed label Not place-able Not yet placed Not yet placed; leftmost right edge 4-position example 1. Always choose the label with the leftmost right side 2. Remove labels that cannot be placed anymore 3. Repeat

21 Greedy algorithm (4-position) Reference points Reference point: lower left corner of label; each point feature has 4 reference points  Use efficient data structures to maintain candidate reference points; only maintain candidates for which the label doesn’t intersect any placed label

22 Greedy algorithm (4-position) Leftmost label Regions where no reference points can lie  The algorithm discards all useless candidates immediately after a new label is placed

23 Greedy algorithm (4-position) Heap data structure that stores all reference points sorted by: “x-coordinate + label width” ➨ To find leftmost candidate Priority search treedata structure that contains all reference points of candidates ➨ To find useless candidates after a placement

24 1. Get reference point p with minimum “x-coordinate + label width” from the heap 2. Place the label at reference point p 3. Search in priority search tree for all reference points at which the label cannot be placed anymore 4. Delete these from the heap and priority search tree Greedy algorithm (4-position) No candidates here

25 Greedy (4-position), efficiency  The data structures allow each candidate label position to be handled in O(log n) time  Overall running time: O(n log n)

26 Greedy (4-position), approximation What’s the maximum number of labels we could have placed among R and the intersecting label candidates? R  Factor ½-approximation:

27 Greedy (4-position), approximation R cannot exist because R is leftmost non-chosen  All candidates that intersect R contain the upper right or lower right corner of R  Hence, the max. non-intersecting subset of R and the intersected candidates has size 2  We choose 1, so approximation is ½

28  Maximum non-intersecting subset in a set of axis-parallel rectangles Labels with varying heights Leftmost rectangle can intersect a large independent set ➨ heuristic gives approximation factor Θ(1/n)

29 A PTAS for label placement  Fixed height rectangles; maximum independent set: polynomial time approximation scheme  More precisely: for any integer k > 1, a (k/(k+1))-approximation in time O(n log n + n 2k-1 ) 2/3-approx. in O(n 3 ) time 3/4-approx. in O(n 5 ) time 4/5-approx. in O(n 7 ) time etc.

30 A PTAS for label placement 1. Optimal algorithm if all rectangles intersect one horizontal line 2. New ½-approximation algorithm 3. Dynamic programming for optimal sub-solutions 4. Shifting lemma to combine sub-solutions into a PTAS

31 PTAS for labels: one line  Assume all rectangles intersect a horizontal line  Greedy left to right (first one ending) gives optimal solution  Note: equal height is not needed (height is irrelevant)

32 PTAS for labels: ½-approx.  Assume labels have unit height  Draw horizontal lines such that: Separation between any two lines is >1 Each line intersects at least one rectangle Each rectangle is intersected by some line

33 PTAS for labels: ½-approx. 1. Compute the MIS for the rectangles for each line 2. Add the MIS for lines L 1, L 3, L 5, … 3. Add the MIS for lines L 2, L 4, L 6, … 4. Return the larger of the two MIS’s

34 PTAS for labels: ½-approx.  Why a ½-approximation? the MIS for lines L 1, L 3, L 5, … is optimal the MIS for lines L 2, L 4, L 6, … is also optimal  The pigeon-hole principle says that the lines L 1, L 3, L 5, … or the lines L 2, L 4, L 6, … must contain half the optimal MIS

35 PTAS for labels: two lines Yes, using dynamic programming in O(n 3 ) time  Can we compute a MIS for a set of rectangles intersected by two horizontal lines?

36 PTAS for labels: two lines L1L1 L2L2 L3L3 L4L4 L5L5 L6L6  Assuming this result, we compute OPT-MIS of L 1  L 2 and L 2  L 3 and L 3  L 4 and …

37 PTAS for labels: two lines L1L1 L2L2 L3L3 L4L4 L5L5 L6L6  Note that the solution for L 1  L 2 and for L 4  L 5 cannot have intersections

38 PTAS for labels: two lines L1L1 L2L2 L3L3 L4L4 L5L5 L6L6  Note that the solution for L 1  L 2 and for L 4  L 5 cannot have intersections

39 PTAS for labels: two lines L1L1 L2L2 L3L3 L4L4 L5L5 L6L6  Note that the solution for L 1  L 2 and for L 4  L 5 cannot have intersections

40 PTAS for labels: two lines L1L1 L2L2 L3L3 L4L4 L5L5 L6L6  Note that the solution for L 1  L 2 and for L 4  L 5 cannot have intersections

41 PTAS for labels: two lines L1L1 L2L2 L3L3 L4L4 L5L5 L6L6  Note that the solution for L 1  L 2 and for L 4  L 5 cannot have intersections

42 PTAS for labels: two lines 1, 2, 3, 4, 5, 6, 7, 8, 9, … 1. Compute the OPT-MIS for L 1  L 2, L 4  L 5, L 7  L 8, … 2. Compute the OPT-MIS for L 2  L 3, L 5  L 6, L 8  L 9, … 3. Compute the OPT-MIS for L 1, L 3  L 4, L 6  L 7, … 4. Choose the largest solution

43 PTAS for labels: two lines 1, 2, 3, 4, 5, 6, 7, 8, 9, …  Consider the real MIS M  Claim: M has at least 2/3 of its rectangles in 1 of the 3 solutions  Why? Let M i  M be those rectangles of M that intersect line L i  The rectangles of any M i are considered in 2 out of 3 sub-problems

44  Our 3 solutions are optimal for L 1  L 2, L 4  L 5, L 7  L 8, … so at least as large as |M 1 |+ |M 2 |+ |M 4 |+ |M 5 |+ |M 7 |+… PTAS for labels: two lines |M 1 |+ |M 2 |+ |M 4 |+ |M 5 |+ |M 7 |+… |M 2 |+ |M 3 |+ |M 5 |+ |M 6 |+ |M 8 |+… |M 1 |+ |M 3 |+ |M 4 |+ |M 6 |+ |M 7 |+… 2|M 1 |+2|M 2 |+2|M 3 |+2|M 4 |+ 2|M 5 |+2|M 6 |+2|M 7 |+… = 2|M| + ≤ |solution 1| ≤ |solution 2| ≤ |solution 3| ≤ |solution 1| + |solution 2| + |solution 3|

45 PTAS for labels: two lines  2 |M| ≤ |solution 1| + |solution 2| + |solution 3| ➨ at least one of solutions 1, 2, and 3 must have size |solution i| ≥ 2 |M| / 3 (pigeon-hole principle) ➨ 2/3-approximation

46 PTAS for labels: k lines 1, 2, …, k, k+1, k+2, …, 2k+1, 2k+2, 2k+3, … 1, 2, …, k, k+1, k+2, k+3, …, 2k+2, 2k+3, 2k+4, … 1, 2, …, k, k+1, k+2, …, 2k+1, 2k+2, 2k+3, … k +1...  The rectangles of every line L i are considered in k out of k +1 sub-problems

47 PTAS for labels: k lines  We get k +1 solutions from k +1 sub-problems whose summed size is at least k |M| (= k times OPT)  So one of the sub-problems gives a solution of size k / (k +1) ➨ (k / (k +1))-approximation for any integer k > 0 ➨ (1- ε)-approximation for any real ε > 0

48 PTAS by optimal sub-solutions Shifting strategy (Hochbaum & Maass, 1985) Choose an integer k (for a (1  1/k)-approximation) Partition the problem into “narrow” sub-problems that can be solved optimally in time O(f (n, k)) (polynomial in n) and can be combined into one optimal solution to a “large” sub-problem Use a scheme of partitions into “narrow” sub-problems (each solution part must occur as candidate solution part in many of the partitions in the scheme introduced for covering and packing for VLSI-design

49 Optimal labeling: 2 lines 1. Normalize to integer coordinates 2. Set up recurrence for optimal solution 3. Use arrays to compute recurrence (to re-use solutions to sub-problems) ➨ dynamic programming

50 Optimal labeling: 2 lines  Normalization 1. Sort all left and right sides by x-coordinate 2. Normalize them to 0, 1, 2, … 3. Sort all bottom and top sides by y-coordinate 4. Normalize them to 0, 1, 2, … 01245673 ➨

51 Optimal labeling: 2 lines p q t Set up recurrence Define A(p, q, t) to be the optimal number of rectangles in a certain sub-region

52 Optimal labeling: 2 lines p q t A(p, q, t) = 2 Set up recurrence Define A(p, q, t) to be the optimal number of rectangles in a certain sub-region: left of green polyline

53 Optimal labeling: 2 lines Set up recurrence How to define A(p, q, t) expressed in A(.,.,.) with smaller indices? Case 1 no rectangle ends at q and is below t ➨ A(p, q, t) = A(p, q -1, t) p q t q -1

54 Optimal labeling: 2 lines Set up recurrence Case 2 a rectangle ends at q and is below t and is right of p ➨ A(p, q, t) = max { A(p, q -1, t), 1 + A(p, r, t) } p q t rq-1

55 Optimal labeling: 2 lines Set up recurrence Case 3 a rectangle ends at q and is below t and is not right of p ➨ A(p, q, t) = max { A(p, q -1, t), 1 + A(p, r, u) } p q t r u q-1

56 Optimal labeling: 2 lines  Set up recurrence: A(p, q, t) depends on A(…) with smaller indices only, and the value is determined by 1 of 3 cases (maximizing a choice in 2)  A(p, q, t) can be determined in O(1) time if we know all A(…) with smaller indices  Note: If several rectangles end at q we must be a bit more careful

57 Optimal labeling: 2 lines 1. Make array A[max-p, max-q, max-t] with ≤ n 3 entries 2. Fill A[…] bottom up in O(1) time per entry ➨ the optimal solution for 2 lines is computed in O(n 3 ) time  Total for 2/3-approximation is also O(n 3 ) time

58  Similar, but need a (2k -1)-dimensional array Optimal labeling: k lines t k-1 t2t2 t1t1 p1p1 pkpk A(p 1, …, p k, t 1, …, t k-1 )

59 Optimal labeling: k lines  The (2k -1)-dimensional array has O(n 2k-1 ) entries  Each takes O(1) time to fill  The approximation factor is k / (k+1) by the shifting strategy  Running time is O(kn 2k-1 ) Literature Label placement by maximum independent set in rectangles P. Agarwal, M. van Kreveld, and S. Suri. Computational Geometry: Theory and Applications, 11:209-218, 1998.


Download ppt "2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 7: Labeling."

Similar presentations


Ads by Google