Divide and Conquer / Closest Pair of Points Yin Tat Lee

Slides:



Advertisements
Similar presentations
Comments We consider in this topic a large class of related problems that deal with proximity of points in the plane. We will: 1.Define some proximity.
Advertisements

Lecture 3: Parallel Algorithm Design
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
Closest Pair Given a set S = {p1, p2,..., pn} of n points in the plane find the two points of S whose distance is the smallest. Images in this presentation.
Prof. Swarat Chaudhuri COMP 482: Design and Analysis of Algorithms Spring 2013 Lecture 11.
1 Divide-and-Conquer Divide-and-conquer. n Break up problem into several parts. n Solve each part recursively. n Combine solutions to sub-problems into.
Robert Pless, CS 546: Computational Geometry Lecture #3 Last Time: Convex Hulls Today: Plane Sweep Algorithms, Segment Intersection, + (Element Uniqueness,
CS38 Introduction to Algorithms Lecture 7 April 22, 2014.
Algorithm Design Strategy Divide and Conquer. More examples of Divide and Conquer  Review of Divide & Conquer Concept  More examples  Finding closest.
Lecture 28 CSE 331 Nov 9, Flu and HW 6 Graded HW 6 at the END of the lecture If you have the flu, please stay home Contact me BEFORE you miss a.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Administrivia, Lecture 5 HW #2 was assigned on Sunday, January 20. It is due on Thursday, January 31. –Please use the correct edition of the textbook!
Closest Pair of Points Computational Geometry, WS 2006/07 Lecture 9, Part II Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen,
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
Lecture 30 CSE 331 Nov 10, Online Office Hours
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 5 Instructor: Paul Beame TA: Gidon Shavit.
1 Closest Pair of Points (from “Algorithm Design” by J.Kleinberg and E.Tardos) Closest pair. Given n points in the plane, find a pair with smallest Euclidean.
1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Divide and Conquer Applications Sanghyun Park Fall 2002 CSE, POSTECH.
CSE 421 Algorithms Lecture 15 Closest Pair, Multiplication.
Lecture 5 Today, how to solve recurrences We learned “guess and proved by induction” We also learned “substitution” method Today, we learn the “master.
CSE 340: Review (at last!) Measuring The Complexity Complexity is a function of the size of the input O() Ω() Θ() Complexity Analysis “same order” Order.
CMPT 238 Data Structures More on Sorting: Merge Sort and Quicksort.
Advanced Algorithms Analysis and Design
Introduction to Algorithms
Chapter 5 Divide and Conquer
Computational problems, algorithms, runtime, hardness
Chapter 4 Divide-and-Conquer
Divide-and-Conquer 6/30/2018 9:16 AM
Lecture 26 CSE 331 Nov 2, 2016.
CSC 421: Algorithm Design & Analysis
CSC 421: Algorithm Design & Analysis
CMPS 3130/6130 Computational Geometry Spring 2017
Chapter 4 Divide-and-Conquer
CSC 421: Algorithm Design & Analysis
CSCE350 Algorithms and Data Structure
Greedy Algorithms / Minimum Spanning Tree Yin Tat Lee
Jordi Cortadella and Jordi Petit Department of Computer Science
Greedy Algorithms / Interval Scheduling Yin Tat Lee
Chapter 5 Divide and Conquer
Punya Biswas Lecture 15 Closest Pair, Multiplication
Data Structures Review Session
Lecture 26 CSE 331 Nov 1, 2017.
Divide-and-Conquer 7 2  9 4   2   4   7
Master Theorem Yin Tat Lee
cse 521: design and analysis of algorithms
Chapter 4 Divide-and-Conquer
Divide and Conquer / Closest Pair of Points Yin Tat Lee
Divide and Conquer Algorithms Part I
Sorting "There's nothing in your head the sorting hat can't see. So try me on and I will tell you where you ought to be." -The Sorting Hat, Harry Potter.
Linear-Time Sorting Algorithms
Lecture 27 CSE 331 Oct 31, 2014.
CPS 173 Computational problems, algorithms, runtime, hardness
Lecture 28 CSE 331 Nov 7, 2012.
Divide-and-Conquer The most-well known algorithm design strategy:
Lecture 27 CSE 331 Nov 2, 2010.
Divide-and-Conquer 7 2  9 4   2   4   7
CSCI 256 Data Structures and Algorithm Analysis Lecture 12
Lecture 15, Winter 2019 Closest Pair, Multiplication
Richard Anderson Lecture 14 Divide and Conquer
The Selection Problem.
Richard Anderson Lecture 14, Winter 2019 Divide and Conquer
Richard Anderson Lecture 14 Divide and Conquer
Divide-and-Conquer 7 2  9 4   2   4   7
Lecture 15 Closest Pair, Multiplication
Lecture 27 CSE 331 Nov 4, 2016.
Given a list of n  8 integers, what is the runtime bound on the optimal algorithm that sorts the first eight? O(1) O(log n) O(n) O(n log n) O(n2)
Divide and Conquer Merge sort and quick sort Binary search
Lecture 27 CSE 331 Nov 1, 2013.
Presentation transcript:

Divide and Conquer / Closest Pair of Points Yin Tat Lee CSE 421 Divide and Conquer / Closest Pair of Points Yin Tat Lee

Stuff This Friday. A 20-min course assessment done by the college. I won’t be fired  even if I got terrible review . So, you can say ANYTHING. It is only used to help me to improve the teaching. HW4 and HW5 is out. (You will like HW5.) Midterm: Apr 30 (Monday) It covers from Lecture 1 to Lecture 12 (Friday). There is a midterm review for you to ask questions. I will be there and I will find an expert in grading (TA) to help me.

Removing weight Distinction Assumption Suppose edge weights are not distinct, and Kruskal’s algorithm sorts edges so 𝑐 𝑒 1 ≤ 𝑐 𝑒 2 ≤…≤ 𝑐 𝑒 𝑚 Suppose Kruskal finds tree 𝑇 of weight 𝑐(𝑇), but the optimal solution 𝑇 ∗ has cost 𝑐 𝑇 ∗ <𝑐 𝑇 . Perturb each of the weights by a very small amount so that 𝑐 𝑒 1 ′ < 𝑐 𝑒 2 ′ <…< 𝑐 𝑒 𝑚 ′ If the perturbation is small enough, 𝑐 ′ 𝑇 ∗ <𝑐′(𝑇). However, this contradicts the correctness of Kruskal’s algorithm, since the algorithm will still find 𝑇, and Kruskal’s algorithm is correct if all weighs are distinct.

Divide and Conquer Approach

Divide and Conquer We reduce a problem to several subproblems. Typically, each sub-problem is at most a constant fraction of the size of the original problem Recursively solve each subproblem Merge the solutions Examples: Mergesort, Binary Search, Strassen’s Algorithm, n Log n levels n/2 n/2 n/4

A Classical Example: Merge Sort recursively Split to 𝑛/2 merge

Why Balanced Partitioning? An alternative "divide & conquer" algorithm: Split into n-1 and 1 Sort each sub problem Merge them Runtime 𝑇 𝑛 =𝑇 𝑛−1 +𝑇 1 +𝑛 Solution: 𝑇 𝑛 =𝑛+𝑇 𝑛−1 +𝑇 1 =𝑛+𝑛−1+ 𝑇 𝑛−2 =𝑛+𝑛−1+𝑛−2+𝑇 𝑛−3 =𝑛+𝑛−1+𝑛−2+…+1=𝑂( 𝑛 2 )

Reinventing Mergesort Suppose we've already invented Bubble-Sort, and we know it takes 𝑛 2 Try just one level of divide & conquer: Bubble-Sort (first n/2 elements) Bubble-Sort (last n/2 elements) Merge results Time: 2 𝑇(𝑛/2) + 𝑛 = 𝑛2/2 + 𝑛 ≪ 𝑛2 Almost twice as fast! D&C in a nutshell

Reinventing Mergesort “the more dividing and conquering, the better” Two levels of D&C would be almost 4 times faster, 3 levels almost 8, etc., even though overhead is growing. Best is usually full recursion down to a small constant size (balancing "work" vs "overhead"). In the limit: you’ve just rediscovered mergesort! Even unbalanced partitioning is good, but less good Bubble-sort improved with a 0.1/0.9 split: .1𝑛 2 + .9𝑛 2 + 𝑛 = .82𝑛2 + 𝑛 The 18% savings compounds significantly if you carry recursion to more levels, actually giving 𝑂(𝑛 log 𝑛 ), but with a bigger constant. This is why Quicksort with random splitter is good – badly unbalanced splits are rare, and not instantly fatal.

Finding the Root of a Function

Finding the Root of a Function Given a continuous function f and two points 𝑎<𝑏 such that 𝑓 𝑎 ≤0 𝑓 𝑏 ≥0 Find an approximate root of 𝑓 (a point 𝑐 where 𝑓 𝑐 =0). 𝑓 has a root in [𝑎,𝑏] by intermediate value theorem Note that roots of 𝑓 may be irrational, So, we want to approximate the root with an arbitrary precision! 𝑓(𝑥)= sin 𝑥 − 100 𝑥 + 𝑥 4 a b

A Naive Approach Suppose we want 𝜖 approximation to a root. Divide [𝑎,𝑏] into 𝑛= 𝑏−𝑎 𝜖 intervals. For each interval check 𝑓 𝑥 ≤0,𝑓 𝑥+𝜖 ≥0 This runs in time 𝑂 𝑛 =𝑂( 𝑏−𝑎 𝜖 ) Can we do faster? a b

Divide & Conquer (Binary Search) Bisection (𝑎,𝑏,𝜀) if 𝑏−𝑎 <𝜀 then return 𝑎; else 𝑚←(𝑎+𝑏)/2; if 𝑓 𝑚 ≤0 then return Bisection(𝑐,𝑏,𝜀); return Bisection(𝑎,𝑐,𝜀); a c b

Time Analysis Let 𝑛= 𝑎−𝑏 𝜖 be the # of intervals and 𝑐=(𝑎+𝑏)/2 Always half of the intervals lie to the left and half lie to the right of c So, 𝑇 𝑛 =𝑇 𝑛 2 +𝑂(1) i.e., 𝑇 𝑛 =𝑂( log 𝑛)=𝑂( log ( 𝑎−𝑏 𝜖 ) ) a c b n/2 n/2 Shameless plug: How about 𝑘 dimension? Yes, binary search still works for some functions. It takes 𝑂( 𝑘 3 log ( 𝑘 𝜀 ) ) .

Fast Exponentiation

Fast Exponentiation Power(𝑎,𝑛) Obvious algorithm Observation: Input: integer 𝒏≥𝟎 and number 𝒂 Output: 𝒂 𝒏 Obvious algorithm 𝒏−𝟏 multiplications Observation: if 𝒏 is even, then 𝑎 𝑛 = 𝑎 𝑛/2 ⋅ 𝑎 𝑛/2 .

Divide & Conquer (Repeated Squaring) Power(𝒂,𝒏) { if (𝒏=𝟎) return 𝟏 else if (𝒏 is even) Compute x = Power(𝒂,𝒏/𝟐) return x  x else Compute x = Power(𝒂,(𝒏−𝟏)/𝟐) return x  x  𝒂 } Time (# of multiplications): 𝑇 𝑛 ≤𝑇( 𝑛/2 )+2 for 𝑛≥1 𝑇 0 =0 Solving it, we have 𝑇 𝑛 ≤𝑇( 𝑛/2 )+2≤𝑇( 𝑛/4 )+2+2 ≤⋯≤𝑇 1 +2+⋯+2≤2 log 2 𝑛 . log 𝟐 (𝒏) copies

Finding the Closest Pair of Points

Closest Pair of Points (general metric) Given 𝑛 points and arbitrary distances between them, find the closest pair. Must look at all 𝑛 2 pairwise distances, else any one you didn’t check might be the shortest. i.e., you have to read the whole input. (… and all the rest of the 𝑛 2 edges…)

Closest Pair of Points (1-dimension) Given 𝑛 points on the real line, find the closest pair, e.g., given 11, 2, 4, 19, 4.8, 7, 8.2, 16, 11.5, 13, 1 find the closest pair Fact: Closest pair is adjacent in ordered list So, first sort, then scan adjacent pairs. Time 𝑂(𝑛 log 𝑛) to sort, if needed, Plus 𝑂(𝑛) to scan adjacent pairs Key point: do not need to calculate distances between all pairs: exploit geometry + ordering 1 2 4 4.8 7 8.2 11 11.5 13 16 19

Closest Pair of Points (2-dimensions) Given 𝑛 points in the plane, find a pair with smallest Euclidean distance between them. Fundamental geometric primitive. Graphics, computer vision, geographic information systems, molecular modeling, air traffic control. Special case of nearest neighbor, Euclidean MST, Voronoi. Brute force: Check all pairs of points in Θ( 𝑛 2 ) time. Assumption: No two points have same 𝑥 coordinate.

Closest Pair of Points (2-dimensions) No single direction along which one can sort points to guarantee success!

Divide & Conquer Divide: draw vertical line 𝐿 with ≈ 𝑛/2 points on each side. Conquer: find closest pair on each side, recursively. Combine to find closest pair overall Return best solutions How ? L 8 21 12

Key Observation Suppose 𝛿 is the minimum distance of all pairs in left/right of 𝐿. 𝛿= min 12,21 =12. Key Observation: suffices to consider points within 𝛿 of line 𝐿. Almost the one-D problem again: Sort points in 2𝛿-strip by their 𝑦 coordinate. =12 L 7 1 2 3 4 5 6 21 12

Almost 1D Problem Partition each side of 𝐿 into 𝛿 2 × 𝛿 2 squares Claim: No two points lie in the same 𝛿 2 × 𝛿 2 box. Proof: Such points would be within 𝛿 2 2 + 𝛿 2 2 =𝛿 1 2 ≈0.7𝛿<𝛿 Let 𝑠 𝑖 have the 𝑖 𝑡ℎ smallest 𝑦-coordinate among points in the 2𝛿-width-strip. Claim: If 𝑖−𝑗 >11, then the distance between 𝑠 𝑖 and 𝑠 𝑗 is >𝛿. Proof: only 11 boxes within 𝛿 of 𝑦( 𝑠 𝑖 ). j 49 >𝛿 31 30 ½ 29 29 ½ 28 i 27 26 25  

Closest Pair (2 dimension) if(𝒏≤𝟐) return | 𝒑 𝟏 − 𝒑 𝟐 | Compute separation line 𝑳 such that half the points are on one side and half on the other side. 𝜹 𝟏 = Closest-Pair(left half) 𝜹 𝟐 = Closest-Pair(right half) 𝜹 = min( 𝜹 𝟏 , 𝜹 𝟐 ) Delete all points further than  from separation line L Sort remaining points p[1]…p[m] by y-coordinate. for 𝒊=𝟏,𝟐,⋯,𝒎 for k = 𝟏,𝟐,⋯,𝟏𝟏 if 𝒊+𝒌≤𝒎  = min(, distance(p[i], p[i+k])); return . }

Closest Pair Analysis Let 𝐷(𝑛) be the number of pairwise distance calculations in the Closest-Pair Algorithm 𝐷 𝑛 ≤ 1 if 𝑛=1 2𝐷 𝑛 2 +11 𝑛 o.w. ⇒𝐷 𝑛 =O(𝑛log 𝑛) BUT, that’s only the number of distance calculations What if we counted running time? 𝑇 𝑛 ≤ 1 if 𝑛=1 2𝑇 𝑛 2 +𝑂(𝑛 log 𝑛) o.w. ⇒𝐷 𝑛 =O(𝑛 log 2 𝑛)

Closest Pair (2 dimension) Improved if(𝒏≤𝟐) return | 𝒑 𝟏 − 𝒑 𝟐 | Compute separation line 𝑳 such that half the points are on one side and half on the other side. ( 𝜹 𝟏 , 𝒑 𝟏 ) = Closest-Pair(left half) ( 𝜹 𝟐 , 𝒑 𝟐 ) = Closest-Pair(right half) 𝜹 = min( 𝜹 𝟏 , 𝜹 𝟐 ) 𝒑 𝒔𝒐𝒓𝒕𝒆𝒅 = merge( 𝒑 𝟏 , 𝒑 𝟐 ) (merge sort it by y-coordinate) Let 𝒒 be points (ordered as 𝒑 𝒔𝒐𝒓𝒕𝒆𝒅 ) that is 𝜹 from line L. for 𝒊=𝟏,𝟐,⋯,𝒎 for k = 𝟏,𝟐,⋯,𝟏𝟏 if 𝒊+𝒌≤𝒎  = min(, distance(q[i], q[i+k])); return  and 𝒑 𝒔𝒐𝒓𝒕𝒆𝒅 . } 𝑇 𝑛 ≤ 1 if 𝑛=1 2𝑇 𝑛 2 +𝑂 𝑛 o.w. ⇒𝐷 𝑛 =𝑂(𝑛 log 𝑛)