Algorithm Design Strategy Divide and Conquer Revisit Notes on Brute-force approach
More example by using Divide and Conquer method Introduction of D&C Approach Finding closest pair of points Quicksort Matrix Multiplication Algorithm Large Integer Multiplication Convex Hull-problem
Divide and Conquer Approach Concept: - D&C is a general strategy for algorithm design. It involves three steps: (1) Divide an instance of a problem into one or more smaller instances (2) Conquer (solve) each of the smaller instances. Unless a smaller instance is sufficiently small, use recursion to do this (3) If necessary, combine the solutions to the smaller instances to obtain the solution to the original instances (e.g., Mergesort) Approach: - Recursion (Top-down approach)
Example 1 Binary search in a sorted array with size n Worst case time complexity in the recursive binary search: W(n) = W(n/2) + 1 W(1) = 1 Solve the recurrence, and obtain: W(n) = lg(n) +1 (lg n)
Example 2 Finding the closest pair of points - Task: solve the problem of finding the closest pair of points in a set of points. The set consists of points in two dimensional plane defined by both an X and a Y coordinate. - Given a set P of N points, find p. q P, such that the distance d(p, q) is minimum - Application: - traffic control systems: A system for controlling air or sea traffic might need to know which two vehicles are too close in order to detect potential collisions. - Computational geometry
Example 2 Finding the closest pair of points - The "closest pair" refers to the pair of points in the set that has the smallest Euclidean distance, Distance between points p1=(x1,y1) and p2=(x2,y2) - If there are two identical points in the set, then the closest pair distance in the set will obviously be zero.
Example 2 (brute-force approach) Find the two closest points in a set of n points (in the two-dimensional Cartesian plane). Brute-force algorithm Compute the distance between every pair of distinct points and return the indexes of the points for which the distance is the smallest.
Closed pair problem (Brute-force approach)
Example 2 Finding the closest pair of points - Brute-force algorithm: Find all the distances D(p, q) and find the minimum distance Time complexity: O(n2)
Notes on Brute-force approach A straightforward approach, usually based directly on the problem’s statement and definitions of the concepts involved Examples: Computing an (a > 0, n a nonnegative integer) Computing n! Multiplying two matrices Searching for a key of a given value in a list
Notes on Brute-force approach Strengths wide applicability simplicity yields reasonable algorithms for some important problems (e.g., matrix multiplication, sorting, searching, string matching) Weaknesses rarely yields efficient algorithms some brute-force algorithms are unacceptably slow not as constructive as some other design techniques
Example 2 (D&C) Finding the closest pair of points - divide and conquer method: - divide: sort the points by x-coordinate; draw vertical line so that roughly n/2 points on each side Conquer: find closest pair in each side recursively Combine: find closest pair with one point in each side Return: best of three solutions
Example 2 Find the closest pair in a strip of width 2d Example: d= min(12, 21)
Example 2 P For each point p, at most 4 points can reside within the left square At most 8 points of p can reside within Rectangle d*2d Method 1: Sorting all the points within the entire strip based on “y” order. From top to bottom, for every point, check distances to at most 4 ~ 6 points of another side. Method 2: Or: sort all the points of the left strip, scan points of the left strip, compute distances from the every left point P (blue dot) to up to 6 points of the right strip (red dots). d d P d
Example 2 Algorithm: Note: if there are no coincident points, We can check 6 points. Time complexity: If sort points in strip recursion T(n)=2T(n/2)+O(nlgn) T(n)=O(nlg2n) If sort point in strip by merge two sorted lists T(n)=2T(n/2)+O(n)=O(nlgn)
Example 2 Algorithm: (pR, qR)
Proof of Correctness Show: The closest pair of points in each subdivision (subproblem in recursion) can be merged to find the closest pair of points at the upper level recursion. (loop-invariant) The closest pair of points in P (middle stripe are) is pl (left part PL) and pr (right part PR), the distance of pl and pr is less than d. At most 8 points of P can reside within this d * 2d rectangle. Base: |P| <= 3 (never try to solve a subproblem consisting of only 1 point.
Quick Sort
Quick sort - Partition exchange sort - Split the array into two parts by a pivot item: left part: all items are smaller than the pivot item right part: all items are larger than the pivot item Worst-case time complexity W(n) = W(n-1) + n-1 (for n > 0) W(0) = 0 W(n) = n(n-1)/2 (n2) (in the same complexity category of the exchange sort, but the average case time complexity is better than the exchange sort algorithm) Average-case time complexity - A(n) (n+1) 2 ln n (nlg n)
Quick Sort Divide and conquer: No need to combine results Divide and conquer idea: Divide problem into two smaller sorting problems. Divide and conquer: Select a splitting element (pivot) PARTITION the array/list No need to combine results
QUICKSORT(A, p, r) if p< r q <- PARTITION(A, p, r) QUICKSORT(A, p, q-1) QUICKSORT(A, q+1, r)
Partition(A, p, r) x A[r] //pivot x is the last element //”left” partition contains A[k] <= x for k= p to i, //”right” partition contains all A[k] > x for k= i+1 to r x A[r] //pivot x is the last element i p – 1 //“left” partition is empty for j p to r-1 if A[j] <= x //store in “left” partition i i + 1 exchange A[i] A[j] exchange A[i+1] A[r] //store pivot return i+1
Partition Example p,i, j r p i j 2 8 7 1 3 5 6 4 2 1 3 8 7 5 6 4 p,i, j p i j 2 8 7 1 3 5 6 4 2 1 3 8 7 5 6 4 p,i, j p i 2 8 7 1 3 5 6 4 2 1 3 8 7 5 6 4 p,i, j 2 8 7 1 3 5 6 4 p i r 2 1 3 4 7 5 6 8 p i j 2 1 7 8 3 5 6 4
Performance of quicksort Worst case partitioning? Asymptotic growth rate? T(n)=T(n-1)+(n) (n2) Best case partitioning? Asymptotic growth rate?
Recursion Tree for Best Case Number of partition comparisons Total per depth n cn cn n/2 n/2 n/4 n/4 n/4 n/4 cn n/8 . . > n/8 . . > n/8 n/8 n/8 n/8 cn . . > . . > T(n)<=2T(n/2)+cn T(n)=O(nlgn) Sum = O(nlgn)
Balanced partitioning Assume the partitioning element always produces 8 to 1 split We will see that quicksort is O(n lg n) In fact a 99 to 1 split would also be O(n lg n)
Balanced partitioning Assume after each application of PARTITION : n/9 elements are to the left of the pivot and 8n/9 elements are to the right of the pivot. The longest path of calls to Quicksort is proportional to lgn and not n The longest path of calls = 1 + log 9/8 n =1 + lgn / lg (9/8) 1 + clg n Let n = 1,000,000. The longest path has about 118 calls to Quicksort. Note: shortest path has 1+ log9 n = 1 +7 = 8 calls
T(n)<=T(n/9)+T(8n/9)+cn Total per depth n cn n/9 cn 8n/9 (log9 n) cn n/81 8n/81 8n/81 64n/81 . . > . . > 512n/729 . . > (log9/8 n) cn n/729 8n/729 0/1 ... <=cn 0/1 <=cn 0/1 0/1 <=cn
Intuition for the Average case 1 Vs n 1+(n-1)/2 (n-1)/2 (n-1)/2 (n-1)/2 One bad and one best Best Bad split “absorbed” by good split. After 2 calls to PARTITION array split “evenly” Expect average run time is O(n lg n)
Average Time Complexity Time to partition Probability Pivot-point is p Average time to sort subarrays When pivot-point is p Note: Assume that the slot of pivot-point returned by partition is equally likely to be any of numbers from 1 through n Solution: T(n) = (nlgn)
Randomized algorithms An algorithm is randomized if: its behavior is determined not only by its input but also by values produced by a random number generator. Often simpler and faster algorithm Exploring the average-case behavior Expecting the split of the input array to be reasonably well balanced.
A randomized version of quicksort RANDOMIZED-PARTITION(A, p, r) i RANDOM(p, r) exchange A[r] A[i] return PARTITION (A, p, r)
RANDOMIZED-QUICKSORT(A, p, r) if p < r q RANDOMIZED-PARTITION (A, p, r) RANDOMIZED-QUICKSORT(A, p, q-1) RANDOMIZED-QUICKSORT(A, q+1, r)
Example (Alternative thought) A[0]…A[s-1] A[s] A[s+1]…A[n-1] All are A[s] All are ≤ A[s] Exercise: quick-sort [5 3 1 9 8 2 4 7]
Example 4 Strassen’s matrix multiplication algorithm - Example: 2 by 2 matrix multiplication m1 = (a11 + a22)(b11+b22) m2 = (a21 + a22) b11 m3 = a11 (b12 – b22) m4 = a22 (b21 – b11) m5 = (a11 + a12) b22 m6 = (a21 – a11) (b11+b12) m7 = (a12 – a22)(b21 + b22)
Example 4 (cont’d) Strassen’s matrix multiplication algorithm - partition n*n matrix into sub-matrices n/2 * n/2 (assume n is power of 2) - apply the seven basic calculations: M1 = (A11 + A22)(B11+B22) M2 = (A21 + A22) B11 M3 = A11 (B12 – B22) M4 = A22 (B21 – B11) M5 = (A11 + A12) B22 M6 = (A21 – A11) (B11+B12) M7 = (A12 – A22)(B21 + B22)
Example 4 (cont’d) Strassen’s matrix multiplication example
Example 4 (cont’d) Strassen’s matrix multiplication: input: n, A, B; output: C strassen_matrix_multiply(int n, A, B, C) { divide A into A11, A12, A21, A22; divide B into B11, B12, B21, B22; strassen_matrix_multiply(n/2, A11+A22, B11+B22, M1); strassen_matrix_multiply(n/2, A21+A22, B11, M2); …… strassen_matrix_multiply(n/2, A12-A22, B21+B22, M7); Compose C11, …, C22 by M1, …, M7; }
Example 4 (cont’d) Strassen’s matrix multiplication algorithm - Every case Time complexity Multiplications: T(n) = 7 T(n/2); T(1) = 1 T(n) = nlg7 (n2.81 ) (while brute-force approach: T(n)= (n3 ) Additions/Subtractions: T(n) = 7 T(n/2) + 18 (n/2)2 T(1) = 0 T(n) = 6nlg7 – 6n2 O(n2.81 )
Example 4 (cont’d) Strassen’s matrix multiplication algorithm - Proof of Time complexity Multiplications: T(n) = 7 T(n/2); T(1) = 1 Guess solution: T(n) = 7lgn Induction base: n=1 T(1) = 7lg1= 1 Induction hypothesis: T(n) = 7lgn Induction step: T(2n)= 7T(2n/2)=7*7lgn=7lg(2n) So the guess solution is true! Note: 7lgn = nlg7n2.18
Example 5 Large integer multiplication - Large digits are divided into a number of small digits - u v u = x 10m + y v = w 10m + z u v = xw 102m + (xz + wy) 10m + yz (split integer (u, v) into two equal size integers (x, y), (w, z)) e.g., 123456 = 123 *10^3 +456. 123 and 456 all have 3 digits. Worst-case time complexity W(n) = 4 W(n/2) + cn W(n) (n2 )
Combine the D&C algorithm with other simple algorithm Switching point - Recursive method may not show the advantage in the case of small n as compared to the alternative algorithms - Recursive algorithm requires a fair amount of overhead. - We need to determine for what value of n it is at least as fast to call an alternative algorithm as it is to divide the instance further. - The dividing process is stopped in a certain switching point (or threshold) for recursive algorithm, and then is switched to the alternative algorithm Example - Switch the Mergesort algorithm to the Exchange sort algorithm when n is smaller certain threshold. Exchange sort: W(n)= n(n-1)/2 (n <= t) Merge sort: W(n) = 2W(n/2) + 32n (n >t) At the optimal value t: 2W(t/2) + 32t = t(t-1)/2 t=128
When not to use D&C algorithm Two cases: - An instance of size n is divided into two or more instances each almost of size n (e.g., quicksort the sorted array; Fibonacci term recursive) - An instance of size n is divided into almost n instances of size n/c (c is constant)
Example 6: Convex hull problem A polygon is a closed path of straight line segments in R2. These segments are also called edges of the polygon, and the intersection of two adjacent edges is a vertex of the polygon. Thus every polygon with n vertices has n edges. A simple polygon is one which has no intersecting non-adjacent edges (see Fig.1). Every simple polygon divides R2 into an interior and an exterior region. A simple polygon is convex if the internal angle formed at each vertex is smaller than 180o. If one were to walk along the polygon's path, one would make only right turns (or only left turns) at each vertex (see Fig.2).
Example 6: Convex hull problem The convex hull of a polygon P is the smallest-area convex polygon which encloses P. Informally, it is the shape of a rubber-band stretched around P (see Fig.3). Similarly, the convex hull of a set of points S is the smallest-area polygon which encloses S. Note : the convex hull of a convex polygon P is P itself.
Brute-force algorithm for convex hull problem Finding convex hull based on line segments: A line segment connecting two point Pi and Pj of a set of n points is a part of its convex hull’s boundary if and only if all the other points of the set lie on the same side of the straight line through these two points. repeating this test for every pair of points yields a list of line segments that make up the convex hull’s boundary. Line: ax+by=c between two points (x1, y1) and (x2, y2) Where a = y2-y1, b=x1-x2, c=x1y2-y1x2 For all points in one side of line: ax+by>c For all points in the other side of the line: ax+by <c For all points on the line: ax+by=c Time-complexity: O(n3)
Convex Hull (Quick-Hull) algorithm (Divide and Conquer) P1 P2 Pmax Given three points: p1 (x1, y1), p2 (x2, y2), and p3 (x3, y3): D>0 if and only if p3 is to the left of the line p1p2
Convex Hull (Quick-Hull) algorithm (Divide and Conquer) Convex hull: smallest convex set that includes given points Assume points are sorted by x-coordinate values Identify extreme points P1 and P2 (leftmost and rightmost) Compute upper hull recursively: find point Pmax that is farthest away from line P1P2 compute the upper hull of the points to the left of line P1Pmax compute the upper hull of the points to the left of line PmaxP2 Compute lower hull in a similar manner
Efficiency of Quickhull Algorithm Finding point farthest away from line P1P2 can be done in linear time Time efficiency: worst case: Θ(n2) (similar to quicksort) average case: Θ(n) (under reasonable assumptions about distribution of points given)