Theory of Algorithms: Divide and Conquer
Objectives To introduce the divide-and-conquer mind set To show a variety of divide-and-conquer solutions: Merge Sort Quick Sort Convex Hull by Divide-and-Conquer Strassen’s Matrix Multiplication To discuss the strengths and weaknesses of a divide-and-conquer strategy
Divide-and-Conquer Best known algorithm design strategy: Divide instance of problem into two or more smaller instances Solve smaller instances recursively Obtain solution to original (larger) instance by combining these solutions Recurrence Templates apply Recurrences are of the form T(n) = aT(n/b) + f (n), a 1, b 2 Silly Example (Addition): a0 + …. + an-1 = (a0 + … + an/2-1) + (an/2 + … an-1)
Divide-and-Conquer Illustrated A PROBLEM OF SIZE n SUBPROBLEM 1 OF SIZE n/2 SUBPROBLEM 2 OF SIZE n/2 A SOLUTION TO SUBPROBLEM 1 A SOLUTION TO SUBPROBLEM 2 A SOLUTION TO THE ORIGINAL PROBLEM
Mergesort Algorithm: Merging: Split A[1..n] in half and put copy of each half into arrays B[1.. n/2 ] and C[1.. n/2 ] Recursively MergeSort arrays B and C Merge sorted arrays B and C into array A Merging: REPEAT until no elements remain in one of B or C Compare 1st elements in the rest of B and C Copy smaller into A, incrementing index of corresponding array Once all elements in one of B or C are processed, copy the remaining unprocessed elements from the other array into A
Mergesort Example 7 2 1 6 4 7 2 1 6 4 7 2 1 6 4 7 2 4 6 2 7 1 2 7 1 2 4 6 7
Efficiency of Mergesort Recurrence: C(n) = 2 C(n/2) + Cmerge(n) for n > 1, C(1) = 0 Cmerge(n) = n - 1 in the worst case All cases have same efficiency: (n log n) Number of comparisons is close to theoretical minimum for comparison-based sorting: log n! ≈ n lg n - 1.44 n Space requirement: ( n ) (NOT in-place) Can be implemented without recursion (bottom-up) even if not analyzing in detail, show the recurrence for mergesort in worst case: T(n) = 2 T(n/2) + (n-1) worst case comparisons for merge
Quicksort Select a pivot (partitioning element) Rearrange the list into two sublists: All elements positioned before the pivot are ≤ the pivot Those positioned after the pivot are > the pivot Requires a pivoting algorithm Exchange the pivot with the last element in the first sublist The pivot is now in its final position QuickSort the two sublists p A[i]≤p A[i]>p
The Partition Algorithm
Quicksort Example 5 3 1 9 8 2 7 2 3 1 8 9 7 i j i j i j 5 3 1 9 8 2 7 5 3 1 9 8 2 7 2 3 1 8 9 7 i j i j i j 5 3 1 9 8 2 7 2 1 3 8 7 9 i j i j i j 5 3 1 2 8 9 7 2 1 3 8 7 9 i j j i j i 5 3 1 2 8 9 7 1 2 3 7 8 9 j i 2 3 1 5 8 9 7 Recursive Call Quicksort (2 3 1) & Quicksort (8 9 7) Recursive Call Quicksort (1) & Quicksort (3) Recursive Call Quicksort (7) & Quicksort (9)
Worst Case Efficiency of Quicksort In the worst case all splits are completely skewed For instance, an already sorted list! One subarray is empty, other reduced by only one: Make n+1 comparisons Exchange pivot with itself Quicksort left = Ø, right = A[1..n-1] Cworst = (n+1) + n + … + 3 = (n+1)(n+2)/2 - 3 = (n2) p A[i]≤p A[i]>p
General Efficiency of Quicksort Efficiency Cases: Best: split in the middle — ( n log n) Worst: sorted array! — ( n2) Average: random arrays — ( n log n) Improvements (in combination 20-25% faster): Better pivot selection: median of three partitioning avoids worst case in sorted files Switch to Insertion Sort on small subfiles Elimination of recursion Considered the method of choice for internal sorting for large files (n ≥ 10000)
Objectives To introduce the divide-and-conquer mind set To show a variety of divide-and-conquer solutions: Merge Sort Quick Sort Convex Hull by Divide-and-Conquer Strassen’s Matrix Multiplication To discuss the strengths and weaknesses of a divide-and-conquer strategy
QuickHull Algorithm Sort points by increasing x-coordinate values Identify leftmost and rightmost extreme points P1 and P2 (part of hull) Compute upper hull: Find point Pmax that is farthest away from line P1P2 Quickhull the points to the left of line P1Pmax Quickhull the points to the left of line PmaxP2 Similarly compute lower hull Pmax L P2 L P1
Finding the Furthest Point Given three points in the plane p1 , p2 , p3 Area of Triangle = p1 p2 p3 = 1/2 D D = D = x1 y2 + x3 y1 + x2 y3 - x3 y2 - x2 y1 - x1 y3 Properties of D : Positive iff p3 is to the left of p1p2 Correlates with distance of p3 from p1p2 x1 y1 1 x2 y2 x3 y3
Efficiency of Quickhull Finding point farthest away from line P1P2 is linear in the number of points This gives same efficiency as quicksort: Worst case: (n2) Average case: (n log n) If an initial sort is required, this can be accomplished in ( n log n) — no increase in asymptotic efficiency class Alternative Divide-and-Conquer Convex Hull: Graham’s scan and DCHull Also ( n log n) but with lower coefficients
Strassen’s Matrix Multiplication Strassen observed [1969] that the product of two matrices can be computed as follows: Op Count: Each of M1, …, M7 requires 1 mult and 1 or 2 add/sub Total = 7 mul and 18 add/sub Compared with brute force which requires 8 mults and 4 add/sub. But is asymptotic behaviour is NB n/2 n/2 submatrices are computed recursively by the same method C00 C01 C10 C11 A00 A01 A10 A11 B00 B01 B10 B11 = * M1 + M4 -M5+M7 M3+ M5 M2 + M4 M1 + M3 - M2 + M6 =
Efficiency of Strassen’s Algorithm If n is not a power of 2, matrices can be padded with zeros Number of multiplications: M(n) = 7 M(n/2) for n > 1, M(1) = 1 Set n = 2k, apply backward substitution M(2k) = 7M(2k-1) = 7 [7 M(2k-2)] = 72 M(2k-2) = … = 7k M(n) = 7log n = nlog 7 ≈ n2.807 Number of additions: A(n) (nlog 7) Other algorithms closer to the lower limit of n2 multiplications, but are even more complicated
Strengths and Weaknesses of Divide-and-Conquer Generally improves on Brute Force by one base efficiency class Easy to analyse using the Recurrence Templates Weaknesses: Often requires recursion, which introduces overheads Can be inapplicable and inferior to simpler algorithmic solutions