Algorithm Design Paradigms Reduce to known problem Divide and conquer, partitioning. Dynamic programming. Greedy method. Backtracking. Branch and Bound. Recursion. Approximations. Geometric methods. Integer programming. Probabilistic techniques.
Reduce to Known Problem Example 1: Determine if an array of n numbers contains repeated elements. Solution 1: Compare each element to every other element. O(n2) Solution 2: Sort (e.g., by mergesort) the n numbers. Then determine if there is a repeat in O(n) steps. Total: (n log n) steps!
Another Example Given a list of n points in the plane, determines if any 3 of them are collinear (lie on the same line). Solution 1: Using a triple loop, compare all distinct triples of points, so this takes O(n3) time. Solution 2: O(n2log n). For each point P in the list do for each point Q in the list do compute the slope of the line connecting P and Q and save it in a list determine (Example 1) if there are any duplicate
Divide and Conquer Divide the problem into a number of subproblems There must be base case (to stop recursion). Conquer (solve) each subproblem recursively Combine (merge) solutions to subproblems into a solution to the original problem
Example: Find the MAX and MIN Obvious strategy (Needs 2n - 3 compares) Find MAX (n - 1 compares) Find MIN of the remaining elements (n - 2) Nontrivial strategy Split the array in half Find the MAX and MIN of both halves Compare the 2 MAXes and compare the 2 MINs to get overall MAX and MIN. In this example, we only consider the number of comparisons and ignore all other operations.
Procedure mm(i, j: integer; var lo, hi: integer); begin if i = j then begin lo := a[i]; hi := a[i] end else if i = j-1 then begin if a[i] < a[j] then begin lo := a [i]; hi := a[j] end else begin lo := a [j]; hi := a[i] end end else begin m := (i+j) div 2; mm(i, m, min1, max1); mm(m+1, j, min2, max2); lo := MIN(min1, min2); hi := MAX(max1, max2) end.
Analysis Solving the above, we have T(n) = 3n/2 –2 if n is a power of 2. This can be shown to be a lower bound.
Note More accurately, it should be
Balancing It is generally best to divide problems into subproblems of roughly EQUAL size. Very roughly, you want binary search instead of linear search. Less roughly, if in the MAX-MIN problem, we divide the set by putting one element in subset 1 and the rest in subset 2, the execution tree would look like: 2(n – 1) + 1 = 2n –3 Comparisons would Be needed.
Mergesort Obvious way to sort n numbers (selection sort): Find the minimum (or maximum) Sort the remaining numbers recursively Analysis: Requires (n-1) + (n-2) + ... + 1 = n(n-1)/2 = Q(n2) comparisons. Clearly this method is not using the balancing heuristics.
A Divide and Conquer Solution The following requires only Q(n log n) comparisons. Mergesort Divide: divide the given n-element array A into 2 subarrays of n/2 elements each Conquer: recursively sort the two subarrays Combine: merge 2 sorted subarrays into 1 sorted array Analysis: T(n) = Q(1) + 2T(n/2) + Q(n) = Q(n log n)
The Pseudocode Procedure mergesort(i, j: integer); var m: integer; begin if i < j then m := (i+j) div 2; mergesort(i, m); mergesort(m+1, j); merge(i, m, j) end; end.
Illustration
The Merging Process merge(x, y, z: integer) uses another array for temporary storage. merge segments of size m and n takes m + n – 1 compares in the worst case.
Merging smallest auxiliary array
Merging smallest smallest auxiliary array
Merging
Merging
Merging
Merging
Merging first half exhausted second half exhausted
Summary function D&C(P: problem): solution; if size(P) is small enough then S = solve(P) else divide P into P1, P2, P3, …, Pk; S1=D&C(P1); S2=D&C(P2); …, Sk=D&C( Pk); S = merge(S1, S2, S3, …, Sk); return(S); input output
Summary Divide and Conquer with Balancing is a useful technique if the subproblems to be solved are independent (no redundant computations). Also the dividing and merging phases must be efficient. The Fibonacci problem was an example where the subproblems were not independent. Usually, either dividing or merging the subproblems will be trivial. Problem is usually, but not always, divide into two parts. Divide into ONE part: Binary search. Divide into > 2 parts: Critical path problem.
Two Dimensional Search You are given an m n matrix of numbers A, sorted in increasing order within rows and within columns. Assume m = O(n). Design an algorithm that finds the location of an arbitrary value x, in the matrix or report that the item is not present. Is your algorithm optimal? How about probe the middle of the matrix? It seems we can eliminate 1/4 data with one comparison and it yields 3 subproblems of size about 1/4 of the original problem Is this approach optimal? What is the recurrence in this case? T(n) = 3T(n/4) + O(1) ????
Well, It is Wrong! T(n) = 3T(n/4) + O(1) is not correct, because the subproblems are of size n/2. The correct recurrence for the solution is T(n) = 3T(n/2) + O(1) T(n) = O( )
Illustration of Idea < < > > =
A Q(n) algorithm c = n, r = 1 if c = 1 or r = m then use binary search to locate x. compare x and A[r, c]: x = A[r, c] -- report item found in position (r, c). x > A[r, c] -- r = r + 1; goto step 2 x < A[r, c] -- c = c - 1; goto step 2. At most m + n comparisons are required.
Selection The Problem: Given a sequence S of n elements and an integer k, determine the kth smallest element in S. Special cases: k = 1, or k = n: Q(n) time needed. k = n/2 : trivial method -- O(n2) steps sort then select -- O(nlog n) steps Lower bound: W(n).
A Linear Time Algorithm procedure SELECT(S, k) 1. if |S| < Q then sort S and return the kth element durectly else subdivide S into |S|/Q subsequence of Q elements (with up to Q-1 leftover elements). 2. Sort each subsequence and determine its median. 3. Call SELECT recursively to find m, the median of the |S|/Q medians found in setp 2. 4. Create three subsequences L, E, and G of elements of S smaller than, equal to, and larger than m, respectively. 5. if |L| ≥ k then call SELECT(L, k) else if |L| + |E| ≥ k then return(m) else SELECT(G, k - |L| - |E|).
Analysis of Selection Let t(n) be the running time of SELECT. Step 1 Step 2 Step 3 Step 4 Step 5 O(n) O(n) t(n/Q) O(n) t(3n/4) Q
The Complexity t(n) = t(n/Q) + t(3n/4) + O(n) = t(n/5) + t(3n/4) + O(n) Take Q = 5 Since 1/5 + 3/4 < 1, we have t(n) = Q(n) . • Recall that the solution of the recurrence relation t(n) = t(pn) + t(qn) + cn, when 0 < p + q < 1, is Q(n)
Multiplying Two n Bit Numbers Here we are using the log-cost model, counting bits. The naive pencil-and-paper algorithm: This uses n2 multiplications, (n-1)2 additions (+ carries). In fact, this is also divide and conquer.
Karatsuba's algorithm, 1962 :O(n1.59 ) Let X and Y each contain n bits. Write X = a b and Y = c d where a; b; c; d are n/2 bit numbers. Then XY = (a2n/2 + b)(cn/2 + d) = ac2n + (ad + bc)2n/2 + bd This breaks the problem up into 4 subproblems of size n/2, which doesn't do us any good. Instead, Karatsuba observed that XY = (2n +2n/2)ac + 2n/2 (a-b)(d-c) + (2n/2 + 1)bd = ac2n +ac2n/2 + 2n/2 (a-b)(d-c) + bd2n/2 +bd
Polynomial multiplication Straightforward multiplication: O(n2). Using D&C approach: O(n1.59) Using FFT technique: O(nlog n)
A D&C Approach
A Modified D&C Solution : O(n1.59 ) Any idea for further improvement?
Matrix Multiplication
Complexity (on uniprocessor) Best known lower bound: W(n2) (assume m = Q(n) and k = Q(n) ) Straightforward algorithm: O(n3). Strassen's algorithm: O(nlog 7) = O(n2.81). Best known sequential algorithm: O(n2.376) ? The best algorithm for this problem is still open.
The Straightforward Method It takes O(mnk) = O(n3) time.
A D&P approach
Strassen's algorithm T(n) = 7T(n/2) +O(n2) = O(nlog 7) = O(n2.81)
Quicksort Quicksort is a simple divide-and-conquer sorting algorithm that practically outperforms Heapsort. In order to sort A[p..r] do the following: Divide: rearrange the elements and generate two subarrays A[p..q] and A[q+1..r] so that every element in A[p..q] is at most every element in A[q+1..r]; Conquer: recursively sort the two subarrays; Combine: nothing special is necessary. In order to partition, choose u = A[p] as a pivot, and move everything < u to the left and everything > u to the right.
Quicksort Although mergesort is O(n log n), it is quite inconvenient for implementation with arrays, since we need space to merge. In practice, the fastest sorting algorithm is Quicksort, which uses partitioning as its main idea.
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross swap me
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross swap me
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross swap me
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross swap me swap me
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross swap me
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross swap me
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross swap me swap me
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross swap me
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross swap me
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross swap me
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross swap me
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross swap with partitioning element pointers cross
Partitioning in Quicksort How do we partition the array efficiently? choose partition element to be rightmost element scan from right for smaller element scan from left for larger element exchange repeat until pointers cross
Analysis Worst-case: If A[1..n] is already sorted, then Partition splits A[1..n] into A[1] and A[2..n] without changing the order. If that happens, the running time C(n) satisfies: C(n) = C(1) + C(n –1) + Q(n) = Q(n2) Best case: Partition keeps splitting the subarrays into halves. If that happens, the running time C(n) satisfies: C(n) ≈ 2 C(n/2) + Q(n) = Q(n log n)
Analysis Average case (for random permutation of n elements): C(n) ≈ 1.38 n log n which is about 38% higher than the best case.
Comments Sort smaller subfiles first reduces stack size asymptotically at most O(log n). Do not stack right subfiles of size < 2 in recursive algorithm -- saves factor of 4. Use different pivot selection, e.g. choose pivot to be median of first last and middle. Randomized-Quicksort: turn bad instances to good instances by picking up the pivot randomly