CSCI 256 Data Structures and Algorithm Analysis Lecture 10 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some by Iker Gondra
Review Graph Lemmas for MST algorithms Edge inclusion lemma (also called the “Cut property”) Let S be a subset of V, and suppose e = (u, v) is the minimum cost edge of E, with u in S and v in V-S. Then e is in every MST T of G. Cycle Property The most expensive edge on a cycle is never in a MST (Proofs in Lecture 9)
Prim’s Algorithm (grow a tree, T) S = { s }; T = { }; while S != V choose the minimum cost edge = (v,w) with u in S, and v in V-S add e to T add v to S
Prove: Prim’s algorithm computes an MST (1) The algorithm only adds edges belonging to every MST. Each iteration begins with a set S, a subset of V on which a partial spanning tree has been constructed and a node v and edge e are added which minimizes the quantity min (u in S: e = (u,v)) c e. (i.e., find the node v which gives the minimum of the quantity over all nodes not in S). By definition e is the cheapest edge with one end in S and the other in V-S so by the Cut Property it is in every minimum spanning tree of G. (2) The algorithm produces a spanning tree - Clear
Divide and Conquer Divide-and-conquer –Break up problem into several parts –Solve each part recursively –Combine solutions to sub-problems into overall solution Most common usage –Break up problem of size n into two equal parts of size ½n –Solve two parts recursively –Combine two solutions into overall solution in linear time Consequence –Brute force: n 2 –Divide-and-conquer: n log n Divide et impera. Veni, vidi, vici. - Julius Caesar
Mergesort –Divide array into two halves –Recursively sort each half –Merge two halves to make sorted whole merge sort divide ALGORITHMS ALGORITHMS AGLORHIMST AGHILMORST O(n) 2T(n/2) O(1)
Mergesort Array Mergesort(Array a){ n = a.Length; if (n <= 1) return a; b = Mergesort(a[0.. n/2]); c = Mergesort(a[n/2+1.. n-1]); return Merge(b, c); }
Merging Merging: Combine two pre-sorted lists into a sorted whole How to merge efficiently? –Linear number of comparisons –Use temporary array –Challenge for the bored: In-place merge [Kronrud, 1969] AGLORHIMST AGHI using only a constant amount of extra storage
Analysis of Mergesort Def: T(n) = worst case running time on input of size n Cost of mergesort? –O(1) to divide the input into 2 pieces of size n/2 –T(n/2) on each piece of size n/2 –O(n) to combine the solutions from the two recursive calls T(n) ≤ 2T(n/2) + cn, n > 2 T(2) ≤ c Remark assume parameters like n are even (rather than consider floors and ceilings) asymptotic bounds same and manipulation is cleaner
Recurrence Analysis 2 Solution methods –Unrolling recurrence –Guess and verify
Unrolling the Mergesort Recurrence
Analyse the first few levels Identify a pattern Sum over all levels of recursion
Unrolling the Mergesort Recurrence Prove T(n) is bounded by O(n log n) We have a single problem of size n, which takes time cn plus time for recursive calls. Recursive calls get us to the next level Level 0: cn Level 1: cn/2 + cn/2 = cn Level 2: 4(cn/4) = cn Level j: 2 j (cn/2 j ) = cn Last level: 2 ? (cn/2 ? ) = cn How many levels?? ( log 2 n) So summing over all levels gives us: cn log 2 n
Substitution (Guess and Verify) for Mergesort Recurrence Suppose we are given: T(n) ≤ 2T(n/2) + cn and T(2) ≤ c Claim: T(n) ≤ cn log 2 n Prove this for n ≥ 2 by induction – (in class) – use strong induction.
Strong Induction Let P(n) be a statement about the integers; we can use strong induction to prove P(n) holds (is true) for every n ≥ b as follows: 1. Base case: Prove P(b) is true 2 Induction step: Assume P(k) holds for all k, b ≤ k < m. (Using this assumption) prove P(m) is true. Conclude that P(n) is true for all n ≥ b.
Strong induction allowed us to prove that if T(n) = 2 T(n/2) + cn for n > 2, and T(2) ≤ c, then for all n ≥ 2 T(n) ≤ cn log 2 n So T(n) is O (n log n) Strong Induction gives the result
Solve the more general recurrence relation A more general class of divide and conquer algorithms create q subproblems of size n/2 each then combine the results in O(n) time; For these we see that: T(n) ≤ q T(n/2) +cn 2 ≤ n T(2) ≤ c Solve this by unrolling! Example: if q = 3 Level 0: cn Level 1: 3/2cn Level 2: 9/4(cn) At j th level there are 3 j problems size n/2 j Work performed at each level is 3 j x cn/2 j There are log 2 n levels (j ranges from 0 to (log 2 n -1) ) Sum work over all levels (read details page 216): T(n) < c n log 2 3