CS 3343: Analysis of Algorithms

Slides:



Advertisements
Similar presentations
Analysis of Algorithms
Advertisements

David Luebke 1 4/22/2015 CS 332: Algorithms Quicksort.
Analysis of Algorithms
ADA: 5. Quicksort1 Objective o describe the quicksort algorithm, it's partition function, and analyse its running time under different data conditions.
5/5/20151 Analysis of Algorithms Lecture 6&7: Master theorem and substitution method.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Spring 2015 Lecture 5: QuickSort & Selection
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 3 Recurrence equations Formulating recurrence equations Solving recurrence equations.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 3 Recurrence equations Formulating recurrence equations Solving recurrence equations.
Quicksort CIS 606 Spring Quicksort Worst-case running time: Θ(n 2 ). Expected running time: Θ(n lg n). Constants hidden in Θ(n lg n) are small.
Mathematics Review and Asymptotic Notation
CS 3343: Analysis of Algorithms
10/13/20151 CS 3343: Analysis of Algorithms Lecture 9: Review for midterm 1 Analysis of quick sort.
2IL50 Data Structures Fall 2015 Lecture 2: Analysis of Algorithms.
10/25/20151 CS 3343: Analysis of Algorithms Lecture 6&7: Master theorem and substitution method.
David Luebke 1 6/3/2016 CS 332: Algorithms Analyzing Quicksort: Average Case.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
1/6/20161 CS 3343: Analysis of Algorithms Lecture 2: Asymptotic Notations.
Spring 2015 Lecture 2: Analysis of Algorithms
David Luebke 1 2/19/2016 Priority Queues Quicksort.
Big O David Kauchak cs302 Spring Administrative Assignment 1: how’d it go? Assignment 2: out soon… Lab code.
Algorithm Design Techniques, Greedy Method – Knapsack Problem, Job Sequencing, Divide and Conquer Method – Quick Sort, Finding Maximum and Minimum, Dynamic.
CMPT 438 Algorithms.
Analysis of Algorithms CS 477/677
DAST Tirgul 2.
Introduction to Algorithms Prof. Charles E. Leiserson
CS 3343: Analysis of Algorithms
Linear-Time Sorting Continued Medians and Order Statistics
Randomized Algorithms
QuickSort QuickSort Best, Worst Average Cases K-th Ordered Statistic
CS 3343: Analysis of Algorithms
CS 3343: Analysis of Algorithms
CSC 413/513: Intro to Algorithms
CS 3343: Analysis of Algorithms
CS 3343: Analysis of Algorithms
CS 3343: Analysis of Algorithms
CS 3343: Analysis of Algorithms
Ch 7: Quicksort Ming-Te Chi
Randomized Algorithms
Algorithms and Data Structures Lecture III
Data Structures Review Session
CS 3343: Analysis of Algorithms
Topic: Divide and Conquer
Lecture No 6 Advance Analysis of Institute of Southern Punjab Multan
CS 3343: Analysis of Algorithms
CS 3343: Analysis of Algorithms
CS200: Algorithms Analysis
CS 583 Analysis of Algorithms
Introduction to Algorithms
Divide and Conquer (Merge Sort)
CS 3343: Analysis of Algorithms
Algorithms: the big picture
CS 332: Algorithms Quicksort David Luebke /9/2019.
Ch. 2: Getting Started.
Algorithms: Design and Analysis
Ack: Several slides from Prof. Jim Anderson’s COMP 202 notes.
Topic: Divide and Conquer
At the end of this session, learner will be able to:
Introduction To Algorithms
David Kauchak cs161 Summer 2009
Algorithms Recurrences.
The Selection Problem.
Quicksort and Randomized Algs
Design and Analysis of Algorithms
Quicksort Quick sort Correctness of partition - loop invariant
CS200: Algorithm Analysis
Analysis of Algorithms CS 477/677
Algorithms and Data Structures Lecture III
Algorithms and Data Structures Lecture II
Presentation transcript:

CS 3343: Analysis of Algorithms Lecture 9: Review for midterm 1 Analysis of quick sort 5/29/2018

Exam (midterm 1) Closed book exam One cheat sheet allowed (limit to a single page of letter-size paper, double-sided) Thursday, Feb 23, class time + 5 minutes Basic calculator (no graphing) is allowed Do NOT use phones / tablets as calculators 5/29/2018

Materials covered Up to Lecture 8 (Feb 2) O, Θ, Ω Compare order of growth Prove O, Θ, Ω by Definition Analyzing iterative algorithms Use loop invariant to prove correctness Know how to count the number of basic operations, and express the running time as a sum of a series Know how to compute the sum of geometric and arithmetic series Analyzing recursive algorithms Use induction to prove correctness Define running time using recurrence Solve recurrence using recursion tree / iteration method Solve recurrence using master method Proof using substitution method 5/29/2018

Asymptotic notations O: <= o: < Ω: >= ω: > Θ: = (in terms of growth rate) 5/29/2018

Mathematical definitions O(g(n)) = {f(n):  positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n)  n>n0} Ω(g(n)) = {f(n):  positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n)  n>n0} Θ(g(n)) = {f(n):  positive constants c1, c2, and n0 such that 0  c1 g(n)  f(n)  c2 g(n)  n  n0} 5/29/2018

Big-Oh Claim: f(n) = 3n2 + 10n + 5  O(n2) Proof by definition: f(n) = 3n2 + 10n + 5  3n2 + 10n2 + 5 , n > 1  3n2 + 10n2 + 5n2, n > 1  18 n2,  n > 1 If we let c = 18 and n0 = 1, we have f(n)  c n2,  n > n0. Therefore by definition, f(n) = O(n2). 5/29/2018

Use limits to compare orders of growth lim f(n) / g(n) = c > 0 ∞ Use limits to compare orders of growth f(n)  o(g(n)) f(n)  O(g(n)) f(n)  Θ (g(n)) n→∞ f(n)  Ω(g(n)) f(n)  ω (g(n)) L’ Hopital’s rule lim f(n) / g(n) = lim f(n)’ / g(n)’ Condition: If both lim f(n) and lim g(n) = ∞ or 0 n→∞ n→∞ Stirling’s formula (constant) 5/29/2018

Useful rules for logarithms For all a > 0, b > 0, c > 0, the following rules hold logba = logca / logcb = lg a / lg b So: log10n = log2n / log2 10 logban = n logba So: log 3n = n log3 = (n) blogba = a So: 2log2n = n log (ab) = log a + log b So: log (3n) = log 3 + log n = (log n) log (a/b) = log (a) – log(b) So: log (n/2) = log n – log 2 = (log n) logba = 1 / logab logb1 = 0 5/29/2018

Useful rules for exponentials For all a > 0, b > 0, c > 0, the following rules hold a0 = 1 (00 = ?) Answer: does not exist a1 = a a-1 = 1/a (am)n = amn (am)n = (an)m So: (3n)2 = 32n = (32)n =9n aman = am+n So: n2 n3 = n5 2n 22 = 2n+2 = 4 * 2n = (2n) 5/29/2018

More advanced dominance ranking 5/29/2018

Sum of arithmetic series If a1, a2, …, an is an arithmetic series, then 5/29/2018

Sum of geometric series if r < 1 if r > 1 if r = 1 5/29/2018

Sum manipulation rules Example: 5/29/2018

Analyzing non-recursive algorithms Decide parameter (input size) Identify most executed line (basic operation) worst-case = average-case? T(n) = i ti T(n) = Θ (f(n)) 5/29/2018

Analysis of insertion Sort Statement cost time__ InsertionSort(A, n) { for j = 2 to n { c1 n key = A[j] c2 (n-1) i = j - 1; c3 (n-1) while (i > 0) and (A[i] > key) { c4 S A[i+1] = A[i] c5 (S-(n-1)) i = i - 1 c6 (S-(n-1)) } 0 A[i+1] = key c7 (n-1) } 0 } 5/29/2018

Inner loop stops when A[i] <= key, or i = 0 Best case Inner loop stops when A[i] <= key, or i = 0 1 i j Key sorted Array already sorted S =  j=1..n tj tj = 1 for all j S = n. T(n) = Θ (n) 5/29/2018

Inner loop stops when A[i] <= key Worst case Inner loop stops when A[i] <= key 1 i j Key sorted Array originally in reverse order sorted S =  j=1..n tj tj = j S =  j=1..n j = 1 + 2 + 3 + … + n = n (n+1) / 2 = Θ (n2) 5/29/2018

Inner loop stops when A[i] <= key Average case Inner loop stops when A[i] <= key 1 i j Key sorted Array in random order S =  j=1..n tj tj = j / 2 in average S =  j=1..n j/2 = ½  j=1..n j = n (n+1) / 4 = Θ (n2) 5/29/2018

Use loop invariants to prove the correctness of Insertion Sort Loop Invariant (LI): at the start of each iteration of the for loop, the subarray A[1..j-1] consists of the elements originally in A[1..j-1] but in sorted order. Proof by induction Initialization: the LI is true at the start of the 1st iteration (j=2), since A[1] is sorted by itself. Maintenance: if the LI is true at the start of the jth iteration (i.e., A[1..j-1] has all the elements originally in A[1..j-1] but in sorted order ), it remains true before the (j+1)th iteration (i.e., A[1..j] has all the elements originally in A[1..j] in sorted order), as the while loop finds the right position in A[1..j-1] to insert A[j]. Termination: when the loop terminates, j = n+1. By the LI, A[1..n] has all the elements originally in A[1..n] but in sorted order. Therefore the algorithm is correct. 5/29/2018

Analyzing recursive algorithms Prove correctness using induction Define running time as a recurrence Solve recurrence Recursion tree (iteration) method Substitution method Master method 5/29/2018

Correctness of merge sort MERGE-SORT A[1 . . n] If n = 1, done. Recursively sort A[ 1 . . n/2 ] and A[ n/2+1 . . n ] . “Merge” the 2 sorted lists. Proof: Base case: if n = 1, the algorithm will return the correct answer because A[1..1] is already sorted. Inductive hypothesis: assume that the algorithm correctly sorts smaller suarrays, i.e., A[1.. n/2 ] and A[n/2+1..n]. Step: if A[1.. n/2 ] and A[n/2+1..n] are both correctly sorted, the whole array A[1.. n] is sorted after merging. Therefore, the algorithm is correct. 5/29/2018

Analyzing merge sort T(n) MERGE-SORT A[1 . . n] Θ(1) 2T(n/2) f(n) MERGE-SORT A[1 . . n] If n = 1, done. Recursively sort A[ 1 . . n/2 ] and A[ n/2+1 . . n ] . “Merge” the 2 sorted lists T(n) = 2 T(n/2) + Θ(n) 5/29/2018

Recursive Insertion Sort RecursiveInsertionSort(A[1..n]) 1. if (n == 1) do nothing; 2. RecursiveInsertionSort(A[1..n-1]); 3. Find index i in A such that A[i] <= A[n] < A[i+1]; 4. Insert A[n] after A[i]; 5/29/2018

Binary Search BinarySearch (A[1..N], value) { if (N == 0) return -1; // not found mid = (1+N)/2; if (A[mid] == value) return mid; // found else if (A[mid] > value) return BinarySearch (A[1..mid-1], value); else return BinarySearch (A[mid+1, N], value) } 5/29/2018

Recursion tree Solve T(n) = 2T(n/2) + n. n n n/2 n/2 n h = log n n/4 … … Q(1) #leaves = n Q(n) Total Q(n log n) 5/29/2018

Substitution method Recurrence: T(n) = 2T(n/2) + n. Guess: T(n) = O(n log n). (eg. by recursion tree method) To prove, have to show T(n) ≤ c n log n for some c > 0 and for all n > n0 Proof by induction: assume it is true for T(n/2), prove that it is also true for T(n). This means: Fact: T(n) = 2T(n/2) + n Assumption: T(n/2)≤ cn/2 log (n/2) Need to Prove: T(n)≤ c n log (n) 5/29/2018

Proof To prove T(n) = O(n log n), we need to show that T(n)  cn logn for some positive c and all sufficiently large n. Let’s assume this inequality is true for T(n/2), which means T(n/2)  cn/2 log(n/2) Substitute T(n/2) in the recurrence by the r.h.s. of the above inequality, we have T(n) = 2 T(n/2) + n  2 * cn/2 log (n/2) + n  cn (log n – 1) + n  cn log n – (cn – n)  cn log n for c ≥ 1 and all n ≥ 0. Therefore, by definition, T(n) = O(n log n). 5/29/2018

Master theorem T(n) = a T(n/b) + f (n) Key: compare f(n) with nlogba CASE 1: f (n) = O(nlogba – e)  T(n) = Q(nlogba) . CASE 2: f (n) = Q(nlogba)  T(n) = Q(nlogba log n) . CASE 3: f (n) = W(nlogba + e) and a f (n/b) £ c f (n) T(n) = Q( f (n)) . Optional: extended case 2 Regularity Condition 5/29/2018

Analysis of Quick Sort 5/29/2018

Quick sort Another divide and conquer sorting algorithm – like merge sort Anyone remember the basic idea? The worst-case and average-case running time? Learn some new algorithm analysis tricks 5/29/2018

Quick sort Quicksort an n-element array: Divide: Partition the array into two subarrays around a pivot x such that elements in lower subarray £ x £ elements in upper subarray. Conquer: Recursively sort the two subarrays. Combine: Trivial. £ x x ≥ x Key: Linear-time partitioning subroutine. 5/29/2018

Partition All the action takes place in the partition() function £ x x Rearranges the subarray in place End result: two subarrays All values in first subarray  all values in second Returns the index of the “pivot” element separating the two subarrays p q r £ x x ≥ x 5/29/2018

Pseudocode for quicksort QUICKSORT(A, p, r) if p < r then q  PARTITION(A, p, r) QUICKSORT(A, p, q–1) QUICKSORT(A, q+1, r) Initial call: QUICKSORT(A, 1, n) 5/29/2018

Idea of partition If we are allowed to use a second array, it would be easy 6 10 5 8 13 3 2 11 6 5 3 2 11 13 8 10 2 5 3 6 11 13 8 10 5/29/2018

Another idea Keep two iterators: one from head, one from tail 6 10 5 8 13 3 2 11 6 2 5 3 13 8 10 11 3 2 5 6 13 8 10 11 5/29/2018

In-place Partition 3 6 2 10 5 6 8 3 13 8 3 2 10 11 5/29/2018

Partition In Words Partition(A, p, r): Select an element to act as the “pivot” (which?) Grow two regions, A[p..i] and A[j..r] All elements in A[p..i] <= pivot All elements in A[j..r] >= pivot Increment i until A[i] > pivot Decrement j until A[j] < pivot Swap A[i] and A[j] Repeat until i >= j Swap A[j] and A[p] Return j Note: different from book’s partition(), which uses two iterators that both move forward. 5/29/2018

Partition Code What is the running time of partition()? Partition(A, p, r) x = A[p]; // pivot is the first element i = p; j = r + 1; while (TRUE) { repeat i++; until A[i] > x or i >= j; j--; until A[j] < x or j < i; if (i < j) Swap (A[i], A[j]); else break; } swap (A[p], A[j]); return j; What is the running time of partition()? partition() runs in (n) time 5/29/2018

p r 6 10 5 8 13 3 2 11 x = 6 i j 6 10 5 8 13 3 2 11 scan i j 6 2 5 8 13 3 10 11 swap i j Partition example 6 2 5 8 13 3 10 11 scan i j 6 2 5 3 13 8 10 11 swap i j 6 2 5 3 13 8 10 11 scan j i p q r 3 2 5 6 13 8 10 11 final swap 5/29/2018

6 10 5 8 11 3 2 13 Quick sort example 3 2 5 6 11 8 10 13 2 3 5 6 10 8 11 13 2 3 5 6 8 10 11 13 2 3 5 6 8 10 11 13 5/29/2018

Analysis of quicksort Assume all input elements are distinct. In practice, there are better partitioning algorithms for when duplicate input elements may exist. Let T(n) = worst-case running time on an array of n elements. 5/29/2018

Worst-case of quicksort Input sorted or reverse sorted. Partition around min or max element. One side of partition always has no elements. (arithmetic series) 5/29/2018

Worst-case recursion tree T(n) = T(0) + T(n–1) + n 5/29/2018

Worst-case recursion tree T(n) = T(0) + T(n–1) + n T(n) 5/29/2018

Worst-case recursion tree T(n) = T(0) + T(n–1) + n n T(0) T(n–1) 5/29/2018

Worst-case recursion tree T(n) = T(0) + T(n–1) + n n T(0) (n–1) T(0) T(n–2) 5/29/2018

Worst-case recursion tree T(n) = T(0) + T(n–1) + n n T(0) (n–1) T(0) (n–2) T(0) T(0) 5/29/2018

Worst-case recursion tree T(n) = T(0) + T(n–1) + n height n height = n T(0) (n–1) T(0) (n–2) T(0) T(0) 5/29/2018

Worst-case recursion tree T(n) = T(0) + T(n–1) + n n n height = n T(0) (n–1) T(0) (n–2) T(0) T(0) 5/29/2018

Worst-case recursion tree T(n) = T(0) + T(n–1) + n n n height = n Q(1) (n–1) Q(1) (n–2) T(n) = Q(n) + Q(n2) = Q(n2) Q(1) Q(1) 5/29/2018

Best-case analysis (For intuition only!) If we’re lucky, PARTITION splits the array evenly: T(n) = 2T(n/2) + Q(n) = Q(n log n) (same as merge sort) What if the split is always ? What is the solution to this recurrence? 5/29/2018

Analysis of “almost-best” case 5/29/2018

Analysis of “almost-best” case 5/29/2018

Analysis of “almost-best” case 5/29/2018

Analysis of “almost-best” case log10/9n … … … O(n) leaves Q(1) Q(1) 5/29/2018

Analysis of “almost-best” case log10n log10/9n … … O(n) leaves … Q(1) Q(n log n) Q(1) n log10n £ T(n) £ n log10/9n + O(n) 5/29/2018

Quicksort Runtimes Best-case runtime Tbest(n)  (n log n) Worst-case runtime Tworst(n)  (n2) Worse than mergesort? Why is it called quicksort then? Its average runtime Tavg(n)  (n log n ) Better even, the expected runtime of randomized quicksort is (n log n) 5/29/2018

Randomized quicksort Randomly choose an element as pivot Every time need to do a partition, throw a die to decide which element to use as the pivot Each element has 1/n probability to be selected Rand-Partition(A, p, r) d = random(); // a random number between 0 and 1 index = p + floor((r-p+1) * d); // p<=index<=r swap(A[p], A[index]); Partition(A, p, r); // now do partition using A[p] as pivot 5/29/2018

Running time of randomized quicksort T(0) + T(n–1) + dn if 0 : n–1 split, T(1) + T(n–2) + dn if 1 : n–2 split, M T(n–1) + T(0) + dn if n–1 : 0 split, T(n) = The expected running time is an average of all cases Expectation 5/29/2018

5/29/2018

Solving recurrence Recursion tree (iteration) method - Good for guessing an answer Substitution method - Generic method, rigid, but may be hard Master method - Easy to learn, useful in limited cases only - Some tricks may help in other cases 5/29/2018

Substitution method The most general method to solve a recurrence (prove O and  separately): Guess the form of the solution: (e.g. using recursion trees, or expansion) Verify by induction (inductive step). 5/29/2018

Expected running time of Quicksort Guess We need to show that for some c and sufficiently large n Use T(n) instead of for convenience 5/29/2018

Need to show: T(n) ≤ c n log (n) Fact: Need to show: T(n) ≤ c n log (n) Assume: T(k) ≤ ck log (k) for 0 ≤ k ≤ n-1 Proof: using the fact that if c ≥ 4. Therefore, by defintion, T(n) =  (nlogn) 5/29/2018

Tightly Bounding The Key Summation Split the summation for a tighter bound What are we doing here? The lg k in the second term is bounded by lg n What are we doing here? Move the lg n outside the summation What are we doing here? 5/29/2018

Tightly Bounding The Key Summation The summation bound so far The lg k in the first term is bounded by lg n/2 What are we doing here? lg n/2 = lg n - 1 What are we doing here? Move (lg n - 1) outside the summation What are we doing here? 5/29/2018

Tightly Bounding The Key Summation The summation bound so far What are we doing here? Distribute the (lg n - 1) The summations overlap in range; combine them What are we doing here? The Guassian series What are we doing here? 5/29/2018

Tightly Bounding The Key Summation The summation bound so far Rearrange first term, place upper bound on second What are we doing here? What are we doing? Guassian series Multiply it all out What are we doing? 5/29/2018

Tightly Bounding The Key Summation 5/29/2018