10/13/20151 CS 3343: Analysis of Algorithms Lecture 9: Review for midterm 1 Analysis of quick sort.

Slides:



Advertisements
Similar presentations
Analysis of Algorithms
Advertisements

David Luebke 1 4/22/2015 CS 332: Algorithms Quicksort.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.
Analysis of Algorithms
CSE 373: Data Structures and Algorithms Lecture 5: Math Review/Asymptotic Analysis III 1.
ADA: 5. Quicksort1 Objective o describe the quicksort algorithm, it's partition function, and analyse its running time under different data conditions.
5/5/20151 Analysis of Algorithms Lecture 6&7: Master theorem and substitution method.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Divide-and-Conquer Recursive in structure –Divide the problem into several smaller sub-problems that are similar to the original but smaller in size –Conquer.
Analysis of Algorithms CS 477/677 Sorting – Part B Instructor: George Bebis (Chapter 7)
Spring 2015 Lecture 5: QuickSort & Selection
CS 253: Algorithms Chapter 7 Mergesort Quicksort Credit: Dr. George Bebis.
CS421 - Course Information Website Syllabus Schedule The Book:
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 3 Recurrence equations Formulating recurrence equations Solving recurrence equations.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 3 Recurrence equations Formulating recurrence equations Solving recurrence equations.
Quicksort CIS 606 Spring Quicksort Worst-case running time: Θ(n 2 ). Expected running time: Θ(n lg n). Constants hidden in Θ(n lg n) are small.
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 2 Elements of complexity analysis Performance and efficiency Motivation: analysis.
CS Main Questions Given that the computer is the Great Symbol Manipulator, there are three main questions in the field of computer science: What kinds.
Analysis of Algorithms
1 QuickSort Worst time:  (n 2 ) Expected time:  (nlgn) – Constants in the expected time are small Sorts in place.
Asymptotic Growth Rates Themes –Analyzing the cost of programs –Ignoring constants and Big-Oh –Recurrence Relations & Sums –Divide and Conquer Examples.
October 1, Algorithms and Data Structures Lecture III Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Algorithm Efficiency CS 110: Data Structures and Algorithms First Semester,
Mathematics Review and Asymptotic Notation
Analyzing Recursive Algorithms A recursive algorithm can often be described by a recurrence equation that describes the overall runtime on a problem of.
CS 3343: Analysis of Algorithms
Algorithm Analysis An algorithm is a clearly specified set of simple instructions to be followed to solve a problem. Three questions for algorithm analysis.
Chapter 7 Quicksort Ack: This presentation is based on the lecture slides from Hsu, Lih- Hsing, as well as various materials from the web.
2IL50 Data Structures Fall 2015 Lecture 2: Analysis of Algorithms.
Project 2 due … Project 2 due … Project 2 Project 2.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
A Lecture /24/2015 COSC3101A: Design and Analysis of Algorithms Tianying Ji Lecture 1.
10/25/20151 CS 3343: Analysis of Algorithms Lecture 6&7: Master theorem and substitution method.
Introduction to Algorithms Jiafen Liu Sept
David Luebke 1 6/3/2016 CS 332: Algorithms Analyzing Quicksort: Average Case.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
Asymptotic Growth Rates  Themes  Analyzing the cost of programs  Ignoring constants and Big-Oh  Recurrence Relations & Sums  Divide and Conquer 
Order Statistics(Selection Problem)
1/6/20161 CS 3343: Analysis of Algorithms Lecture 2: Asymptotic Notations.
Midterm Review 1. Midterm Exam Thursday, October 15 in classroom 75 minutes Exam structure: –TRUE/FALSE questions –short questions on the topics discussed.
Spring 2015 Lecture 2: Analysis of Algorithms
E.G.M. PetrakisAlgorithm Analysis1  Algorithms that are equally correct can vary in their utilization of computational resources  time and memory  a.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Young CS 331 D&A of Algo. Topic: Divide and Conquer1 Divide-and-Conquer General idea: Divide a problem into subprograms of the same kind; solve subprograms.
David Luebke 1 2/19/2016 Priority Queues Quicksort.
2IL50 Data Structures Spring 2016 Lecture 2: Analysis of Algorithms.
Big O David Kauchak cs302 Spring Administrative Assignment 1: how’d it go? Assignment 2: out soon… Lab code.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
CS 3343: Analysis of Algorithms
Introduction to Algorithms Prof. Charles E. Leiserson
CS 3343: Analysis of Algorithms
CS 3343: Analysis of Algorithms
CS 3343: Analysis of Algorithms
CSC 413/513: Intro to Algorithms
CS 3343: Analysis of Algorithms
CS 3343: Analysis of Algorithms
CS 3343: Analysis of Algorithms
Algorithms and Data Structures Lecture III
CS 3343: Analysis of Algorithms
CS 3343: Analysis of Algorithms
CS 3343: Analysis of Algorithms
CS 332: Algorithms Quicksort David Luebke /9/2019.
Ack: Several slides from Prof. Jim Anderson’s COMP 202 notes.
Quicksort Quick sort Correctness of partition - loop invariant
Algorithms and Data Structures Lecture III
Presentation transcript:

10/13/20151 CS 3343: Analysis of Algorithms Lecture 9: Review for midterm 1 Analysis of quick sort

10/13/20152 Exam (midterm 1) Closed book exam One cheat sheet allowed (limit to a single page of letter-size paper, double-sided) Tuesday, Feb 24, 10:00 – 11:25pm Basic calculator (no graphing) is allowed

10/13/20153 Materials covered Up to Lecture 8 (Feb 6) Comparing functions O, Θ, Ω –Definition, limit method, L’Hopital’s rule, sterling’s formula Analyzing iterative algorithms –Know how to count the number of basic operations, and express the running time as a sum of a series –Know how to compute the sum of a series (geometric, arithmetic, or other frequently seen series) Analyzing recursive algorithms –Define recurrence –Solve recurrence using recursion tree / iteration method –Solve recurrence using master method –Prove using substitution method

10/13/20154 Asymptotic notations O: <= o: < Ω: >= ω: > Θ: = (in terms of growth rate)

10/13/20155 Mathematical definitions O(g(n)) = {f(n):  positive constants c and n 0 such that 0 ≤ f(n) ≤ cg(n)  n>n 0 } Ω(g(n)) = {f(n):  positive constants c and n 0 such that 0 ≤ cg(n) ≤ f(n)  n>n 0 } Θ(g(n)) = {f(n):  positive constants c 1, c 2, and n 0 such that 0  c 1 g(n)  f(n)  c 2 g(n)  n  n 0 }

10/13/20156 Big-Oh Claim: f(n) = 3n n + 5  O(n 2 ) Proof by definition: f(n) = 3n n + 5  3n n 2 + 5,  n > 1  3n n 2 + 5n 2,  n > 1  18 n 2,  n > 1 If we let c = 18 and n 0 = 1, we have f(n)  c n 2,  n > n 0. Therefore by definition, f(n) = O(n 2 ).

10/13/20157 Use limits to compare orders of growth 0 lim f(n) / g(n) = c > 0 ∞ n→∞ f(n)  o(g(n)) f(n)  Θ (g(n)) f(n)  ω (g(n)) f(n)  O(g(n)) f(n)  Ω(g(n)) lim f(n) / g(n) = lim f(n)’ / g(n)’ n→∞ Condition: If both lim f(n) and lim g(n) = ∞ or 0 L’ Hopital’s rule Stirling’s formula (constant)

10/13/20158 Useful rules for logarithms For all a > 0, b > 0, c > 0, the following rules hold log b a = log c a / log c b = lg a / lg b –So: log 10 n = log 2 n / log 2 10 log b a n = n log b a –So: log 3 n = n log3 =  (n) b log b a = a –So: 2 log 2 n = n log (ab) = log a + log b –So: log (3n) = log 3 + log n =  (log n) log (a/b) = log (a) – log(b) –So: log (n/2) = log n – log 2 =  (log n) log b a = 1 / log a b log b 1 = 0

10/13/20159 Useful rules for exponentials For all a > 0, b > 0, c > 0, the following rules hold a 0 = 1 (0 0 = ?) Answer: does not exist a 1 = a a -1 = 1/a (a m ) n = a mn (a m ) n = (a n ) m –So: (3 n ) 2 = 3 2n = (3 2 ) n =9 n a m a n = a m+n –So: n 2 n 3 = n 5 –2 n 2 2 = 2 n+2 = 4 * 2 n =  (2 n )

10/13/ More advanced dominance ranking

10/13/ Sum of arithmetic series If a 1, a 2, …, a n is an arithmetic series, then

10/13/ Sum of geometric series if r < 1 if r > 1 if r = 1

10/13/ Sum manipulation rules Example:

10/13/ Analyzing non-recursive algorithms Decide parameter (input size) Identify most executed line (basic operation) worst-case = average-case? T(n) =  i t i T(n) = Θ (f(n))

10/13/ Statement cost time__ InsertionSort(A, n) { for j = 2 to n { c 1 n key = A[j] c 2 (n-1) i = j - 1; c 3 (n-1) while (i > 0) and (A[i] > key) { c 4 S A[i+1] = A[i] c 5 (S-(n-1)) i = i - 1 c 6 (S-(n-1)) } 0 A[i+1] = key c 7 (n-1) } 0 } Analysis of insertion Sort

10/13/ Best case Array already sorted S =  j=1..n t j t j = 1 for all j S = n. T(n) = Θ (n) 1 ij sorted Key Inner loop stops when A[i] <= key, or i = 0

10/13/ Worst case Array originally in reverse order sorted S =  j=1..n t j t j = j S =  j=1..n j = … + n = n (n+1) / 2 = Θ (n 2 ) 1 ij sorted Inner loop stops when A[i] <= key Key

10/13/ Average case Array in random order S =  j=1..n t j t j = j / 2 in average S =  j=1..n j/2 = ½  j=1..n j = n (n+1) / 4 = Θ (n 2 ) 1 ij sorted Inner loop stops when A[i] <= key Key

10/13/ Analyzing recursive algorithms Defining recurrence relation Solving recurrence relation –Recursion tree (iteration) method –Substitution method –Master method

10/13/ Analyzing merge sort M ERGE -S ORT A[1.. n] 1.If n = 1, done. 2.Recursively sort A[ 1..  n/2  ] and A[  n/2  +1.. n ]. 3.“Merge” the 2 sorted lists T(n)T(n) Θ(1) 2T(n/2) f(n) T(n) = 2 T(n/2) + Θ(n)

10/13/ Recursive Insertion Sort RecursiveInsertionSort(A[1..n]) 1.if (n == 1) do nothing; 2.RecursiveInsertionSort(A[1..n-1]); 3.Find index i in A such that A[i] <= A[n] < A[i+1]; 4.Insert A[n] after A[i];

10/13/ Binary Search BinarySearch (A[1..N], value) { if (N == 0) return -1;// not found mid = (1+N)/2; if (A[mid] == value) return mid;// found else if (A[mid] > value) return BinarySearch (A[1..mid-1], value); else return BinarySearch (A[mid+1, N], value) }

10/13/ Recursion tree Solve T(n) = 2T(n/2) + n. n n/4 n/2  (1) … h = log n n n n #leaves = n (n)(n) Total  (n log n) …

10/13/ Recurrence: T(n) = 2T(n/2) + n. Guess: T(n) = O(n log n). (eg. by recursion tree method) To prove, have to show T(n) ≤ c n log n for some c > 0 and for all n > n 0 Proof by induction: assume it is true for T(n/2), prove that it is also true for T(n). This means: Fact: T(n) = 2T(n/2) + n Assumption: T(n/2)≤ cn/2 log (n/2) Need to Prove: T(n)≤ c n log (n) Substitution method

10/13/ Proof To prove T(n) = O(n log n), we need to show that T(n)  cn logn for some positive c and all sufficiently large n. Let’s assume this inequality is true for T(n/2), which means T(n/2)  cn/2 log(n/2) Substitute T(n/2) in the recurrence by the r.h.s. of the above inequality, we have T(n) = 2 T(n/2) + n  2 * cn/2 log (n/2) + n  cn (log n – 1) + n  cn log n – (cn – n)  cn log n for c ≥ 1 and all n ≥ 0. Therefore, by definition, T(n) = O(n log n).

10/13/ Master theorem T(n) = a T(n/b) + f (n) C ASE 1: f (n) = O(n log b a –  )  T(n) =  (n log b a ). C ASE 2: f (n) =  (n log b a )  T(n) =  (n log b a log n). C ASE 3: f (n) =  (n log b a +  ) and a f (n/b)  c f (n)  T(n) =  ( f (n)). Optional: extended case 2 Key: compare f(n) with n log b a Regularity Condition

10/13/ Analysis of Quick Sort

10/13/ Quick sort Another divide and conquer sorting algorithm – like merge sort Anyone remember the basic idea? The worst-case and average-case running time? Learn some new algorithm analysis tricks

10/13/ Quick sort Quicksort an n-element array: 1.Divide: Partition the array into two subarrays around a pivot x such that elements in lower subarray  x  elements in upper subarray. 2.Conquer: Recursively sort the two subarrays. 3.Combine: Trivial.  x x  x x x x ≥ x≥ x ≥ x≥ x Key: Linear-time partitioning subroutine.

10/13/ Partition All the action takes place in the partition() function –Rearranges the subarray in place –End result: two subarrays All values in first subarray  all values in second –Returns the index of the “pivot” element separating the two subarrays  x x  x x x x ≥ x≥ x ≥ x≥ x pr q

10/13/ Pseudocode for quicksort Q UICKSORT (A, p, r) if p < r then q  P ARTITION (A, p, r) Q UICKSORT (A, p, q–1) Q UICKSORT (A, q+1, r) Initial call: Q UICKSORT (A, 1, n)

10/13/ Idea of partition If we are allowed to use a second array, it would be easy

10/13/ Another idea Keep two iterators: one from head, one from tail

10/13/ In-place Partition

10/13/ Partition In Words Partition(A, p, r): –Select an element to act as the “pivot” (which?) –Grow two regions, A[p..i] and A[j..r] All elements in A[p..i] <= pivot All elements in A[j..r] >= pivot –Increment i until A[i] > pivot –Decrement j until A[j] < pivot –Swap A[i] and A[j] –Repeat until i >= j –Swap A[j] and A[p] –Return j Note: different from book’s partition(), which uses two iterators that both move forward.

10/13/ Partition Code Partition(A, p, r) x = A[p];// pivot is the first element i = p; j = r + 1; while (TRUE) { repeat i++; until A[i] > x or i >= j; repeat j--; until A[j] < x or j < i; if (i < j) Swap (A[i], A[j]); else break; } swap (A[p], A[j]); return j; What is the running time of partition() ? partition() runs in  (n) time

10/13/ ij x = 6 pr ij ij ij ij ij qpr scan swap final swap Partition example

10/13/ Quick sort example

10/13/ Analysis of quicksort Assume all input elements are distinct. In practice, there are better partitioning algorithms for when duplicate input elements may exist. Let T(n) = worst-case running time on an array of n elements.

10/13/ Worst-case of quicksort Input sorted or reverse sorted. Partition around min or max element. One side of partition always has no elements. (arithmetic series)

10/13/ Worst-case recursion tree T(n) = T(0) + T(n–1) + n

10/13/ Worst-case recursion tree T(n) = T(0) + T(n–1) + n T(n)T(n)

10/13/ n T(0)T(n–1) Worst-case recursion tree T(n) = T(0) + T(n–1) + n

10/13/ n T(0)(n–1) Worst-case recursion tree T(n) = T(0) + T(n–1) + n T(0)T(n–2)

10/13/ n T(0)(n–1) Worst-case recursion tree T(n) = T(0) + T(n–1) + n T(0)(n–2) T(0)

10/13/ n T(0)(n–1) Worst-case recursion tree T(n) = T(0) + T(n–1) + n T(0)(n–2) T(0) height height = n

10/13/ n T(0)(n–1) Worst-case recursion tree T(n) = T(0) + T(n–1) + n T(0)(n–2) T(0) n height = n

10/13/ n (n–1) Worst-case recursion tree T(n) = T(0) + T(n–1) + n (n–2)  (1) n height = n  (1) T(n)=  (n) +  (n 2 ) =  (n 2 )

10/13/ Best-case analysis (For intuition only!) If we’re lucky, P ARTITION splits the array evenly: T(n)= 2T(n/2) +  (n) =  (n log n) (same as merge sort) What if the split is always? What is the solution to this recurrence?

10/13/ Analysis of “almost-best” case

10/13/ Analysis of “almost-best” case n

10/13/ Analysis of “almost-best” case n

10/13/ Analysis of “almost-best” case  (1) … … log 10/9 n … O(n) leaves n

10/13/ log 10 n Analysis of “almost-best” case  (1) … … log 10/9 n T(n)  n log 10/9 n +  (n) … n log 10 n  O(n) leaves  (n log n)

10/13/ Quicksort Runtimes Best-case runtime T best (n)   (n log n) Worst-case runtime T worst (n)   (n 2 ) Worse than mergesort? Why is it called quicksort then? Its average runtime T avg (n)   (n log n ) Better even, the expected runtime of randomized quicksort is  (n log n)

10/13/ Randomized quicksort Randomly choose an element as pivot –Every time need to do a partition, throw a die to decide which element to use as the pivot –Each element has 1/n probability to be selected Rand-Partition(A, p, r) d = random(); // a random number between 0 and 1 index = p + floor((r-p+1) * d); // p<=index<=r swap(A[p], A[index]); Partition(A, p, r); // now do partition using A[p] as pivot

10/13/ Running time of randomized quicksort The expected running time is an average of all cases T(n) = T(0) + T(n–1) + dnif 0 : n–1 split, T(1) + T(n–2) + dnif 1 : n–2 split,  T(n–1) + T(0) + dnif n–1 : 0 split, Expectation

10/13/201558

10/13/ Solving recurrence 1.Recursion tree (iteration) method - Good for guessing an answer 2.Substitution method - Generic method, rigid, but may be hard 3.Master method - Easy to learn, useful in limited cases only - Some tricks may help in other cases

10/13/ Substitution method 1.Guess the form of the solution: (e.g. using recursion trees, or expansion) 2.Verify by induction (inductive step). The most general method to solve a recurrence (prove O and  separately):

10/13/ Expected running time of Quicksort Guess We need to show that for some c and sufficiently large n Use T(n) instead of for convenience

10/13/ Fact: Need to show: T(n) ≤ c n log (n) Assume: T(k) ≤ ck log (k) for 0 ≤ k ≤ n-1 Proof: if c ≥ 4. Therefore, by defintion, T(n) =  (nlogn) using the fact that

10/13/ What are we doing here? The lg k in the second term is bounded by lg n Tightly Bounding The Key Summation Move the lg n outside the summation Split the summation for a tighter bound

10/13/ The summation bound so far Tightly Bounding The Key Summation What are we doing here? The lg k in the first term is bounded by lg n/2 What are we doing here? lg n/2 = lg n - 1 What are we doing here? Move (lg n - 1) outside the summation

10/13/ The summation bound so far Tightly Bounding The Key Summation What are we doing here? Distribute the (lg n - 1) What are we doing here? The summations overlap in range; combine them What are we doing here? The Guassian series

10/13/ The summation bound so far Tightly Bounding The Key Summation What are we doing here? Rearrange first term, place upper bound on second What are we doing? Guassian series What are we doing? Multiply it all out

10/13/ Tightly Bounding The Key Summation