CS 3343: Analysis of Algorithms Lecture 2: Asymptotic Notations 9/20/2018
Outline Review of last lecture Order of growth Asymptotic notations Big O, big Ω, Θ 9/20/2018
How to express algorithms? English Pseudocode Real programming languages Increasing precision Ease of expression Describe the ideas of an algorithm in English. Use pseudocode to clarify sufficiently tricky details of the algorithm. 9/20/2018
Insertion Sort InsertionSort(A, n) { for j = 2 to n { } ▷ Pre condition: A[1..j-1] is sorted 1. Find position i in A[1..j-1] such that A[i] ≤ A[j] < A[i+1] 2. Insert A[j] between A[i] and A[i+1] ▷ Post condition: A[1..j] is sorted 1 j sorted 9/20/2018
Insertion Sort InsertionSort(A, n) { for j = 2 to n { key = A[j]; i = j - 1; while (i > 0) and (A[i] > key) { A[i+1] = A[i]; i = i – 1; } A[i+1] = key } } 1 i j Key sorted 9/20/2018
Use loop invariants to prove the correctness of loops A loop invariant (LI) is a formal statement about the variables in your program which holds true throughout the loop Proof by induction Initialization: the LI is true prior to the 1st iteration Maintenance: if the LI is true before the jth iteration, it remains true before the (j+1)th iteration Termination: when the loop terminates, the invariant shows the correctness of the algorithm 9/20/2018
Loop invariants and correctness of insertion sort Claim: at the start of each iteration of the for loop, the subarray consists of the elements originally in A[1..j-1] but in sorted order. Proof: by induction 9/20/2018
Prove correctness using loop invariants InsertionSort(A, n) { for j = 2 to n { key = A[j]; i = j - 1; ▷Insert A[j] into the sorted sequence A[1..j-1] while (i > 0) and (A[i] > key) { A[i+1] = A[i]; i = i – 1; } A[i+1] = key } } Loop invariant: at the start of each iteration of the for loop, the subarray A[1..j-1] consists of the elements originally in A[1..j-1] but in sorted order. 9/20/2018
Initialization InsertionSort(A, n) { for j = 2 to n { key = A[j]; i = j - 1; ▷Insert A[j] into the sorted sequence A[1..j-1] while (i > 0) and (A[i] > key) { A[i+1] = A[i]; i = i – 1; } A[i+1] = key } } Subarray A[1] is sorted. So loop invariant is true before the loop starts. 9/20/2018
Maintenance InsertionSort(A, n) { for j = 2 to n { key = A[j]; i = j - 1; ▷Insert A[j] into the sorted sequence A[1..j-1] while (i > 0) and (A[i] > key) { A[i+1] = A[i]; i = i – 1; } A[i+1] = key } } Assume loop variant is true prior to iteration j Loop variant will be true before iteration j+1 1 i j Key sorted 9/20/2018
The algorithm is correct! Termination InsertionSort(A, n) { for j = 2 to n { key = A[j]; i = j - 1; ▷Insert A[j] into the sorted sequence A[1..j-1] while (i > 0) and (A[i] > key) { A[i+1] = A[i]; i = i – 1; } A[i+1] = key } } The algorithm is correct! Upon termination, A[1..n] contains all the original elements of A in sorted order. 1 n j=n+1 Sorted 9/20/2018
Efficiency Correctness alone is not sufficient Brute-force algorithms exist for most problems E.g. use permutation to sort Problem: too slow! How to measure efficiency? Accurate running time is not a good measure It depends on input, computer, and implementation, etc. 9/20/2018
Machine-independent A generic uniprocessor random-access machine (RAM) model No concurrent operations Each simple operation (e.g. +, -, =, *, if, for) takes 1 step. Loops and subroutine calls are not simple operations. All memory equally expensive to access Constant word size Unless we are explicitly manipulating bits 9/20/2018
Analysis of insertion Sort InsertionSort(A, n) { for j = 2 to n { key = A[j] i = j - 1; while (i > 0) and (A[i] > key) { A[i+1] = A[i] i = i - 1 } A[i+1] = key } } How many times will this line execute? 9/20/2018
Analysis of insertion Sort InsertionSort(A, n) { for j = 2 to n { key = A[j] i = j - 1; while (i > 0) and (A[i] > key) { A[i+1] = A[i] i = i - 1 } A[i+1] = key } } How many times will this line execute? 9/20/2018
Analysis of insertion Sort Statement cost time__ InsertionSort(A, n) { for j = 2 to n { c1 n key = A[j] c2 (n-1) i = j - 1; c3 (n-1) while (i > 0) and (A[i] > key) { c4 S A[i+1] = A[i] c5 (S-(n-1)) i = i - 1 c6 (S-(n-1)) } 0 A[i+1] = key c7 (n-1) } 0 } S = t2 + t3 + … + tn where tj is number of while expression evaluations for the jth for loop iteration 9/20/2018
Analyzing Insertion Sort T(n) = c1n + c2(n-1) + c3(n-1) + c4S + c5(S - (n-1)) + c6(S - (n-1)) + c7(n-1) = c8S + c9n + c10 What can S be? Best case -- inner loop body never executed tj = 1 S = n - 1 T(n) = an + b is a linear function Worst case -- inner loop body executed for all previous elements tj = j S = 2 + 3 + … + n = n(n+1)/2 - 1 T(n) = an2 + bn + c is a quadratic function Average case Can assume that on average, we have to insert A[j] into the middle of A[1..j-1], so tj = j/2 S ≈ n(n+1)/4 T(n) is still a quadratic function Θ (n) Θ (n2) Θ (n2) 9/20/2018
Analysis of insertion Sort Statement cost time__ InsertionSort(A, n) { for j = 2 to n { c1 n key = A[j] c2 (n-1) i = j - 1; c3 (n-1) while (i > 0) and (A[i] > key) { c4 S A[i+1] = A[i] c5 (S-(n-1)) i = i - 1 c6 (S-(n-1)) } 0 A[i+1] = key c7 (n-1) } 0 } What are the basic operations (most executed lines)? 9/20/2018
Analysis of insertion Sort Statement cost time__ InsertionSort(A, n) { for j = 2 to n { c1 n key = A[j] c2 (n-1) i = j - 1; c3 (n-1) while (i > 0) and (A[i] > key) { c4 S A[i+1] = A[i] c5 (S-(n-1)) i = i - 1 c6 (S-(n-1)) } 0 A[i+1] = key c7 (n-1) } 0 } 9/20/2018
Analysis of insertion Sort Statement cost time__ InsertionSort(A, n) { for j = 2 to n { c1 n key = A[j] c2 (n-1) i = j - 1; c3 (n-1) while (i > 0) and (A[i] > key) { c4 S A[i+1] = A[i] c5 (S-(n-1)) i = i - 1 c6 (S-(n-1)) } 0 A[i+1] = key c7 (n-1) } 0 } 9/20/2018
Inner loop stops when A[i] <= key, or i = 0 What can S be? Inner loop stops when A[i] <= key, or i = 0 1 i j Key sorted S = j=1..n tj Best case: Worst case: Average case: 9/20/2018
Inner loop stops when A[i] <= key, or i = 0 Best case Inner loop stops when A[i] <= key, or i = 0 1 i j Key sorted Array already sorted S = j=1..n tj tj = 1 for all j S = n. T(n) = Θ (n) 9/20/2018
Inner loop stops when A[i] <= key Worst case Inner loop stops when A[i] <= key 1 i j Key sorted Array originally in reverse order sorted S = j=1..n tj tj = j S = j=1..n j = 1 + 2 + 3 + … + n = n (n+1) / 2 = Θ (n2) 9/20/2018
Average case Array in random order S = j=1..n tj Inner loop stops when A[i] <= key 1 i j Key sorted Array in random order S = j=1..n tj tj = j / 2 on average S = j=1..n j/2 = ½ j=1..n j = n (n+1) / 4 = Θ (n2) What if we use binary search? 9/20/2018
Asymptotic Analysis Running time depends on the size of the input Larger array takes more time to sort T(n): the time taken on input with size n Look at growth of T(n) as n→∞. “Asymptotic Analysis” Size of input is generally defined as the number of input elements In some cases may be tricky 9/20/2018
Asymptotic Analysis Ignore actual and abstract statement costs Order of growth is the interesting measure: Highest-order term is what counts As the input size grows larger it is the high order term that dominates 9/20/2018
Comparison of functions log2n n nlog2n n2 n3 2n n! 10 3.3 33 102 103 106 6.6 660 104 1030 10158 109 13 105 108 1012 17 1010 1015 20 107 1018 For a super computer that does 1 trillion operations per second, it will be longer than 1 billion years 9/20/2018
Order of growth 1 << log2n << n << nlog2n << n2 << n3 << 2n << n! (We are slightly abusing of the “<<“ sign. It means a smaller order of growth). 9/20/2018
Exact analysis is hard! Worst-case and average-case are difficult to deal with precisely, because the details are very complicated It may be easier to talk about upper and lower bounds of the function. 9/20/2018
Asymptotic notations O: Big-Oh Ω: Big-Omega Θ: Theta o: Small-oh ω: Small-omega 9/20/2018
Big O Informally, O (g(n)) is the set of all functions with a smaller or same order of growth as g(n), within a constant multiple If we say f(n) is in O(g(n)), it means that g(n) is an asymptotic upper bound of f(n) Intuitively, it is like f(n) ≤ g(n) What is O(n2)? The set of all functions that grow slower than or in the same order as n2 9/20/2018
So: But: n O(n2) n2 O(n2) 1000n O(n2) n2 + n O(n2) Intuitively, O is like ≤ 9/20/2018
Intuitively, o is like < small o Informally, o (g(n)) is the set of all functions with a strictly smaller growth as g(n), within a constant multiple What is o(n2)? The set of all functions that grow slower than n2 So: 1000n o(n2) But: n2 o(n2) Intuitively, o is like < 9/20/2018
Big Ω Informally, Ω (g(n)) is the set of all functions with a larger or same order of growth as g(n), within a constant multiple f(n) Ω(g(n)) means g(n) is an asymptotic lower bound of f(n) Intuitively, it is like g(n) ≤ f(n) So: n2 Ω(n) 1/1000 n2 Ω(n) But: 1000 n Ω(n2) Intuitively, Ω is like ≥ 9/20/2018
Intuitively, ω is like > small ω Informally, ω (g(n)) is the set of all functions with a larger order of growth as g(n), within a constant multiple So: n2 ω(n) 1/1000 n2 ω(n) n2 ω(n2) Intuitively, ω is like > 9/20/2018
Theta (Θ) Informally, Θ (g(n)) is the set of all functions with the same order of growth as g(n), within a constant multiple f(n) Θ(g(n)) means g(n) is an asymptotically tight bound of f(n) Intuitively, it is like f(n) = g(n) What is Θ(n2)? The set of all functions that grow in the same order as n2 9/20/2018
So: But: n2 Θ(n2) n2 + n Θ(n2) 100n2 + n Θ(n2) 100n2 + log2n Θ(n2) But: nlog2n Θ(n2) 1000n Θ(n2) 1/1000 n3 Θ(n2) Intuitively, Θ is like = 9/20/2018
Tricky cases How about sqrt(n) and log2 n? How about log2 n and log10 n How about 2n and 3n How about 3n and n!? 9/20/2018
Big-Oh Definition: lim n→∞ g(n)/f(n) > 0 (if the limit exists.) There exist For all Definition: O(g(n)) = {f(n): positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) n>n0} lim n→∞ g(n)/f(n) > 0 (if the limit exists.) Abuse of notation (for convenience): f(n) = O(g(n)) actually means f(n) O(g(n)) 9/20/2018
Big-Oh Claim: f(n) = 3n2 + 10n + 5 O(n2) Proof by definition To prove this claim by definition, we need to find some positive constants c and n0 such that f(n) <= cn2 for all n > n0. (Note: you just need to find one concrete example of c and n0 satisfying the condition.) 3n2 + 10n + 5 10n2 + 10n + 10 10n2 + 10n2 + 10n2, n 1 30 n2, n 1 Therefore, if we let c = 30 and n0 = 1, we have f(n) c n2, n ≥ n0. Hence according to the definition of big-Oh, f(n) = O(n2). Alternatively, we can show that lim n→∞ n2 / (3n2 + 10n + 5) = 1/3 > 0 9/20/2018
Big-Omega Definition: lim n→∞ f(n)/g(n) > 0 (if the limit exists.) Ω(g(n)) = {f(n): positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n) n>n0} lim n→∞ f(n)/g(n) > 0 (if the limit exists.) Abuse of notation (for convenience): f(n) = Ω(g(n)) actually means f(n) Ω(g(n)) 9/20/2018
Big-Omega Claim: f(n) = n2 / 10 = Ω(n) Proof by definition: f(n) = n2 / 10, g(n) = n Need to find a c and a no to satisfy the definition of f(n) Ω(g(n)), i.e., f(n) ≥ cg(n) for n > n0 n ≤ n2 / 10 when n ≥ 10 If we let c = 1 and n0 = 10, we have f(n) ≥ cn, n ≥ n0. Therefore, according to definition, f(n) = Ω(n). Alternatively: lim n→∞ f(n)/g(n) = lim n→∞ (n/10) = ∞ 9/20/2018
Theta Definition: lim n→∞ f(n)/g(n) = c > 0 and c < ∞ Θ(g(n)) = {f(n): positive constants c1, c2, and n0 such that 0 c1 g(n) f(n) c2 g(n), n n0 } lim n→∞ f(n)/g(n) = c > 0 and c < ∞ f(n) = O(g(n)) and f(n) = Ω(g(n)) Abuse of notation (for convenience): f(n) = Θ(g(n)) actually means f(n) Θ(g(n)) Θ(1) means constant time. 9/20/2018
Theta Claim: f(n) = 2n2 + n = Θ (n2) Proof by definition: Need to find the three constants c1, c2, and n0 such that c1n2 ≤ 2n2+n ≤ c2n2 for all n > n0 A simple solution is c1 = 2, c2 = 3, and n0 = 1 Alternatively, limn→∞(2n2+n)/n2 = 2 9/20/2018
More Examples Prove n2 + 3n + lg n is in O(n2) Want to find c and n0 such that n2 + 3n + lg n <= cn2 for n > n0 Proof: n2 + 3n + lg n <= 3n2 + 3n + 3lgn for n > 1 <= 3n2 + 3n2 + 3n2 <= 9n2 Or n2 + 3n + lg n <= n2 + n2 + n2 for n > 10 <= 3n2 9/20/2018
More Examples Prove n2 + 3n + lg n is in Ω(n2) Want to find c and n0 such that n2 + 3n + lg n >= cn2 for n > n0 n2 + 3n + lg n >= n2 for n > 0 n2 + 3n + lg n = O(n2) and n2 + 3n + lg n = Ω (n2) => n2 + 3n + lg n = Θ(n2) 9/20/2018
O, Ω, and Θ The definitions imply a constant n0 beyond which they are satisfied. We do not care about small values of n. 9/20/2018
Using limits to compare orders of growth lim f(n) / g(n) = c > 0 ∞ f(n) o(g(n)) f(n) O(g(n)) f(n) Θ (g(n)) n→∞ f(n) Ω(g(n)) f(n) ω (g(n)) 9/20/2018
logarithms compare log2n and log10n logab = logcb / logca log2n = log10n / log102 ~ 3.3 log10n Therefore lim(log2n / log10 n) = 3.3 log2n = Θ (log10n) 9/20/2018
Therefore, 2n o(3n), and 3n ω(2n) How about 2n and 2n+1? Compare 2n and 3n lim 2n / 3n = lim(2/3)n = 0 Therefore, 2n o(3n), and 3n ω(2n) How about 2n and 2n+1? 2n / 2n+1 = ½, therefore 2n = Θ (2n+1) n→∞ n→∞ 9/20/2018
L’ Hopital’s rule lim f(n) / g(n) = lim f(n)’ / g(n)’ You can apply this transformation as many times as you want, as long as the condition holds lim f(n) / g(n) = lim f(n)’ / g(n)’ Condition: If both lim f(n) and lim g(n) are or 0 n→∞ n→∞ 9/20/2018
In fact, log n o(nε), for any ε > 0 Compare n0.5 and log n lim n0.5 / log n = ? (n0.5)’ = 0.5 n-0.5 (log n)’ = 1 / n lim (n-0.5 / 1/n) = lim(n0.5) = ∞ Therefore, log n o(n0.5) In fact, log n o(nε), for any ε > 0 n→∞ 9/20/2018
Stirling’s formula (constant) 9/20/2018
Compare 2n and n! Therefore, 2n = o(n!) Compare nn and n! Therefore, nn = ω(n!) How about log (n!)? 9/20/2018
9/20/2018
More advanced dominance ranking 9/20/2018
Asymptotic notations O: Big-Oh Ω: Big-Omega Θ: Theta o: Small-oh ω: Small-omega Intuitively: O is like o is like < is like is like > is like = 9/20/2018