Big Oh Notation Greek letter Omicron (Ο) is used to denote the limit of asymptotic growth of an algorithm If algorithm processing time grows linearly with the input set n, then we say the algorithm is Order n, or O(n). This notation isolates an algorithm’s run-time from other factors: –Size of the problem set –Initialization time –Processor speed and instruction set
Big-Oh notation Let b(x) be the bubble sort algorithm We say b(x) is O(n 2 ) –This is read as “b(x) is big-oh n 2 ” –This means that the input size increases, the running time of the bubble sort will increase proportional to the square of the input size In other words, by some constant times n 2 Let l(x) be the linear (or sequential) search algorithm We say l(x) is O(n) –Meaning the running time of the linear search increases directly proportional to the input size
Big-Oh notation Consider: b(x) is O(n 2 ) –That means that b(x)’s running time is less than (or equal to) some constant times n 2 Consider: l(x) is O(n) –That means that l(x)’s running time is less than (or equal to) some constant times n
Big-Oh proofs Show that f(x) = x 2 + 2x + 1 is O(x 2 ) –In other words, show that x 2 + 2x + 1 ≤ c*x 2 Where c is some constant For input size greater than some x We know that 2x 2 ≥ 2x whenever x ≥ 1 And we know that x 2 ≥ 1 whenever x ≥ 1 So we replace 2x+1 with 3x 2 –We then end up with x 2 + 3x 2 = 4x 2 –This yields 4x 2 ≤ c*x 2 This, for input sizes 1 or greater, when the constant is 4 or greater, f(x) is O(x 2 ) We could have chosen values for c and x that were different
Big-Oh proofs
Sample Big-Oh problems Show that f(x) = x is O(x 2 ) –In other words, show that x ≤ c*x 2 We know that x 2 > 1000 whenever x > 31 –Thus, we replace 1000 with x 2 –This yields 2x 2 ≤ c*x 2 Thus, f(x) is O(x 2 ) for all x > 31 when c ≥ 2
Sample Big-Oh problems Show that f(x) = 3x+7 is O(x) –In other words, show that 3x+7 ≤ c*x We know that x > 7 whenever x > 7 –Uh huh…. –So we replace 7 with x –This yields 4x ≤ c*x Thus, f(x) is O(x) for all x > 7 when c ≥ 4
A variant of the last question Show that f(x) = 3x+7 is O(x 2 ) –In other words, show that 3x+7 ≤ c*x 2 We know that x > 7 whenever x > 7 –Uh huh…. –So we replace 7 with x –This yields 4x < c*x 2 –This will also be true for x > 7 when c ≥ 1 Thus, f(x) is O(x 2 ) for all x > 7 when c ≥ 1
What that means If a function is O(x) –Then it is also O(x 2 ) –And it is also O(x 3 ) Meaning a O(x) function will grow at a slower or equal to the rate x, x 2, x 3, etc.
Function growth rates For input size n = 1000 O(1)1 O(log n)≈10 O(n)10 3 O(n log n)≈10 4 O(n 2 )10 6 O(n 3 )10 9 O(n 4 )10 12 O(n c )10 3*c c is a consant 2 n ≈ n!≈ n n Many interesting problems fall into these categories
Function growth rates Logarithmic scale!
Integer factorization Factoring a composite number into it’s component primes is O(2 n ) –Where n is the number of bits in the number This, if we choose 2048 bit numbers (as in RSA keys), it takes steps –That’s about steps!
Formal Big-Oh definition Let f and g be functions. We say that f(x) is O(g(x)) if there are constants c and k such that |f(x)| ≤ C |g(x)| whenever x > k
Formal Big-Oh definition
Big Omega (Ω) and Big Theta (Θ) If Big-Oh a less-than relationship: –then Big Omega is greater-than | f(x) | > C | g(x) | when x > n 0 –and Big Theta is equals if f(x) is O(g(x)) and Ω(g(x)), then it is Θ(g(x)) x 2 is Θ(x 2 ) x is O(x 2 ) x 2 is Ω(x)
A useful recursive algorithm Merge sort procedure mergesort (L = a 1,…a n ) if n>1 then m := floor(n/2) L 1 := a 1,a 2,…,a m L 2 := a m+1,a m+2,…,a n L := merge(mergesort(L 1 ),mergesort(L 2 )) {L is now sorted into elements of increasing order}
mergesort needs merge procedure merge(L 1, L 2 : lists) L := empty list while L 1 and L 2 are both nonempty begin remove smaller of first element of L 1 and L 2 from the list it is in and put it at the left end of L if removal of this element make one list empty then remove all elements from the other list and append them to L end return L {L is the merged list with elements in increasing order}
Time complexity First: how many recursive calls are there for n inputs?
Satisfiability Consider a Boolean expression of the form: (x 1 x 2 x 3 ) (x 2 x 3 x 4 ) ( x 1 x 4 x 5 ) –This is a conjunction of disjunctions Is such an equation satisfiable? –In other words, can you assign truth values to all the x i ’s such that the equation is true? –The above problem is easy (only 3 clauses of 3 variables) – set x 1, x 2, and x 4 to true There are other possibilities: set x 1, x 2, and x 5 to true, etc. –But consider an expression with 1000 variables and thousands of clauses
Satisfiability If given a solution, it is easy to check if such a solution works –Plug in the values – this can be done quickly, even by hand However, there is no known efficient way to find such a solution –The only definitive way to do so is to try all possible values for the n Boolean variables –That means this is O(2 n )! –Thus it is not a polynomial time function NP stands for “Not Polynomial” Cook’s theorem (1971) states that SAT is NP-complete –There still may be an efficient way to solve it, though!
NP Completeness There are hundreds of NP complete problems –It has been shown that if you can solve one of them efficiently, then you can solve them all –Example: the traveling salesman problem Given a number of cities and the costs of traveling from any city to any other city, what is the cheapest round-trip route that visits each city once and then returns to the starting city? Not all algorithms that are O(2 n ) are NP complete –In particular, integer factorization (also O(2 n )) is not thought to be NP complete
NP Completeness It is “widely believed” that there is no efficient solution to NP complete problems –In other words, everybody has that belief If you could solve an NP complete problem in polynomial time, you would be showing that P = NP –And you’d get a million dollar prize (and lots of fame!) If this were possible, it would be like proving that Newton’s or Einstein’s laws of physics were wrong In summary: –NP complete problems are very difficult to solve, but easy to check the solutions of –It is believed that there is no efficient way to solve them
Reserve
An aside: inequalities If you have a inequality you need to show: x < y You can replace the lesser side with something greater: x+1 < y If you can still show this to be true, then the original inequality is true Consider showing that 15 < 20 –You can replace 15 with 16, and then show that 16 < 20. Because 15 < 16, and 16 < 20, then 15 < 20
An aside: inequalities If you have a inequality you need to show: x < y You can replace the greater side with something lesser: x < y-1 If you can still show this to be true, then the original inequality is true Consider showing that 15 < 20 –You can replace 20 with 19, and then show that 15 < 19. Because 15 < 19, and 19 < 20, then 15 < 20
An aside: inequalities What if you do such a replacement and can’t show anything? –Then you can’t say anything about the original inequality Consider showing that 15 < 20 –You can replace 20 with 10 –But you can’t show that 15 < 10 –So you can’t say anything one way or the other about the original inequality