Mathematics Review and Asymptotic Notation CS 2133: Data Structures Mathematics Review and Asymptotic Notation
Arithmetic Series Review 1 + 2 + 3 + . . . + n = ? Sn = a + a+d + a+2d + a+3d + . . . a+(n-1)d Sn= 2a+(n-1)d + 2a+(n-2)d + 2a+(n-3)d + … + a 2Sn = 2a+(n-1)d + 2a+(n-1)d + 2a+(n-1)d . . . + 2a+(n-1)d Sn = n/2[2a + (n-1)d] consequently 1+2+3+…+n = n(n+1)/2
Problems Find the sum of the following 1+3+5+ . . . + 121 = ? 1+3+5+ . . . + 121 = ? The first 50 terms of -3 + 3 + 9 + 15 + … 1 + 3/2 + 2 + 5/2 + . . . 25=?
Geometric Series Review 1 + 2 + 4 + 8 + . . . + 2n 1 + 1/2 + 1/4 + . . . + 2-n Theorem: Sn= a + ar + ar2 + . . . + arn rSn= ar + ar2 + . . . + arn + arn+1 Sn-rSn = a - arn+1 What about the case where -1< r < 1 ?
Geometric Problems What is the sum of 3+9/4 + 27/16 + . . . 1/2 - 1/4 + 1/8 - 1/16 + . . .
Harmonic Series This is Eulers constants
Just an Interesting Question What is the optimal base to use in the representation of numbers n? Example: with base x we have _ _ _ _ _ _ _ _ _ We minimize X values slots
Logarithm Review Ln = loge is called the natural logarithm Lg = log2 is called the binary logarithm How many bits are required to represent the number n in binary
Logarithm Rules The logarithm to the base b of x denoted logbx is defined to that number y such that by = x logb(x1*x2) = logb x1 + logb x2 logb(x1/x2) = logb x1 - logb x2 logb xc = c logbx logbx > 0 if x > 1 logbx = 0 if x = 1 logbx < 0 if 0 < x < 1
Additional Rules logb a = logca / logc b logb (1/a) = - logb a For all real a>0, b>0 , c>0 and n logb a = logca / logc b logb (1/a) = - logb a logb a = 1/ logab a logb n = n logb a
Asymptotic Performance In this course, we care most about the asymptotic performance of an algorithm. How does the algorithm behave as the problem size gets very large? Running time Memory requirements Coming up: Asymptotic performance of two search algorithms, A formal introduction to asymptotic notation
Input Size Time and space complexity This is generally a function of the input size E.g., sorting, multiplication How we characterize input size depends: Sorting: number of input items Multiplication: total number of bits Graph algorithms: number of nodes & edges Etc
Running Time Number of primitive steps that are executed Except for time of executing a function call most statements roughly require the same amount of time y = m * x + b c = 5 / 9 * (t - 32 ) z = f(x) + g(y) We can be more exact if need be
Analysis Worst case Average case Provides an upper bound on running time An absolute guarantee Average case Provides the expected running time Very useful, but treat with care: what is “average”? Random (equally likely) inputs Real-life inputs
An Example: Insertion Sort InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } }
Insertion Sort What is the precondition for this loop? InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } }
Insertion Sort InsertionSort(A, n) { for i = 2 to n { key = A[i] j = i - 1; while (j > 0) and (A[j] > key) { A[j+1] = A[j] j = j - 1 } A[j+1] = key } } How many times will this loop execute?
Insertion Sort Statement Effort InsertionSort(A, n) { for i = 2 to n { c1n key = A[i] c2(n-1) j = i - 1; c3(n-1) while (j > 0) and (A[j] > key) { c4T A[j+1] = A[j] c5(T-(n-1)) j = j - 1 c6(T-(n-1)) } 0 A[j+1] = key c7(n-1) } 0 } T = t2 + t3 + … + tn where ti is number of while expression evaluations for the ith for loop iteration
Analyzing Insertion Sort T(n) = c1n + c2(n-1) + c3(n-1) + c4T + c5(T - (n-1)) + c6(T - (n-1)) + c7(n-1) = c8T + c9n + c10 What can T be? Best case -- inner loop body never executed ti = 1 T(n) is a linear function Worst case -- inner loop body executed for all previous elements ti = i T(n) is a quadratic function T=1+2+3+4+ . . . n-1 + n = n(n+1)/2
Analysis Simplifications Ignore actual and abstract statement costs Order of growth is the interesting measure: Highest-order term is what counts Remember, we are doing asymptotic analysis As the input size grows larger it is the high order term that dominates
Upper Bound Notation We say InsertionSort’s run time is O(n2) Properly we should say run time is in O(n2) Read O as “Big-O” (you’ll also hear it as “order”) In general a function f(n) is O(g(n)) if there exist positive constants c and n0 such that f(n) c g(n) for all n n0 Formally O(g(n)) = { f(n): positive constants c and n0 such that f(n) c g(n) n n0
Big O example Show using the definition that 5n+4 O(n) Where g(n)=n First we must find a c and an n0 We now need to show that f(n) c g(n) for every n n0 clearly 5n + 5 6n whenever n 6 Hence c=6 and n0=6 satisfy the requirements.
Insertion Sort Is O(n2) Proof Question Suppose runtime is an2 + bn + c If any of a, b, and c are less than 0 replace the constant with its absolute value an2 + bn + c (a + b + c)n2 + (a + b + c)n + (a + b + c) 3(a + b + c)n2 for n 1 Let c’ = 3(a + b + c) and let n0 = 1 Question Is InsertionSort O(n3)? Is InsertionSort O(n)?
Big O Fact A polynomial of degree k is O(nk) Proof: Suppose f(n) = bknk + bk-1nk-1 + … + b1n + b0 Let ai = | bi | f(n) aknk + ak-1nk-1 + … + a1n + a0
Lower Bound Notation We say InsertionSort’s run time is (n) In general a function f(n) is (g(n)) if positive constants c and n0 such that 0 cg(n) f(n) n n0 Proof: Suppose run time is an + b Assume a and b are positive (what if b is negative?) an an + b
Asymptotic Tight Bound A function f(n) is (g(n)) if positive constants c1, c2, and n0 such that c1 g(n) f(n) c2 g(n) n n0 Theorem f(n) is (g(n)) iff f(n) is both O(g(n)) and (g(n)) Proof: someday
Notation (g) is the set of all functions f such that there exist positive constants c1, c2, and n0 such that 0 c1g(n) f(n) c2 g(n) for every n > nc c2 g(n) f(n) c1g(n)
Growth Rate Theorems 1. The power n is in O(n) iff (with ,>0) and n is in o(n) iff 2. logbn o(n ) for any b and 3. n o(cn) for any >0 and c>1 4. logbn O(logbn) for any a and b 5. cn O(dn) iff cd and cn o(dn) iff c<d 6. Any constant function f(n) =c is in O(1)
Big O Relationships 1. o(f) O(f) 2. If fo(g) then O(f) o(g) 4. If f O(g) then f(n) + g(n) O(g) 5. If f O(f `) and g O(g`) then f(n)* g(n) O(f `(n) * g`(n))
Theorem: log(n!)(nlogn) Case 1 nlogn O(log(n!)) log(n!) = log(n*(n-1)*(n-2) * * * 3*2*1) = log(n*(n-1)*(n-2)**n/2*(n/2-1)* * 2*1 => log(n/2*n/2* * * n/2*1 *1*1* * * 1) = log(n/2)n/2 = n/2 log n/2 O(nlogn) Case 2 log(n!) O(nlogn) log(n!) = logn + log(n-1) + log(n-2) + . . . Log(2) + log(1) < log n + log n + log n . . . + log n = nlogn
The Little o Theorem: If log(f)o(log(g)) and lim g(n) =inf as n goes to inf then f o(g) Note the above theorem does not apply to big O for log(n2) O(log n) but n2 O(n) Application: Show that 2n o(nn) Taking the log of functions we have log(2n)=nlog22 and log( nn) = nlog2n. Hence Implies that 2n o(nn)
Theorem: L'Hospital's Rule
Practical Complexity
Practical Complexity
Practical Complexity
Practical Complexity
Other Asymptotic Notations A function f(n) is o(g(n)) if positive constants c and n0 such that f(n) < c g(n) n n0 A function f(n) is (g(n)) if positive constants c and n0 such that c g(n) < f(n) n n0 Intuitively, o() is like < O() is like () is like > () is like () is like =
Comparing functions Definition: The function f is said to dominate g if f(n)/g(n) increases without bound as n increases without bound. i.e. for any c>0 there exist n0>0 such that f(n)> c g(n) for every n>n0
Little o Complexity Little o o(g) is the set of all functions that are dominated by g, i.e. The set of all f such that for every c>0 there exist nc>0 such that f(n)c g(n) for every n > nc
Up Next Solving recurrences Substitution method Master theorem