Data Structure Algorithm Analysis TA: Abbas Sarraf
Objectives How to predict an algorithm’s performance How well an algorithm scales up How to compare different algorithms for a problem
Asymptotic Notations Big-O, “bounded above by”: T(n) = O(f(n)) – For some c and N, T(n) c·f(n) whenever n > N. Big-Omega, “bounded below by”: T(n) = (f(n)) – For some c>0 and N, T(n) c·f(n) whenever n > N. – Same as f(n) = O(T(n)). Big-Theta, “bounded above and below”: T(n) = (f(n)) – T(n) = O(f(n)) and also T(n) = (f(n))
By Pictures Big-Oh (most commonly used) – bounded above Big-Omega – bounded below Big-Theta – exactly 0 N
Examples What is the big-O of 2 n n + 5 ? T(n) = 2 n n + 5; f(n) = n 2 T(n) <= c f(n) ? 2 n n + 5 <= c n 2 ? for c = 3, n0 = 1001,check: 2( ) (1001) + 5 = ;3( ) = answer: O(n 2 ) 6. An algorithm actually takes n 5 + n 3 + 7, we say its complexity is O(n 5 ). ( proof in the book, p. 54 ) 7. This is only for addition! Obviously if the algorithm takes n 5 * n 3 its complexity is O(n 8 )!
Cost sum = 0; ---> 1 sum = sum + next; ---> 1 Total Cost: 2 Cost for( int i = 1; i 1 + n+1 + n = 2n+2 sum = sum++; ---> n Total Cost: 3n + 2 Cost k = 0 ---> 1 for( int i = 0; i 2n+2 for( int j = 0; j n(2n+2) = 2n 2 +2n k++; ---> n 2 Total Cost: 3n 2 + 4n + 3
n Both algorithms are O(n).
n Time complexity: O n 3 int maxSum = 0; for( int i = 0; i < a.size( ); i++ ) for( int j = i; j < a.size( ); j++ ) { int thisSum = 0; for( int k = i; k <= j; k++ ) thisSum += a[ k ]; if( thisSum > maxSum ) maxSum = thisSum; } return maxSum; Example
2n2n n2n2 n log n n log n n n log n n2n2 n3n3 n3n3 2n2n
Class O(1) Function/order – Constant time Examples – Find the ith element in an array. – A[i] Remarks – The running time of the algorithm doesn't depend on the value of n.
Class O(log a n) Function/order – Logarithmic time Examples – binary search Remarks – Typically achieved by dividing the problem into smaller segments and only looking at one input element in each segment. – Binary Search: Every time you go through the recursion (binary search uses recursion), the problem is reduced in half. So the most amount of times you can go through the recursion is log 2 n.
Class O(n) Function/order – Linear time Examples – Find the minimum element, printing, listing Remarks – Typically achieved by examining each element in the input once. – The running time of the algorithm is proportional to the value of n.
Class O(n * log(n)) Examples – heapsort, mergesort Remarks – Typically achieved by dividing the problem into subproblems, solving the subproblems independently, and then combining the results. Unlike the log N algorithms, each element in the subproblems must be examined.
Class O(n 2 ) Function/order – Quadratic time Examples – bubblesort, insertion sort Remarks – Typically achieved by examining all pairs of data elements – Comparisons : ( n) = n(n + 1)/2 = n 2 /2 + n/2 = O(n 2 )
Example -> what if “if”?! sum = 0; for (i = 0; i < n; i++) { if (is_even(i)) { for (j = 0; j < n; j++) sum++; } else sum = sum + n; } O( n 2 ) : outer loop is O(n) inside the loop: if “true” clause executed for half the values of n -> O(n),if “false” clause executed for other half -> O(1);the innermost loop is O(n),so the complexity is n(n + 1) = O( n 2 )
Average, Best, and Worst-Case Insertion Sort:
we let t j be the number of times the while loop test in line 5 is executed for that value of j.
Insertion Sort Best Case: ( t j = 1 ) T(n) = c 1 n + c 2 (n - 1) + c 4 (n - 1) + c 5 (n - 1) + c 8 (n - 1) = (c 1 + c 2 + c 4 + c 5 + c 8 )n - (c 2 + c 4 + c 5 + c 8 ). = O(n)
Insertion Sort Worst Case? – t j = i Average? (Probabilistic ) – E{ t j } = i / 2 (how?)
Average, Best, and Worst-Case On which input instances should the algorithm’s performance be judged? Average case: – Real world distributions difficult to predict Best case: – Seems unrealistic Worst case: – Gives an absolute guarantee – We will use the worst-case measure.
Example: Given A 1,…,A n, find the maximum value of A i +A i+1 +···+A j 0 if the max value is negative n Time complexity: O n 3 int maxSum = 0; for( int i = 0; i < a.size( ); i++ ) for( int j = i; j < a.size( ); j++ ) { int thisSum = 0; for( int k = i; k <= j; k++ ) thisSum += a[ k ]; if( thisSum > maxSum ) maxSum = thisSum; } return maxSum;
Algorithm 2 Idea: Given sum from i to j-1, we can compute the sum from i to j in constant time. This eliminates one nested loop, and reduces the running time to O(n 2 ). into maxSum = 0; for( int i = 0; i < a.size( ); i++ ) int thisSum = 0; for( int j = i; j < a.size( ); j++ ) { thisSum += a[ j ]; if( thisSum > maxSum ) maxSum = thisSum; } return maxSum;
Algorithm 3 Time complexity clearly O(n) But why does it work? I.e. proof of correctness. 2, 3, -2, 1, -5, 4, 1, -3, 4, -1, 2 int maxSum = 0, thisSum = 0; for( int j = 0; j < a.size( ); j++ ) { thisSum += a[ j ]; if ( thisSum > maxSum ) maxSum = thisSum; else if ( thisSum < 0 ) thisSum = 0; } return maxSum;
Proof of Correctness Max subsequence cannot start or end at a negative Ai. More generally, the max subsequence cannot have a prefix with a negative sum. Ex: Thus, if we ever find that Ai through Aj sums to < 0, then we can advance i to j+1 – Proof. Suppose j is the first index after i when the sum becomes < 0 – The max subsequence cannot start at any p between i and j. Because A i through A p-1 is positive, so starting at i would have been even better.
Algorithm 3 int maxSum = 0, thisSum = 0; for( int j = 0; j < a.size( ); j++ ) { thisSum += a[ j ]; if ( thisSum > maxSum ) maxSum = thisSum; else if ( thisSum < 0 ) thisSum = 0; } return maxSum The algorithm resets whenever prefix is < 0. Otherwise, it forms new sums and updates maxSum in one pass