CSE 326: Data Structures Asymptotic Analysis Ben Lerner Summer 2007 Lecture 2
Office Hours, etc. Ben Lerner Tue 10:45-11:45 CSE 212 (except today) The plan so far… Ben Lerner Tue 10:45-11:45 CSE 212 (except today) Ioannis Giotis Mon 10:45-11:45 CSE 220 Or by appointment. (Comments? Conflicts?) TODO : Subscribe to mailing list if you haven’t! 2/25/2019
Comparing Two Algorithms GOAL: Sort a list of names as fast as possible “Who cares how I do it, I’ll add more memory!” “I’ll use C++ instead of Java – wicked fast!” “I’ll buy a faster CPU” “Ooh look, the –O4 flag!” “Can’t I just get the data pre-sorted??” Why bother? 2/25/2019
Comparing Two Algorithms What we want: Rough Estimate Ignores Details Really, independent of details Coding tricks, CPU speed, compiler optimizations, … These would help any algorithms equally Don’t just care about running time – not a good enough measure 2/25/2019
Big-O Analysis Ignores “details” What details? CPU speed Programming language used Amount of memory Compiler Order of input Size of input … sorta. 2/25/2019
Analysis of Algorithms Efficiency measure how long the program runs time complexity how much memory it uses space complexity Why analyze at all? Decide what algorithm to implement before actually doing it Given code, get a sense for where bottlenecks must be, without actually measuring it We talked last time about efficiency. Let’s refine this further. Have an idea where potential bottlenecks are/will be Insight : alternative, better algorithms Confidence: algorithm will work well in practice : gives you boss a reason to pay you right away! 2/25/2019
One detail we won’t ignore – problem size, # of input elements Asymptotic Analysis One “detail” won’t ignore – problem size, # elements One detail we won’t ignore – problem size, # of input elements Complexity as a function of input size n T(n) = 4n + 5 T(n) = 0.5 n log n - 2n + 7 T(n) = 2n + n3 + 3n What happens as n grows? Asymptotic = performance as N -> infinity 2/25/2019
Why Asymptotic Analysis? Most algorithms are fast for small n Time difference too small to be noticeable External things dominate (OS, disk I/O, …) BUT n is often large in practice Databases, internet, graphics, … Time difference really shows up as n grows! 2/25/2019
Exercise - Searching 2 3 5 16 37 50 73 75 126 bool ArrayFind( int array[], int n, int key){ // Insert your algorithm here } Ultimately we want to analyze algorithms, so let’s generate an algorithm to try out. The point of these “Hannah takes a break” series of slides is to encourage students to come up with the answers themselves. What I say should be minimal, and there really shouldn’t be any point in the lecture when I present the “right answer”, because it encourages the students not to say anything if they come to expect this. (n = 0) Hopefully, students will only pick linear and binary search … What algorithm would you choose to implement this code snippet? 2/25/2019
Analyzing Code Basic Java operations Consecutive statements Conditionals Loops Function calls Recursive functions Constant time Sum of times Larger branch plus test Sum of iterations Cost of function body Solve recurrence relation Okay, now that we have some algorithms to serve as examples, let’s analyze them! Here’s some hints before we begin … Best case Worst Case (what are these?) Let’s try it! 2/25/2019
Linear Search Analysis bool LinearArrayFind(int array[], int n, int key ) { for( int i = 0; i < n; i++ ) { if( array[i] == key ) // Found it! return true; } return false; Best Case: 3 Worst Case: 2n+1 T(n) = n (we are looking for exact runtimes) Best = 3, Worst = 2n + 1 [note that average depends on if found] Best: T(n) = 4, when at [0] Worst: T(n) = 3n+2 2/25/2019
Binary Search Analysis bool BinArrayFind( int array[], int low, int high, int key ) { // The subarray is empty if( low > high ) return false; // Search this subarray recursively int mid = (high + low) / 2; if( key == array[mid] ) { return true; } else if( key < array[mid] ) { return BinArrayFind( array, low, mid-1, key ); } else { return BinArrayFind( array, mid+1, high, key ); } Best case: 4 Worst case: log n? Uh-oh. We don’t know how to calculate the runtime exactly (although 142/3 should have taught us that it’s O( log n ) Let’s go to the next slide. We’ll come back and fill out these numbers later. Runtime: T(n) = 4 log_2 n + 2 Best = 4, Worst = 4 log_2 n + 2, Most of the time = 4 log_2 n + 2 Best: 4 when at [mid] Worst: recursion! 4 log n + 4 2/25/2019
Solving Recurrence Relations For problem of size n, Time to get ready for recursive call (4) + time for that call. T(n) = 4 + T(n/2); T(1) = 4 Determine the recurrence relation. What is the base case(s)? “Expand” the original relation to find an equivalent general expression in terms of the number of expansions. Find a closed-form expression by setting the number of expansions to a value which reduces the problem to a base case By repeated substitution, until see pattern: T(n) = 4 + (4 + T( n/4 ) ) = 4 + (4 + (4 + T(n/8))) = 4*3 + T(n/23) = … = 4k + T(n/2k) 1. T(n) = 4 + T( floor(n/2) ) T(1) = 2 2. T(n) = 4 + (4 + T( floor(n/4) ) ) = 8 + T( floor(n/4) ) T(n) = 4 + (4 + (4 + T( floor(n/8) ) ) = 12 + T( floor(n/8) ) So, if k is the number of expansions, the general expression is: T(n) = 4k + T( floor( n/(2^k) ) ) 3. Since the base case is n = 1, we need to find a value of k such that the value of floor( n/(2^k) ) is 1 (which removes T(n) from the RHS of the equation, thus giving us a closed-form expression). A value of k = log_2 n gives us this. Setting k = log_2 n, we have: T(n) = 4 log_2 n + T(1) T(n) = 4 log_2 n + 2 Want: n/2^k = 1, mult by 2k, Need (n/2k) = 1, mult by 2^k So k = log n (all logs to base 2) Hence, T(n) = 4 log n + 4 2/25/2019
Linear Search vs Binary Search Best Case 4 at [0] 4 at [middle] Worst Case 3n+2 4 log n + 4 4 at [0] 4 at [mid] 4 log n + 4 Linear Search: 3, 2n+1 Binary Search: 4, 4logN + 2 Some students will probably say “Binary search, because it’s O( log n ), whereas linear search is O( n )”. But the point of this discussion is that big-O notation obscures some important factors (like constants!) and we really don’t know the input size. To make a meaningful comparison, we need to know more information. What information might that be? (1) what our priorities are (runtime? Memory footprint?) (2) what the input size is (or, even better, what the input is!) (3) our platform/machine – we saw on the earlier chart that architecture made a difference! (4) other … Big-O notation gives us a way to compare algorithms without this information, but the cost is a loss of precision. 3n+2 Depends on constants, input size, good/bad input for alg., machine, … So … which algorithm is better? What tradeoffs can you make? 2/25/2019
Fast Computer vs. Slow Computer Pentium IV was newer/faster machine The y-axis is time – lower is better With the same algorithm, the faster machine wins out With same algorithm, faster machine wins!! 2/25/2019
Fast Computer vs. Smart Programmer (round 1) With different algorithms, constants matter. Does linear search beat out binary search? With different algorithms, constants matter!! 2/25/2019
Fast Computer vs. Smart Programmer (round 2) Binary search wins out – eventually Eventually, better algorithm wins!! 2/25/2019
Asymptotic Analysis Asymptotic analysis looks at the order of the running time of the algorithm A valuable tool when the input gets “large” Ignores the effects of different machines or different implementations of the same algorithm Intuitively, to find the asymptotic runtime, throw away the constants and low-order terms Linear search is T(n) = 3n + 2 O(n) Binary search is T(n) = 4 log2n + 4 O(log n) Okay, so the point of all those pretty pictures is to show that, while constants matter, they don’t really matter as much as the order for “sufficiently large” input (“large” depends, of course, on the constants). Bases don’t matter, more in a sec Remember: the fastest algorithm has the slowest growing function for its runtime 2/25/2019
Asymptotic Analysis Eliminate coefficients Eliminate low order terms 0.5 n log n + 2n + 7 n3 + 2n + 3n Eliminate coefficients 4n 0.5 n log n n log n2 => 4n 0.5 n log n 2^n --- n n log n = 2 n log n = n log n We didn’t get very precise in our analysis of the UWID info finder; why? Didn’t know the machine we’d use. Is this always true? Do you buy that coefficients and low order terms don’t matter? When might they matter? (Linked list memory usage) Any base x log is equivalent to a base 2 log within a constant factor log_A B = log_x B ------------- log_x A log_x B = log_2 B ------------- log_2 x 2/25/2019
Properties of Logs We will assume logs in base 2 unless explicitly otherwise log AB = log A + log B Proof: Similarly, log(A/B) = log A – log B log(AB) = B log A Any base k log is equivalent to log-base-2 2/25/2019
Order Notation: Intuition Why do we eliminate these constants, etc? This is not the whole picture! f(n) = n3 + 2n2 g(n) = 100n2 + 1000 Although not yet apparent, as n gets “sufficiently large”, f(n) will be “greater than or equal to” g(n) 2/25/2019
Definition of Order Notation Upper bound: T(n) = O(f(n)) Big-O Exist positive constants c and n’ such that T(n) c f(n) for all n n’ Lower bound: T(n) = (g(n)) Omega T(n) c g(n) for all n n’ Tight bound: T(n) = (f(n)) Theta When both hold: T(n) = O(f(n)) T(n) = (f(n)) We’ll use some specific terminology to describe asymptotic behavior. There are some analogies here that you might find useful. 2/25/2019
Definition of Order Notation Back to our two functions f and g from before Definition of Order Notation O( f(n) ) : a set or class of functions g(n) O( f(n) ) iff there exist positive consts c and n0 such that: g(n) c f(n) for all n n0 Example: 100n2 + 1000 5 (n3 + 2n2) for all n 19 So g(n) O( f(n) ) Sometimes, you’ll see the notation g(n) = O(f(n)). This is equivalent to g(n) O(f(n)). Remember: notation O(f(n)) = g(n) is meaningless! 2/25/2019
Notation Notes Sometimes you’ll see g(n) = O( f(n) ) This is equivalent to g(n) O( f(n) ) But note that the reverse O( f(n) ) = g(n) Is MEANINGLESS 2/25/2019
Order Notation: Example Wait, crossover point at 100, not 19? Point is- big_O captures the relationship – doesn’t tell you exactly where the crossover point is. If we pick c =1, then n0=100 Whoa, what happened here? The picture seems to indicate that the “crossover point” happens around 95, whereas our inequality seems to indicate that the crossover happens at 19! The point we want to make is that big-O notation captures a relationship between f(n) and g(n) (ie, the fact that f(n) is “greater than or equal to” g(n)), not that it captures the actual constants that describe when the “crossover” happens. Remember, in big-O notation, the constants on the two functions don’t really matter. For c = 1, the crossover happens at n = 100 exactly 100n2 + 1000 5 (n3 + 2n2) for all n 19 So f(n) O( g(n) ) 2/25/2019
Big-O: Common Names constant: O(1) logarithmic: O(log n) (logkn, log n2 O(log n)) linear: O(n) log-linear: O(n log n) quadratic: O(n2) cubic: O(n3) polynomial: O(nk) (k is a constant) exponential: O(cn) (c is a constant > 1) 2/25/2019
Meet the Family O( f(n) ) is the set of all functions asymptotically less than or equal to f(n) o( f(n) ) is the set of all functions asymptotically strictly less than f(n) ( f(n) ) is the set of all functions asymptotically greater than or equal to f(n) ( f(n) ) is the set of all functions asymptotically strictly greater than f(n) ( f(n) ) is the set of all functions asymptotically equal to f(n) ( f(n) ) = theta 2/25/2019
Meet the Family, Formally g(n) O( f(n) ) iff There exist c and n0 such that g(n) c f(n) for all n n0 g(n) o( f(n) ) iff There exists a n0 such that g(n) < c f(n) for all c and n n0 g(n) ( f(n) ) iff There exist c and n0 such that g(n) c f(n) for all n n0 g(n) ( f(n) ) iff There exists a n0 such that g(n) > c f(n) for all c and n n0 g(n) ( f(n) ) iff g(n) O( f(n) ) and g(n) ( f(n) ) Equivalent to: limn g(n)/f(n) = 0 Note how they all look suspiciously like copy-and-paste definitions … Make sure that they notice the only difference between the relations -- less-than, less-than-or-equal, etc. Equivalent to: limn g(n)/f(n) = Note: only inequality change, or (there exists c) or (for all c). 2/25/2019
Big-Omega et al. Intuitively Asymptotic Notation Mathematics Relation O = o < > In fact, it’s not just the intuitive chart, but it’s the chart of definitions! Notice how similar the formal definitions all were … they only differed in the relations which we highlighted in blue! 2/25/2019
Pros and Cons of Asymptotic Analysis Let’s do a think-pair-share: Have the students (in teams of 2 or 3) come up with some of the pros and cons of asymptotic analysis Have them come together and share as a group Some points I hope to get out: Asymptotic analysis is useful for quick-and-and-dirty comparisons of algorithms They allow us to talk about algorithms separate from architecture But in order to be so powerful, we sacrifice precision. Also don’t contrast implementation complexity. Pros: quick-and-dirty comparison separates alg from architecture Cons: less precise (lose consts) separates alg from arch. (bad for graphics etc) doesn’t capture implementation complexity 2/25/2019
Perspective: Kinds of Analysis Running time may depend on actual data input, not just length of input Distinguish worst case your worst enemy is choosing input best case average case assumes some probabilistic distribution of inputs amortized average time over many operations We already discussed the bound flavor. All of these can be applied to any analysis case. For example, we’ll later prove that sorting in the worst case takes at least n log n time. That’s a lower bound on a worst case. Average case is hard! What does “average” mean. For example, what’s the average case for searching an unordered list (as precise as possible, not asymptotic). WRONG! It’s about n, not 1/2 n. Why? You have to search the whole thing if the elt is not there. Note there’s two senses of tight. I’ll try to avoid the terminology “asymptotically tight” and stick with the lower def’n of tight. O(inf) is not tight! 2/25/2019
Types of Analysis Two orthogonal axes: bound flavor analysis case All of these can be applied to any analysis case Two orthogonal axes: bound flavor upper bound (O, o) lower bound (, ) asymptotically tight () analysis case worst case (adversary) average case best case “amortized” We already discussed the bound flavor. All of these can be applied to any analysis case. For example, we’ll later prove that sorting in the worst case takes at least n log n time. That’s a lower bound on a worst case. Average case is hard! What does “average” mean. For example, what’s the average case for searching an unordered list (as precise as possible, not asymptotic). WRONG! It’s about n, not 1/2 n. Why? You have to search the whole thing if the elt is not there. Note there’s two senses of tight. I’ll try to avoid the terminology “asymptotically tight” and stick with the lower def’n of tight. O(inf) is not tight! 2/25/2019
16n3log8(10n2) + 100n2 = O(n3log n) Eliminate low-order terms Eliminate constant coefficients 16n3log8(10n2) + 100n2 16n3log8(10n2) n3log8(10n2) n3(log8(10) + log8(n2)) n3log8(10) + n3log8(n2) n3log8(n2) 2n3log8(n) n3log8(n) n3log8(2)log(n) n3log(n)/3 n3log(n) 2/25/2019