Presentation is loading. Please wait.

Presentation is loading. Please wait.

SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan

Similar presentations


Presentation on theme: "SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan"— Presentation transcript:

1 SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan ola@cs.duke.edu http://www.cs.duke.edu/~ola

2 SIGCSE 2004 2 Analysis l Vocabulary to discuss performance and to reason about alternative algorithms and implementations  It’s faster! It’s more elegant! It’s safer! It’s cooler! l Use mathematics to analyze the algorithm, l Implementation is another matter  cache, compiler optimizations, OS, memory,…

3 SIGCSE 2004 3 What do we need? l Need empirical tests and mathematical tools  Compare by running 30 seconds vs. 3 seconds, 5 hours vs. 2 minutes Two weeks to implement code l We need a vocabulary to discuss tradeoffs

4 SIGCSE 2004 4 Analyzing Algorithms l Three solutions to online problem sort1  Sort strings, scan looking for runs  Insert into Set, count each unique string  Use map of (String,Integer) to process l We want to discuss trade-offs of these solutions  Ease to develop, debug, verify  Runtime efficiency  Vocabulary for discussion

5 SIGCSE 2004 5 What is big-Oh about? l Intuition: avoid details when they don’t matter, and they don’t matter when input size (N) is big enough  For polynomials, use only leading term, ignore coefficients: linear, quadratic y = 3x y = 6x-2 y = 15x + 44 y = x 2 y = x 2 -6x+9 y = 3x 2 +4x

6 SIGCSE 2004 6 O-notation, family of functions first family is O(n), the second is O(n 2 )  Intuition: family of curves, same shape  More formally: O(f(n)) is an upper- bound, when n is large enough the expression cf(n) is larger  Intuition: linear function: double input, double time, quadratic function: double input, quadruple the time

7 SIGCSE 2004 7 Reasoning about algorithms l We have an O(n) algorithm,  For 5,000 elements takes 3.2 seconds  For 10,000 elements takes 6.4 seconds  For 15,000 elements takes ….? l We have an O(n 2 ) algorithm  For 5,000 elements takes 2.4 seconds  For 10,000 elements takes 9.6 seconds  For 15,000 elements takes …?

8 SIGCSE 2004 8 More on O-notation, big-Oh l Big-Oh hides/obscures some empirical analysis, but is good for general description of algorithm  Compare algorithms in the limit  20N hours v. N 2 microseconds: which is better?

9 SIGCSE 2004 9 More formal definition O-notation is an upper-bound, this means that N is O(N), but it is also O(N 2 ) ; we try to provide tight bounds. Formally:  A function g(N) is O(f(N)) if there exist constants c and n such that g(N) n cf(N) g(N) x = n

10 SIGCSE 2004 10 Big-Oh calculations from code l Search for element in array:  What is complexity (using O-notation)?  If array doubles, what happens to time? for(int k=0; k < a.length; k++) { if (a[k].equals(target)) return true; } return false; l Best case? Average case? Worst case?

11 SIGCSE 2004 11 Measures of complexity l Worst case  Good upper-bound on behavior  Never get worse than this l Average case  What does average mean?  Averaged over all inputs? Assuming uniformly distributed random data?

12 SIGCSE 2004 12 Some helpful mathematics l 1 + 2 + 3 + 4 + … + N  N(N+1)/2 = N 2 /2 + N/2 is O(N 2 ) l N + N + N + …. + N (total of N times)  N*N = N 2 which is O(N 2 ) l 1 + 2 + 4 + … + 2 N  2 N+1 – 1 = 2 x 2 N – 1 which is O(2 N )

13 SIGCSE 2004 13 10 6 instructions/sec, runtimes N O(log N)O(N)O(N log N)O(N 2 ) 10 0.0000030.000010.0000330.0001 100 0.0000070.000100.0006640.1000 1,000 0.0000100.001000.0100001.0 10,000 0.0000130.010000.1329001.7 min 100,000 0.0000170.100001.6610002.78 hr 1,000,000 0.0000201.019.911.6 day 1,000,000,000 0.00003016.7 min18.3 hr318 centuries

14 SIGCSE 2004 14 Multiplying and adding big-Oh l Suppose we do a linear search then we do another one  What is the complexity?  If we do 100 linear searches?  If we do n searches on an array of size n?

15 SIGCSE 2004 15 Multiplying and adding l Binary search followed by linear search?  What are big-Oh complexities? Sum?  50 binary searches? N searches? l What is the number of elements in the list (1,2,2,3,3,3)?  What about (1,2,2, …, n,n,…,n)?  How can we reason about this?

16 SIGCSE 2004 16 Reasoning about big-O l Given an n-list: (1,2,2,3,3,3, …n,n,…n)  If we remove all n’s, how many left?  If we remove all larger than n/2?

17 SIGCSE 2004 17 Reasoning about big-O l Given an n-list: (1,2,2,3,3,3, …n,n,…n)  If we remove every other element?  If we remove all larger than n/1000?  Remove all larger than square root n?

18 SIGCSE 2004 18 Money matters while (n != 0) { count++; n = n/2; } l Penny on a statement each time executed  What’s the cost?

19 SIGCSE 2004 19 Money matters void stuff(int n){ for(int k=0; k < n; k++) System.out.println(k); } while (n != 0) { stuff(n); n = n/2; } l Is a penny always the right cost/statement?  What’s the cost ?

20 SIGCSE 2004 20 Find k-th largest l Given an array of values, find k-th largest  0 th largest is the smallest  How can I do this? l Do it the easy way… l Do it the fast way …

21 SIGCSE 2004 21 Easy way, complexity? public int find(int[] list, int index) { Arrays.sort(list); return list[index]; }

22 SIGCSE 2004 22 Fast way, complexity? public int find(int[] list, int index) { return findHelper(list,index, 0, list.length-1); }

23 SIGCSE 2004 23 Fast way, complexity? private int findHelper(int[] list, int index, int first, int last) { int lastIndex = first; int pivot = list[first]; for(int k=first+1; k <= last; k++){ if (list[k] <= pivot){ lastIndex++; swap(list,lastIndex,k); } swap(list,lastIndex,first);

24 SIGCSE 2004 24 Continued… if (lastIndex == index) return list[lastIndex]; else if (index < lastIndex ) return findHelper(list,index, first,lastIndex-1); else return findHelper(list,index, lastIndex+1,last);

25 SIGCSE 2004 25 Recurrences int length(ListNode list) { if (0 == list) return 0; else return 1+length(list.getNext()); } l What is complexity? justification? l T(n) = time of length for an n-node list T(n) = T(n-1) + 1 T(0) = O(1)

26 SIGCSE 2004 26 Recognizing Recurrences l T must be explicitly identified l n is measure of size of input/parameter  T(n) time to run on an array of size n T(n) = T(n/2) + O(1) binary search O( log n ) T(n) = T(n-1) + O(1) sequential search O( n )

27 SIGCSE 2004 27 Recurrences T(n) = 2T(n/2) + O(1) tree traversal O( n ) T(n) = 2T(n/2) + O(n) quicksort O( n log n) T(n) = T(n-1) + O(n) selection sort O( n 2 )

28 SIGCSE 2004 28 Compute b n, b is BigInteger l What is complexity using BigInteger.pow?  What method does it use?  How do we analyze it? l Can we do exponentiation ourselves?  Is there a reason to do so?  What techniques will we use?

29 SIGCSE 2004 29 Correctness and Complexity? public BigInteger pow(BigInteger b, int expo) { if (expo == 0) { return BigInteger.ONE; } BigInteger half = pow(b,expo/2); half = half.multiply(half); if (expo % 2 == 0) return half; else return half.multiply(b); }

30 SIGCSE 2004 30 Correctness and Complexity? public BigInteger pow(BigInteger b, int expo) { BigInteger result =BigInteger.ONE; while (expo != 0){ if (expo % 2 != 0){ result = result.multiply(b); } expo = expo/2; b = b.multiply(b); } return result; }

31 SIGCSE 2004 31 Solve and Analyze l Given N words (e.g., from a file)  What are 20 most frequently occurring?  What are k most frequently occurring? l Proposals?  Tradeoffs in efficiency  Tradeoffs in implementation


Download ppt "SIGCSE 2004 1 Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan"

Similar presentations


Ads by Google