1 Introduction to Analysing Costs 2013-T2 Lecture 10 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Rashina Hoda, and Peter Andreae COMP 103 Marcus Frean

2 RECAP  using collections, implementing ArrayList, iterators, comparators TODAY  introduction to analysing costs Announcements 2 RECAP-TODAY

3 Analysing Costs (in general) How can we determine the costs of a program?  Time:  Run the program and count the milliseconds/minutes/days.  Count number of steps/operations the algorithm will take.  Space:  Measure the amount of memory the program occupies.  Count the number of elementary data items the algorithm stores.  Programs or Algorithms? Both. programs: called “benchmarking” algorithms: called “analysis” 3

4 Benchmarking: program cost Measure:  actual programs  on real machines  with specific input  measure elapsed time System.currentTimeMillis () → time from the system clock in milliseconds (long)  measure real memory Problems:  what input ? ⇒ use large data sets don’t include user input  other users/processes ? ⇒ minimise average over many runs  which computer ? ⇒ specify details 4

5 Analysis: Algorithm “complexity”  Abstract away from the details of  the hardware  the operating system  the programming language  the compiler  the program  the specific input  Measure number of “steps” as a function of the data size.  best case (easy, but useless)  worst case (usually easy)  average case (harder)  We could try to construct an expression for the number of steps, for example:  3.47 n 2 - 67n + 53 steps  3n log(n) - 5n + 6 steps 5

6 Analysis: Notation  We only care about the cost when it is large, and we don’t care about the constant multipliers: 3.47 n 2 + 102n + 10064 steps  O(n 2 ) 3n log(n) + 12 n steps  O(n log n) Lower-order terms become insignificant for large n Multiplication factors don’t tell us how things “scale up” We care about how cost grows with input size 6

7 Big-O costs 7  “Asymptotic cost”, or “big-O” cost. describes how cost grows with input size. Examples: O(1) constant cost doesn’t grow with n at all: it’s a fixed cost O(n) linear cost grows linearly with n O(log n) logarithmic cost grows up by 1everytime n doubles itself O(n^2) quadratic costs grows up by n*n with n (bad!!)

8 Big-O costs 8 NO(1) Constant O(N) Linear O(log N) Logarithmic O(N log N)O(N^2) Quadratic O(N^3) Cubic O(N!) Factorial 1510.00 111 5552.3211.6125125120 105 3.3233.2210010003628800 155 3.9158.6022533751.31E+12 205 4.3286.4440080002.43E+18 O(N^2)

9  We should probably count these:  actions in the innermost loop (happen the most times)  actions that happen every time round (not inside an “if”)  actions involving “data values”(rather than indexes) public E remove (int index){ if (index = count) throw new ….Exception(); E ans = data[index]; for (int i=index+1; i< count; i++) data[i-1]=data[i]; count--; data[count] = null; return ans; } Problem: What is a “step” ? ←Key Step 9

10 ArrayList: get, set, remove  Assume some List contains n items.  Cost of get and set:  best, worst, average:  ⇒ constant number of steps, regardless of n  Cost of Remove:  worst case: what is the worst case ? how many steps ?  average case: half way what is the average case ? how many steps ? n 10

11 ArrayList: add (add at some position) public void add(int index, E item){ if (index count) throw new IndexOutOfBoundsException(); ensureCapacity(); for (int i=count; i > index; i--) data[i]=data[i-1]; data[index]=item; count++; }  Cost of add(index, value):  key step?  worst case:  average case: n 11

12 ArraySet costs Costs:  contains, add, remove:O(n)  All the cost is in the search!  How can we speed up the search? 12

