Introduction to Analysing Costs 2015-T2 Lecture 10 School of Engineering and Computer Science, Victoria University of Wellington Marcus Frean, Rashina Hoda, and Peter Andreae COMP 103 Thomas Kuehne
RECAP iterators, comparators, comparables TODAY introduction to analysing cost 2 RECAP-TODAY
Analysing Costs (in general) How can we determine the costs of a program? Time: Run the program and count the milliseconds/minutes/days. Count number of steps/operations the algorithm will take. Space: Measure the amount of memory the program occupies. Count the number of elementary data items the algorithm stores. Applies to Programs or Algorithms? Both. programs: called “benchmarking” algorithms: called “analysis” 3
Benchmarking: program cost Measure: actual programs, on real machines, with specific input measure elapsed time System.currentTimeMillis () → time from the system clock in milliseconds measure real memory Problems: what input ? ⇒ use large data sets don’t include user input other users/processes ? ⇒ minimise average over many runs which computer ? ⇒ specify details how to compare cross-platform? ⇒ measure cost at an abstract level 4
Analysis: Algorithm “complexity” Abstract away from the details of the hardware, the operating system the programming language, the compiler the program, the specific input Measure number of “steps” as a function of the data size. best case (easy, but not interesting) worst case (usually easy) average case (harder) The precise number of steps is not required 3.47 n n + 53 steps 3n log(n) + 5n - 3 steps Rather, we are interested in the complexity class 5
Analysis: The Big Picture Only care about with what shape cost grows with input size ,000,000 2,000,000 3,000,000 f 1 (n) f 2 (n) f 1 (n)= n n + 2 f 2 (n)= n input size cost
Analysis: Notation s g(n) f(n) n0n0 input size f(n) is O(g(n)), if there are two positive constants, s and n 0 with f(n) s | g(n) |, for all n n 0 Big-O Notation cost
Big-O Notation Only care for large input sets and ignore constant factors 3.47 n n steps O(n 2 ) 3n log n + 12n steps O(n log n) Lower-order terms become insignificant for large n Multiplication factors don’t tell us how things “scale up” We care about how cost grows with input size 8
“Asymptotic cost”, or “big-O” cost describes how cost grows with input size Big-O classes 9 Examples: O(1) constant cost is independent of n : Fixed cost! O(log n) logarithmic cost grows up by 1, every time n doubles : Slow growth! O(n) linear cost grows linearly with n : can often be beaten O(n 2 ) quadratic costs grows like n n : severely limits problem size
Big-O costs 10 NO(1) Constant O(log N) Logarithmic O(N) Linear O(N log N)O(N^2) Quadratic O(N^3) Cubic O(N!) Factorial E E+18 Cost O(n) input size O(n log n) O(n 2 ) O(n 3 ) O(log n) O(1)
Manageable Problem Sizes 11 N1 min1 h1 day1 week 1 year n 6 n log n 2.8 n2n 10 6 n3n n2n High asymptotic cost severely limits problem size
Of importance comparing data moving data anything you consider to be “expensive” public E remove (int index){ if (index = count) throw new ….Exception(); E ans = data[index]; for (int i=index+1; i< count; i++) data[i-1]=data[i]; count--; data[count] = null; return ans; } What is a “step” ? ← Key Step 12
ArrayList: get, set, remove Assume an ArrayList contains n items. Cost of get and set: best, worst, average: Cost of Remove: worst case: what is the worst case ? how many steps ? average case: what is the average case ? how many steps ? n 13
ArrayList: add (at some position) public void add(int index, E item){ if (index count) throw new IndexOutOfBoundsException(); ensureCapacity(); for (int i=count; i > index; i--) data[i]=data[i-1]; data[index]=item; count++; } Cost of add(index, value): key step? worst case: average case: n 14
ArraySet costs Costs: contains, add, remove: O(n) 15