Algorithm Analysis Lakshmish Ramaswamy
What Constitutes Good Software? Code correctness Good design Code reusability OO design can help us in achieving both But what about performance? –Software can become useless if performance is poor
Factors Affecting Performance Platform where software runs –Resources –Operating system –Load on the platform Programming language Compiler Algorithms used
Algorithm A clearly specified set of instructions for solving a particular problem Independent of programming language Software can be comprised of several algorithms –Course maintenance software Algorithms vary in terms of time and memory requirements Algorithm analysis - science of determining time and space requirement
Chapter Overview Estimation of running time of algorithm Mathematical technique to describe/reason about running time –Comparing algorithms without implementing Techniques for reducing time Example – Searching (Binary search)
Algorithm Analysis Reasoning about algorithms can be complex –Performance of different algorithms may vary with input Time requirement generally depends upon amount of input data –Searching in 1 million word file takes much longer than searching 10 word file Running time is a function of input size –Exact value may depend upon many factors
An Example
How to Compare Algorithms? In general, look for trends –Not individual points on the curve An algorithm is better if its running time increases slowly with increasing input size –Linear is better than O(N long N), which is better than quadratic Do you need to choose the best algorithm always? –Not if you know the range of input values
More on Functions Usually expressed as sum of components (terms) Dominant term – Component that has the maximum impact on the growth rate 10N 3 + N N + 80 Functions compared in terms of dominant terms Cubic function – Dominant term is 3 rd power Quadratic function – Dominant term is 2 nd power
Why Analyze Growth Rates? Dominant term contributes maximum to the value of function for large input sizes –For N = 1000, 99.99% of value contributed by N 3 term Constants are not meaningful across platforms –Compiler, architecture and OS have significant impact on the constants We are mostly concerned with large inputs –Algorithms have comparable performances on small inputs
Example (Contd.)
Growth Rates of Functions
The Big-Oh Notation Captures the dominant term of the function The constants associated with the dominant term are not included If dominant term is 10N 2, function is O(N 2 ) If dominant term is 0.5N 3 function is O(N 3 ) If dominant term is 2N log(N) function is O(N logN)
Example Algorithms Finding minimum (maximum) element in an array Finding average (mean) of an array Calculating variance of a population Calculating closest points in a plane Calculating co-linear points in plane
Minimum Element ALGORITHM: min <- array[0] i <- 1 while (i < numElements) if (array[i] < min) min <- array[i] endif endwhile Fixed amount of work (constant) for each element Linear or O(N) running time
Average of Array Elements ALGORITHM: sum <- 0.0 i <- 0 while (i < numElements) sum <- sum + array[i] endwhile average <- sum/numElements Again O(N) algorithm
Variance of Population in Array ALGORITHM: FIRST CALCULATE AVERAGE (Use previous alg.) varSum <- 0 i <- 0 while (i < numElements) varSum <- varSum + (array[i]-average)^2 endwhile variance <- varSum/numElements Two phases –Constant amount of work per element in each phase O(N) algorithm
Calculating Closest Points in Plane ALGORITHM: Select any two points in the set Mindist <- Euclidean distance between the selected point For every pair of points if Mindist > distance between pair Mindist <- distance between pair endif Endfor N (N-1)/2 comparisons O(N 2 ) algorithm
Colinear Points in Plane Problem: Given set of points in plane determine if any three form a straight line ALGORITHM: For every triplets of points Check if they form a straight line \\ How do you do this? If yes output and break Number of triplets = N (N-1) (N-2)/6 O(N 3 ) algorithm