Comparing Algorithms Unit 1.2.

Comparing Algorithms Unit 1.2

Why do we need to study algorithms?
An algorithm is a set of unambiguous instructions for solving a problem. Why do we need to study algorithms? What makes computer science different to any other discipline is its emphasis on precisely defining procedural solutions consisting of specific instructions for getting answers. Would computers exist without algorithms? There may be more than one algorithm for solving a particular problem and different algorithms may solve it with dramatically different speeds and with different memory requirements.

Correctness, alone, is not sufficient.
As you commence your career & future in CS, you will begin to see that problems may have more than one solution! If you create an algorithm that solves a problem, how to you measure it’s success? Correctness, alone, is not sufficient. How quickly an algorithm calculates its output (speed) How much memory it consumes.

Computational Complexity – measure of how economical the algorithm is with time and space.
Time- how economical is the algorithm in terms of time? Does it take a long time to calculate the required results? The time complexity of an algorithm indicates how much time an algorithm requires to solve a particular problem. Space - how economical is the algorithm in terms of memory? Does it require a large amount of memory to calculate the required results? The space complexity of an algorithm indicates how much memory the algorithm needs.

Consider a linear search algorithm on the following array:
15 6 10 18 7 3 1 9 Worst-case complexity: this is the complexity computed when we just happen to choose an input that requires the longest time or greatest workload. For example, if we are searching for value 9, every item in the array must be accessed! Best-case complexity: this is the complexity computed when we just happen to choose an input that requires the shortest time or smallest workload. For example, if we are searching for value 15, only one comparison is needed! Average-case complexity: this is the average complexity calculated by averaging the times for every possible input. For example, the average number of comparisons needed in the array would be 8/2 = 4 The complexity of a problem is taken to be the worst-case complexity of the most efficient algorithm which solves a problem.

How do we measure time complexity?
Pointless to use a stopwatch! vs Speed of algorithm would be dependent on speed of microprocessors and the compiler used to generate the machine code. A supercomputer would outperform a personal computer! Therefore, a standard unit of time measurement would be unreliable.

How do we measure time complexity?
It is sufficient to identify the operation contributing the most to the total running time … This operation is called the basic operation. …and to compute the number of times this operation is executed. Therefore, the established method for analysis of an algorithm’s time efficiency is based on counting the number of times the algorithm’s basic operation is executed, on inputs of size n.

For example, consider these two algorithms, which both calculate the sum of the first n integers.
SUB sumIntegersMethod1(n) sum  0 FOR i  i TO n sum  sum + n END FOR RETURN sum ENDSUB SUB sumIntegersMethod2(n) sum  n*(n+1)/2 RETURN sum ENDSUB The first algorithm performs one operation (sum  0) outside the loop and n operations inside the FOR loop, a total of n+1 operations. As n increases, the extra operation to initialise the sum is insignificant. The larger the value of n, the more time to process the algorithm takes. Its order of magnitude or time complexity is basically n. The second algorithm, however, takes the same amount of time whatever the value of n. Its time complexity is a constant.

Sequence of statements statement 1; //basic operation1
Big O notation (with a capital letter O, not a zero) is a symbolism used in complexity theory, computer science, and mathematics to describe the behaviour of functions in relation to size of input. Basically, it tells you how fast a function grows or declines. Constant time Written as O(1) Sequence of statements statement 1; //basic operation1 statement 2; //basic operation 2 ... statement k; The total time is found by counting the number of basic operations required to achieved a result. As you can see from the algorithm above, k statements are required no matter what size of input n is. If n = 100, the number of basic operations conducted would be k. If n = 3, the number of basic operations would still be k.

GetFirstElementNull(elements: array of integers)
Constant time Written as O(1) But beware!!! We are not interested in the exact number of operations that are being performed. Instead, we are interested in the relation of the number of operations to the problem size. So, let’s look at a much better example that takes into account the size of input (n). GetFirstElementNull(elements: array of integers) (return elements[0]; } No matter what the input size, whether the array has 10 elements or 100 elements, this algorithm will always execute in the same amount of time.

Order of growth for O(1) algorithms
( pronounced “Big OH of one”) time Constant time Input size (n) Time taken to run the algorithm does NOT vary with the input size.

Linear time Written as O(n) for i in 1 .. N loop
sequence of statements end loop; Look at the above for loop. The loop executes N times, so the sequence of statements also executes N times. If we assume the statements are O(1), the total time for the for loop is N * O(1), which is O(N) overall.

Linear time Written as O(n)
Similarly, if you consider a linear search, the number of items needed to be accessed is dependent on the size of the array. Remember that we are typically interested in the worst-case complexity. 15 6 10 18 7 3 1 9 n=8 (size of input). To find value 9, we need 8 comparisons. 7 3 1 9 n=4 To find value 9, we need 4 comparisons. 15 9 n=2 To find value 9, we need 2 comparisons. The number of basic operations required, in a linear search, is directly proportional to the input size.

Order of growth for O(N) algorithms
Linear time time Input size (n) Time taken to run the algorithm varies with the input size – directly proportional.

Quadratic time Written as O(n²) Bubblesort
O(N2) represents an algorithm whose performance is directly proportional to the square of the size of the input data set. This is common with algorithms that involve nested iterations over the data set. Deeper nested iterations will result in O(N3), O(N4) etc. Bubblesort If you recall the bubble sort, you saw that the mechanism worked by switching adjacent pairs and moving larger values toward the right. With 5 values, we need to make 4 comparisons. Why only 4? Well…it would be dumb to compare a value with itself! And so with n values we made n - 1 comparisons the first time we passed through the array. 1st pass (n - 1) 15 6 18 10 20

n(n - 1)/2 Let’s use the number of comparisons as a unit of measure.
So in the first pass we made n - 1 comparisons, but we know that the largest value is now furthest on the right, so we only have to make n - 2 comparisons in the second pass: 2nd pass (n - 1) + (n - 2) 15 6 10 18 20 If we keep going, we are left with one comparison at the end and so the total number of comparisons needed will be: (n - 1) + (n - 2)…+1 To simplify, we can use the back of our old GCSE math book to eventually figure out that this series is equivalent to: This might seem a leap of faith, but you can check this out for yourself! n(n - 1)/2

499,999,500,000 comparisons! n(n - 1)/2 is the same as n²/2 - n/2
Let’s multiply this out. n(n - 1)/2 is the same as n²/2 - n/2 Let’s say we have a million values in an array, rather than 5. 1,000,000² /2 – 1,000,000 /2 500,000,000,000 – 500,000 This tells us that sorting an array of 1,000,000 people will take 499,999,500,000 comparisons! What we can extrapolate here is that for very large values of n, we see that the bubble sort really performs worse and worse! You can see the detrimental effect of the exponentiation (n²)

To say that given 1,000,000 a bubblesort will take that many comparisons is a generalisation. It shouldn’t be thee lesson to take from this small number crunching exercise. What is more important, however, is that the n² term dominates in terms of magnitude! Subtracting 500,000 doesn’t really affect things much as this value is very small in comparison. 500,000,000,000 – 500,000 (n²/2 - n/2) As computer scientists, you can ignore these lower order terms as their impact on the final number of comparisons (i.e. 499,999,500,000) is very small! Therefore, we can really simplify this and just take into account those terms that really matter i.e. n², as this term really matters most. Remember, this does not give you an exact number of comparisons. We really don’t care if we’re exact or not – that’s silly. E.g. is it 499,999,500,000 comparisons or 500,000,000,000 comparisons? Who cares! What we’re interested in is the general order of growth. So we say that bubblesort is in the order of n² or O(n²). Remember, this is the worst case complexity.

Order of growth for O(N²) algorithms
Polynomial time time Input size (n) Time taken to run the algorithm does vary proportionally to the square of the input size.

Logarithmic Written as O(log N) using base-2
Remember the binary search? Take an array of 1000 items. How many comparisons will it take to find if the item exists? Let’s say, in worst case, item is located in element 2. First check that the array is sorted! Consider the following manual and very slow method! Comparison 1: [[first]+[last]] div 2 [1]+[1000] div 2 = 500 Comparison 2: Repeat above - [1] + [499] div 2 = 250 Comparison 3: [1]+[249] div 2 = 125 Comparison 4: [1]+[124] div 2 = 62 Comparison 5: [1]+[61] div 2 = 31 Comparison 6: [1]+[30] div 2 = 15 Comparison 7: [1]+[14] div 2 = 7 Comparison 8: [1]+[6] div 2 = 3 Comparison 9: [1]+[2] div 2 = 1 Comparison 10: only other unchecked element is 2.

29 = 512 (not enough) 2 >= 1000 ? 210 = 1024 Logarithmic
Written as O(log N) using base-2 It took us, at worst, 10 comparisons to search an array of 1000 items. Knowing that Binary Search is O(log N), you could alternatively do this: 29 = 512 (not enough) ? 2 >= 1000 ? 210 = 1024 or 1+Log2n ! So, at worst case, a maximum of 10 comparisons is needed to search an array of 1000 items. The (Log N) term makes algorithms like binary search extremely efficient when dealing with large data sets. Iterations for O(N) algorithms Iterations for O(Log N) algorithms 100 7 1000 10 1 million 20 1 billion 30 1 billion billion 60

Order of growth for O(log N) algorithms
time Logarithmic time Input size (n) It means that the time taken to run an algorithm increases or decreases in line with a logarithm.

Exponential Written as O(2N) or O(aN) time Input size (n)
O(2N) denotes an algorithm whose growth doubles with each addition to the input data set. The growth curve of an O(2N) function is exponential - starting off very shallow, then rising dramatically. Its execution time grows exponentially with input size. An example of an O(2N) function is the recursive calculation of Fibonacci numbers. Order of growth for O(2N) algorithms time Input size (n)

Merge Sort Time Complexity?
By the way… For the merge sort, a common mistake is to assume it’s time complexity is O(Log N). This is not true. The ‘divide and conquer’ mechanism it uses does indeed perform in logarithmic time O(Log N), similar to a binary search. In other words, the data set must be iterated through O(Log N) times. However, remember that afterwards there are N sub-lists to be merged, so the time complexity has to be multiplied by a factor of N. So, theoretically, N x Log N is a truer reflection of it’s time complexity. Merge Sort time complexity is O(N Log N) Merge Sort Time Complexity?

Summary : ) The value of Big O notation is that you can find the most efficient solution / algorithm for your problem. Remember that Big O uses upper bound (worst case) so it is a good measure of how scalable your solution is. As a broad rule of thumb: O(1) algorithm scales the best as it never takes any longer to run. (constant time) O(log N) algorithm is the next most efficient. (logarithmic time) O(N) algorithm is the next most efficient. (linear – polynomial time) O(N²) algorithm is considered to be the point beyond which algorithms start to become unsolvable within an acceptable time frame! (polynomial time) O(2N) algorithms is the least efficient and considered intractable. (exponential time)

Comparing Algorithms Unit 1.2.

Similar presentations

Presentation on theme: "Comparing Algorithms Unit 1.2."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Comparing Algorithms Unit 1.2.

Similar presentations

Presentation on theme: "Comparing Algorithms Unit 1.2."— Presentation transcript:

Similar presentations

About project

Feedback