Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficiency & SORTING CITS1001.

Similar presentations


Presentation on theme: "Efficiency & SORTING CITS1001."— Presentation transcript:

1 Efficiency & SORTING CITS1001

2 Listen to the sound of sorting
Various algorithms Quicksort Or Google for “sound of sorting”

3 Scope of this lecture Linear Search Sorting algorithms and efficiency
References: Wirth, Algorithms + Data Structures = Programs, Chapter 2 Knuth, The Art of Computer Programming, Volume 3, Sorting and Searching This lecture is based on powerpoint slides originally by Gordon Royle, UWA

4 Why study sorting algorithms?
NOT so you can reproduce them in your Java applications If you want to sort a collection of objects in Java, use the Collections library A list l may be sorted by calling Collections.sort(l); See for a tutorial

5 So, why study sorting algorithms?
Consider sorting as an introduction to algorithmic thinking “Algorithms and data structures are the basics of CS and engineering. If you learn them your thinking process improves. Your coding style improves. If you read good code (and understand) you become a better coder. Where do you find better code than those few lines of precise, elegantly crafter, peer reviewed by millions of people? And sorting is the foundation of many other things.” Maruf Maniruzzaman, Software Engineer at Microsoft

6 Searching Searching refers to the process of finding data items that match certain criteria We may just want a yes/no answer to a question, or we may want additional details as well Find out whether any students got a mark of 49 Find out which students got a mark of 49 The simplest searching technique is called linear search, which involves looking through each element in turn until we find one that matches the criteria

7 Our favourite student class
public class Student { private String studentID; private int mark; public Student(String studentID, int mark) { this.studentID = studentID; this.mark = mark; } public String getStudentID() { return studentID; public int getMark() { return mark; A skeleton version of a possible Student class in a student records system

8 A collection of students
We consider a class list being stored as an ArrayList The question we consider is how to retrieve the data for a student with a given student number So we will write a method with the following signature public Student findStudent(ArrayList<Student> classlist, String id) The method returns a (reference to a) Student object The student ID we want is the other parameter The arraylist of students is a parameter

9 Linear search public Student findStudent(ArrayList<Student> classlist, String id) { for (Student s : classlist ) if (s.getStudentID().equals(id)) return s; return null; }

10 Comments If the arraylist does contain the desired value, the method returns the object as soon as it is found If the arraylist does not contain the desired value, the method returns null after checking every element without success We have shown the general situation of finding an object in a collection of objects

11 Performance of linear search
How fast does linear search work on an collection of n items? We can identify three situations Best case, when the input is the most convenient possible Worst case, when the input is the least convenient possible Average case, averaged over all the inputs In the best case, linear search finds the item at the first position of the array, so it needs 1 comparison In the worst case, linear search does not find the item, so it performs n comparisons unsuccessfully To calculate the average case performance, we would need some problem-specific assumptions about the input data

12 Linear search is too slow
If we have very large amounts of data, then linear search is not feasible For example, we can view a telephone directory as a very large array of objects, with each object consisting of a name and a number If you are asked to find out which person has phone number , how long would it take you to do this by linear search? However, if I ask you to find out the phone number of a specific person, then you can do it much, much faster How do you do it? How can we program a computer to do this?

13 Sorted collections The reason that is quick, while
is slow, is because the collection (i.e. the phone book) is sorted into alphabetical order, and somehow this allows us to find an entry much more quickly (we will see why later) Most useful databases are sorted – dictionaries, indexes, etc. Name Phone number Phone number Name

14 Sorting Before we examine how to efficiently search in a sorted collection, we consider how to sort the collection We again start with the “plain vanilla” example – sorting an array of integers into increasing order Later we will extend this to sorting arrays of objects according to various other criteria (alphabetical, etc.) before 6 8 1 15 12 2 7 4 after 1 2 4 6 7 8 12 15

15 The basic set up We will implement a number of sorting methods, all of which operate on an array of integers We will develop these as a utility class called Sorter – a class with no instance variables, but just static methods (cf. Math) Each method will have a similar signature, where the only thing that will vary is the name of the sorting technique public static void nameSort(int[] a) Each method receives an array as a parameter, and will sort that array “in place” i.e. the method returns nothing, but a gets updated

16 bubbleSort The idea behind bubbleSort is to systematically compare pairs of elements, exchanging them if they are out of order If the array contains n elements, then we view the algorithm as consisting of n–1 “passes” In the first pass we compare Element 0 with element 1, exchange if necessary Element 1 with element 2, exchange if necessary Element n-2 with element n-1, exchange if necessary

17 The first pass After the first pass, the largest element will be at the end 6 8 1 15 12 2 7 4 6 8 1 15 12 2 7 4 6 1 8 15 12 2 7 4 6 1 8 15 12 2 7 4 6 1 8 12 15 2 7 4 6 1 8 12 2 15 7 4 6 1 8 12 2 7 15 4 6 1 8 12 2 7 4 15

18 The second pass The second pass doesn’t need to make the last comparison 6 1 8 12 2 7 4 15 1 6 8 12 2 7 4 15 1 6 8 12 2 7 4 15 1 6 8 12 2 7 4 15 1 6 8 2 12 7 4 15 1 6 8 2 7 12 4 15 1 6 8 2 7 4 12 15

19 The third pass The third pass can omit the last two comparisons 1 6 8
2 7 4 12 15 1 6 8 2 7 4 12 15 1 6 8 2 7 4 12 15 1 6 2 8 7 4 12 15 1 6 2 7 8 4 12 15 1 6 2 7 4 8 12 15

20 The fourth pass The fourth pass is even shorter 1 6 2 7 4 8 12 15 1 6

21 The fifth and sixth passes
1 2 6 4 7 8 12 15 1 2 6 4 7 8 12 15 1 2 6 4 7 8 12 15 1 2 4 6 7 8 12 15 1 2 4 6 7 8 12 15 1 2 4 6 7 8 12 15 1 2 4 6 7 8 12 15

22 Why does it work? We need to have some argument or “proof” that this works We claim that This is true after the first pass, because the largest element in the array is encountered at some stage and then “swapped all the way to the end” of the array The same argument – applied to the remainder of the array – shows that the second pass puts the second largest element into place; repeating this argument n times gives the result After i passes, the largest i elements in the array are in their correct positions

23 Coding bubblesort public static void bubbleSort(int[] a) {
for (int pass = 1; pass < a.length; pass++) for (int j = 0; j < a.length-pass; j++) if (a[j] > a[j+1]) swap(a, j, j+1); }

24 Sorting students public static void bubbleSort(Student[] a) { for (int pass = 1; pass < a.length; pass++) for (int j = 0; j < a.length-pass; j++) if (/* a[j] and a[j+1] out of order */) swap(a, j, j+1); } Almost identical code, except that we need to get the right boolean condition to check when two students are in the “wrong order”

25 What order do we want? The precise form of the statement depends on whether we want to sort students: Alphabetically according to their studentId Numerically according to their mark In addition, the desired sort could be ascending (smaller values first) or descending (smaller values last) Suppose that we want to sort the students into normal (ascending) alphabetical order by studentId

26 For alphabetic order The comparison between the two Student objects a[j] and a[j+1] first needs to obtain the two ids to compare, so it will involve the two Strings String s1 = a[j].getStudentID(); String s2 = a[j+1].getStudentID(); To compare two Strings we use the compareTo method if (s1.compareTo(s2) > 0) { // Swap the two Students }

27 Selection sort When sorting n items, Selection Sort works as follows
The procedure has n–1 stages Select the smallest element in the array, and swap it with the element in position 0 Then select the smallest element in the array starting from position 1, and swap it with the element in position 1 Then select the smallest element in the array starting from position 2, and swap it with the element in position 2 Etc. This algorithm has the following properties After i stages, the first i items in the array are the i smallest items, in order At the (i+1)th stage, the (i+1)th smallest item is placed in the (i+1)th slot in the array

28 Coding selectionSort public static void selectionSort(int[] a) {
for (int pass = 0; pass < a.length – 1; pass++) { int smallest = pass; for (int j = pass + 1; j < a.length; j++) if (a[j] < a[smallest]) smallest = j; swap(a, smallest, pass); }

29 Insertion Sort (like sorting cards)
In card games, it is common to pick up your cards as they are dealt, and to sort them into order as they arrive For example, suppose your first three cards are Next you pick up a 9 of clubs

30 Inserting a card The new card is then inserted into the correct position

31 Insertion sort We can develop this idea into an algorithm called Insertion Sort When sorting n items, Insertion Sort works as follows The procedure has n–1 stages Compare the second item in the array with the first item; make them ordered Compare the third item in the array with the first two items; make them ordered Etc. This algorithm has the following properties After i stages, the first i+1 items are sorted although they aren’t the smallest At the (i+1)th stage, the item originally in position i+2 is placed in its correct position relative to the first i+1 items

32 Example Initial array Stage 0: Move the first element into position (do nothing) Stage 1: Examine the second element and insert it into position (again do nothing) 6 8 1 15 12 2 7 4 6 8 1 15 12 2 7 4 6 8 1 15 12 2 7 4

33 Stage 2 This element is out of position, so it will have to be inserted 6 8 1 15 12 2 7 4 6 8 15 12 2 7 4 1 6 8 15 12 2 7 4 1 6 8 15 12 2 7 4 1 1 6 8 15 12 2 7 4

34 Stages 3 & 4 Stage 3 1 6 8 15 12 2 7 4 Stage 4 1 6 8 15 12 2 7 4 1 6 8 15 2 7 4 12 1 6 8 15 2 7 4 12 1 6 8 12 15 2 7 4

35 Stage 5 1 6 8 12 15 2 7 4 1 6 8 12 15 7 4 2 1 6 8 12 15 7 4 2 1 6 8 12 15 7 4 2 1 6 8 12 15 7 4 2 1 6 8 12 15 7 4 2 1 2 6 8 12 15 7 4

36 Stage 6 1 2 6 8 12 15 7 4 1 2 6 8 12 15 4 7 1 2 6 8 12 15 4 7 1 2 6 8 12 15 4 7 1 2 6 8 12 15 4 7 1 2 6 7 8 12 15 4

37 Final stage 1 2 6 7 8 12 15 4 1 2 6 7 8 12 15 4 1 2 6 7 8 12 15 4 1 2 6 7 8 12 15 4 1 2 6 7 8 12 15 4 1 2 6 7 8 12 15 4 1 2 6 7 8 12 15 4 1 2 4 6 7 8 12 15

38 Code for insertionSort
public static void insertionSort(int[] a) { for (int pass = 1; pass < a.length; pass++) { int tmp = a[pass]; // new element to insert int pos = pass - 1; // move out-of-order elements up to make space while (pos >= 0 && a[pos] > tmp) { a[pos+1] = a[pos]; pos--; } // insert the new element in the right place a[pos+1] = tmp;

39 Code dissection public static void insertionSort(int[] a) { for (int pass = 1; pass < a.length; pass++) { int tmp = a[pass]; int pos = pass-1; while (pos >= 0 && a[pos] > tmp) { a[pos+1] = a[pos]; pos--; } a[pos+1] = tmp; The body of the for-loop contains the code for one stage or “pass” of the algorithm

40 Code dissection public static void insertionSort(int[] a) { for (int pass=1;pass<a.length; pass++) { int tmp = a[pass]; int pos = pass-1; while (pos >= 0 && a[pos] > tmp) { a[pos+1] = a[pos]; pos--; } a[pos+1] = tmp; The variable tmp stores the value that is to be inserted; the variable pos will eventually indicate the position where it should be inserted

41 Code dissection public static void insertionSort(int[] a) { for (int pass=1;pass<a.length; pass++) { int tmp = a[pass]; int pos = pass-1; while (pos >= 0 && a[pos] > tmp) { a[pos+1] = a[pos]; pos--; } a[pos+1] = tmp; This code does the work of shifting each element in turn one space along if it is bigger than the value to be inserted. We also need to ensure that we don’t fall off the left-hand end of the array!

42 Code dissection public static void insertionSort(int[] a) { for (int pass=1;pass<a.length; pass++) { int tmp = a[pass]; int pos = pass-1; while (pos >= 0 && a[pos] > tmp) { a[pos+1] = a[pos]; pos--; } a[pos+1] = tmp; The while loop finishes when we have found the correct position for a[pass], so it is now inserted into this position

43 Code dissection public static void insertionSort(int[] a) { for (int pass=1;pass<a.length; pass++) { int tmp = a[pass]; int pos = pass-1; while (pos >= 0 && a[pos] > tmp) { a[pos+1] = a[pos]; pos--; } a[pos+1] = tmp; Note that if a[pass] is already in the correct spot, the while loop does nothing and a[pass] goes back into the same place

44 Efficiency & SORTING II
CITS1001

45 Scope of this lecture Quicksort and mergesort Performance comparison
Binary search

46 Recursive sorting What if we take a completely different approach?
All of the algorithms so far build up the “sorted part” of the array one element at a time What if we take a completely different approach? Faster algorithms split the elements to be sorted into groups, sort the groups separately, then combine the results There are two principal approaches “Intelligent” splitting and “simple” combining Simple splitting and intelligent combining These are divide-and-conquer algorithms

47 Quicksort When sorting n items, Quick Sort works as follows
Choose one of the items p to be the pivot Partition the items into L (items smaller than p) and U (items larger than p) L’ = sort(L) U’ = sort(U) The sorted array is then L’ + p + U’ , in that order Intelligent splitting, and simple combining

48 Behaviour of quicksort
6 8 1 7 12 9 2 15 Items smaller than the pivot Items larger than the pivot 6 1 2 8 15 12 9 Sort Choose a pivot (7) Sort 1 2 6 8 9 12 15 Append 1 2 6 7 8 9 12 15

49 Second level 12 9 8 15 Items smaller than the pivot
Items larger than the pivot 8 15 12 Choose a pivot (9) Sort 12 15 Append 8 9 12 15

50 Code for quickSort What if low == high?
public static void quickSort(int[] a) { qsort(a, 0, a.length – 1); } // sort a[l..u] inclusive private static void qsort(int[] a, int low, int high) { if (low < high) { int p = partition(a, low, high); qsort(a, low, p – 1); qsort(a, p + 1, high); If low == high we are done – nothing left to sort

51 Code for partition private static int partition(int[] a, int low, int high) { // this code always uses the last element a[high] as the pivot int si = low; for (int i = low; i < high; i++) { if (a[i] <= a[high]) { // swap small elements to the front swap(a, i, si++); } //swap the pivot to be between smalls and larges swap(a, si, u); return si;

52 Behaviour of partition
6 8 1 15 12 2 9 7 si a[u] 6 8 1 15 12 2 9 7 si a[u] a[0] < a[u], so a[0] ↔ a[si] and si++ 6 1 8 15 12 2 9 7 si a[u] a[2] < a[u], so a[2] ↔ a[si] and si++ 6 1 2 15 12 8 9 7 si a[u] a[5] < a[u], so a[5] ↔ a[si] and si++ 6 1 2 7 12 8 9 15 si a[u] a[7] ↔ a[si], return si

53 Or … partition around middle element
private static void qsortM(int[] a, int l, int u) { int pivotValue = a[(l+u)/2]; //pivot is middle of the array int i=l; //process low partition from the left int j=u; //process high partition from the right while (i<=j) { //until all elements are in position while (a[i] < pivotValue) { i++; } //skip over in place elements while (a[j] > pivotValue) { j--; } //skip over in place elements if (i<=j) { swap(a,i++,j--); //swap out of place elements } if (l<j) { qsort(a,l,j); } // qsort lower half if (i<u) { qsort(a,i,u); } // qsort upper half

54 Mergesort When sorting n items, Merge Sort works as follows
Let F be the front half of the array, and B be the back half F’ = sort(F) B’ = sort(B) Merge F’ and B’ to get the sorted list – repeatedly compare their first elements and take the smaller one Simple splitting, and intelligent combining

55 Behaviour of mergesort
6 8 1 15 12 2 9 7 Front half Back half 6 8 1 15 12 2 9 7 Sort Sort 1 6 8 15 2 7 9 12 Merge 1 2 6 7 8 9 12 15

56 Second level 12 2 9 7 Front half Back half 12 2 9 7 Sort Sort 2 12 7 9
Merge 2 7 9 12

57 Code for mergeSort Again, if l == u, there is only one element: no sorting is needed public static void mergeSort(int[] a){ msort(a, 0, a.length - 1); } // sort a[l..u] inclusive private static void msort(int[] a, int l, int u){ if (l < u) {int m = (l + u) / 2; msort(a, l, m); msort(a, m + 1, u); merge(a, l, m, u);}

58 Code for merge // merge a[l..m] with a[m+1..u]
private static void merge(int[] a, int l, int m, int u) { while (l <= m && a[l] <= a[m + 1]) l++; // small elements on the 1st list needn't be moved if (l <= m) // if the 1st list is exhausted, we're done while (u >= m + 1 && a[u] >= a[m]) u--; // large elements on the 2nd list needn't be moved int start = l; // record the start and finish points of the 1st list int finish = m++; int[] b = new int[u - l + 1]; // this is where we will put the sorted list int z = 0; while (m <= u) // while the 2nd list is alive, copy the smallest element to b if (a[l] <= a[m]) b[z++] = a[l++]; else b[z++] = a[m++]; while (z < b.length) b[z++] = a[l++]; // copy the rest of the 1st list for (int i = 0; i < b.length; i++) a[start + i] = b[i]; // copy the sorted list back from b }

59 Efficiency experiment
Is there any difference between the performance of all these sorting algorithms? After all they all achieve the same result… Which one(s) are more efficient? Why? Experiment: use the provided Sorter class to estimate the execution time of each algorithm for sorting a large, disordered array Graph your results

60 Performance Comparison
Algorithm Time to sort (ms) 1,000 items 10,000 items 100,000 items 1,000,000 items Bubble 1 192 19,295 1,822,273 Selection 47 4,220 396,780 Insertion 21 954 111,746 Merge 2 13 166 Quick (E) 7 97 Quick (M) 8 88 2017 data. Java 8 iMAC 3.2 GHz Intel Core i5 16 GB 1867 MHz DDR3

61 Analysis Why are quicksort and mergesort so much faster?
The first three algorithms all reduce the number of items to be sorted by one in each pass And each pass takes linear time Therefore their overall run-time is n2, i.e. quadratic Multiplying the number of items by 10 multiplies run-time by 102 = 100 Quicksort and mergesort reduce the number of items by half at each level And each level takes linear time Therefore their overall run-time is nlog2n Multiplying the number of items by 10 multiplies run-time by 10 and a bit

62 A note on the accuracy of such tests
Assessing the execution time of Java code this way is not completely accurate You will not always get the same results Activities such as garbage collection may affect the times Or just if your computer is running other applications concurrently We “average out” unrepresentative examples by using Random data Multiple runs

63 (Finally) back to searching
Linear search was inefficient for large datasets But if we can assume that our data is sorted, we can use a much more efficient approach reminiscent of quickSort and mergeSort When looking for an item x, compare x with the middle item if x equals the middle item, we’re done if x is bigger than the middle item, search the top half of the data if x is smaller than the middle item, search the bottom half of the data In each step, we delete half of the data from consideration!

64 Code for binary search public static int binarySearch(int[] a, int x)
{ int l = 0; int u = a.length - 1; // we know that x can only be in the interval l…u while (l <= u) // have we failed yet? int m = (l + u) / 2; // m is the middle index if (a[m] == x) return m; else if (a[m] < x) l = m + 1; // restrict the search to m+1…u // a[m] > x u = m - 1; // restrict the search to l…m-1 } return -1;

65 Summary We study sorting algorithms because they provide good examples of many of the features that affect the run-time of program code. When checking the efficiency of your own code, consider Number of loops, and depth of nesting Number of comparison operations Number of swap (or similar) operations


Download ppt "Efficiency & SORTING CITS1001."

Similar presentations


Ads by Google