Copyright © 2013 by John Wiley & Sons. All rights reserved. SORTING AND SEARCHING CHAPTER Slides by Rick Giles 14
Chapter Goals To study several sorting and searching algorithms To appreciate that algorithms for the same task can differ widely in performance To understand the big-oh notation To estimate and compare the performance of algorithms To write code to measure the running time of a program Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 2
Contents Selection Sort Profiling the Selection Sort Algorithm Analyzing the Performance of the Selection Sort Algorithm Merge Sort Analyzing the Merge Sort Algorithm Searching Problem Solving: Estimating the Running Time of an Algorithm Sorting and Searching in the Java Library Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 3
14.1 Selection Sort Sorts an array by repeatedly finding the smallest element of the unsorted tail region and moving it to the front Example: sorting an array of integers Copyright © 2013 by John Wiley & Sons. All rights reserved. Page
Sorting an Array of Integers Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 5 Find the smallest and swap it with the first element Find the next smallest. It is already in the correct place Find the next smallest and swap it with first element of unsorted portion Repeat When the unsorted portion is of length 1, we are done
SelectionSorter.java Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 6 Continued
SelectionSorter.java (cont.) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 7
SelectionSortDemo.java Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 8
ArrayUtil.java Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 9 Continued
ArrayUtil.java (cont.) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 10
SelectionSortDemo.java Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 11
14.2 Profiling the Selection Sort Algorithm Want to measure the time the algorithm takes to execute Exclude the time the program takes to load Exclude output time Create a StopWatch class to measure execution time of an algorithm It can start, stop and give elapsed time Use System.currentTimeMillis method Create a StopWatch object Start the stopwatch just before the sort Stop the stopwatch just after the sort Read the elapsed time Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 12
StopWatch.java Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 13 Continued
StopWatch.java (cont.) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 14 Continued
StopWatch.java (cont.) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 15 Continued
SelectionSortTimer.java Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 16 Continued
SelectionSortTimer.java (cont.) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 17
Selection Sort on Various Array Sizes* Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 18 nMilliseconds 10, ,0002,148 30,0004,796 40,0009,192 50,00013,321 60,00019,299 * Obtained with a 2GHz Pentium processor, Java 6, Linux
Time Taken by Selection Sort Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 19
14.3 Analyzing the Performance of the Selection Sort Algorithm (1) In an array of size n, count how many times an array element is visited To find the smallest, visit n elements + 2 visits for the swap To find the next smallest, visit (n - 1) elements + 2 visits for the swap The last term is 2 elements visited to find the smallest + 2 visits for the swap Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 20
14.3 Analyzing the Performance of the Selection Sort Algorithm (2) The number of visits: n (n - 1) (n - 2) This can be simplified to n 2 /2 + 5n/2 - 3 5n/2 - 3 is small compared to n 2 /2 — so ignore it Also ignore the 1/2 — it cancels out when comparing ratios Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 21
14.3 Analyzing the Performance of the Selection Sort Algorithm (3) The number of visits is of the order n 2 Using big-Oh notation: The number of visits is O(n 2 ) Multiplying the number of elements in an array by 2 multiplies the processing time by 4 Big-Oh notation “f(n) = O(g(n))” expresses that f grows no faster than g To convert to big-Oh notation: Locate fastest-growing term, and ignore constant coefficient Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 22
14.4 Merge Sort Sorts an array by Cutting the array in half Recursively sorting each half Merging the sorted halves Much more efficient than selection sort Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 23
Merge Sort Example Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 24 Divide an array in half and sort each half Merger the two sorted arrays into a single sorted array
sort Method Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 25
MergeSorter.java Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 26 Continued
MergeSorter.java (cont.) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 27 Continued
MergeSorter.java (cont.) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 28 Continued
MergeSorter.java (cont.) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 29
MergeSortDemo.java Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 30
14.5 Analyzing the Merge Sort Algorithm (1) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 31 nMerge Sort (milliseconds)Selection Sort (milliseconds) 10, ,000732,148 30, ,796 40, ,192 50, ,321 60, ,299
Analyzing the Merge Sort Algorithm (2) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 32
Analyzing the Merge Sort Algorithm (3) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 33 In an array of size n, count how many times an array element is visited Assume n is a power of 2: n = 2 m Calculate the number of visits to create the two sub-arrays and then merge the two sorted arrays 3 visits to merge each element or 3n visits 2n visits to create the two sub-arrays total of 5n visits
Analyzing the Merge Sort Algorithm (4) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 34 Let T(n) denote the number of visits to sort an array of n elements then T(n) = T(n/2) + T(n/2) + 5n or T(n) = 2T(n/2) + 5n The visits for an array of size n/2 is: T(n/2) = 2T(n/4) + 5n/2 So T(n) = 2 × 2T(n/4) +5n + 5n The visits for an array of size n/4 is: T(n/4) = 2T(n/8) + 5n/4 So T(n) = 2 × 2 × 2T(n/8) + 5n + 5n + 5n
Analyzing the Merge Sort Algorithm (5) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 35 Repeating the process k times: T(n) = 2 k T(n/2 k ) +5nk Since n = 2 m, when k=m: T(n) = 2 m T(n/2 m ) +5nm T(n) = nT(1) +5nm T(n) = n + 5nlog 2 (n)
Analyzing the Merge Sort Algorithm (6) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 36 To establish growth order Drop the lower-order term n Drop the constant factor 5 Drop the base of the logarithm since all logarithms are related by a constant factor We are left with n log(n) Using big-Oh notation: Number of visits is O(nlog(n))
Merge Sort vs Selection Sort Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 37 Selection sort is an O(n 2 ) algorithm Merge sort is an O(nlog(n)) algorithm The nlog(n) function grows much more slowly than n 2
14.6 Searching Linear search: Examines all values in an array until it finds a match or reaches the end Also called sequential search Number of visits for a linear search of an array of n elements: The average search visits n/2 elements The maximum visits is n A linear search locates a value in an array in O(n) steps Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 38
LinearSearcher.java Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 39
LinearSearchDemo.java Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 40 Continued
LinearSearchDemo.java (cont.) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 41
Binary Search (1) Locates a value in a sorted array by Determining whether the value occurs in the first or second half Then repeating the search in one of the halves Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 42
Binary Search (2) To search 15: 15 ≠ 17: We don’t have a match Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 43
BinarySearcher.java Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 44 Continued
BinarySearcher.java (cont.) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 45
Analyzing Binary Search (1) Count the number of visits to search a sorted array of size n We visit one element (the middle element) then search either the left or right subarray Thus: T(n) = T(n/2) + 1 Using the same equation, T(n/2) = T(n/4) + 1 Substituting into the original equation: T(n) = T(n/4) + 2 This generalizes to: T(n) = T(n/2 k ) + k Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 46
Analyzing Binary Search (2) Assume n is a power of 2, n = 2m, where m = log 2 (n) Then: T(n) = 1 + log 2 (n) Binary search is an O(log(n)) algorithm Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 47
14.7 Problem Solving: Estimating the Running Time of an Algorithm Differentiating between O(nlog(n)) and O(n 2 ) has great practical implications Being able to estimate running times of other algorithms is an important skill Will practice estimating the running times of array algorithms Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 48
Linear Time (1) An algorithm to count how many elements of an array have a particular value: int count = 0; for (int i = 0; i < a.length; i++) { if (a[i] == value) { count++; } } Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 49
Linear Time (2) To visualize the pattern of array element visits: Imagine the array as a sequence of light bulbs As the i th element gets visited, imagine the i th light bulb lighting up When each visit involves a fixed number of actions, the running time is n times a constant, or O(n) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 50
Linear Time (3) When you don’t always run to the end of the array; e.g.: boolean found = false; for (int i = 0; !found && i < a.length; i++) { if (a[i] == value) { found = true; } } Loop can stop in middle: Still O(n), because in some cases the match may be at the very end of the array Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 51
Quadratic Time (1) What if we do a lot of work with each visit? Example: find the most frequent element in an array Obvious for small array For larger array? Put counts in a second array of same length Take the maximum of the counts, 3, look up where the 3 occurs in the counts, and find the corresponding value, 7 Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 52
Quadratic Time (2) First estimate how long it takes to compute the counts for (int i = 0; i < a.length; i++) { counts[i] = Count how often a[i] occurs in a } Still visit each element once, but work per visit is much larger Each counting action is O(n) When we do O(n) work in each step, total running time is O(n 2 ) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 53
Quadratic Time (3) Algorithm has three phases: 1.Compute all counts 2.Compute the maximum 3.Find the maximum in the counts First phase is O(n 2 ), second and third are each O(n) The big-Oh running time for doing several steps in a row is the largest of the big-Oh times for each step Thus algorithm for finding the most frequent element is O(n 2 ) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 54
Logarithmic Time (1) An algorithm that cuts the size of work in half in each step runs in O(log(n)) time E.g. binary search, merge sort To improve our algorithm for finding the most frequent element, first sort the array: Sorting costs O(n log(n)) time If we can complete the algorithm in O(n) time, have a better algorithm that the pervious O(n 2 ) algorithm Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 55
Logarithmic Time (2) While traversing the sorted array: As long as you find a value that is equal to its predecessor, increment a counter When you find a different value, save the counter and start counting anew Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 56
Logarithmic Time (3) int count = 0; for (int i = 0; i < a.length; i++) { count++; if (i == a.length - 1 || a[i] != a[i + 1]) { counts[i] = count; count = 0; } Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 57
Logarithmic Time (4) Constant amount of work per iteration, even though it visits two elements: 2n is still O(n) Entire algorithm is now O(log(n)) Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 58
14.8 Sorting and Searching in the Java Library When you write Java programs, you don’t have to implement your own sorting algorithms Arrays and Collections classes provide sorting and searching methods Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 59
Sorting The Arrays class contains static sort methods To sort an array of integers: int[] a =... ; Arrays.sort(a); That sort method uses the Quicksort algorithm (see Special Topic 14.3) To sort an array of Arrays List use Collections.sort method: ArrayList names =... ; Collections.sort(names); That sort method uses the merge sort algorithm Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 60
Binary Search Arrays and Collections classes contains static binarySearch methods These methods implement the binary search algorithm, with a useful enhancement: If the value is not found in the array, return -k -1, where k is the position before which the element should be inserted E.g. int[] a = { 1, 4, 9 }; int v = 7; int pos = Arrays.binarySearch(a, v); // Returns –3; v should be inserted before position 2 Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 61
Comparing Objects (1) Arrays and ArrayLists classes provide sort and binarySearch methods for objects These methods cannot know how to compare arbitrary objects Methods require that the objects belong to a class that implements the Comparable interface with a single method: public interface Comparable { int compareTo(Object otherObject); } These methods cannot know how to compare arbitrary objects Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 62
Comparing Objects (2) Call a.compareTo(b) must return a negative number if a should come before b, 0 if a and b are the same, and a positive number otherwise Several classes in the standard Java Library implement the Comparable interface E.g. String, Date Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 63
Comparing Objects (3) You can implement the Comparable interface for your classes as well E.g. to sort a collection of countries by area, class Country can implement this interface and supply a compareTo method: Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 64
Comparing Objects (3) You can implement the Comparable interface for your classes as well E.g. to sort a collection of countries by area, class Country can implement this interface and supply a compareTo method: public class Country implements Comparable { public int compareTo(Object otherObject) { Country other = (Country) otherObject; if (area < other.area) { return -1; } else if (area == other.area) { return 0; } else { return 1; } } Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 65
Summary: Selection Sort Algorithm The selection sort algorithm sorts an array by repeatedly finding the smallest element of the unsorted tail region and moving it to the front. Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 66
Summary: Measuring the Running Time of an Algorithm To measure the running time of a method, get the current time immediately before and after the method call. Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 67
Summary: Big-Oh Notation Computer scientists use the big-Oh notation to describe the growth rate of a function. Selection sort is an O(n 2 ) algorithm. Doubling the data set means a fourfold increase in processing time. Insertion sort is an O(n 2 ) algorithm. Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 68
Summary: Contrasting Running Times of Merge Sort and Selection Sort Algorithms Merge sort is an O(n log(n)) algorithm. The n log(n) function grows much more slowly than n 2. Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 69
Summary: Running Times of Linear Search and Binary Search Algorithms A linear search examines all values in an array until it finds a match or reaches the end. A linear search locates a value in an array in O(n) steps. A binary search locates a value in a sorted array by determining whether the value occurs in the first or second half, then repeating the search in one of the halves. A binary search locates a value in a sorted array in O(log(n)) steps. Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 70
Summary: Practice Developing Big-Oh Estimates of Algorithms A loop with n iterations has O(n) running time if each step consists of a fixed number of actions. A loop with n iterations has O(n 2 ) running time if each step takes O(n) time. The big-Oh running time for doing several steps in a row is the largest of the big-Oh times for each step. A loop with n iterations has O(n 2 ) running time if the i th step takes O(i) time. An algorithm that cuts the size of work in half in each step runs in O(log(n)) time. Copyright © vby John Wiley & Sons. All rights reserved. Page 71
Summary: Use the Java Library Methods for Sorting and Searching Data The Arrays class implements a sorting method that you should use for your Java programs. The Collections class contains a sort method that can sort array lists. The sort methods of the Arrays and Collections classes sort objects of classes that implement the Comparable interface. Copyright © 2013 by John Wiley & Sons. All rights reserved. Page 72