Sort & Search Algorithms
Definition Sorting is the process of: Taking a list of objects which could be stored in a linear order (a0, a1, ..., an – 1) e.g., numbers, and returning an reordering (a'0, a'1, ..., a'n – 1) such that a'0 ≤ a'1 ≤ · · · ≤ a'n – 1 The conversion of an Abstract List into an Abstract Sorted List Arrays are to be used for both input and output,
Classifications The operations of a sorting algorithm are based on the actions performed: Insertion Exchanging Selection Merging Distribution
Run-time The run time of the sorting algorithms may fall into one of the two categories: Q(n ln(n)) (faster) O(n2) (traditional) average- and worst-case scenarios for each algorithm The run-time may change significantly based on the scenario
Sub-optimal Sorting Algorithms Bogosort Algorithm 1. Randomly order the objects, and 2. Check if they’re sorted, if not, go back to Step 1. Run time analysis: average: Q(n·n!) n! permutations worst: unbounded...
Insertion Sort Consider the following observations: A list with one element is sorted In general, if we have a sorted list of k items, we can insert a new item to create a sorted list of size k + 1 For example, consider this sorted array containing of eight sorted entries Suppose we want to insert 14 into this array leaving the resulting array sorted
Background Starting at the back, if the number is greater than 14, copy it to the right Once an entry less than 14 is found, insert 14 into the resulting vacancy
The Algorithm For any unsorted list: Treat the first element as a sorted list of size 1 Then, given a sorted list of size k – 1 Insert the kth item in the unsorted list into it into the sorted list The sorted list is now of size k + 1
Repeat this for each element The Algorithm Code for this would be: for ( int j = k; j > 0; j-- ) { if ( array[j - 1] > array[j] ) { swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap break; Repeat this for each element
“the generic bad algorithm” Bubble Sort Some thoughts about bubble sort: the Jargon file states that bubble sort is “the generic bad algorithm” Donald Knuth comments that “the bubble sort seems to have nothing to recommend it, except a catchy name and the fact that it leads to some interesting theoretical problems”
Implementation Starting with the first item, assume that it is the largest Compare it with the second item: If the first is larger, swap the two, Otherwise, assume that the second item is the largest Continue up the array, either swapping or redefining the largest item After one pass, the largest item must be the last in the list Repeat n – 1 times, after which, all entries will be in place
Implementation The default algorithm: void bubble( int []array, int n ) { for ( int i = n - 1; i > 0; i-- ) { for ( int j = 0; j < i; j++ ) { if ( array[j] > array[j + 1] ) { swap( array[j], array[j + 1] ); }
Binary search (13.1) binary search: Locates a target value in a sorted array/list by successively eliminating half of the array from consideration. How many elements will it need to examine? O(log N) Can be implemented with a loop or recursively Example: Searching the array below for the value 42: index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 value -4 20 22 25 30 36 42 50 56 68 85 92 103 min mid max
Binary search code (ParamArrayTest.cs) // Returns the index of an occurrence of target in a, // or a negative number if the target is not found. // Precondition: elements of a are in sorted order public static int binarySearch(int[] a, int target) { int min = 0; int max = a.length - 1; while (min <= max) { int mid = (min + max) / 2; if (a[mid] < target) { min = mid + 1; } else if (a[mid] > target) { max = mid - 1; } else { return mid; // target found } return -1; // target not found
Have one or more base cases Recursion Sometimes, the best way to solve a problem is by solving a smaller version of the exact same problem first Recursion is a technique that solves a problem by solving a smaller problem of the same type Have one or more base cases Neglecting base case can cause infinite loop
Problems defined recursively There are many problems whose solution can be defined recursively Example: n factorial 1 if n = 0 n!= (recursive solution) (n-1)!*n if n > 0 1 if n = 0 n!= (closed form solution) 1*2*3*…*(n-1)*n if n > 0
Coding the factorial function Recursive implementation int Factorial(int n) { if (n==0) // base case return 1; else return n * Factorial(n-1); } Fibonacci.cs
Recursive binary search (13.3) Write a recursive binarySearch method. If the target value is not found, return -1. int index = binarySearch(data, 42); // 10 int index2 = binarySearch(data, 66); // -1 index 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 value -4 20 22 25 30 36 42 50 56 68 85 92 103
// Returns the index of an occurrence of the given value in // the given array, or a negative number if not found. // Precondition: elements of a are in sorted order public static int binarySearch(int[] a, int target) { return binarySearch(a, target, 0, a.length - 1); } // Recursive helper to implement search behavior. private static int binarySearch(int[] a, int target, int min, int max) { if (min > max) { return -1; // target not found } else { int mid = (min + max) / 2; if (a[mid] < target) { // too small; go right return binarySearch(a, target, mid + 1, max); } else if (a[mid] > target) { // too large; go left return binarySearch(a, target, min, mid - 1); return mid; // target found; a[mid] == target
HashTable Arrays uses nonnegative integer indexes as keys. Sometimes associating these integer keys with objects to store them is impractical, so we develop a scheme for using arbitrary keys. When an application needs to store something, the scheme could convert the application key rapidly to an index. Once the application has a key for which it wants to retrieve the data, simply apply the conversion to the key to find the array index where the data resides. The scheme we describe here is the basis of a technique called hashing, in which we store data in a data structure called a hash table.
HashTable A hash function performs a calculation that determines where to place data in the hash table. The hash function is applied to the key in a key/value pair of objects. Class Hashtable can accept any object as a key. For this reason, class object defines method GetHashCode, which all objects inherit. Example: Let’s write a program that counts the number of occurrences of each word in a string read from console. To split the sentence into words, we will use this: // split input text into tokens string[] words = Regex.Split( input, @"\s+" ); HashTable solution.
HashTable Hashtable method ContainsKey determines whether a key is in the hash table. Read-only property Keys returns an ICollection that contains all the keys. Hashtable property Count returns the number of key/value pairs in the Hashtable. If you use a foreach statement with a Hashtable object, the iteration variable will be of type DictionaryEntry. The enumerator of a Hashtable (or any other class that implements IDictionary) uses the DictionaryEntry structure to store key/value pairs. This structure provides properties Key and Value for retrieving the key and value of the current element. If you do not need the key, class Hashtable also provides a read-only Values property that gets an ICollection of all the values stored in the Hashtable.
Stack & Queue Stack: QueueTest.cs Queue: Example: Push Enqueue Pop Peek Example: StackTest.cs Queue: Enqueue Dequeue Peek QueueTest.cs