Chapter 7: Sorting Algorithms Insertion Sort
Sorting Algorithms Insertion Sort Shell Sort Heap Sort Merge Sort Quick Sort 2
Assumptions Elements to sort are placed in arrays of length N Can be compared Sorting can be performed in main memory 3
Complexity Simple sorting algorithms : O(N 2 ) Shellsort: o(N 2 ) Advanced sorting algorithms: O(NlogN) In general: Ω(NlogN) 4
Insertion Sort 5 PRE: array of N elements (from 0 to N-1) POST: array sorted 1. An array of one element is sorted 2. Assume that the first p elements are sorted. for j = p to N-1 Take the j-th element and find a place for it among the first j sorted elements
Insertion Sort 6 int j, p; comparable tmp; for ( p = 1; p < N ; p++) { tmp = a[p]; for ( j=p; j>0 && tmp < a[ j-1 ] ; j- - ) a[ j ] = a[ j-1 ]; a[ j ] = tmp; }
Analysis of the Insertion Sort 7 Insert the N-th el.: at most N-1 comparisons N-1 movements Insert the N-1 st el. at most N-2 comparisons N-2 movements Insert the 2 nd el.: 1 comparison 1 movement 2*( … N - 1) = 2*N * (N-1) / 2 = N(N-1) = Θ (N 2 ) Almost sorted array: O(N) Average complexity: Θ (N 2 )
A lower bound for simple sorting algorithms 8 An inversion : an ordered pair (A i, A j ) such that i A j Example: 10, 6, 7, 15, 3,1 Inversions are: (10,6), (10,7), (10,3), (10,1), (6,3), (6,1) (7,3), (7,1) (15,3), (15,1), (3,1)
Swapping 9 Swapping adjacent elements that are out of order removes one inversion. A sorted array has no inversions. Sorting an array that contains i inversions requires at least i swaps of adjacent elements How many inversions are there in an average unsorted array?
Theorems 10 Theorem 1: The average number of inversions in an array of N distinct elements is N (N - 1) / 4 Theorem 2: Any algorithm that sorts by exchanging adjacent elements requires Ω (N 2 ) time on average. For a sorting algorithm to run in less than quadratic time it must do something other than swap adjacent elements
Shell Sort
Improves on insertion sort Compares elements far apart, then less far apart, finally compares adjacent elements (as an insertion sort). 12
Idea arrange the data sequence in a two-dimensional array sort the columns of the array repeat the process each time with smaller number of columns 13
Example it is arranged in an array with 7 columns (left), then the columns are sorted (right):
Example (cont) one column in the last step – it is only a 6, an 8 and a 9 that have to move a little bit to their correct position
Implementation 16 one-dimensional array that is indexed appropriately. an increment sequence to determine how far apart elements to be sorted are
Increment sequence 17 Determines how far apart elements to be sorted are: h 1, h 2,..., h t with h 1 = 1 h k -sorted array - all elements spaced a distance h k apart are sorted relative to each other.
Correctness of the algorithm 18 Shellsort only works because an array that is h k -sorted remains h k -sorted when h k- 1 -sorted. Subsequent sorts do not undo the work done by previous phases. The last step (with h = 1) - Insertion Sort on the whole array
Java code for Shell sort 19 int j, p, gap; comparable tmp; for (gap = N/2; gap > 0; gap = gap/2) for ( p = gap; p < N ; p++) { tmp = a[p]; for ( j = p ; j >= gap && tmp < a[ j- gap ]; j = j - gap) a[ j ] = a[ j - gap ]; a[j] = tmp; }
Increment sequences Shell's original sequence: N/2, N/4,..., 1 (repeatedly divide by 2). 2. Hibbard's increments: 1, 3, 7,..., 2 k - 1 ; 3. Knuth's increments: 1, 4, 13,..., ( 3 k - 1) / 2 ; 4. Sedgewick's increments: 1, 5, 19, 41, 109, ·4 k - 9 ·2 k + 1 or 4 k - 3 ·2 k + 1.
Analysis 21 Shellsort's worst-case performance using Hibbard's increments is Θ(n 3/2 ). The average performance is thought to be about O(n 5/4 ) The exact complexity of this algorithm is still being debated. for mid-sized data : nearly as well if not better than the faster (n log n) sorts.
Heap Sort
Basic Idea Complexity Example 23
Idea Store N elements in a binary heap tree. Perform delete_Min operation N times, storing each element deleted from the heap into another array. Copy back the array. Not very efficient to use two arrays. Improvement – use one array for the binary heap and the sorted elements 24
Improvements Use the same array to store the deleted elements instead of using another array After each deletion we get a vacant position in the array - the last cell. There we store the deleted element, which becomes part of the sorted sequence. 25
Improvements When all the elements are deleted and stored in the same array following the above method, the elements will be there in reversed order. What is the remedy for this? Store the elements in the binary heap tree in reverse order of priority - then at the end the elements in the array will be in correct order. 26
Complexity Sorts in O(NlogN) time by performing N times deleteMax operations. - Each deleteMax operation takes log N running time. - N times performing deleteMax NlogN running time Used for general purpose sorting, guarantees O(N logN) 27
Example Consider the values of the elements as priorities and build the heap tree. 2. Start deleteMax operations, storing each deleted element at the end of the heap array.
Example (cont) 29 Note that we use only one array, treating its parts differently: when sorting, part of the array will be the heap, and the rest part - the sorted array
Build the Heap 30 We start with the element at position SIZE/2 comparing the item with the children. The hole is percolated down to position 6 and the item is inserted there Result: holechild
Build the Heap 31 Next we compare position 2 with its children holechild1child2 19 is greater than 7 and 17, and we continue with position
Build the Heap 32 Percolate down the hole at position The hole at position 1 is percolated down to position 2 -the greater child
Build the Heap 33 Percolate down the hole at position One of the children of the hole at position 2 - item 17, is greater than 15. So we percolate the hole to position
Build the Heap the heap is built
Sorting 35 DeleteMax the top element Store the last heap element (10) in a temporary place. Move the DeletedMax element (19) to the place where the last heap element was - the current available position in the sorted portion of the array. A hole is created at the top 10
Sorting 36 Percolate down the hole
Sorting 37 Percolate down the hole
Sorting 38 Fill the hole
Sorting 39 DeleteMax the top element Store the last heap element (10) in a temporary place. Move the DeletedMax element (17) to the place where the last heap element was - the current available position in the sorted portion of the array. A hole is created at the top
Sorting 40 Percolate down the hole
Sorting 41 Fill the hole
Sorting 42 DeleteMax the top element Store the last heap element (7) in a temporary place. Move the DeletedMax element (16) to the place where the last heap element was - the current available position in the sorted portion of the array. A hole is created at the top 7
Sorting 43 Percolate down the hole
Sorting 44 Fill the hole
Sorting 45 DeleteMax the top element Store the last heap element (10) in a temporary place. Move the DeletedMax element (15) to the place where the last heap element was - the current available position in the sorted portion of the array. A hole is created at the top
Sorting 46 Percolate down the hole Since 10 is greater than the children of the hole, It has to be inserted in the hole
Sorting 47 Fill the hole
Sorting 48 DeleteMax the top element Store the last heap element (7) in a temporary place. Move the DeletedMax element (10) to the place where the last heap element was - the current available position in the sorted portion of the array. A hole is created at the top 10
Sorting 49 Fill the hole The hole has no children and so it has to be filled.
Sorted array is the last element from the heap, so now the array is sorted