Prof. Amr Goneid, AUC1 CSCE 210 Data Structures and Algorithms Prof. Amr Goneid AUC Part 8b. Sorting(2): (n log n) Algorithms
Prof. Amr Goneid, AUC2 Sorting(2): (n log n) Algorithms General Heap Sort Merge Sort Quick Sort
Prof. Amr Goneid, AUC3 1. General We examine here 3 advanced sorting algorithms: Heap Sort (based on Priority Queues) Merge Sort (based on Divide & Conquer) Quick Sort (based on Divide & Conquer) All of these algorithms have a worst case complexity of O(n log n)
Prof. Amr Goneid, AUC4 2. Heap Sort The heap sort is a sorting algorithm based on Priority Queues. The idea is to insert all array elements into a minimum heap, then remove the top of the heap one by one back into the array.
Prof. Amr Goneid, AUC5 HeapSort Algorithm V1 //To sort an array X[ ] of n elements heapsort( X[1..n ], n) { int i; PQ Heap(n); for (i = 1 to n) Heap.insert(X[i]); for (i = 1 to n) X[i] = Heap.remove(); }
Prof. Amr Goneid, AUC6 Analysis of HeapSort V1 Worst case cost of insertion: This happens when the data are in descending order. In this case, every new insertion will have to take the element all the way up to the root, i.e. O(h) operations. Since a complete tree has a height of O(log n), the worst case cost of inserting (n) elements into a heap is O(n log n)
Prof. Amr Goneid, AUC7 Analysis of HeapSort V1 Worst case cost of removal: It is now easy to see that the worst case cost of removal of an element is O(log n). Removing all elements from the heap will then cost O(n log n). Worst Case cost of HeapSort Therefore, the total worst case cost for heapsort is O(n log n)
Prof. Amr Goneid, AUC8 Demos
Prof. Amr Goneid, AUC9 Performance of Heap Sort V1 The complexity of the HeapSort is O(n log n) In-Place SortNo (uses heap array) Stable AlgorithmNo This technique is satisfactory for medium to large data sets
Prof. Amr Goneid, AUC10 HeapSort Algorithm V2 //To sort an array X[ ] of n elements using Heapify Algorithm heapsort (X[1..n ], n) { heap_size = n; Build-MaxHeap (X[1..n], n) for i = n downto 2 { swap(X[1], X[i]) heap_size = heap_size -1 Heapify ( X, heap_size, 1) }
Prof. Amr Goneid, AUC11 Analysis of HeapSort V2 Build-MaxHeap takes O(n) time Each of (n-1) calls to Heapify takes O(log n) Hence Heapsort takes: T(n) = O(n) + (n-1) O(log n) = O(n log n) Does not use extra space (In-Place algorithm)
Prof. Amr Goneid, AUC12 Divide & Conquer Algorithms
Prof. Amr Goneid, AUC13 3. MergeSort (a) Merging Definition: Combine two or more sorted sequences of data into a single sorted sequence. Formal Definition: The input is two sorted sequences, A={a 1,..., a n } and B={b 1,..., b m } The output is a single sequence, merge(A,B), which is a sorted permutation of {a 1,..., a n, b 1,..., b m }.
Prof. Amr Goneid, AUC14 Practical Situation p q r Array B q+1 A1A1 A2A2 qB1B1 r B2B2 p Array A Copy A to B Merge back to A
Prof. Amr Goneid, AUC15 Merge Algorithm Merge Algorithm Merge(A,p,q,r) { copy A p..r to B p..r and set A p..r to empty while (neither B1 nor B2 empty) { compare first items of B1 & B2 remove smaller of the two from its list add to end of A } concatenate remaining list to end of A return A }
Prof. Amr Goneid, AUC16 Example ij p qr B A z q+1
Prof. Amr Goneid, AUC17 Example ij p qr B A z
Prof. Amr Goneid, AUC18 Example ij p qr B A z
Prof. Amr Goneid, AUC19 Example ij p qr B A z
Prof. Amr Goneid, AUC20 Example ij p qr B A z
Prof. Amr Goneid, AUC21 Example ij p qr B A z
Prof. Amr Goneid, AUC22 Example ij p qr B A z
Prof. Amr Goneid, AUC23 Example ij p qr B A z
Prof. Amr Goneid, AUC24 Worst Case Analysis |L 1 | = size of L 1, |L 2 | = size of L 2 In the worst case |L 1 | = |L 2 | = n/2 Both lists empty at about same time, so everything has to be compared. Each comparison adds one item to A so the worst case is T(n) = |A|-1 = |L1|+|L2|-1 = n-1 = O(n) comparisons.
Prof. Amr Goneid, AUC25 (b) MergeSort Methodology Invented by Von Neumann in 1945 Recursive Divide-And-Conquer
Prof. Amr Goneid, AUC26 MergeSort Methodology Divides the sequence into two subsequences of equal size, sorts the subsequences and then merges them into one sorted sequence Fast, but uses an extra space
Prof. Amr Goneid, AUC27 Methodology (continued) Divide: Divide n element sequence into two subsequences each of n/2 elements Conquer: Sort the two sub-arrays recursively Combine: Merge the two sorted subsequences to produce a sorted sequence of n elements
Prof. Amr Goneid, AUC28 Algorithm MergeSort (A, p, r) // Mergesort array A[ ] locations p..r { if (p < r) // if there are 2 or more elements { q = (p+r)/2;// Divide in the middle // Conquer both MergeSort (A, p, q); MergeSort (A, q+1, r); Merge (A, p, q, r); // Combine solutions }
Prof. Amr Goneid, AUC29 Merge Sort Example
Prof. Amr Goneid, AUC Merge Sort Demos
Prof. Amr Goneid, AUC31 Performance of MergeSort MergeSort divides the array to two halves at each level. So, the number of levels is O(log n) At each level, merge will cost O(n) Therefore, the complexity of MergeSort is O(n log n) In-Place SortNo (uses extra array) Stable AlgorithmYes This technique is satisfactory for large data sets
Prof. Amr Goneid, AUC32 4. QuickSort (a) Methodology Invented by Sir Tony Hoare in 1962 Recursive Divide-And-Conquer algorithm Partitions array around a Pivot, then sorts parts independently Fast, in-place sorting
Prof. Amr Goneid, AUC33 Methodology (continued) Divide: Array a[p..r] is rearranged into two nonempty sub- arrays a[p..q] and a[q+1..r] such that each element of the first is <= each element of the second ( the pivot index is computed as part of this process) Conquer: Sort the two sub-arrays recursively Combine: The sub-arrays are already sorted in place, i.e., a[p..r] is sorted.
Prof. Amr Goneid, AUC34 (b) Algorithm QuickSort (a, p, r) { if (p < r ) { q = partition(a, p, r); QuickSort(a, p,q); QuickSort(a, q+1, r); }
Prof. Amr Goneid, AUC35 Partitioning int partition (a, p, r) { pivot = a p ; // pivot is initially the first element i = p-1; j = r+1; while (true) { do { j--; } while (a j > pivot); do { i++; } while (a i < pivot); if(i < j) swap (a i, a j ); else return j; } // j is the location of last element in left part }
Prof. Amr Goneid, AUC36 Example: using 1 st element as pivot i j Pivot = 1 st element = 8 p r
Prof. Amr Goneid, AUC37 Pivot = 8 Partitioning ij
Prof. Amr Goneid, AUC38 Pivot = 8 Partitioning ij
Prof. Amr Goneid, AUC39 Pivot = 8 Partitioning ij
Prof. Amr Goneid, AUC40 Pivot = 8 Partitioning ij
Prof. Amr Goneid, AUC41 Pivot = 8 Partitioning ji
Prof. Amr Goneid, AUC42 Pivot = 8 Partitioning ij
Prof. Amr Goneid, AUC43 Pivot = 8 Partitioning ij
Prof. Amr Goneid, AUC44 Pivot = 8 Partitioning ij
Prof. Amr Goneid, AUC45 Pivot = 8 Partitioning ij
Prof. Amr Goneid, AUC46 Partitioning ij final j
Prof. Amr Goneid, AUC47 Partitioning Left Right q = final j p r
Prof. Amr Goneid, AUC48 Example (continued) Final Array
Prof. Amr Goneid, AUC Quick Sort Demo
Prof. Amr Goneid, AUC50 Performance of QuickSort Partitioning will cost O(n) at each level Best Case: Pivot has the middle value Quicksort divides the array to two equal halves at each level. So, the number of levels is O(log n) Therefore, the complexity of QuickSort is O(n log n) Worst Case: Pivot is the minimum (maximum) value The number of levels is O(n) Therefore, the complexity of QuickSort is O(n 2 ) In-Place SortYes Stable AlgorithmNo This technique is satisfactory for large data sets
Prof. Amr Goneid, AUC51 See Internet Animation Sites: For Example: demo.html demo.html
Prof. Amr Goneid, AUC52 Median-of-Three Partitioning If we take the elements in the first, last and middle locations in the array and then take the element middle in value between these three, this will be the median of three. In this case, there is always at least one element below and one element above. The worst case is therefore unlikely
Prof. Amr Goneid, AUC53 A Better Choice: Random Pivot A better solution that avoids the worst case of O(n 2 ) is to use a randomizer to select the pivot element. Pick a random element from the sub-array (p..r) as the partition element. Do this only if (r-p) > 5 to avoid cost of the randomizer
Prof. Amr Goneid, AUC54 A Better Choice: Random Pivot void RQuickSort (a, p, r) { if (p < r ) { if ((r-p) > 5) { int m = rand()%(r-p+1) + p; swap (a p, a m ); } q = partition(a, p, r); RQuickSort(a, p,q); RQuickSort(a, q+1, r); }