Sorting Algorithms 2
Quicksort General Quicksort Algorithm: Select an element from the array to be the pivot Select an element from the array to be the pivot Rearrange the elements of the array into a left and right subarray Rearrange the elements of the array into a left and right subarray All values in the left subarray are < pivot All values in the left subarray are < pivot All values in the right subarray are > pivot All values in the right subarray are > pivot Independently sort the subarrays Independently sort the subarrays No merging required, as left and right are independent problems [ Parallelism?!? ] No merging required, as left and right are independent problems [ Parallelism?!? ]
Quicksort void quicksort(int* arrayOfInts, int first, int last) { int pivot; int pivot; if (first < last) if (first < last) { pivot = partition(arrayOfInts, first, last); pivot = partition(arrayOfInts, first, last); quicksort(arrayOfInts,first,pivot-1); quicksort(arrayOfInts,first,pivot-1); quicksort(arrayOfInts,pivot+1,last); quicksort(arrayOfInts,pivot+1,last); }}
Quicksort int partition(int* arrayOfInts, int first, int last) { int temp; int temp; int p = first; // set pivot = first index int p = first; // set pivot = first index for (int k = first+1; k <= last; k++) // for every other indx for (int k = first+1; k <= last; k++) // for every other indx { if (arrayOfInts[k] <= arrayOfInts[first]) // if data is smaller if (arrayOfInts[k] <= arrayOfInts[first]) // if data is smaller { p = p + 1; // update final pivot location p = p + 1; // update final pivot location swap(arrayOfInts[k], arrayOfInts[p]); swap(arrayOfInts[k], arrayOfInts[p]); } } swap(arrayOfInts[p], arrayOfInts[first]); swap(arrayOfInts[p], arrayOfInts[first]); return p; return p;}
Partition Step Through partition(cards, 0, 4) P = 0 K = 1P = 1 K = 3 cards[1] < cards[0] ? Nocards[3] < cards[0]? Yes P = 2 P = 0 K = 2temp = cards[3] cards[2] < cards[0] ? Yescards[3] = cards[2] P = 1cards[2] = cards[3] temp = cards[2]P = 2 K = 4 cards[2] = cards[1]cards[4] < cards[0]? No cards[1] = temp temp = cards[2], cards[2] = cards[first] cards[first] = temp, return p = 2;
Complexity of Quicksort Worst case is O(n 2 ) Worst case is O(n 2 ) What does worst case correspond to? What does worst case correspond to? Already sorted or near sorted Already sorted or near sorted Partitioning leaves heavily unbalanced subarrays Partitioning leaves heavily unbalanced subarrays On average is O(n log 2 n), and it is average a lot of the time. On average is O(n log 2 n), and it is average a lot of the time.
Complexity of Quicksort Recurrence Relation: [Average Case] 2 sub problems ½ size (if good pivot) Partition is O(n) a = 2 b = 2 k = 1 2 = 2 1 Master Theorem: O(nlog 2 n)
Complexity of Quicksort Recurrence Relation: [Worst Case] Partition separates into (n-1) and (1) Partition separates into (n-1) and (1) Can’t use master theorem: Can’t use master theorem: b (subproblem size) changes n-1/n n-2/n-1 n-3/n-2 Note that sum of partition work: Note that sum of partition work: n + (n-1) + (n-2) + (n-3) … Sum(1,N) = N(N+1)/2= O(N 2 )
Complexity of Quicksort Requires stack space to implement recursion Requires stack space to implement recursion Worst case: O(n) stack space Worst case: O(n) stack space If pivot breaks into 1 element and n-1 element subarrays If pivot breaks into 1 element and n-1 element subarrays Average case: O(log n) Average case: O(log n) Pivot splits evenly Pivot splits evenly
MergeSort General Mergesort Algorithm: General Mergesort Algorithm: Recursively split subarrays in half Recursively split subarrays in half Merge sorted subarrays Merge sorted subarrays Splitting is first in recursive call, so continues until have one item subarrays Splitting is first in recursive call, so continues until have one item subarrays One item subarrays are by definition sorted One item subarrays are by definition sorted Merge recombines subarrays so result is sorted Merge recombines subarrays so result is sorted 1+1 item subarrays => 2 item subarrays 1+1 item subarrays => 2 item subarrays 2+2 item subarrays => 4 item subarrays 2+2 item subarrays => 4 item subarrays Use fact that subarrays are sorted to simplify merge algorithm Use fact that subarrays are sorted to simplify merge algorithm
MergeSort void mergesort(int* array, int* tempArray, int low, int high, int size) { if (low < high) { int middle = (low + high) / 2; int middle = (low + high) / 2; mergesort(array,tempArray,low,middle, size); mergesort(array,tempArray,low,middle, size); mergesort(array,tempArray,middle+1, high, size); mergesort(array,tempArray,middle+1, high, size); merge(array,tempArray,low,middle,high, size); merge(array,tempArray,low,middle,high, size); }}
MergeSort void merge(int* array, int* tempArray, int low, int middle, int high, int size) { int i, j, k; int i, j, k; for (i = low; i <= high; i++) { tempArray[i] = array[i]; } // copy into temp array for (i = low; i <= high; i++) { tempArray[i] = array[i]; } // copy into temp array i = low; j = middle+1; k = low; i = low; j = middle+1; k = low; while ((i <= middle) && (j <= high)) {// merge while ((i <= middle) && (j <= high)) {// merge if (tempArray[i] <= tempArray[j])// if lhs item is smaller if (tempArray[i] <= tempArray[j])// if lhs item is smaller array[k++] = tempArray[i++];// put in final array, increment else// final array position, lhs index else// final array position, lhs index array[k++] = tempArray[j++];// else put rhs item in final array }// increment final array position }// increment final array position // rhs index while (i <= middle)// one of the two will run out while (i <= middle)// one of the two will run out array[k++] = tempArray[i++];// copy the rest of the data array[k++] = tempArray[i++];// copy the rest of the data }// only need to copy if in lhs array // rhs array already in right place
MergeSort Example Recursively Split
MergeSort Example Recursively Split
MergeSort Example Merge
Merge Sort Example Temp Array i j Array Temp[i] < Temp[j] Yes 2 cards Not very interesting Think of as swap k3
MergeSort Example Temp Array i j Array Temp[i] < Temp[j] No k Update J, K by 1 => Hit Limit of Internal While Loop, as J > High Now Copy until I > Middle k20 Array183
MergeSort Example 2 Card Swap Final after merging above sets i=1,j=3 5 i=1,j=4 i=0,j=3 9 i=1,j=5 18 i=2,j=5 20 i=3,j=5
Complexity of MergeSort Recurrence relation: 2 subproblems ½ size Merging is O(n) for any subproblem Always moving forwards in the array a = 2 b = 2 k = 1 2 = 2 1 Master Theorem: O(n log 2 n) Always O(n log 2 n) in both average and worst case Doesn’t rely on quality of pivot choice
Space Complexity of Mergesort Need an additional O(n) temporary array Need an additional O(n) temporary array Number of recursive calls: Number of recursive calls: Always O(log 2 n) Always O(log 2 n)
Tradeoffs When it is more useful to: When it is more useful to: Just search Just search Quicksort or Mergesort and search Quicksort or Mergesort and search Assume Z searches Assume Z searches Search on random data: Z * O(n) Fast Sort and binary search: O(nlog 2 n) + Z *log 2 n
Tradeoffs Z * n <= nlog 2 n + Zlog 2 n Z(n - log 2 n) <= n log 2 n Z <= (n log 2 n) / (n-log 2 n) Z <= (n log 2 n) / n [Approximation] Z <= log 2 n [Approximation] Where as before, had to do N searches to make up for cost of sorting, now only do log 2 N 1,000,000 items = 19 searches, instead of 1,000,000
How Fast? Without specific details of what sorting, O(n log 2 n) is the maximum speed sort possible. Without specific details of what sorting, O(n log 2 n) is the maximum speed sort possible. Only available operations: Compare, Swap Only available operations: Compare, Swap Proof: Decision Tree – describes how sort operates Proof: Decision Tree – describes how sort operates Every vertex represents a comparison, every branch a result Every vertex represents a comparison, every branch a result Moving down tree – Tracing a possible run through the algorithm Moving down tree – Tracing a possible run through the algorithm
How Fast? K1 <= K2 [1,2,3] K2 <= K3K1 <= K3 [1,2,3] [2,1,3] Yes No stopK1 <= K3 Yes No K2 <= K3stop Yes No [1,2,3] [1,3,2] [2,1,3] [1,3,2][3,1,2] [2,3,1] [3,2,1]
How Fast? There are n! possible “stop” nodes – effectively all permutations of the n numbers in the array. There are n! possible “stop” nodes – effectively all permutations of the n numbers in the array. Thus any decision tree representing a sorting algorithm must have n! leaves Thus any decision tree representing a sorting algorithm must have n! leaves The height of a this type of tree (a binary tree) is correlated with number of leaves: The height of a this type of tree (a binary tree) is correlated with number of leaves: Height k = 2^(k-1) leaves Height k = 2^(k-1) leaves Must be at least log 2 n! + 1 height Must be at least log 2 n! + 1 height
How Fast? Path from top to bottom of tree – trace of a run of the algorithm Path from top to bottom of tree – trace of a run of the algorithm Need to prove that (log 2 n!) is lower bounded by (n log 2 n) Need to prove that (log 2 n!) is lower bounded by (n log 2 n) n! = (n)(n-1)(n-2)(n-3) … (3)(2)(1) > (n)(n-1)(n-2)(n-3) … ceil(n/2) // doing fewer multiplies > (n)(n-1)(n-2)(n-3) … ceil(n/2) // doing fewer multiplies > ceil(n/2) (ciel(n/2)) // doing multiplies of bigger things > ceil(n/2) (ciel(n/2)) // doing multiplies of bigger things > approximately (n/2) (n/2) > approximately (n/2) (n/2) log 2 n! > log 2 (n/2) (n/2) log 2 n! > (n/2) log 2 (n/2)//exponentiation in logs = multiplication out front log 2 n! > (n/2)(log 2 n – log 2 2) // division in logs = subtraction log 2 n! > (n/2)(log 2 n – 1) log 2 n! > (n/2)(log 2 n) – (n/2) log 2 n! > (1/2) [nlog 2 n – n] log 2 n! ~ O(n log 2 n)