FASTER SORTING using RECURSION : QUICKSORT 2014-T2 Lecture 16 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus Frean
2 RECAP-TODAY RECAP MergeSort cost O(n log n) TODAY QuickSort ANNOUNCEMENTS Test on Thursday – 4pm RoomsFROMTO MCLT103Abubakr Geuze HMLT002Gibb Jones HMLT205Julian Pappafloratos KKLT303Patel Zhu sorting algorithms put to music: (quicksort is at 38 sec)
3 MergeSort Cost (analysis order) LEVEL 1 (two mergesorts) LEVEL 1 (merge) LEVEL 2 (two mergesorts) LEVEL 2 (merge) LEVEL 3 (two mergesorts) LEVEL 3 (merge) LEVEL 4 (two mergesorts) LEVEL 4 (merge)
4 Divide and Conquer Sorts To Sort: Split Sort each part (recursive) Combine Where does the work happen ? MergeSort: split is trivial combine does all the work QuickSort: split does all the work combine is trivial Array Sorted Array Split Combine SubArray SortedSubArray Sort Split Combine SubArray Sort SortedSubArray Split Combine SubArray Sort SortedSubArray
5 QuickSort uses Divide and Conquer, but does its work in the “split” step split the array into parts, by choosing a “pivot” item, and making sure that: all items < pivot are in the left part all items > pivot are in the right part Then (recursively) sort each part Here's how we start it off: public static void quickSort( E[] data, int size, Comparator comp) { quickSort (data, 0, size, comp); } “wrapper” version starts it off recursive version carries it on (and on...) note: it won’t usually be an equal split
6 QuickSort – the recursion public static void quickSort(E[ ] data, int low, int high, Comparator comp){ if (high-low < 2) // only one item to sort. return; if (high - low < 4) // only two or three items to sort. sort3(data, low, high, comp); else { // split into two parts, mid = index of boundary int mid = partition(data, low, high, comp); quickSort(data, low, mid, comp); quickSort(data, mid, high, comp); }
7 QuickSort: simplest version 1. Choose a pivot: pivot Use pivot to partition the array: not yet sorted
8 QuickSort: in-place version 1. Choose a pivot: pivot Use pivot to partition the array: low high left (gets returned) not yet sorted
9 QuickSort: coding in-place partition /** Partition into small items (low..mid-1) and large items (mid..high-1) private static int partition(E[] data, int low, int high, Comparator comp){ E pivot = int left = low-1; int right = high; while( left <= right ){ do { left++; // just skip over items on the left < pivot } while (left<high &&comp.compare(data[left], pivot)< 0); do { right--; // just skip over items on the right > pivot } while (right>=low && comp.compare(data[right], pivot)> 0); if (left < right) swap(data, left, right); } return left; } data[low];// simple but poor choice! median(data[low], data[high-1], data[(low+high)/2], comp);
10 QuickSort data array : [p a1 r f e q2 w q1 t z2 x c v b z1 a2 ] indexes : [ ] do : [p a1 r f e q2 w q1 t z2 x c v b z1 a2 ] p ->6 : [a2 a1 b f e c w q1 t z2 x q2 v r z1 p ] do 0..6 : [a2 a1 b f e c ] c ->4 : [a2 a1 b c e f ] do 0..4 : [a2 a1 b c ] b ->3 : [a2 a1 b c ] do 0..3 : [a2 a1 b ] done 0..3 : [a2 a1 b ] do 3..4 : [ c ] done 0..4 : [a2 a1 b c ] do 4..6 : [ e f ] done 4..6 : [ e f ] done 0..6 : [a2 a1 b c e f ] do : [ w q1 t z2 x q2 v r z1 p ] q2 ->8 : [ p q2 t z2 x q1 v r z1 w ] do 6..8 : [ p q2 ] done 6..8 : [ p q2 ]
11 Quick Sort do : [ w q1 t z2 x q2 v r z1 p ] : [ p q2 t z2 x q1 v r z1 w ] do 6..8 : [ p q2 ] done 6..8 : [ p q2 ] do : [ t z2 x q1 v r z1 w ] ->12: [ t r v q1 x z2 z1 w ] do : [ t r v q1 ] ->10: [ q1 r v t ] do : [ q1 r ] done : [ q1 r ] do : [ v t ] done : [ t v ] done : [ q1 r t v ] do : [ x z2 z1 w ] ->13: [ w z2 z1 x ] do : [ w ] do : [ z2 z1 x ] done : [ x z2 z1 ] done : [ w x z2 z1 ] done : [ q1 r t v w x z2 z1 ] done : [ p q2 q1 r t v w x z2 z1 ] done : [a2 a1 b c e f p q2 q1 r t v w x z2 z1 ]
12 QuickSort public static void quickSort (E[ ] data, int low, int high, Comparator comp) { if (high > low +2) { int mid = partition(data, low, high, comp); quickSort(data, low, mid, comp); quickSort(data, mid, high, comp); } } Cost of Quick Sort: three steps: partition: has to compare (high-low) pairs first recursive call second recursive call
13 QuickSort Cost: If Quicksort divides the array exactly in half, then: C(n) = n + 2 C(n/2) n log(n) comparisons = O(n log(n)) (best case) If Quicksort divides the array into 1 and n-1: C(n) = n + (n-1) + (n-2) + (n-3) + … = n(n-1)/2 comparisons = O(n 2 ) (worst case) Average case ? very hard to analyse. still O(n log(n)), and very good. Quicksort is “in place”, MergeSort is not
14 Stable or Unstable ? Faster if almost-sorted ? MergeSort: Stable: doesn’t jump any item over an unsorted region ⇒ two equal items preserve their order Same cost on all input “natural merge” variant doesn’t sort already sorted regions ⇒ will be very fast: O(n) on almost sorted lists QuickSort: Unstable: Partition “jumps” items to the other end ⇒ two equal items likely to reverse their order Cost depends on choice of pivot. simplest choice is very slow: O(n 2 ) even on almost sorted lists better choice (median of three) ⇒ O(n log(n)) on almost sorted lists
15 Some Big-O costs revisited Implementing Collections: ArrayList: O(n) to add/remove, except at end Stack:O(1) ArraySet:O(n) (cost of searching) SortedArraySetO( log(n) ) to search (with binary search) O(n) to add/remove (cost of moving up/down) O( n 2 ) to add n items O( n log(n) ) to initialise with n items. (with fast sorting)
16 the test this Thursday 20% of final grade, and takes 45 mins RoomsFROMTO MCLT103Abubakr Geuze HMLT002Gibb Jones HMLT205Julian Pappafloratos KKLT303Patel Zhu Please be on time, and bring your ID card. “Exam conditions”: sit in alternate rows, so someone can walk right along every 2 nd row sit in alternate seats put your ID card where it can be seen (but no need to leave bags at the front as happens in final exams) What’s in? everything up to now no complex coding questions on recursion or the fast sorts old tests are a guide for what to expect