1 Sorting in Linear Time How can we do better? CountingSort RadixSort BucketSort
2 Counting Sort No comparisons between elements But depends on assumptions about the values being sorted Each of the n input elements is an integer in the range 0 to k When k = (n), the sort runs in (n) time The algorithm: Input: A[1..n] where A[ j ] {0, 1, …, k} Output: B[1..n], sorted –Notice: elements are not sorted in place Also: C[1..k] for auxiliary storage
3 Counting Sort (cont) Counting-Sort(A, B, k) 1.for i 0 to k 2.C[i] 0 3.for j 1 to Length(A) 4.C[A[j]] C[A[j]] // C[i] now contains the number of elements = i 6.for i 1 to k 7.C[i] C[i] + C[i-1] 8.// C[i] now contains the number of elements i 9.for j Length(A) downto 1 10.B[C[A[j]]] A[j] 11.C[A[j]] C[A[j]] – 1 k+2 k+1 n+1 n k+1 k n+1 n
4
5 Analysis of CountingSort: (nlgn) does not apply because CountingSort isn’t a comparison sort CountingSort is stable.
6 Radix Sort This sort was originally used to sort computer punch-card decks. It is currently used for multi-key sorts for example: year/month/day Consider each digit of the number as a separate key
7 Radix Sort (cont) Idea 1: Sort on most significant digit n, then sort on digit n-1, etc. Problem: For old sorters, sort into 10 bins, but subsequent recursive sorts require all 10 bins Operator must store the other 9 piles Idea 2: Sort on least significant digit, then sort on next least significant digit, etc.
8 Radix Sort (cont) Radix-Sort(A, d) for i 1 to d use a stable sort to sort array A on digit d
9 Radix Sort (cont) Radix-n: another way of indicating the sort used Implies each digit can differentiate among n different symbols. In the previous case we assumed radix-10. This is why the name Radix Sort is given
10 Analysis of RadixSort If each digit is in the range 1 to k (or 0 to k-1), use CountingSort for each pass Each pass over a digit is (n + k) For d digits (d(n + k)) If d is a constant and k = (n) T(n) = (n)
11 Proof by induction that RadixSort works Base Case: d = 1 since only one digit, sorting by that digit sorts the list Inductive Step: Assume it holds for d – 1digits Show: that it works for d digits A radix sort of d digits is the same as a radix sort of d-1 digits followed by a sort on digit d By our inductive hypothesis, the sort on d-1 digits works and the digits are in order according to their low-order d- 1 digits
12 Proof by induction that RadixSort works (cont) The sort on digit d will order the elements by their d th digit Consider two elements a and b, with d th digits a d and b d respectively If a d < b d, the sort will put a before b, which is correct, since a < b regardless of their low order digits
13 Proof by induction that RadixSort works (cont) If a d > b d, the sort will put a after b, which is correct, since a > b regardless of their low order digits if a d = b d, the sort will leave a and b in the same order they were in, because it is stable. But that order is already correct, since the correct order of a and b is determined by the low-order d-1 digits when their d th digits are equal
14 Radix Sort Example Show how n integers in the range 1 to n 2 can be sorted in (n) time Subtract 1 from each number So they’re in the range 0 to n 2 -1 We’ll add the one back after they are sorted Use a radix-n sort Each digit requires n symbols, and log n n 2 digits are needed (d=2, k=n). i.e., treat the numbers as 2-digit numbers in radix n Each digit ranges from 0 to n-1
15 Radix Sort Example (cont) Sort these 2-digit numbers with radix sort There are 2 calls to counting sort For (2(n + n)) = (n) time The passes to subtract 1 and add 1each take (n) time Hence, total running time of (n)
16 Bucket Sort Bucket Sort assumes that the input values are uniformly distributed over the range [0,1), 0 x < 1 Procedure: Divide inputs into n equal-sized subintervals (buckets) over the range [0,1). Sort each bucket and concatenate the buckets. T(n) = (n)
17 Bucket Sort (cont)
18 Bucket Sort (cont) Bucket-Sort(A) 1.n Length(A)1 2.for i 1 to nn+1 3.insert A[i] into list B[ nA[i] ]n 4.for i 0 to n – 1n+1 5.sort list B[i] with InsertionSortn*T(n) 6.concatenate the lists B[0], B[1], …,B[n-1] n together in order
19 Analysis of Bucket Sort All lines except Line 5 take (n) time What is the cost of the calls to InsertionSort? Let n i be the random variable denoting the number of elements in bucket B[i] And we know that InsertionSort runs in (n 2 ) time
20 Analysis of Bucket Sort (cont) By equation c.21 EQ 8.2 EQ 8.1
21 Analysis of Bucket Sort (cont) To prove the above equation, define indicator random variables X ij = I{A[j] falls into bucket i] –for i = 0, 1, …, n-1 –for j = 1, 2, …, n Thus…
22 Analysis of Bucket Sort (cont) EQ 8.3
23 Analysis of Bucket Sort (cont) Evaluate the two summations separately Lemma 5.1 E[X A ] = Pr{A}
24 Analysis of Bucket Sort (cont) Substitute these into EQ 8.3 Which proves EQ 8.2
25 Thus, when input is drawn from a uniform distribution, BucketSort runs in linear time Analysis of Bucket Sort (cont) Use the expected value in EQ 8.1