Sorting CSE 373 Data Structures.

Slides:



Advertisements
Similar presentations
Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort
Advertisements

DIVIDE AND CONQUER APPROACH. General Method Works on the approach of dividing a given problem into smaller sub problems (ideally of same size).  Divide.
Spring 2015 Lecture 5: QuickSort & Selection
Quicksort CS 3358 Data Structures. Sorting II/ Slide 2 Introduction Fastest known sorting algorithm in practice * Average case: O(N log N) * Worst case:
25 May Quick Sort (11.2) CSE 2011 Winter 2011.
1 Today’s Material Divide & Conquer (Recursive) Sorting Algorithms –QuickSort External Sorting.
Quicksort. 2 Introduction * Fastest known sorting algorithm in practice * Average case: O(N log N) * Worst case: O(N 2 ) n But, the worst case seldom.
Quicksort.
Divide and Conquer Sorting
CSE 326: Data Structures Sorting Ben Lerner Summer 2007.
Chapter 7 (Part 2) Sorting Algorithms Merge Sort.
Sorting Chapter 6 Chapter 6 –Insertion Sort 6.1 –Quicksort 6.2 Chapter 5 Chapter 5 –Mergesort 5.2 –Stable Sorts Divide & Conquer.
CSE 373 Data Structures Lecture 19
Sorting (Part II: Divide and Conquer) CSE 373 Data Structures Lecture 14.
CSE 373 Data Structures Lecture 15
1 Data Structures and Algorithms Sorting. 2  Sorting is the process of arranging a list of items into a particular order  There must be some value on.
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Sorting HKOI Training Team (Advanced)
Computer Science 101 Fast Searching and Sorting. Improving Efficiency We got a better best case by tweaking the selection sort and the bubble sort We.
HKOI 2006 Intermediate Training Searching and Sorting 1/4/2006.
1 Lecture 16: Lists and vectors Binary search, Sorting.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
1 Today’s Material Iterative Sorting Algorithms –Sorting - Definitions –Bubble Sort –Selection Sort –Insertion Sort.
1 Sorting Algorithms Sections 7.1 to Comparison-Based Sorting Input – 2,3,1,15,11,23,1 Output – 1,1,2,3,11,15,23 Class ‘Animals’ – Sort Objects.
Sorting CSIT 402 Data Structures II. 2 Sorting (Ascending Order) Input ›an array A of data records ›a key value in each data record ›a comparison function.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Computer Science 101 Fast Algorithms. What Is Really Fast? n O(log 2 n) O(n) O(n 2 )O(2 n )
Divide and Conquer Sorting Algorithms COMP s1 Sedgewick Chapters 7 and 8.
CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan
Today’s Material Sorting: Definitions Basic Sorting Algorithms
Algorithm Design Techniques, Greedy Method – Knapsack Problem, Job Sequencing, Divide and Conquer Method – Quick Sort, Finding Maximum and Minimum, Dynamic.
Advanced Sorting.
Advanced Sorting 7 2  9 4   2   4   7
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Sorting.
Sorting.
Chapter 7 Sorting Spring 14
Parallel Sorting Algorithms
Data Structures and Algorithms
Description Given a linear collection of items x1, x2, x3,….,xn
Algorithm Design and Analysis (ADA)
Sorting Chapter 13 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved
Quick Sort (11.2) CSE 2011 Winter November 2018.
Unit-2 Divide and Conquer
Bin Sort, Radix Sort, Sparse Arrays, and Stack-based Depth-First Search CSE 373, Copyright S. Tanimoto, 2002 Bin Sort, Radix.
Data Structures Review Session
8/04/2009 Many thanks to David Sun for some of the included slides!
Topic: Divide and Conquer
Lecture No 6 Advance Analysis of Institute of Southern Punjab Multan
Parallel Sorting Algorithms
Sub-Quadratic Sorting Algorithms
CSE373: Data Structure & Algorithms Lecture 21: Comparison Sorting
Simple Sorting Methods: Bubble, Selection, Insertion, Shell
Bin Sort, Radix Sort, Sparse Arrays, and Stack-based Depth-First Search CSE 373, Copyright S. Tanimoto, 2001 Bin Sort, Radix.
CSE 326: Data Structures Sorting
EE 312 Software Design and Implementation I
CSE 332: Data Abstractions Sorting I
Quicksort.
CSE 373: Data Structures and Algorithms
Analysis of Algorithms
CSE 373 Data Structures and Algorithms
Algorithms: Design and Analysis
The Selection Problem.
Design and Analysis of Algorithms
CSE 332: Sorting II Spring 2016.
Linear Time Sorting.
Quicksort.
Advanced Sorting Methods: Shellsort
Presentation transcript:

Sorting CSE 373 Data Structures

Reading Reading Goodrich and Tamassia, Chapter 10 2/23/2019 CSE 373 AU 04 -- Sorting

Outline Definition of Sorting Time, Space, and Stability considerations Bubble sort – a simple but inefficient algorithm Insertion sort – O(n2) but usually good when n < 10. Merge Sort (uses Divide-and-Conquer strategy) Quicksort – O(n2) worst case, but efficient in practice. Bin Sort and Radix Sort 2/23/2019 CSE 373 AU 04 -- Sorting

Sorting Input an array A of data records (Note: we have seen how to sort when elements are in linked lists: Mergesort) a key value in each data record a comparison function which imposes a consistent ordering on the keys (e.g., integers) Output reorganize the elements of A such that For any i and j, if i < j then A[i]  A[j] 2/23/2019 CSE 373 AU 04 -- Sorting

Space How much space does the sorting algorithm require in order to sort the collection of items? Is copying needed? O(n) additional space In-place sorting – no copying – O(1) additional space Somewhere in between for “temporary”, e.g. O(logn) space External memory sorting – data so large that does not fit in memory 2/23/2019 CSE 373 AU 04 -- Sorting

Time How fast is the algorithm? The definition of a sorted array A says that for any i<j, A[i] < A[j] This means that you need to at least check on each element at the very minimum, I.e., at least O(N) And you could end up checking each element against every other element, which is O(N2) The big question is: How close to O(N) can you get? 2/23/2019 CSE 373 AU 04 -- Sorting

Stability Stability: Does it rearrange the order of input data records which have the same key value (duplicates)? E.g. Phone book sorted by name. Now sort by county – is the list still sorted by name within each county? Extremely important property for databases A stable sorting algorithm is one which does not rearrange the order of duplicate keys 2/23/2019 CSE 373 AU 04 -- Sorting

n2 n·log2n n Faster is better! log2n

Bubble Sort “Bubble” elements to to their proper place in the array by comparing elements i and i+1, and swapping if A[i] > A[i+1] Bubble every element towards its correct position last position has the largest element then bubble every element except the last one towards its correct position then repeat until done or until the end of the quarter, whichever comes first ... 2/23/2019 CSE 373 AU 04 -- Sorting

Bubblesort bubble(A[1..n]: integer array, n : integer): { i, j : integer; for i = 1 to n-1 do for j = 2 to n–i+1 do if A[j-1] > A[j] then SWAP(A[j-1],A[j]); } SWAP(a,b) : { t :integer;6 t:=a; a:=b; b:=t; 6 5 3 2 7 1 5 6 3 2 7 1 5 3 6 2 7 1 1 2 3 4 5 6 2/23/2019 CSE 373 AU 04 -- Sorting

Put the largest element in its place larger value? 2 3 8 8 1 2 3 8 7 9 10 12 23 18 15 16 17 14 swap 1 2 3 7 8 9 10 12 23 18 15 16 17 14 9 10 12 23 23 1 2 3 7 8 9 10 12 23 18 15 16 17 14 swap 1 2 3 7 8 9 10 12 18 23 15 16 17 14 swap 1 2 3 7 8 9 10 12 18 15 23 16 17 14 swap 1 2 3 7 8 9 10 12 18 15 16 23 17 14 swap 1 2 3 7 8 9 10 12 18 15 16 17 23 14 swap 1 2 3 7 8 9 10 12 18 15 16 17 14 23 2/23/2019 CSE 373 AU 04 -- Sorting

Put 2nd largest element in its place larger value? 2 3 7 8 9 10 12 18 18 1 2 3 7 8 9 10 12 18 15 16 17 14 23 swap 1 2 3 7 8 9 10 12 15 18 16 17 14 23 swap 1 2 3 7 8 9 10 12 15 16 18 17 14 23 swap 1 2 3 7 8 9 10 12 15 16 17 18 14 23 swap 1 2 3 7 8 9 10 12 15 16 17 14 18 23 Two elements done, only n-2 more to go ... 2/23/2019 CSE 373 AU 04 -- Sorting

Bubble Sort: Just Say No “Bubble” elements to to their proper place in the array by comparing elements i and i+1, and swapping if A[i] > A[i+1] We bubblize for i=1 to n (i.e, n times) Each bubblization is a loop that makes n-i comparisons This is O(n2) 2/23/2019 CSE 373 AU 04 -- Sorting

Insertion Sort What if first k elements of array are already sorted? 4, 7, 12, 5, 19, 16 We can shift the tail of the sorted elements list down and then insert next element into proper position and we get k+1 sorted elements 4, 5, 7, 12, 19, 16 2/23/2019 CSE 373 AU 04 -- Sorting

Insertion Sort Is Insertion sort in place? Running time = ? 2 1 4 i j InsertionSort(A[1..N]: integer array, N: integer) { i, j, temp: integer ; for i = 2 to N { temp := A[i]; j := i; while j > 1 and A[j-1] > temp { A[j] := A[j-1]; j := j–1;} A[j] = temp; } Is Insertion sort in place? Running time = ? 1 2 3 2 1 4 i j 2/23/2019 CSE 373 AU 04 -- Sorting

Example 1 2 3 8 7 9 10 12 23 18 15 16 17 14 1 2 3 7 8 9 10 12 23 18 15 16 17 14 1 2 3 7 8 9 10 12 18 23 15 16 17 14 1 2 3 7 8 9 10 12 18 15 23 16 17 14 1 2 3 7 8 9 10 12 15 18 23 16 17 14 1 2 3 7 8 9 10 12 15 18 16 23 17 14 1 2 3 7 8 9 10 12 15 16 18 23 17 14 2/23/2019 CSE 373 AU 04 -- Sorting

Example 1 2 3 7 8 9 10 12 15 16 18 17 23 14 1 2 3 7 8 9 10 12 15 16 17 18 23 14 1 2 3 7 8 9 10 12 15 16 17 18 14 23 1 2 3 7 8 9 10 12 15 16 17 14 18 23 1 2 3 7 8 9 10 12 15 16 14 17 18 23 1 2 3 7 8 9 10 12 15 14 16 17 18 23 1 2 3 7 8 9 10 12 14 15 16 17 18 23 2/23/2019 CSE 373 AU 04 -- Sorting

Insertion Sort Characteristics In place and Stable Running time Worst case is O(N2) reverse order input must copy every element every time Good sorting algorithm for almost sorted data Each item is close to where it belongs in sorted order. 2/23/2019 CSE 373 AU 04 -- Sorting

“Divide and Conquer” Very important strategy in computer science: Divide problem into smaller parts Independently solve the parts Combine these solutions to get overall solution Idea 1: Divide array into two halves, recursively sort left and right halves, then merge two halves  Mergesort Idea 2 : Partition array into items that are “small” and items that are “large”, then recursively sort the two sets  Quicksort 2/23/2019 CSE 373 AU 04 -- Sorting

Mergesort Divide it in two at the midpoint Conquer each side in turn (by recursively sorting) Merge two halves together 8 2 9 4 5 3 1 6 2/23/2019 CSE 373 AU 04 -- Sorting

Mergesort Example 8 2 9 4 5 3 1 6 Divide 8 2 9 4 5 3 1 6 Divide 8 2 8 2 9 4 5 3 1 6 Divide 8 2 9 4 5 3 1 6 Divide 1 element 8 2 9 4 5 3 1 6 Merge 2 8 4 9 3 5 1 6 Merge 2 4 8 9 1 3 5 6 Merge 1 2 3 4 5 6 8 9 2/23/2019 CSE 373 AU 04 -- Sorting

Auxiliary Array The merging requires an auxiliary array. 2 4 8 9 1 3 5 6 Auxiliary array 2/23/2019 CSE 373 AU 04 -- Sorting

Auxiliary Array The merging requires an auxiliary array. 2 4 8 9 1 3 5 6 Auxiliary array 1 2/23/2019 CSE 373 AU 04 -- Sorting

Auxiliary Array The merging requires an auxiliary array. 2 4 8 9 1 3 5 6 Auxiliary array 1 2 3 4 5 2/23/2019 CSE 373 AU 04 -- Sorting

Merging i j normal target Left completed first i j target copy 2/23/2019 CSE 373 AU 04 -- Sorting

Merging i j Right completed first target first second 2/23/2019 CSE 373 AU 04 -- Sorting

Merging Algorithm Merge(A[], T[] : integer array, left, right : integer) : { mid, i, j, k, l, target : integer; mid := (right + left)/2; i := left; j := mid + 1; target := left; while i < mid and j < right do if A[i] < A[j] then T[target] := A[i] ; i:= i + 1; else T[target] := A[j]; j := j + 1; target := target + 1; if i > mid then //left completed// for k := left to target-1 do A[k] := T[k]; if j > right then //right completed// k : = mid; l := right; while k > i do A[l] := A[k]; k := k-1; l := l-1; } 2/23/2019 CSE 373 AU 04 -- Sorting

Recursive Mergesort Mergesort(A[], T[] : integer array, left, right : integer) : { if left < right then mid := (left + right)/2; Mergesort(A,T,left,mid); Mergesort(A,T,mid+1,right); Merge(A,T,left,right); } MainMergesort(A[1..n]: integer array, n : integer) : { T[1..n]: integer array; Mergesort[A,T,1,n]; 2/23/2019 CSE 373 AU 04 -- Sorting

Iterative Mergesort uses 2 arrays; alternates between them Merge by 1 2/23/2019 CSE 373 AU 04 -- Sorting

Iterative Mergesort Need of a last copy Merge by 1 Merge by 2 2/23/2019 CSE 373 AU 04 -- Sorting

Iterative Mergesort How do you handle non-powers of 2? IterativeMergesort(A[1..n]: integer array, n : integer) : { //precondition: n is a power of 2// i, m, parity : integer; T[1..n]: integer array; m := 2; parity := 0; while m < n do for i = 1 to n – m + 1 by m do if parity = 0 then Merge(A,T,i,i+m-1); else Merge(T,A,i,i+m-1); parity := 1 – parity; m := 2*m; if parity = 1 then for i = 1 to n do A[i] := T[i]; } How do you handle non-powers of 2? How can the final copy be avoided? 2/23/2019 CSE 373 AU 04 -- Sorting

Mergesort Analysis Let T(N) be the running time for an array of N elements Mergesort divides array in half and calls itself on the two halves. After returning, it merges both halves using a temporary array Each recursive call takes T(N/2) and merging takes O(N) 2/23/2019 CSE 373 AU 04 -- Sorting

Mergesort Recurrence Relation The recurrence relation for T(N) is: T(1) < a base case: 1 element array  constant time T(N) < 2T(N/2) + bN Sorting N elements takes the time to sort the left half plus the time to sort the right half plus an O(N) time to merge the two halves T(N) = O(n log n) 2/23/2019 CSE 373 AU 04 -- Sorting

Properties of Mergesort Not in-place Requires an auxiliary array (O(n) extra space) Stable Make sure that left is sent to target on equal values. Iterative Mergesort reduces copying. 2/23/2019 CSE 373 AU 04 -- Sorting

Quicksort Quicksort uses a divide and conquer strategy, but does not require the O(N) extra space that MergeSort does Partition array into left and right sub-arrays Choose an element of the array, called pivot the elements in left sub-array are all less than pivot elements in right sub-array are all greater than pivot Recursively sort left and right sub-arrays Concatenate left and right sub-arrays in O(1) time 2/23/2019 CSE 373 AU 04 -- Sorting

“Four easy steps” To sort an array S 1. If the number of elements in S is 0 or 1, then return. The array is sorted. 2. Pick an element v in S. This is the pivot value. 3. Partition S-{v} into two disjoint subsets, S1 = {all values xv}, and S2 = {all values xv}. 4. Return QuickSort(S1), v, QuickSort(S2) 2/23/2019 CSE 373 AU 04 -- Sorting

The steps of QuickSort S S1 S2 S1 S2 S select pivot value partition S 81 31 57 43 13 75 92 65 26 S1 S2 partition S 31 75 43 13 65 81 92 26 57 QuickSort(S1) and QuickSort(S2) S1 S2 13 26 31 43 57 65 75 81 92 S 13 26 31 43 57 65 75 81 92 Voila! S is sorted [Weiss] 2/23/2019 CSE 373 AU 04 -- Sorting

Details, details Implementing the actual partitioning Picking the pivot want a value that will cause |S1| and |S2| to be non-zero, and close to equal in size if possible Dealing with cases where the element equals the pivot 2/23/2019 CSE 373 AU 04 -- Sorting

Quicksort Partitioning Need to partition the array into left and right sub-arrays the elements in left sub-array are  pivot elements in right sub-array are  pivot How do the elements get to the correct partition? Choose an element from the array as the pivot Make one pass through the rest of the array and swap as needed to put elements in partitions 2/23/2019 CSE 373 AU 04 -- Sorting

Partitioning:Choosing the pivot One implementation (there are others) median3 finds pivot and sorts left, center, right Median3 takes the median of leftmost, middle, and rightmost elements An alternative is to choose the pivot randomly (need a random number generator; “expensive”) Another alternative is to choose the first element (but can be very bad. Why?) Swap pivot with next to last element 2/23/2019 CSE 373 AU 04 -- Sorting

Partitioning in-place Set pointers i and j to start and end of array Increment i until you hit element A[i] > pivot Decrement j until you hit elmt A[j] < pivot Swap A[i] and A[j] Repeat until i and j cross Swap pivot (at A[N-2]) with A[i] 2/23/2019 CSE 373 AU 04 -- Sorting

Example Choose the pivot as the median of three 1 2 3 4 5 6 7 8 9 8 1 4 9 3 5 2 7 6 Median of 0, 6, 8 is 6. Pivot is 6 1 4 9 7 3 5 2 6 8 Place the largest at the right and the smallest at the left. Swap pivot with next to last element. i j

Example Move i to the right up to A[i] larger than pivot. j 1 4 9 7 3 5 2 6 8 i j 1 4 9 7 3 5 2 6 8 i j 1 4 9 7 3 5 2 6 8 i j 1 4 2 7 3 5 9 6 8 Move i to the right up to A[i] larger than pivot. Move j to the left up to A[j] smaller than pivot. Swap 2/23/2019 CSE 373 AU 04 -- Sorting

Example Cross-over i > j S1 < pivot pivot S2 > pivot 1 4 2 7 1 4 2 7 3 5 9 6 8 i j 1 4 2 7 3 5 9 6 6 8 i j 1 4 2 5 3 7 9 6 8 i j 1 4 2 5 3 7 9 6 8 j i 1 4 2 5 3 7 9 6 6 8 Cross-over i > j j i 1 4 2 5 3 6 9 7 8 S1 < pivot pivot S2 > pivot

Recursive Quicksort Don’t use quicksort for small arrays. Quicksort(A[]: integer array, left,right : integer): { pivotindex : integer; if left + CUTOFF  right then pivot := median3(A,left,right); pivotindex := Partition(A,left,right-1,pivot); Quicksort(A, left, pivotindex – 1); Quicksort(A, pivotindex + 1, right); else Insertionsort(A,left,right); } Don’t use quicksort for small arrays. CUTOFF = 10 is reasonable. 2/23/2019 CSE 373 AU 04 -- Sorting

Quicksort Best Case Performance Algorithm always chooses best pivot and splits sub-arrays in half at each recursion T(0) = T(1) = O(1) constant time if 0 or 1 element For N > 1, 2 recursive calls plus linear time for partitioning T(N) = 2T(N/2) + O(N) Same recurrence relation as Mergesort T(N) = O(N log N) 2/23/2019 CSE 373 AU 04 -- Sorting

Quicksort Worst Case Performance Algorithm always chooses the worst pivot – one sub-array is empty at each recursion T(N)  a for N  C T(N)  T(N-1) + bN  T(N-2) + b(N-1) + bN  T(C) + b(C+1)+ … + bN  a +b(C + (C+1) + (C+2) + … + N) T(N) = O(N2) Fortunately, average case performance is O(N log N) (see text for proof) 2/23/2019 CSE 373 AU 04 -- Sorting

Properties of Quicksort Not stable because of long distance swapping. No iterative version (without using a stack). Pure quicksort not good for small arrays. “In-place”, but uses auxiliary storage because of recursive call (O(logn) space). O(n log n) average case performance, but O(n2) worst case performance. 2/23/2019 CSE 373 AU 04 -- Sorting

Folklore “Quicksort is the best in-memory sorting algorithm.” Truth Quicksort uses very few comparisons on average. Quicksort does have good performance in the memory hierarchy. Small footprint Good locality 2/23/2019 CSE 373 AU 04 -- Sorting

Bin Sort (a.k.a. Bucket Sort) Assumption: The range of possible key values is {0, ..., N-1}, where N is not too much bigger than the number of data elements, n. (It could be smaller.) Example: Sorting a collection of 67 two-digit numbers. N = 100 n = 67 2/23/2019 CSE 373 AU 04 -- Sorting

Bin Sort: The Algorithm Create an array Head[0..N-1] of N headers of empty linked lists. Create another array (parallel to the first), Tail[0..N-1] of (initially null) links to the tails of the same lists. For each  key,data  record,  ki, di  insert it onto the end of the list at Head[ki], (which is Tail[ki]). Concatenate the N lists together and return this. 2/23/2019 CSE 373 AU 04 -- Sorting

Bin Sort: Example Sort  2, A, 3, X, 2, B, 1,C, 0,D, 1,E  Head[0] Head[1] Head[2] Head[3] Head[4] Tail[0] Tail[1] Tail[2] Tail[3] Tail[4] 2/23/2019 CSE 373 AU 04 -- Sorting

Bin Sort: Example after 1st record has been inserted Sort  3, X, 2, B, 1,C, 0,D, 1,E  Head[0] Head[1] Head[2] 2, A <------------ Head[3] Head[4] Tail[0] Tail[1] Tail[2] Tail[3] Tail[4] 2/23/2019 CSE 373 AU 04 -- Sorting

Bin Sort: Example after all records have been inserted Head[0] 0,D <------------- Head[1] 1,C, 1,E <----- Head[2] 2,A, 2,B <----- Head[3] 3, X <------------- Head[4] Tail[0] Tail[1] Tail[2] Tail[3] Tail[4] 2/23/2019 CSE 373 AU 04 -- Sorting

Bin Sort: Example after all lists have been concatenated Head[0] 0,D <------------- Head[1] 1,C, 1,E <----- Head[2] 2,A, 2,B <----- Head[3] 3, X <------------- Head[4] Tail[0] Tail[1] Tail[2] Tail[3] Tail[4]  0,D, 1,C, 1,E, 2, A , 2, B, 3, X 2/23/2019 CSE 373 AU 04 -- Sorting

Bin Sort: Comments We really didn’t have to store the full records on the lists; the key information was implicit in which list the record was on. The sorting operations was stable; the relative ordering of records was preserved except when their keys were out of order. 2/23/2019 CSE 373 AU 04 -- Sorting

Bin Sort: Time and Space Creating the arrays and performing the concatenation takes (N). Inserting the data records takes (n). Overall time complexity is (N+n). Space requirements: (N) for Head and Tail arrays, (n) for holding the records. Overall space complexity is (N+n). If we know that in some class of sorting problems N = n or N = c n, then Bin Sort gives us a linear-time, linear space sorting algorithm, which has a lower asymptotic growth rate than the well known n log n time algorithms such as Heapsort. 2/23/2019 CSE 373 AU 04 -- Sorting

Can Bin Sort be Adapted for Large N? What if the range of possible keys is relatively large, but still bounded at some practical value like 10,000,000? Can we still use Bin Sort effectively? Yes, but in a new way. Break each key into a list of subkeys and perform one Bin Sort run for each subkey. This is called “radix sort”... 2/23/2019 CSE 373 AU 04 -- Sorting

Radix Sort Suppose N, the bound on the range of possible key values, is large. For example, N = 10,000,000? We can use each digit of a key as a separate subkey. Consider a key such as 3,870,215 to be a list of 7 separate subkeys: 3,8,7,0,2,1,5 Sort all the records using the last (least signficant ) subkey. Then, using that partially sorted output as input to another sorting step, sort by the second-to-last subkey. Keep doing this until the most significant subkey has been used. 2/23/2019 CSE 373 AU 04 -- Sorting

Radix Sort: Example N = 100, n = 5 32, 51, 31, 52, 81 Phase 1: Sort by the rightmost digit (1’s position)  51, 31 , 81, 32, 52 Phase 2: Sort by the next digit (10’s position)  31, 32, 51, 52, 81 2/23/2019 CSE 373 AU 04 -- Sorting

Radix Sort: Why does it work? Subkey sorting steps are applied in the order from least significant to most significant. Bin Sort is stable, so each phase preserves the useful work done by the previous phase. The subkeys are independent and cover the entire original key. 2/23/2019 CSE 373 AU 04 -- Sorting

Radix Sort: Time and Space Space requirement: (m+n), where m is the maximum number of possible values of a subkey. Time requirement: (k ( m + n)), where k is the number of phases. If we assume m < n, and that k is ( log N), then the time requirement is ( n log N). However, depending upon the data to be sorted, we might be able to assume that N is constant, making log N also constant, and then have a time complexity of (n). 2/23/2019 CSE 373 AU 04 -- Sorting

Radix Sort: Alternative radices Our example used radix 10; each subkey was a digit. Writing each key in binary, we can easily use radix 2. A naturally efficient radix for strings of limited length would be 256; each 1-byte character serves as a subkey. (But, lexicographic ordering is not completely consistent with ASCII codes.) 2/23/2019 CSE 373 AU 04 -- Sorting

Radix Sort: Irregular Subkeys The subkeys for each phase do not have to have the same ranges. To sort playing cards, we can let the less significant subkey be suit, and the more significant subkey be rank. Clubs < Diamonds < Hearts < Spades 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10 < J < Q < K < A 2/23/2019 CSE 373 AU 04 -- Sorting