CSE 326 Sorting David Kaplan Dept of Computer Science & Engineering Autumn 2001
SortingCSE 326 Autumn Outline Sorting: The Problem Space Sorting by Comparison Lower bound for comparison sorts Insertion Sort Heap Sort Merge Sort Quick Sort External Sorting Comparison of Sorting by Comparison
SortingCSE 326 Autumn Sorting: The Problem Space General problem Given a set of N orderable items, put them in order Without (significant) loss of generality, assume: Items are integers Ordering is (Most sorting problems map to the above in linear time.)
SortingCSE 326 Autumn Lower Bound for Sorting by Comparison Sorting by Comparison Only information available to us is the set of N items to be sorted Only operation available to us is pairwise comparison between 2 items What is the best running time we can possibly achieve?
SortingCSE 326 Autumn Decision Tree Analysis of Sorting by Comparison
SortingCSE 326 Autumn Max depth of decision tree How many permutations are there of N numbers? How many leaves does the tree have? What’s the shallowest tree with a given number of leaves? What is therefore the worst running time (number of comparisons) by the best possible sorting algorithm?
SortingCSE 326 Autumn Lower Bound for Comparison Sort Stirling’s approximation: Any comparison sort of n items must have a decision tree with n! leaves, of height log(n!) n log(n) is a lower bound for log(n!) Thus, no comparison sort can be faster than O(n log(n))
SortingCSE 326 Autumn Insertion Sort Basic idea After k th pass, ensure that first k+1 elements are sorted On k th pass, swap (k+1) th element to left as necessary Start After Pass 1 After Pass After Pass What if array is initially sorted? Reverse sorted?
SortingCSE 326 Autumn Why Insertion Sort is Slow Inversion: a pair (i,j) such that i Array[j] Array of size N can have (N 2 ) inversions average number of inversions in a random set of elements is N(N-1)/4 Insertion Sort only swaps adjacent elements only removes 1 inversion!
SortingCSE 326 Autumn HeapSort Sorting via Priority Queue (Heap) Shove items into a priority queue, take them out smallest to largest Worst Case: Best Case:
SortingCSE 326 Autumn MergeSort Merging Cars by key [Aggressiveness of driver]. Most aggressive goes first. MergeSort (Table [1..n]) Split Table in half Recursively sort each half Merge two sorted halves together
SortingCSE 326 Autumn MergeSort Analysis Running Time Worst case? Best case? Average case? Other considerations besides running time?
SortingCSE 326 Autumn QuickSort Basic idea: Pick a “pivot” Divide into less-than & greater-than pivot Sort each side recursively Picture from PhotoDisc.com
SortingCSE 326 Autumn QuickSort Partition Pick pivot: Partition with cursors <> <> 2 goes to less-than
SortingCSE 326 Autumn QuickSort Partition (cont’d) <> 6, 8 swap less/greater-than ,5 less-than 9 greater-than Partition done. Recursively sort each side.
SortingCSE 326 Autumn Analyzing QuickSort Picking pivot: constant time Partitioning: linear time Recursion: time for sorting left partition (say of size i) + time for right (size N-i-1) T(1) = b T(N) = T(i) + T(N-i-1) + cN where i is the number of elements smaller than the pivot
SortingCSE 326 Autumn QuickSort Worst case Pivot is always smallest element. T(N) = T(i) + T(N-i-1) + cN T(N)= T(N-1) + cN = T(N-2) + c(N-1) + cN = T(N-k) + = O(N 2 )
SortingCSE 326 Autumn Optimizing QuickSort Choosing the Pivot Randomly choose pivot Good theoretically and practically, but call to random number generator can be expensive Pick pivot cleverly “Median-of-3” rule takes Median(first, middle, last). Works well in practice. Cutoff Use simpler sorting technique below a certain problem size (Weiss suggests using insertion sort, with a cutoff limit of 5-20)
SortingCSE 326 Autumn QuickSort Best Case Pivot is always median element. T(N) = T(i) + T(N-i-1) + cN T(N)= 2T(N/2 - 1) + cN
SortingCSE 326 Autumn QuickSort Average Case Assume all size partitions equally likely, with probability 1/N details: Weiss pg
SortingCSE 326 Autumn External Sorting When you just ain’t got enough RAM … e.g. Sort 10 billion numbers with 1 MB of RAM. Databases need to be very good at this MergeSort Good for Something! Basis for most external sorting routines Can sort any number of records using a tiny amount of main memory in extreme case, keep only 2 records in memory at once!
SortingCSE 326 Autumn External MergeSort Split input into two tapes Each group of 1 records is sorted by definition, so merge groups of 1 to groups of 2, again split between two tapes Merge groups of 2 into groups of 4 Repeat until data entirely sorted log N passes
SortingCSE 326 Autumn Better External MergeSort Suppose main memory can hold M records: Read in groups of M records and sort them (e.g. with QuickSort) Number of passes reduced to log(N/M) k-way mergesort reduces number of passes to log k (N/M) Requires 2k output devices (e.g. mag tapes) But wait, there’s more … Polyphase merge does a k-way mergesort using only k+1 output devices (plus k th -order Fibonacci numbers!)
SortingCSE 326 Autumn Sorting by Comparison Summary Sorting algorithms that only compare adjacent elements are (N 2 ) worst case – but may be (N) best case HeapSort - (N log N) both best and worst case Suffers from two test-ops per data move MergeSort - (N log N) running time Suffers from extra-memory problem QuickSort - (N 2 ) worst case, (N log N) best and average case In practice, median-of-3 almost always gets us (N log N) Big win comes from {sorting in place, one test-op, few swaps}! Any comparison-based sorting algorithm is (N log N) External sorting: MergeSort with (log N/M) passes
SortingCSE 326 Autumn Sorting: The Problem Space Revisited General problem Given a set of N orderable items, put them in order Without (significant) loss of generality, assume: Items are integers Ordering is Most sorting problems can be mapped to the above in linear time But what if we have more information available to us?
SortingCSE 326 Autumn Sorting in Linear Time Sorting by Comparison Only information available to us is the set of N items to be sorted Only operation available to us is pairwise comparison between 2 items Best running time is O(N log(N)), given these constraints What if we relax the constraints? Know something in advance about item values
SortingCSE 326 Autumn BinSort (a.k.a. BucketSort) If keys are known to be in {1, …, K} Have array of size K Put items into correct bin (cell) of array, based on key
SortingCSE 326 Autumn BinSort example K=5 list=(5,1,3,4,3,2,1,1,5,4,5) Bins in array key = 11,1,1 key = 22 key = 33,3 key = 44,4 key = 55,5,5 Sorted list: 1,1,1,2,3,3,4,4,5,5,5
SortingCSE 326 Autumn BinSort Pseudocode procedure BinSort (List L,K) LinkedList bins[1..K] // Each element of array bins is linked list. // Could also BinSort with array of arrays. For Each number x in L bins[x].Append(x) End For For i = 1..K For Each number x in bins[i] Print x End For
SortingCSE 326 Autumn BinSort Running Time K is a constant BinSort is linear time K is variable Not simply linear time K is large (e.g ) Impractical
SortingCSE 326 Autumn BinSort is “stable” Stable Sorting algorithm Items in input with the same key end up in the same order as when they began. Important if keys have associated values Critical for RadixSort
SortingCSE 326 Autumn Mr. Radix Herman Hollerith invented and developed a punch-card tabulation machine system that revolutionized statistical computation. Born in Buffalo, New York, the son of German immigrants, Hollerith enrolled in the City College of New York at age 15 and graduated from the Columbia School of Mines with distinction at the age of 19. His first job was with the U.S. Census effort of Hollerith successively taught mechanical engineering at the Massachusetts Institute of Technology and worked for the U.S. Patent Office. The young engineer developed an electrically actuated brake system for the railroads, but the Westinghouse steam-actuated brake prevailed. Hollerith began working on the tabulating system during his days at MIT, filing for the first patent in He developed a hand-fed 'press' that sensed the holes in punched cards; a wire would pass through the holes into a cup of mercury beneath the card closing the electrical circuit. This process triggered mechanical counters and sorter bins and tabulated the appropriate data. Hollerith's system-including punch, tabulator, and sorter-allowed the official 1890 population count to be tallied in six months, and in another two years all the census data was completed and defined; the cost was $5 million below the forecasts and saved more than two years' time. His later machines mechanized the card-feeding process, added numbers, and sorted cards, in addition to merely counting data. In 1896 Hollerith founded the Tabulating Machine Company, forerunner of Computer Tabulating Recording Company (CTR). He served as a consulting engineer with CTR until retiring in In 1924 CTR changed its name to IBM- the International Business Machines Corporation. Herman Hollerith Born February 29, Died November 17, 1929 Art of Compiling Statistics; Apparatus for Compiling Statistics Patent Nos. 395,781; 395,782; 395,783 Inducted 1990 Source: National Institute of Standards and Technology (NIST) Virtual Museum -
SortingCSE 326 Autumn RadixSort Radix = “The base of a number system” (Webster’s dictionary) alternate terminology: radix is number of bits needed to represent 0 to base-1; can say “base 8” or “radix 3” Idea: BinSort on each digit, bottom up.
SortingCSE 326 Autumn RadixSort – magic! It works. Input list: 126, 328, 636, 341, 416, 131, 328 BinSort on lower digit: 341, 131, 126, 636, 416, 328, 328 BinSort result on next-higher digit: 416, 126, 328, 328, 131, 636, 341 BinSort that result on highest digit: 126, 131, 328, 328, 341, 416, 636
SortingCSE 326 Autumn RadixSort – how it works
SortingCSE 326 Autumn Not magic. It provably works. Keys K-digit numbers base B Claim: after i th BinSort, least significant i digits are sorted e.g. B=10, i=3, keys are 1776 and comes before 1776 for last 3 digits.
SortingCSE 326 Autumn RadixSort Proof by Induction Base case: i=0. 0 digits are sorted (that wasn’t hard!) Induction step: assume for i, prove for i+1. consider two numbers: X, Y. Say X i is i th digit of X (from the right) X i+1 < Y i+1 then i+1 th BinSort will put them in order X i+1 > Y i+1, same thing X i+1 = Y i+1, order depends on last i digits. Induction hypothesis says already sorted for these digits. (Careful about ensuring that your BinSort preserves order aka “stable”…)
SortingCSE 326 Autumn What data types can you RadixSort? Any type T that can be BinSorted Any type T that can be broken into parts A and B, such that: You can reconstruct T from A and B A can be RadixSorted B can be RadixSorted A is always more significant than B, in ordering
SortingCSE 326 Autumn RadixSort Examples 1-digit numbers can be BinSorted 2 to 5-digit numbers can be BinSorted without using too much memory 6-digit numbers, broken up into A=first 3 digits, B=last 3 digits. A and B can reconstruct original 6-digits A and B each RadixSortable as above A more significant than B
SortingCSE 326 Autumn RadixSorting Strings 1 Character can be BinSorted Break strings into characters Need to know length of biggest string (or calculate this on the fly). Null-pad shorter strings Running time: N is number of strings L is length of longest string RadixSort takes O(N*L)
SortingCSE 326 Autumn Evaluating Sorting Algorithms What factors other than asymptotic complexity could affect performance? Suppose two algorithms perform exactly the same number of instructions. Could one be better than the other?
SortingCSE 326 Autumn Memory Hierarchy Stats (made up 1 ) CPU cyclesSize L1 (on chip) cache032 KB L2 cache8512 KB RAM35256 MB Hard Drive500,0008 GB 1 But plausible : - )
SortingCSE 326 Autumn Memory Hierarchy Exploits Locality of Reference Idea: small amount of fast memory Keep frequently used data in the fast memory LRU replacement policy Keep recently used data in cache To free space, remove Least Recently Used data
SortingCSE 326 Autumn Cache Details (simplified) Main Memory Cache Cache line size (4 adjacent memory cells)
SortingCSE 326 Autumn Traversing an Array One miss for every 4 accesses in a traversal cache misses cache hits
SortingCSE 326 Autumn Iterative MergeSort Poor Cache Performance Cache Size no temporal locality!
SortingCSE 326 Autumn Recursive MergeSort Better Cache Performance Cache Size
SortingCSE 326 Autumn Quicksort Pretty Good Cache Performance Initial partition causes a lot of cache misses As subproblems become smaller, they fit into cache Generally good cache performance
SortingCSE 326 Autumn Radix Sort Lousy Cache Performance On each BinSort Sweep through input list – cache misses along the way (bad!) Append to output list – indexed by pseudorandom digit (ouch!) Truly evil for large Radix (e.g ), which reduces # of passes
SortingCSE 326 Autumn Sorting Summary Linear-time sorting is possible if we know more about the input (e.g. that all input values fall within a known range) BinSort Moderately useful O(n) categorizer Most useful as building block for RadixSort RadixSort O(n), but input must conform to fairly narrow profile Poor cache performance Memory Hierarchy and Cache Cache locality can have major impact on performance of sorting, and other, algorithms