CSC 213 – Large Scale Programming
Bucket Sort Buckets, B, is array of Sequence Sorts Collection, C, in two phases: 1. Remove each element v from C & add to B[v] 2. Move elements from each bucket back to C A B C
Bucket Sort Example
Bucket-Sort Algorithm Algorithm bucketSort( Sequence C) B = new Sequence [10] // & instantiate each Sequence // Phase 1 for each element v in C B[v].addLast(v) endfor // Phase 2 loc = 0 for each Sequence b in B for each element v in b C.set(loc, v) loc += 1 endfor endfor return C
Bucket Sort Properties For this to work, values must be legal indices Non-negative integers only can be used Sorting occurs without comparing objects
Bucket Sort Properties For this to work, values must be legal indices Non-negative integers only can be used Sorting occurs without comparing objects
Bucket Sort Properties
For this to work, values must be legal indices Non-negative integers only can be used Sorting occurs without comparing objects Example of a stable sort Preserves relative ordering of objects with same value (B UBBLE - SORT & M ERGE - SORT are other stable sorts)
Bucket Sort Extensions Use Comparator for B UCKET - SORT Get index for v using compare( v, null) Comparator for booleans could return 0 when v is false 1 when v is true Comparator for US states, could return Annual per capita consumption of Jello Consumption of jello overall, in cubic feet State’s ranking by population
Bucket Sort Extensions State’s ranking by population 1 California 2 Texas 3 New York 4 Florida 5 Illinois 6 Pennsylvania 7 Ohio 8 Michigan 9 Georgia
Bucket Sort Extensions Extended B UCKET - SORT works with many types Limited set of data needed for this to work Need way to enumerating values of the set
Bucket Sort Extensions Extended B UCKET - SORT works with many types Limited set of data needed for this to work Need way to enumerating values of the set enumeration is subtle hint
d -Tuples Combination of d values such as ( k 1, k 2, …, k d ) k i is i th dimension of the tuple A point ( x, y, z ) is 3-tuple x is 1 st dimension’s value Value of 2 nd dimension is y z is 3 rd dimension’s value
Lexicographic Order Assume a & b are both d-tuples a = ( a 1, a 2, …, a d ) b = ( b 1, b 2, …, b d ) Can say a < b if and only if a 1 < b 1 OR a 1 = b 1 && ( a 2, …, a d ) < ( b 2, …, b d ) Order these 2-tuples using previous definition (3, 4) (7, 8) (3, 2) (1, 4) (4, 8)
Lexicographic Order Assume a & b are both d-tuples a = ( a 1, a 2, …, a d ) b = ( b 1, b 2, …, b d ) Can say a < b if and only if a 1 < b 1 OR a 1 = b 1 && ( a 2, …, a d ) < ( b 2, …, b d ) Order these 2-tuples using previous definition (3, 4) (7, 8) (3, 2) (1, 4) (4, 8) (1, 4) (3, 2) (3, 4) (4, 8) (7, 8)
Radix-Sort Very fast sort for data expressed as d-tuple Cheats to win Cheats to win; faster than sorting’s lower bound Sort performed using d calls to bucket sort Sorts least to most important dimension of tuple Luckily lots of data are d-tuples String is d-tuple of char
Radix-Sort Very fast sort for data expressed as d-tuple Cheats to win Cheats to win; faster than sorting’s lower bound Sort performed using d calls to bucket sort Sorts least to most important dimension of tuple Luckily lots of data are d-tuples Can sort an int by its digits
Radix-Sort Very fast sort for data expressed as d-tuple Cheats to win Cheats to win; faster than sorting’s lower bound Sort performed using d calls to bucket sort Sorts least to most important dimension of tuple Luckily lots of data are d-tuples String is d-tuple of char Can sort an int by its digits
Represent int as a d-tuple of digits: 6210 = = Decimal digits needs 10 buckets to use for sorting Ordering using their bits needs 2 buckets O (d∙ n ) time needed to run R ADIX - SORT d is length of longest element in input In most cases value of d is constant (d = 31 for int ) Radix sort takes O ( n ) time, ignoring constant Radix-Sort For int s
Radix-Sort In Action List of 4-bit integers sorted using R ADIX - SORT
Radix-Sort In Action List of 4-bit integers sorted using R ADIX - SORT
Radix-Sort In Action List of 4-bit integers sorted using R ADIX - SORT
Radix-Sort In Action List of 4-bit integers sorted using R ADIX - SORT
Radix-Sort In Action List of 4-bit integers sorted using R ADIX - SORT
Finding Value of a Digit Value of the i th digit (base-10) of k : (( k / 10 i ) % 10) Value of the i th bit (base-2) of k : (( k / 2 i ) % 2) Value of the i th hex digit (base-16) of k : (( k / 16 i ) % 16)
Radix Sort Algorithm radixSort( Sequence C) for i = 0 to 31 B = new Sequence [2] // & instantiate each Sequence for each element v in C digit = (v / (1 << i)) % 2 B[digit].addLast(v) endfor loc = 0 for each Sequence b in B for each element v in b C.set(loc, v) loc += 1 endfor endfor endfor return C 26
For Next Lecture Weekly assignment doubles-down on last week Due at regular time tomorrow Reviewing requirements for program #2 1 st Preliminary deadline is today Spend time working on this: design saves coding Reading on Graph ADT for Wednesday Note: these have nothing to do with bar charts What are mathematical graphs? Why are they the basis of everything in CS?