Bucket sort.

Slides:



Advertisements
Similar presentations
Issues with arrays.
Advertisements

Recursive binary search
For loops.
Templates.
Introduction to classes
Static variables.
Default values of parameters
The structured programming theorem
Pointers.
Dynamically allocating arrays
Binary search.
Do-while loops.
Command-line arguments
Throwing exceptions.
Pointer arithmetic.
Console input.
Dangling pointers.
This.
Sorted arrays.
Floating-point primitive data types
Dynamically allocating arrays within structures
Break statements.
Linked Lists.
Wild pointers.
The comma as a separator and as an operator
Selection sort.
The ternary conditional operator
Dynamically allocating structures
Memory leaks.
Pushing at the back.
Bit-wise and bit-shift operators
Sorting algorithms.
Command-line arguments
Passing pointers as parameters to and from functions
Repetitious operations
Dynamically allocating arrays
Insertion sort.
Problems with pointers
A list-size member variable
Protecting pointers.
Dynamically allocating arrays
Code-development strategies
Selection sort.
Insertion sort.
Pointers as arguments and return values
Reference variables, pass-by-reference and return-by-reference
Addresses and pointers
Default values of parameters
Pointer arithmetic.
Class variables and class functions
Operator overloading.
Dynamic allocation of arrays
Templates.
Insertion sort.
Sorted arrays.
Sorting algorithms.
Issues with classes.
Dangling pointers.
Dynamic allocation of classes
Encapsulation.
Counting sort.
Selection sort.
Searching and sorting arrays
Protecting pointers.
Data structures: class
An array class: constructor and destructor
Constructors.
This.
Recursive binary search
Algorithms and templates
Presentation transcript:

Bucket sort

Outline In this lesson, we will: Describe the bucket sort algorithm Look at an implementation Consider the run time of this algorithm

Sorting Suppose we have the following numbers: 3 2 4 3 2 1 2 3 4 3 2 3 3 2 1 1 4 3 4 1 2 4 3 3 2 2 4 1 2 4 4 3 2 0 1 How could you sort these numbers quickly? How about just counting how often each number appears: 0 appears once 1 appears 6 times 2 appears 10 times 3 appears 10 times, as well 4 appears 7 times Thus, the sorted array is: 0 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4

Sorting algorithms This nice algorithm only works because we not sorting that many different numbers Suppose we know, a priori, that we are only sorting numbers from 0 to 9 What could we do in this case? Create an array of capacity 10 and initialize all of the entries to zero Walk through the array, incrementing the counter for each entry Then, use this to copy the values sorted back to the array This is called bucket sort, because we are essentially putting the numbers we find into different buckets

Sorting algorithms Here is the algorithm: First, create 10 counters (an array) and ensure they are all zero Go through the array, and for each entry in the array, increment the corresponding counter Now, for each value between 0 to 9, copy into the array as many entries as we counted

Sorting algorithms Looking at these steps: Go through the array, and for each entry in the array, increment the corresponding counter This requires a for-loop going through each entry of the array for ( std::size_t k{0}; k < capacity; ++k ) { // Record what we found } Now, for each value between 0 to 9, copy into the array as many entries as we counted Inside a for-loop going from 0 to 9, we will have to copy to the array as many entries were found in the array for ( int j{0}; j < 10; ++j ) { for ( std::size_t i{0}; i < count[j]; ++i, ++k ) { array[k] = j;

Bucket sort Here is an implementation: template <typename T> void bucket_sort( T array[], std::size_t const capacity ) { std::size_t count[10]{0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; // Count the number of objects being sorted for ( std::size_t k{0}; k < capacity; ++k ) { ++count[array[k]]; } // Copy the appropriate number of each back to the array std::size_t posn{0}; for ( T i{0}; i < 10; ++i ) { for ( std::size_t k{0}; k < count[k]; ++k, ++posn ) { array[posn] = i;

Bucket sort Questions: What happens if you pass this function an array that has a value outside of 0 through 9? We are still using templates: Does it make sense to use bucket sort on an array of type double? Which types would this work for?

Bucket sort Now, in general, we will not be sorting numbers from 0 to 9, but more likely, in general, we may know that we are sorting numbers from some minimum value to some maximum value We can deal with this either by: Explicitly searching for both the minimum and maximum values in the array Require the user explicitly pass the minimum and maximum values: template <typename T> void bucket_sort( T array[], std::size_t const capacity, T const &minimum, T const &maximum );

Sorting algorithms In your homework, you will be required to implement both of these algorithms You will have to use dynamically allocated memory You must use the minimum sized array possible For example, the following array should only require an array of size 12 (why?) to sort: int array[20]{ 592359245, 592359239, 592359243, 592359236, 592359243, 592359243, 592359242, 592359242, 592359237, 592359237, 592359241, 592359245, 592359243, 592359236, 592359235, 592359236, 592359243, 592359240, 592359236, 592359246 };

Run times We know that sorting this array using bucket sort is the correct approach: int array[100]{ 8, 8, 6, 5, 9, 9, 6, 9, 5, 6, 5, 6, 7, 8, 9, 8, 7, 5, 7, 7, 5, 9, 9, 7, 5, 9, 9, 6, 6, 5, 8, 6, 5, 9, 8, 8, 8, 8, 6, 8, 7, 5, 6, 5, 9, 6, 5, 9, 6, 8, 8, 7, 6, 6, 9, 8, 8, 8, 5, 5, 6, 8, 9, 9, 7, 7, 6, 7, 5, 8, 5, 6, 9, 9, 5, 7, 7, 9, 6, 5, 8, 9, 6, 6, 6, 6, 5, 5, 8, 5, 8, 7, 9, 7, 8, 6, 7, 8, 6, 7 }; Question: Would bucket sort be good for sorting this array? int array[20]{ 316774203, 1618161201, 1323603860, 76338464, 1379741408, 1378001070, 1890103888, 42, 1817480507, 347706477 Why or why not?

Summary Following this lesson, you now Understand the bucket sort algorithm Know that this is a fast algorithm under specific circumstances Know that this algorithm cannot be used in general

References [1] No references?

Colophon These slides were prepared using the Georgia typeface. Mathematical equations use Times New Roman, and source code is presented using Consolas. The photographs of lilacs in bloom appearing on the title slide and accenting the top of each other slide were taken at the Royal Botanical Gardens on May 27, 2018 by Douglas Wilhelm Harder. Please see https://www.rbg.ca/ for more information.

Disclaimer These slides are provided for the ece 150 Fundamentals of Programming course taught at the University of Waterloo. The material in it reflects the authors’ best judgment in light of the information available to them at the time of preparation. Any reliance on these course slides by any party for any other purpose are the responsibility of such parties. The authors accept no responsibility for damages, if any, suffered by any party as a result of decisions made or actions based on these course slides for any other purpose than that for which it was intended.