Download presentation
Presentation is loading. Please wait.
Published byBuddy Richardson Modified over 9 years ago
1
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 1 Sorting
2
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 2 The efficiency of data handling can often be substantially increased if the data are sorted For example, it is practically impossible to find a name in the telephone directory if the items are not sorted In order to sort a set of item such as numbers or words, two properties must be considered The number of comparisons required to arrange the data The number of data movement
3
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 3 Depending on the sorting algorithm, the exact number of comparisons or exact number of movements may not always be easy to determine Therefore, the number of comparisons and movements are approximated with big-O notations Some sorting algorithm may do more movement of data than comparison of data It is up to the programmer to decide which algorithm is more appropriate for specific set of data For example, if only small keys are compared such as integers or characters, then comparison are relatively fast and inexpensive But if complex and big objects should be compared, then comparison can be quite costly
4
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 4 If on the other hand, the data items moved are large, and the movement is relatively done more, then movement stands out as determining factor rather than comparison Further, a simple method may only be 20% less efficient than a more elaborated algorithm If sorting is used in a program once in a while and only for small set of data, then using more complicated algorithm may not be desirable However, if size of data set is large, 20% can make significant difference and should not be ignored Lets look at different sorting algorithms now
5
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 5 Insertion Sort Start with first two element of the array, data[0], and data[1] If they are out of order then an interchange takes place Next data[2] is considered and placed into its proper position If data[2] is smaller than data[0], it is placed before data[0] by shifting down data[0] and data[1] by one position Otherwise, if data[2] is between data[0] and data[1], we just need to shift down data [1] and place data[2] in the second position Otherwise, data[2] remain as where it is in the array Next data[3] is considered and the same process repeats And so on
6
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 6 InsertionSort(data[], n) for (i=1, i<n, i++) move all elements data[j] greater than data[i] by one position; place data[i] in its proper position; template void InsertionSort(T data[ ], int n) { for (int i=1; i<n, i++) { T tmp = data[i]; for (int j = i; j>0 && tmp < data[j-1]; j--) data[j] = data[j-1]; data[j] = tmp } Algorithm and code for insertion sort
7
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 7 Example of Insertion Sort 5 2 3 8 1 tmp = 2 5 5 3 8 1 Moving 5 down 2 5 3 8 1 Put tmp=2 in position 1 2 5 3 8 1 2 5 5 8 1 2 3 5 8 1 Moving 5 down Put tmp=3 in position 2 tmp = 3
8
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 8 tmp = 8 Since 5 is less than 8 no shifting is required tmp=1 Moving 8 down Put tmp=1 in position 1 2 3 5 8 1 2 3 5 8 1 2 3 5 8 1 2 3 5 8 8 2 3 5 5 8 Moving 5 down 2 3 3 5 8 Moving 3 down 2 2 3 5 8 Moving 2 down 1 2 3 5 8
9
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 9 Advantage of insertion sort: If the data are already sorted, they remain sorted and basically no movement is not necessary Disadvantage of insertion sort: An item that is already in its right place may have to be moved temporary in one iteration and be moved back into its original place Complexity of Insertion Sort: Best case: This happens when the data are already sorted. It takes O(n) to go through the elements Worst case: This happens when the data are in reverse order, then for the i th item (i-1) movement is necessary Total movement = 1 + 2 +... +(n-1) = n(n-1)/2 which is O(n 2 ) The average case is approximately half of the worst case which is still O(n 2 )
10
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 10 Selection Sort Select the minimum in the array and swap it with the first element Then select the second minimum in the array and swap it with the second element And so on until everything is sorted
11
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 11 SelectionSort(data[ ],n) for (i=0; i<n-1; i++) Select the smallest element among data[i] … data[n-1]; Swap it with data[i] template void SelectionSort(T data[ ], int n) { int i, j, least; for (i=1; i<n-1, i++) { for (j = i+1; least=i; j<n; j++) if data[j] < data[least]; least = j; swap (data[least], data[i]); } Algorithm and code for selection sort
12
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 12 Example of Selection Sort 5 2 3 8 1 The first minimum is searched in the entire array which is 1 Swap 1 with the first position 1 2 3 8 5 1 2 3 8 5 The second minimum is 2 Swap it with the second position 1 2 3 8 5
13
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 13 The third minimum is 3 Swap 1 with the third position 1 2 3 8 5 The fourth minimum is 5 Swap it with the forth position 1 2 3 5 8 1 2 3 8 5 1 2 3 8 5
14
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 14 Complexity of Selection Sort The number of comparison and/or movements is the same in each case (best case, average case and worst case) The number of comparison is equal to Total = (n-1) + (n-2) + (n-3) + …. + 1 = n(n-1)/2 which is O(n 2 )
15
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 15 Bubble Sort Start from the bottom and move the required elements up (i.e. bubble the elements up) Two adjacent elements are interchanged if they are found to be out of order with respect to each other First data[n-1] and data[n-2] are compared and swapped if they are not in order Then data[n-2] and data[n-3] are swapped if they are not in order And so on
16
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 16 BubbleSort(data[ ],n) for (i=0; i<n-1; i++) for (j=n-1; j>i; --j) swap elements in position j and j-1 if they are out of order template void BubbleSort(T data[ ], int n) { for (int i=0; i<n-1, i++) for (int j = n-1; j>i; --j) if data[j] < data[j-1]; swap (data[j], data[j-1]); } Algorithm and code for bubble sort
17
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 17 Example of Bubble Sort 5 2 3 8 1 Iteration 1: Start from the last element up to the first element and bubble the smaller elements up 5 2 3 1 8 swap 5 2 1 3 8 5 1 2 3 8 1 5 2 3 8 1 5 2 3 8 1 5 2 3 8 no swap 1 5 2 3 8 swap 1 2 5 3 8 Iteration 2: Start from the last element up to second element and bubble the smaller elements up
18
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 18 Example of Bubble Sort Iteration 4: Start from the last element up to fourth element and bubble the smaller elements up 1 2 5 3 8 1 2 5 3 8 no swapswap 1 2 3 5 8 Iteration 3: Start from the last element up to third element and bubble the smaller elements up 1 2 3 5 8 no swap
19
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 19 Complexity of Bubble Sort The number of comparison and/or movements is the same in each case (best case, average case and worst case) The number of comparison is equal to Total = (n-1) + (n-2) + (n-3) + …. + 1 = n(n-1)/2 which is O(n 2 )
20
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 20 Comparing the bubble sort with insertion and selection sorts we can say that: For the average case, bubble sort makes approximately twice as many comparisons and the same number of moves as insertion sort Bubble sort also, on average, makes as many comparison as selection sort and n times more moves than selection sort Between theses three types of sorts “Insertion Sort” is generally better algorithm because if array is already sorted running time only takes O(n) which is relatively faster than other algorithms
21
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 21 Shell Sort Shell sort works on the idea that it is easier and faster to sort many short lists than it is to sort one large list Select an increment value k (the best value for k is not necessarily clear) Sort the sequence consisting of every k th element (use some simple sorting technique) Decrement k and repeat above step until k=1
22
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 22 Example of Shell Sort 4 7 10 2 5 12 1 9 6 3 8 11 Choose k = 4 first 4 7 10 2 5 12 1 9 6 3 8 11 4 3 10 2 5 7 1 9 6 12 8 11 4 3 1 2 5 7 8 9 6 12 10 11
23
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 23 Example of Shell Sort Now choose k = 2, and then 1 by applying the insertion sort 4 3 1 2 5 7 8 9 6 12 10 11 1 3 4 2 5 7 6 9 8 12 10 11 1 2 4 3 5 7 6 9 8 10 12 1 2 3 4 5 6 7 8 9 10 11 12
24
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 24 Complexity of shell sort Shell sort works well on data that is almost sorted – O (n log 2 n) Deeper analysis of Shell sort is quite difficult Can be shown is practice that it is ~O(n 3/2 ) ShellSort(data[ ],n) determine numbers ht, ht-1, …..h1 of ways of dividing array data into subarrays for (h = h t ; t>1; t--, h=h t ) divide data into h sub-array for (i=1; i<=h; i++) sort sub-array data i ; sort array data Algorithm of shell sort
25
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 25 template void ShellSort(T data[ ], int arrsize) { int i, j, hCnt, h, k; int increments [20]; // create appropriate number of increments h for (h = 1; i=0; h<arrsize; i++) { increments [i] = h; h = 3*h +1; } // loop on the number of different increments h for (i=i-1; i>=0; i--) { h = increments [i]; // loop on the number of sub-arrays h-sorted in i th pass for (hCnt=h; hCnt<2*h; hCnt++) { // insertion sort for sub-array containing every h th element of array data for (j=hCntl j<arrsize;) { T tmp = data[j]; k = j; while (k-h>=0; && tmp < data [k-h]) { data[k] = data[k-h]; k = k –h; } data [k] = tmp; j = j + h; } Code for shell sort
26
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 26 Heap Sort Heap sort uses a heap as described in the earlier lectures As we said before, a heap is a binary tree with the following two properties: Value of each node is not less than the values stored in each of its children The tree is perfectly balanced and the leaves in the level are all in the leftmost positions
27
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 27 The procedure is: The data are transformed into a heap first Doing this, the data are not necessarily sorted; however, we know that the largest element is at the root Thus, start with a heap tree, Swap the root with the last element Restore all elements except the last element into a heap again Repeat the process for all elements until you are done
28
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 28 template void HeapSort(T data[ ], int size) { for (int i = (size/2)-1; i>=0; i--) MoveDown(data, i, size-1); // creates the heap for (i=size-1; i>=1; --i) { Swap (data[0], data[i]); // move the largest item to data[i] MoveDown(data, 0, i-1); // restores the heap } Algorithm and Code for Heap sort HeapSort(data[ ],n) transform data into a heap for (i=n-1; i>1; i--) swap the root with the element in position i; restore the heap property for the tree data[0] … data[i-1]
29
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 29 Example of Heap Sort We first transform the data into heap 2 8 6 1 10 15 3 12 11 2 10 1 6 8 11 12 3 15 The initial tree is formed as follows
30
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 30 2 10 12 6 8 11 1 3 15 We turn the array into a heap first 2 10 1 6 8 11 12 3 15 2 10 12 15 8 11 1 3 6 2 10 12 6 8 11 1 3 15
31
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 31 2 10 8 15 12 11 1 3 6 2 10 12 15 8 11 1 3 6 2 10 11 15 12 8 1 3 6 2 10 8 15 12 11 1 3 6
32
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 32 15 10 11 6 12 8 1 3 2 15 10 11 2 12 8 1 3 6 15 10 11 2 12 8 1 3 6 2 10 11 15 12 8 1 3 6
33
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 33 Now we start to sort the elements 15 10 11 6 12 8 1 3 2 8 10 11 6 12 15 1 3 2 12 10 8 6 11 15 1 3 2 Swap the root with the last element Restore the heap
34
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 34 Swap the root with the last element Restore the heap 12 10 8 6 11 15 1 3 2 1 10 8 6 11 15 12 3 2 11 1 8 6 10 15 12 3 2
35
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 35 Swap the root with the last element Restore the heap 3 1 8 6 10 15 12 11 2 10 1 3 6 8 15 12 11 2 1 8 6 10 15 12 3 2
36
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 36 Swap the root with the last element Restore the heap 8 1 2 6 3 15 12 11 10 1 3 6 8 15 12 11 2 2 1 3 6 8 15 12 11 10
37
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 37 Swap the root with the last element Restore the heap 8 1 2 6 3 15 12 11 10 1 8 2 6 3 15 12 11 10 6 8 2 1 3 15 12 11 10
38
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 38 Swap the root with the last element Restore the heap 6 8 2 1 3 15 12 11 10 2 8 6 1 3 15 12 11 10 3 8 6 1 2 15 12 11 10
39
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 39 Swap the root with the last element Restore the heap 3 8 6 1 2 15 12 11 10 1 8 6 3 2 15 12 11 10 2 8 6 3 1 15 12 11 10
40
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 40 Swap the root with the last element Restore the heap 2 8 6 3 1 15 12 11 10 1 8 6 3 2 15 12 11 10 1 8 6 3 2 15 12 11 10
41
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 41 Place the elements into array using breadth first traversal 1 8 6 3 2 15 12 11 10 1 2 3 6 8 11 12 15
42
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 42 Complexity of heap sort The heap sort requires a lot of movement which can be inefficient for large objects In the second phase when we start to sort the elements while keeping the heap, we exchange “n-1” times the root with the element in position i and also restore the heap “n-1” times which takes O(nlogn) In general: The first phase, where we turn the array into heap, requires O(n) steps And the second phase when we start to sort the elements requires O(n-1) swap + O(nlogn) operations to restore the heap Total = O(n) + O(nlogn) + O(n-1) = O(nlogn)
43
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 43 Quick Sort This is known to be the best sorting method. In this scheme: One of the elements in the array is chosen as pivot Then the array is divided into sub-arrays The elements smaller than the pivot goes into one sub-array The elements bigger than the pivot goes into another sub- array The pivot goes in the middle of these two sub-arrays Then each sub-array is partitioned the same way as the original array and process repeats recursively
44
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 44 Algorithm of quick sort QuickSort(array[ ]) if length (array) > 1 choose a pivot; // partition array into array1 and array2 while there are elements left in array include elements either in array1 // if element <= pivot or in array2 // if element >= pivot QuickSort(array1); QuickSort(array2); Complexity of quick sort The best case is when the arrays are always partitioned equally For the best case, the running time is O(nlogn) The running time for the average case is also O(nlogn) The worst case happens if pivot is always either the smallest element in the array or largest number in the array. In the worst case, the running time moves toward O(n 2 )
45
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 45 template void quicksort(T data[ ], int first, int last) { int lower = first +1; upper = last; swap (data[first], data[(first+last)/2)]); T pivot = data [first] while (lower <= upper) { while (data[lower] < pivot) lower++; while (pivot < data[upper]) upper--; if (lower < upper) swap(data[lower++], data[upper--]); else lower++; } swap (data[upper], data[first]); if (first < upper-1) quicksort(data, first, upper-1); if (upper+1 < last) quicksort(data, upper+1, last) } Code for quick sort template void quicksort(T data[ ], int n) { if (n<2) return; for (int i=1, max=0; i<n; i++) if (data[max] < data[i]) max = i; swap(data[n-1, data[max]); quicksort(data, 0, n-2); }
46
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 46 Example of Quick Sort By example Select pivot Partition 81 31 75 13 43 57 92 65 26 0 65 0 13 26 43 31 57 65 92 81 75
47
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 47 Recursively apply quicksort to both partitions Result will ultimately be a sorted array 0 13 26 31 43 57 65 75 81 92 0 13 26 43 31 57 92 81 75 65
48
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 48 Radix Sort Radix refers to the base of the number. For example radix for decimal numbers is 10 or for hex numbers is 16 or for English alphabets is 26. Radix sort has been called the bin sort in the past The name bin sort comes from mechanical devices that were used to sort keypunched cards Cards would be directed into bins and returned to the deck in a new order and then redirected into bins again For integer data, the repeated passes of a radix sort focus on the ones place value, then on the tens place value, then on the thousands place value, etc For character based data, focus would be placed on the right-most character, then the second most right-character, etc
49
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 49 Algorithm and Code for Radix Sort Assuming the numbers to be sorted are all decimal integers RadixSort(array[ ]) for (d = 1; d <= the position of the leftmost digit of longest number; i++) distribute all numbers among piles 0 through 9 according to the dth digit Put all integers on one list void radixsort(long data[ ], int n) { int i, j, k, mask = 1; const int radix = 10; // because digits go from 0 to 9 const int digits = 10; Queue queues[radix]; for (i=0, factor = 1, i < digits; factor = factor*radix, i++) { for (j=0; j<n; j++) queues [(data[j] / factor ) % radix ].enqueue (data[j]); for (j=k=0; j < radix; j++) while (!queues[j].empty()) data[k++] = queues[j].dequeue(); }
50
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 50 Example of Radix Sort Assume the data are: 459 254 472 534 649 239 432 654 477 Radix sort will arrange the values into 10 bins based upon the ones place value 0 1 2472 432 3 4254 534 654 5 6 7477 8 9459 649 239
51
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 51 The sublists are collected and made into one large bin (in order given) 472 432 254 534 654 477 459 649 239 Then Radix sort will arrange the values into 10 bins based upon the tens place value 0 1 2 3432 534 239 4649 5254 654 459 6 7472 477 8 9
52
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 52 The sublists are collected and made into one large bin (in order given) 432 534 239 649 254 654 459 472 477 Radix sort will arrange the values into 10 bins based upon the hundreds place value (done!) 0 1 2239 254 3 4432 459 472 477 5534 6649 654 7 8 9 The sublists are collected and the numbers are sorted 239 254 432 459 472 477 534 649 654
53
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 53 Another Example of Radix Sort Assume the data are: 9 54 472 534 39 43 654 77 To make it simple, rewrite the numbers to make them all three digits like: 009 054 472 534 039 043 654 077 Radix sort will arrange the values into 10 bins based upon the ones place value 0 1 2472 3043 4054 534 654 5 6 7077 8 9009 039
54
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 54 The sublists are collected and made into one large bin (in order given) 472 043 054 534 654 077 009 039 Then Radix sort will arrange the values into 10 bins based upon the tens place value 0009 1 2 3534 039 4043 5054 654 6 7472 077 8 9
55
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 55 The sublists are collected and made into one large bin (in order given) 009 534 039 043 054 654 472 077 Radix sort will arrange the values into 10 bins based upon the hundreds place value (done!) The sublists are collected and the numbers are sorted 009 039 043 054 077 472 534 654 0009 039 043 054 077 1 2 3 4472 5534 6654 7 8 9
56
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 56 Assume the data are: area book close team new place prince To sort the above elements using the radix sort you need to have 26 buckets, one for each character. You also need one more character to represent space which has the lowest value. Suppose that letter is question-mark “?” and it is used to represent space You can rewrite the data as follows: area? Book? Close Team? New?? Place Print Now all letters have 5 characters and it is easy to compare them with each other To do the sorting, you can start from the right most character, place the data into appropriate buckets and collect them. Then place them into bucket based on the second right most character and collect them again and so on.
57
Dr. Ahmad R. Hadaegh A.R. Hadaegh California State University San Marcos (CSUSM) Page 57 Complexity of Radix Sort The complexity is O(n) However, keysize (for example, the maximum number of digits) is a factor, but will still be a linear relationship because for example for at most 3 digits 3n is still O(n) which is linear Although theoretically O(n) is an impressive running time for sort, it does not include the queue implementation Further, if radix r (the base) is a large number and a large amount of data has to be sorted, then radix sort algorithm requires r queues of at most size n and the number r*n is O(rn) which can be substantially large depending of the size of r.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.