Download presentation
Presentation is loading. Please wait.
Published byJeffrey Morton Modified over 9 years ago
1
1 Joe Meehean
2
Problem arrange comparable items in list into sorted order Most sorting algorithms involve comparing item values We assume items define < operator > operator == operator
3
void sort( Iterator begin, Iterator end) data items must override < operator void sort( Iterator begin, Iterator end, Comparator cmp) Comparator is comparison functor cmp(a,b) returns true if a should go before b in the sorted list Often implemented using quick sort 3
4
Aspects we care about run time memory cost Types of algorithms comparison vs. non comparison Examples of important sorting algorithms Google Search: real world example 4
5
Obvious algorithms are O(N 2 ) Clever ones are O(NlogN) special purpose sorts can go even faster Additional considerations does algorithm always take worst-case time? what is the average case? what happens when the list is already sorted? 5
6
In-place sorts list with constant extra memory e.g., temporary variables Not in-place requires additional memory in relation to input size e.g., another parallel list
7
Comparison sort compare items in the list place smaller items near the front fastest worst case: O(NlogN) Non-comparison sort sort using special properties of items use/extrapolate additional information e.g., non-comparison sort O(Range+N) 7
8
Sorting algorithm based on heaps Idea insert items from unsorted list into heap use heap::removeMin to get items out of heap in sorted order put items back into list in sorted order 8
9
Problems with this approach Complexity not ideal inserting N items into heap is O(NlogN) removing N items from heap is O(NlogN) it would be better if would could do the whole thing in O(NlogN) Memory cost not in-place need original list + a heap 9
10
Can “heapify” a vector/array in O(N) convert unsorted vector into a max heap For each parent node (N/2 to 0) make sure its larger than its children if its not, swap parent with largest child shiftDown(int pos, K val) Minor complication vector starts at 0 not 1 like a normal heap 10
11
11 0 12345 6 7 8 9 L P 4 4 7 7 3 3 0 0 1 1 6 6 8 8 5 5 9 9 2 2
12
12 0 12345 6 7 8 9 P swap L 4 4 7 7 3 3 0 0 1 1 6 6 8 8 5 5 9 9 2 2
13
13 0 12345 6 7 8 9 P L 4 4 7 7 3 3 2 2 1 1 6 6 8 8 5 5 9 9 0 0
14
14 0 12345 6 7 8 9 L P R 4 4 7 7 3 3 2 2 1 1 6 6 8 8 5 5 9 9 0 0
15
15 0 12345 6 7 8 9 L P R swap 4 4 7 7 3 3 2 2 1 1 6 6 8 8 5 5 9 9 0 0
16
16 0 12345 6 7 8 9 L P R 4 4 7 7 9 9 2 2 1 1 6 6 8 8 5 5 3 3 0 0
17
17 0 12345 6 7 8 9 L P R 4 4 7 7 9 9 2 2 1 1 6 6 8 8 5 5 3 3 0 0
18
18 0 12345 6 7 8 9 L P R swap 4 4 7 7 9 9 2 2 1 1 6 6 8 8 5 5 3 3 0 0
19
19 0 12345 6 7 8 9 L P R 4 4 8 8 9 9 2 2 1 1 6 6 7 7 5 5 3 3 0 0
20
20 0 12345 6 7 8 9 L P R 4 4 8 8 9 9 2 2 1 1 6 6 7 7 5 5 3 3 0 0
21
21 0 12345 6 7 8 9 L P R swap 4 4 8 8 9 9 2 2 1 1 6 6 7 7 5 5 3 3 0 0
22
22 0 12345 6 7 8 9 L P R PL R swap 9 9 8 8 4 4 2 2 1 1 6 6 7 7 5 5 3 3 0 0
23
23 0 12345 6 7 8 9 L P R PL R 9 9 8 8 5 5 2 2 1 1 6 6 7 7 4 4 3 3 0 0
24
24 0 12345 6 7 8 9 L P R 9 9 8 8 5 5 2 2 1 1 6 6 7 7 4 4 3 3 0 0
25
25 0 12345 6 7 8 9 L P R swap 9 9 8 8 5 5 2 2 1 1 6 6 7 7 4 4 3 3 0 0
26
26 0 12345 6 7 8 9 L P R PL R 1 1 8 8 5 5 2 2 9 9 6 6 7 7 4 4 3 3 0 0
27
1 1 8 8 5 5 2 2 9 9 6 6 7 7 4 4 3 3 0 0 27 0 12345 6 7 8 9 L P R PL R
28
28 0 12345 6 7 8 9 5 5 8 8 4 4 2 2 9 9 6 6 7 7 1 1 3 3 0 0
29
O(N) proof is somewhat complex see Weiss 6.4.3 if interested Intuitively it is faster because we only need to shiftdown ½ the nodes plus starting at bottom reduces number of shift downs inserting each node into a heap shifts down for each insert (all the nodes) 29
30
Removing an item from the heap creates a space at the end This space is where the largest item should go in the finished array Why don’t we just put it there recall in heap::removeMax we return h[first] and replace h[first] with h[last] instead lets swap h[first] with h[last] 30
31
31 0 12345 6 7 8 9 5 5 8 8 4 4 2 2 9 9 6 6 7 7 1 1 3 3 0 0 = Heap= Sorted Vector swap
32
32 0 12345 6 7 8 9 5 5 8 8 4 4 2 2 0 0 6 6 7 7 1 1 3 3 9 9 = Heap= Sorted Vector
33
33 0 12345 6 7 8 9 5 5 8 8 4 4 2 2 0 0 6 6 7 7 1 1 3 3 9 9 L P R shift down = Heap= Sorted Vector
34
34 0 12345 6 7 8 9 5 5 0 0 4 4 2 2 8 8 6 6 7 7 1 1 3 3 9 9 L P R shift down = Heap= Sorted Vector
35
35 0 12345 6 7 8 9 5 5 7 7 4 4 2 2 8 8 6 6 0 0 1 1 3 3 9 9 P shift down = Heap= Sorted Vector
36
36 0 12345 6 7 8 9 5 5 7 7 4 4 2 2 8 8 6 6 0 0 1 1 3 3 9 9 swap = Heap= Sorted Vector
37
37 0 12345 6 7 8 9 5 5 7 7 4 4 2 2 3 3 6 6 0 0 1 1 8 8 9 9 shift down L P R = Heap= Sorted Vector
38
= Heap= Sorted Vector 38 0 12345 6 7 8 9 5 5 7 7 4 4 2 2 3 3 6 6 0 0 1 1 8 8 9 9 shift down L P R
39
39 0 12345 6 7 8 9 1 1 2 2 3 3 4 4 0 0 5 5 6 6 7 7 8 8 9 9 = Heap= Sorted Vector
40
Position in the Array Item value
41
Heapify O(N) In-place conversion of heap into sorted array O(NlogN) O(N) + O(NlogN) = O(NlogN) Costs the same if array was sorted to begin with 41
42
42
43
Fundamental Idea if all values in sorted array A are less than all values in sorted array B we can easily combine them an array of size 1 is sorted 0 1234 1 1 2 2 3 3 4 4 0 0 5 5 6 6 7 7 8 8 9 9 0 1234 A B 0 1234 1 1 2 2 3 3 4 4 0 0 5 5 6 6 7 7 8 8 9 9 A 5 6 7 8 9 43
44
1) if number of items in A is one or zero, return 2) Choose a value from A to be the pivot 3) Partition A into sub-lists all values ≤ pivot into left part all values ≥ pivot into the right part 4) Return quicksort(L-part) + pivot + quicksort(R-part) 44
45
5 5 6 6 4 4 3 3 2 2 0 0 1 1 7 7 45
46
4 4 5 5 6 6 4 4 3 3 2 2 0 0 1 1 7 7 46
47
2 2 6 6 3 3 1 1 5 5 4 4 0 0 7 7 5 5 6 6 4 4 3 3 2 2 0 0 1 1 7 7 47
48
48 2 2 6 6 3 3 1 1 5 5 4 4 0 0 7 7 1 1 5 5 6 6 4 4 3 3 2 2 0 0 1 1 7 7
49
49 2 2 6 6 3 3 1 1 5 5 4 4 0 0 7 7 2 2 3 3 1 1 0 0 5 5 6 6 4 4 3 3 2 2 0 0 1 1 7 7
50
50 2 2 6 6 3 3 1 1 5 5 4 4 0 0 7 7 2 2 3 3 1 1 0 0 3 3 5 5 6 6 4 4 3 3 2 2 0 0 1 1 7 7
51
51 2 2 6 6 3 3 1 1 5 5 4 4 0 0 7 7 2 2 3 3 1 1 0 0 2 2 3 3 5 5 6 6 4 4 3 3 2 2 0 0 1 1 7 7
52
52 2 2 6 6 3 3 1 1 5 5 4 4 0 0 7 7 2 2 3 3 1 1 0 0 2 2 3 3 6 6 5 5 6 6 4 4 3 3 2 2 0 0 1 1 7 7
53
53 2 2 6 6 3 3 1 1 5 5 4 4 0 0 7 7 2 2 3 3 1 1 0 0 2 2 3 3 6 6 5 5 7 7 5 5 6 6 4 4 3 3 2 2 0 0 1 1 7 7
54
54 4 4 1 1 0 0 2 2 3 3 6 6 5 5 7 7
55
55 4 4 1 1 0 0 2 2 3 3 6 6 5 5 7 7
56
56 4 4 1 1 0 0 2 2 3 3 6 6 5 5 7 7
57
57 4 4 1 1 0 0 2 2 3 3 6 6 5 5 7 7
58
58 4 4 1 1 0 0 2 2 3 3 6 6 5 5 7 7
59
59 4 4 1 1 0 0 2 2 3 3 6 6 5 5 7 7 In practice items are already in the correct place when we get to the bottom. All work is done on the way down.
60
Goal: Choose the median value so that left and right arrays are the same size If we choose the smallest value each partition only reduces the problem by one sorting tree height will be N instead of log N Same if we choose the largest 60
61
Actually finding median is O(N) Choose 1 st item very bad if A is already sorted or reverse sorted Choose a random item (index) OK if you have a fast, accurate random number generator we don’t 61
62
Reduces comparisons by 14% Compare the center, left, and right items choose the median as the pivot only works if >= 3 items to be sorted Partitioning optimization place smallest of 3 in left (≤ pivot) place largest of 3 in right (≥ pivot) place pivot in the center 62
63
63 Choose the median of 3 place in the right positions left right center 5 5 6 6 2 2 3 3 7 7 0 0 1 1 4 4
64
64 Choose the median of 3 place in the right positions left right center 5 5 6 6 4 4 3 3 2 2 0 0 1 1 7 7
65
65 Swap pivot with right − 1 left right center 5 5 6 6 4 4 3 3 2 2 0 0 1 1 7 7
66
66 Swap pivot with right − 1 left right center 5 5 6 6 1 1 3 3 2 2 0 0 4 4 7 7
67
67 Use indices hi and lo to partition remainder of the array increment lo until it finds a value ≥ pivot decrement hi until it finds a value ≤ pivot lo hi 5 5 6 6 1 1 3 3 2 2 0 0 4 4 7 7
68
68 When lo & hi stop swap lo and hi increment lo decrement hi lo hi 5 5 6 6 1 1 3 3 2 2 0 0 4 4 7 7
69
69 lo hi 0 0 6 6 1 1 3 3 2 2 5 5 4 4 7 7 When lo & hi stop swap lo and hi increment lo decrement hi
70
70 Repeat until lo > hi 0 0 6 6 1 1 3 3 2 2 5 5 4 4 7 7 lo hi
71
71 0 0 3 3 1 1 6 6 2 2 5 5 4 4 7 7 lo hi Repeat until lo > hi
72
72 0 0 3 3 1 1 6 6 2 2 5 5 4 4 7 7 lo hi Repeat until lo > hi
73
73 Restore the pivot swap with lo 0 0 3 3 1 1 6 6 2 2 5 5 4 4 7 7 lo hi
74
74 Done 0 0 3 3 1 1 4 4 2 2 5 5 6 6 7 7 lo hi
75
Position in the Array Item value
76
Values equal to the pivot e.g., A[lo] ≥ pivot OR A[lo] > pivot worst case entire list is the same value if we didn’t swap for duplicates, lo would be all the way at the right uneven partition best to swap values that are equal to pivot 76
77
Small lists insertion sort is faster for N < 20 Quick sort is recursive will always, eventually, sort lists < 20 large lists are broken down into small ones commonly quick sort is used until each sub-list is <= 10 then the sub-list is sorted using insertion sort reduces run-time by 15% 77
78
At each level, N comparisons to pivots Levels: worst case pivot is smallest or largest value call graph linear: N levels Levels: best-case pivot is median value call graph balanced binary tree: log 2 N levels Worst-case: O(N 2 ) Best-case: O(NlogN) No better/worse if array already sorted
79
Quick sort A: select 1 st item as pivot B: select random item as pivot A O(NlogN) on average O(N 2 ) if list is already mostly sorted B O(NlogN) on average O(N 2 ) if random number always select smallest 79
80
Algorithm A & B would be identical if all inputs are equally likely but partially sorted lists are far more likely Bad inputs A is always bad B has different run times for the same input depends on random numbers selected unlike inputs, all random numbers equally likely B more likely have average run time 80
82
General comparison sorts have a best- case of O(N log N) Non-comparison sorts use extra information about items can sort faster than O(N log N) bucket sort radix sort 82
83
Used for sorting positive integers with a small range (less than M) Algorithm for input A 1, A 2, A 3, … A N, each a positive int less than M make an array count of size M foreach input A i => count[A i ]++ scan count printing the ints we’ve seen 83
84
84 3 3 6 6 4 4 3 3 2 2 0 0 1 1 9 9 A A max = 9 0 12345 6 7
85
85 3 3 6 6 4 4 3 3 2 2 0 0 1 1 9 9 A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 count 0 0 0 0 0 12345 6 7 8 9 0 12345 6 7
86
86 3 3 6 6 4 4 3 3 2 2 0 0 1 1 9 9 A 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 count 0 0 0 0 i 0 12345 6 7 0 12345 6 7 8 9
87
87 3 3 6 6 4 4 3 3 2 2 0 0 1 1 9 9 A 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 count 0 0 0 0 i 0 12345 6 7 0 12345 6 7 8 9
88
88 3 3 6 6 4 4 3 3 2 2 0 0 1 1 9 9 A 0 0 1 1 1 1 0 0 0 0 0 0 1 1 0 0 count 0 0 0 0 i 0 12345 6 7 0 12345 6 7 8 9
89
89 3 3 6 6 4 4 3 3 2 2 0 0 1 1 9 9 A 0 0 1 1 1 1 1 1 0 0 0 0 1 1 0 0 count 0 0 0 0 i 0 12345 6 7 0 12345 6 7 8 9
90
90 3 3 6 6 4 4 3 3 2 2 0 0 1 1 9 9 A 0 0 1 1 2 2 1 1 0 0 0 0 1 1 0 0 count 0 0 0 0 i 0 12345 6 7 0 12345 6 7 8 9
91
91 3 3 6 6 4 4 3 3 2 2 0 0 1 1 9 9 A 0 0 1 1 2 2 1 1 1 1 0 0 1 1 0 0 count 0 0 0 0 i 0 12345 6 7 0 12345 6 7 8 9
92
92 3 3 6 6 4 4 3 3 2 2 0 0 1 1 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 count 0 0 0 0 i 0 12345 6 7 0 12345 6 7 8 9
93
93 3 3 6 6 4 4 3 3 2 2 0 0 1 1 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 count 0 0 1 1 i 0 12345 6 7 0 12345 6 7 8 9
94
94 3 3 6 6 4 4 3 3 0 0 0 0 1 1 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 count 0 0 1 1 i 0 12345 6 7 0 12345 6 7 8 9 j
95
95 1 1 6 6 4 4 3 3 0 0 0 0 1 1 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 count 0 0 1 1 i 0 12345 6 7 0 12345 6 7 8 9 j
96
96 1 1 2 2 4 4 3 3 0 0 0 0 1 1 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 count 0 0 1 1 i 0 12345 6 7 0 12345 6 7 8 9 j
97
97 1 1 2 2 3 3 3 3 0 0 0 0 1 1 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 count 0 0 1 1 i 0 12345 6 7 0 12345 6 7 8 9 j
98
98 1 1 2 2 3 3 3 3 0 0 0 0 1 1 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 count 0 0 1 1 i 0 12345 6 7 0 12345 6 7 8 9 j
99
99 1 1 2 2 3 3 3 3 0 0 4 4 1 1 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 count 0 0 1 1 i 0 12345 6 7 0 12345 6 7 8 9 j
100
100 1 1 2 2 3 3 3 3 0 0 4 4 1 1 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 0 0 1 1 i 0 12345 6 7 0 12345 6 7 8 9 j
101
101 1 1 2 2 3 3 3 3 0 0 4 4 6 6 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 0 0 1 1 i 0 12345 6 7 0 12345 6 7 8 9 j
102
102 1 1 2 2 3 3 3 3 0 0 4 4 6 6 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 0 0 1 1 i 0 12345 6 7 0 12345 6 7 8 9 j
103
103 1 1 2 2 3 3 3 3 0 0 4 4 6 6 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 0 0 1 1 i 0 12345 6 7 0 12345 6 7 8 9 j
104
104 1 1 2 2 3 3 3 3 0 0 4 4 6 6 9 9 A 1 1 1 1 2 2 1 1 1 1 0 0 1 1 0 0 0 0 1 1 i 0 12345 6 7 0 12345 6 7 8 9 j
105
Not-in place requires an extra M memory Complexity scan the original list O(N) scan the “count” list O(M) O(M+N) 105
106
Requires items are sequences of comparables numbers (sequence of digits) strings (sequence of characters) Useful for short sequences of comparables Idea sort each position in sequence separately 106
107
Use an auxiliary array of queues array must be large enough to store queues for full range of digits 0-9 for numbers a-z for words Process sequences from R to L least significant “digit” first Each pass evaluates the next digit store each item in queue in auxiliary array based on value of current digit dequeue items back into original array 107
108
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132, 355, 104, 327, 111, 285, 391, 543, 123, 535] [ ] 108
109
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132, 355, 104, 327, 111, 285, 391, 543, 123, 535] 109
110
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132] [ ] [132, 355, 104, 327, 111, 285, 391, 543, 123, 535] 110
111
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132] [355] [ ] [132, 355, 104, 327, 111, 285, 391, 543, 123, 535] 111
112
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132] [355] [ ] [104] [ ] [132, 355, 104, 327, 111, 285, 391, 543, 123, 535] 112
113
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132] [355] [ ] [104] [ ] [327] [ ] [132, 355, 104, 327, 111, 285, 391, 543, 123, 535] 113
114
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132] [355] [111] [ ] [104] [ ] [327] [ ] [132, 355, 104, 327, 111, 285, 391, 543, 123, 535] 114
115
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132] [355, 285] [111] [ ] [104] [ ] [327] [ ] [132, 355, 104, 327, 111, 285, 391, 543, 123, 535] 115
116
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132] [355, 285] [111, 391] [ ] [104] [ ] [327] [ ] [132, 355, 104, 327, 111, 285, 391, 543, 123, 535] 116
117
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132] [355, 285] [111, 391] [ ] [104] [543] [327] [ ] [132, 355, 104, 327, 111, 285, 391, 543, 123, 535] 117
118
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132] [355, 285] [111, 391] [ ] [104] [543, 123] [327] [ ] [132, 355, 104, 327, 111, 285, 391, 543, 123, 535] 118
119
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132] [355, 285, 535] [111, 391] [ ] [104] [543, 123] [327] [ ] 119
120
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [132] [355, 285, 535] [111, 391] [ ] [104] [543, 123] [327] [ ] [111, 391, 132, 543, 123, 104, 355, 285, 535, 327] 120
121
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [ ] 121
122
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [111, 391, 132, 543, 123, 104, 355, 285, 535, 327] [111] [123, 327] [132, 535] [543] [355] [285] [104] [391] [ ] 122
123
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [111] [123, 327] [132, 535] [543] [355] [285] [104] [391] [ ] [104, 111, 123, 327, 132, 535, 543, 355, 285, 391] 123
124
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [ ] 124
125
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [104, 111, 123, 327, 132, 535, 543, 355, 285, 391] [104, 111, 123, 132] [285] [327, 355, 391] [535, 543] [ ] 125
126
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 [104, 111, 123, 132] [285] [327, 355, 391] [535, 543] [ ] [104, 111, 123, 132, 285, 327, 355, 391, 535, 543] 126
127
Each pass puts items in correct Q: O(N) moves items from Q to array: O(N) pass total: O(N) # of passes depends on # of digits O(N * #digits) 127
128
Sort every number between 0 and 4,294,967,296 (4 billion something) Merge Sort Nlog 2 N N = 4 billion Log 2 (4 billion) = 32 4 billion * 32 = 128 billion operations 128
129
Radix sort O(N * #digits) N = 4 billion # of digits = 10 4billion * 10 = 40 billion Radix sort about 3 times faster 129
130
BestWorstAvgIn-Place SelectionO(N 2 ) Yes InsertionO(N)O(N 2 ) Yes HeapO(NlogN) Yes MergeO(NlogN) No QuickO(NlogN)O(N 2 )O(NlogN)Yes BucketO(N + M) No RadixO(N * #digits) No 130
131
131
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.