Simple Sorting Methods: Bubble, Selection, Insertion, Shell
Outline Importance of sorting Sorting algorithms bubble sort insertion sort selection sort shell sort
Searching Want to know if collection contains value keep looking until find it… …or determine it isn’t there Unsorted collections linear search: O(N) Sorted collections binary search: O(log N)
O(N) vs. O(log N) Small differences at small numbers 3N vs. 6 log N for N = 30? 30 vs. 42 Large differences at large numbers 3N vs. 6 log N for N = 1,000,000? 3,000,000 vs. 120 Will eventually overcome any amount of overhead (3N vs. 6000 log N?)
Some Simple Sorting Methods Bubble sort “swap” items that are out of order Selection sort “select” items and place in their final position Insertion sort “insert” items into a sorted sub-array Shell sort “insert” using longer distances, then shorter
Bubble Sort Simplest algorithm: look at items next to each other if they’re out of order, swap them keep going until everything in order Items percolate toward their final positions big items move down quickly small items move up slowly
Bubble Sort Start at low end For each pair in the list… if lower item is bigger, swap the pair biggest item will move to the end repeat until no more moves 6 100 3 –2 8 6 100 3 3 100 –2 8 100 –2 8 100 100 3 6 –2 –2 –2 3 3 3 6 –2 6 6 8 8 100
Bubble Sort to BubbleSort( List a ) for i Length(a) down to 1 for j 1 .. i - 1 if (a[j-1] > a[j]) Swap(a[j], a[j-1]);
Bubble Sort Complexity for i Length(a) down to 1 for j 1 .. i - 1 if (a[j-1] > a[j]) // 1 comparison Swap(a[j], a[j-1]); // 3 assignments N i-1 ∑ ∑ (1 + (0 or 3)) best 1, worst 4, average 2.5 i=1 j=0 N i N ∑ ∑ 2.5 = 2.5 ∑ i = 2.5(N)(N+1)/2 = 1.25(N2 + N) i=1 j=1 i=1
Bubble Sort Considered Bad Not generally very good lots of work moving things around Can be improved track where last swap took place everything above there is sorted Improved version OK if list almost sorted (Head-2-Head sorting link uses this version) just need to know that ahead of time
Exercise Show the evolution of the following array under bubble sort. Show the result after each (top-level) pass thru the array [15, 3, 21, 45, 7]
Selection Sort Based on “copying” a list in order select the smallest item in the list add it to end of new list cross it off the old list But do it all in one array front part of list is sorted won’t need to be changed again – everything left “unsorted” is bigger
Selection Sort Start at low end For each position in the list… find the smallest unsorted element swap it into position 6 100 3 –2 8 6 –2 6 3 100 100 100 3 6 8 –2 100 8 100 3 8 –2 6
Selection Sort Top-Level Trace lst [6, 100, 3, -2, 8]; SelectionSort( lst ); 0 1 2 3 4 lst == 6 100 3 -2 8 1st pass: find smallest item in list swap it with first item in list i.e. swap -2 with 6
Selection Sort Top-Level Trace lst [6, 100, 3, -2, 8]; SelectionSort( lst ); _0 1 2 3 4 lst == 6 100 3 -2 8 lst == [-2] 100 3 6 8 2nd pass: find smallest remaining item in rest of list swap it into the second position i.e. swap 3 with 100
Selection Sort Top-Level Trace lst [6, 100, 3, -2, 8]; SelectionSort( lst ); 0 1 2 3 4 lst == 6 100 3 -2 8 lst == [-2] 100 3 6 8 lst == [-2 3] 100 6 8
Selection Sort Top-Level Trace lst [6, 100, 3, -2, 8]; SelectionSort( lst ); 0 1 2 3 4 lst == 6 100 3 -2 8 lst == [-2] 100 3 6 8 lst == [-2 3] 100 6 8 lst == [-2 3 6] 100 8
Selection Sort Top-Level Trace lst [6, 100, 3, -2, 8]; SelectionSort( lst ); 0 1 2 3 4 lst == 6 100 3 -2 8 lst == [-2] 100 3 6 8 lst == [-2 3] 100 6 8 lst == [-2 3 6] 100 8 lst == [-2 3 6 8] 100
Exercise Show the evolution of the following array under selection sort. Show the result after each (top-level) pass [15, 3, 21, 45, 7]
Finding the Smallest Each pass requires finding the smallest item remaining unsorted need its location in the array, so we can swap it on ith pass search locations i–1 .. N–1 can start assuming first of those is smallest
Selection Sort to SelectionSort( List a ) for i 0 .. Length(a) – 2 p i; for j i + 1 .. Length(a) – 1 if (a[j] < a[p]) p j; Swap(a[p], a[i]);
Exercise List all the comparisons made between array elements while selection-sorting the following list: [2, 8, 12, 4, 9]
Work in Selection Sort to SelectionSort( List a ) for i 1 .. Length(a) – 1 p i; for j i + 1 .. Length(a) if (a[j] < a[p]) p j; temp a[p]; a[p] a[i]; a[i] temp; Comparison Assignment
Counting Ops in Selection Sort Outer loop runs i from 1 to N – 1 inner runs j from i+1 to N WSel,worst(N) = i=1SN–1 (3 + j=i+1SN 1) WSel,worst(N) = i=1SN–1 (3 + j=1SN 1 – j=1Si 1) WSel,worst(N) = i=1SN–1 (3 + N – i) WSel,worst(N) = i=1SN–1 (3 + N) – i=1SN–1 i WSel,worst(N) = (N – 1)(3 + N) – (N – 1)(N)/2 WSel,worst(N) = N2 + 2N – 3 – (N2 – N)/2 WSel,worst(N) = (N2 + 5N – 6)/2
Magnitude of Selection Sort Average case same as worst case second smallest item could be anywhere always need to look at every remaining element Worst case: WSel,worst(N) = (N2 + 5N – 6)/2 = O(N2) Average case: WSel,ave(N) = (N2 + 5N – 6)/2 = O(N2)
Insertion Sort Based on inserting into an ordered list For when you’re creating a list Keep it sorted right from the start Insert each item into its proper place... ...shifting other items up as required But all items already in the array “split” array into sorted list part and input part
Insertion Sort Start at low end For each unsorted element… find its place make space insert it 6 100 3 –2 8 –2 6 6 100 3 6 100 3 6 100 100 3 100 8 –2 8 100 6 8 3 –2
Insertion Sort, Top Level Loop Multiple passes thru the list At start of ith pass, first i items are in order list of length 1 always “in order” At end of ith pass, first i+1 items in order thus need only N – 1 passes in all On ith pass, “insert” i+1st item into sorted part of list
Insertion Sort Top-Level Trace lst [6, 100, 3, -2, 8]; InsertionSort( lst ); _0 _1 2 3 4 lst == [ 6] 100 3 -2 8 1st pass: list from 0 to 0 is sorted insert item 1 into the sorted part of the list i.e. insert 100 into [6]
Insertion Sort Top-Level Trace lst [6, 100, 3, -2, 8]; InsertionSort( lst ); _0 _1 2 3 4 lst == [ 6] 100 3 -2 8 lst == [ 6 100] 3 -2 8 2nd pass: list from 0 to 1 is sorted insert item 2 into the sorted part of the list i.e. insert 3 into [6, 100]
Insertion Sort Top-Level Trace lst [6, 100, 3, -2, 8]; InsertionSort( lst ); _0 _1 2 3 4 lst == [ 6] 100 3 -2 8 lst == [ 6 100] 3 -2 8 lst == [ 3 6 100] -2 8
Insertion Sort Top-Level Trace lst [6, 100, 3, -2, 8]; InsertionSort( lst ); _0 _1 2 3 4 lst == [ 6] 100 3 -2 8 lst == [ 6 100] 3 -2 8 lst == [ 3 6 100] -2 8 lst == [-2 3 6 100] 8
Insertion Sort Top-Level Trace lst [6, 100, 3, -2, 8]; InsertionSort( lst ); _0 _1 2 3 4 lst == [ 6] 100 3 -2 8 lst == [ 6 100] 3 -2 8 lst == [ 3 6 100] -2 8 lst == [-2 3 6 100] 8 lst == [-2 3 6 8 100]
Exercise Show the evolution of the following array under insertion sort. Show the result after each (top-level) pass [15, 3, 21, 45, 7] [17, 4, 8, 3]
Insertion on Pass i Move i+1st item into temporary storage Starting at the “end” of the sorted part… …move items up until you find where “new” item goes find a smaller item fall off the front of the list
Insertion on Pass 2 Trace lst == [6, 100], 3, -2, 8 Insert 3 into [6, 100] 0 1 2 temp lst == [ 6 100 3] 3 lst == [ 6 100 100] 3 lst == [ 6 6 100] 3 lst == [ 3 6 100] 3
Insertion on Pass 2 Trace lst == [-2, 3, 6, 100], 8 Insert 8 into [-2, 3, 6, 100] 0 1 2 3 4 temp [ -2 3 6 100 8] 8 [ -2 3 6 100 100] 8 [ -2 3 6 8 100] 8
Insertion Sort Pseudo-Code to InsertionSort( List a ) for i 0 .. Length(a) – 2 p i + 1 temp a[p]; while (p > 0 && a[p – 1] > temp) a[p] a[p – 1]; p p – 1; a[p] temp;
Exercise Show the evolution of the following array under insertion sort. Show the result after each assignment into the array (that is, inner as well as outer loops): [17, 4, 8, 3]
Work in Insertion Sort to InsertionSort( List a ) for i 1 .. Length(a) – 1 p i + 1 temp a[p]; while (p > 1 && a[p – 1] > temp) a[p] a[p – 1]; p p – 1; a[p] temp; Comparison Assignment
Counting Ops in Insertion Sort Let N be the size of the array/vector number of elements in the list Outer loop iterates N – 1 times contains two assignments Inner loop iterates at most i times list was in reverse order – need to compare all contains one comparison & one assignment
Counting Ops in Insertion Sort Outer loop runs i from 1 to N – 1 inner runs p from i+1 down to 2 (“worst case”) WIns,worst(N) = i=1SN–1 (2 + p=2Si+1 2) WIns,worst(N) = i=1SN–1 (2 + 2i) WIns,worst(N) = 2(N – 1) + 2(N – 1)(N)/2 WIns,worst(N) = 2N – 2 + N2 – N WIns,worst(N) = N2 + N – 2
Average Case Analysis On average, we expect inner loop of insertion to only go ½ way to the front WIns,ave(N) = i=1SN–1 (2 + ½(2i)) WIns,ave(N) = 2(N–1) + (N–1)(N)/2 WIns,ave(N) = 2N – 2 + (N2 – N)/2 WIns,ave(N) = (N2 + 3N – 4)/2
Comparing Selection & Insertion WSel,worst(N) = (N2 + 5N – 6)/2 WIns,worst(N) = N2 + N – 2 N = 1000 gives: Selection sort: 502,987 operation Insertion sort: 1,000,998 operations Selection looks faster but that’s worst-case time for Insertion sort
Average Selection & Insertion WSel,ave(N) = (N2 + 5N – 6)/2 WIns,ave(N) = (N2 + 3N – 4)/2 N = 1000 gives: Selection sort: 502,987 operation Insertion sort: 501,498 operations They do about the same amount of work and both are O(N2)
Exercise Show the evolution of each of the following arrays under both insertion and selection sorts [99, 14, 21, 12, 5] [3, 7, 14, 5, 2, 8] [9, 8, 7, 6, 5, 4, 3, 2, 1]
Shell Sort Sort far apart items first then sort the ones that are closer together Sort items separated by a given “gap” every fifth element, for example Use any sorting method insertion sort, e.g. Reduce the gap size & repeat stop when you’ve sorted with a gap of 1
Shell Sort: Gap = 5 81 94 11 32 12 24 17 29 28 18 41 77 75 15 81 24 Start at location 6 Up by 1 each time Use insertion sort Swap 24 81 94 17 Swap 17 94 11 OK 29 32 28 Swap 28 32 12 OK 18
Shell Sort: Gap = 5 (cont) 24 17 11 28 12 81 94 29 32 18 41 77 75 15 81 41 Swap 24 OK 41 81 94 77 Swap 17 OK 77 94 29 OK 75 32 15 Swap 28 15 32 Swap 15 28
Shell Sort: Gap = 5 (cont) 24 17 11 15 12 41 77 29 28 18 81 94 75 32 List is now “5-sorted” 41 24 81 77 17 94 11 75 29 15 32 28 12 18
Shell Sort: Gap = 3 Continue with a smaller gap 24 17 11 15 12 41 77 29 28 18 81 94 75 32 Continue with a smaller gap 24 15 77 18 75 Swap OK 17 12 29 81 32 Swap OK 11 41 28 94 OK Swap (41)
Shell Sort: Gap = 3 (cont) 24 28 17 18 11 12 15 81 77 94 32 75 41 29 24 17 11 15 12 41 77 29 28 18 81 94 75 32 List is now 3-sorted 15 18 24 75 77 12 17 29 32 81 11 28 41 94
Shell Sort: Gap = 1 Now it’s just normal insertion sort 15 12 11 18 17 28 24 29 41 75 32 94 77 81 Now it’s just normal insertion sort but everything’s “pretty close” to where it’s going to end up Number of assignments: shell sort: 13 + 13 + 15 = 41 insertion sort: 57 15 12 11 18 17 28 24 41 29 75 94 32 77 81
Gap Sizes Any descending sequence will do Some sequences: so long as it ends at 1 Some sequences: N/2, N/4, N/8, …, 1 not especially good A = 2log N – 1, A/2, A/4, …, 1 N/3, N/9, N/27, …, 1 … other sequences may be even better
Gap Sizes N = 13 insertion sort = 52 comps, 65 stores 1st gap sequence: 6, 3, 1 41 comps, 77 stores 2nd gap sequence: 7, 3, 1 46 comps, 81 stores 3rd gap sequence: 4, 1 37 comps, 63 stores N = 130 (random values) 4487c, 4618s 1st: 65, 32, 16, 8, 4, 2, 1 1383c, 2236s 2nd: 127, 63, 31, 15, 7, 3, 1 1059c, 1792s 3rd: 43, 14, 4, 1 1082c, 1592s 17 24 12 32 11 94 81 15 77 41 18 28 29
Shell Sort to ShellSort( List a ) gap Length(a) 2; /* or … */ while (gap 1) if (gap % 2 == 0) ++gap; /* much better! */ for i gap .. Length(a)–1 p i, temp a[p]; while (p gap && a[p–gap] > temp) a[p] a[p–gap], p p–gap; a[p] temp; gap gap 2; /* or … */
Exercise Show the evolution of the following arrays under shell sort. Use length/3 as the starting gap: [15, 3, 21, 45, 7, 17, 4] [13, 11, 20, 15, 16, 6, 5, 8]
Complexity of Shell Sort Depends on the gap sequence O(N2) for N/2 version O(N3/2) if we do the only-odd-numbers version O(N3/2) for 2log N – 1 version …, 63, 31, 15, 7, 3, 1 some others have this as well some versions have O(N4/3) one version has O(N1+sqrt(8ln(5/2)/ln(N))) others we don’t even know!
Comments All those sort methods O(N2) in worst case doubling size of array quadruples time Shell sort can be O(N3/2), but still not good enuf Want something better! Recursion offers a way up next time: recursion after that: recursive sorting; other fast sorting