1 Data Structures and Algorithms Sorting I Gal A. Kaminka Computer Science Department
2 What we’ve done so far We’ve talked about complexity O(), run-time and space requirements We’ve talked about ADTs Implementations for: Stacks (2 implementations) Queues (2 implementations) Sets (4 implementations)
3 Sorting Take a set of items, order unknown Set: Linked list, array, file on disk, … Return ordered set of the items For instance: Sorting names alphabetically Sorting by height Sorting by color
4 Sorting Algorithms Issues of interest: Running time in worst case, other cases Space requirements In-place algorithms: require constant space The importance of empirical testing Often Critical to Optimize Sorting
5 Short Example: Bubble Sort Key: “large unsorted elements bubble up” Make several sequential passes over the set Every pass, fix local pairs that are not in order Considered inefficient, but useful as first example
6 (Naïve) BubbleSort(array A, length n) 1. for i n to 2 // note: going down 2. for j 2 to i // loop does swaps in [1..i] 3. if A[j-1]>A[j] 4. swap(A[j-1],A[j])
7 Pass 1:
8 Pass 2: Pass 3:
9 Pass 4: Pass 5: Pass 6: Pass 7:
10 Bubble Sort Features Worst case: Inverse sorting Passes: n-1 Comparisons each pass: (n-k) where k pass number Total number of comparisons: (n-1)+(n-2)+(n-3)+…+1 = n 2 /2-n/2 = O(n 2 ) In-place: No auxilary storage Best case: already sorted O(n 2 ) Still: Many redundant passes with no swaps
11 BubbleSort(array A, length n) 1. i n 2. quit false 3. while (i>1 AND NOT quit) // note: going down 4. quit true 5. for j=2 to i // loop does swaps in [1..i] 6. if A[j-1]>A[j] 7. swap(A[j-1],A[j]) // put max in I 8. quit false 9. i i-1
12 Bubble Sort Features Best case: Already sorted O(n) – one pass over set, verifying sorting Total number of exchanges Best case: None Worst case: O(n 2 ) Lots of exchanges: A problem with large items
13 Selection Sort Observation: Bubble-Sort uses lots of exchanges These always float largest unsorted element up We can save exchanges: Move largest item up only after it is identified More passes, but less total operations Same number of comparisons Many fewer exchanges
14 SelectSort(array A, length n) 1. for i n to 2 // note we are going down 2. largest A[1] 3. largest_index 1 4. for j 1 to i // loop finds max in [1..i] 5. if A[j]>A[largest_index] 6. largest_index j 7. swap(A[i],A[largest_index]) // put max in i
15 Initial: Pass 1: | 92 Pass 2: I Pass 3: I Pass 4: I Pass 5: I Pass 6: I Pass 7: 12 I
16 Selection Sort Summary Best case: Already sorted Passes: n-1 Comparisons each pass: (n-k) where k pass number # of comparisons: (n-1)+(n-2)+…+1 = O(n 2 ) Worst case: Same. In-place: No external storage Very few exchanges: Always n-1 (better than Bubble Sort)
17 Selection Sort vs. Bubble Sort Selection sort: more comparisons than bubble sort in best case O(n 2 ) But fewer exchanges O(n) Good for small sets/cheap comparisons, large items Bubble sort: Many exchanges O(n 2 ) in worst case O(n) on sorted input
18 Insertion Sort Improve on # of comparisons Key idea: Keep part of array always sorted As in selection sort, put items in final place As in bubble sort, “bubble” them into place
19 InsertSort(array A, length n) 1. for i 2 to n // A[1] is sorted 2. y=A[i] 3. j i-1 4. while (j>0 AND y<A[j]) 5. A[j+1] A[j] // shift things up 6. j j-1 7. A[j+1] y // put A[i] in right place
20 Initial: Pass 1: 25 | Pass 2: I | Pass 3: | | | |
21 Pass 4: | | | | | |
22 Pass 5: | Pass 6: | | Pass 7: | | | | | 92
23 Pass 7: | | | 92
24 Insertion Sort Summary Best case: Already sorted O(n) Worst case: O(n 2 ) comparisons # of exchanges: O(n 2 ) In-place: No external storage In practice, best for small sets (<30 items) BubbleSort does more comparisons! Very efficient on nearly-sorted inputs
25 Divide-and-Conquer An algorithm design technique: Divide a problem of size N into sub-problems Solve all sub-problems Merge/Combine the sub-solutions This can result in VERY substantial improvements
26 Small Example: f(x) 1. if x = 0 OR x = 1 2. return 1 3. else 4. return f(x-1) + f(x-2) What is this function?
27 Small Example: f(x) 1. if x = 0 OR x = 1 2. return 1 3. else 4. return f(x-1) + f(x-2) What is this function? Fibbonacci!
28 Divide-and-Conquer in Sorting Mergesort O(n log n) always, but O(n) storage Quick sort O(n log n) average, O(n^2) worst Good in practice (>30), O(log n) storage