Download presentation
Presentation is loading. Please wait.
Published byRussell Rogers Modified over 9 years ago
2
Sorting 15-211 Fundamental Data Structures and Algorithms Aleks Nanevski February 17, 2004
3
Announcements Reading: Chapter 8 Quiz #1 available on Thursday Quiz will be much easier to complete if you solved Homework 5
4
Introduction to Sorting
5
The Problem We are given a sequence of items a 1 a 2 a 3 … a n-1 a n We want to rearrange them so that they are in non-decreasing order. More precisely, we need a permutation f such that a f(1) a f(2) a f(3) … a f(n-1) a f(n).
6
A Constraint Comparison Based Sorting While we are rearranging the items, we will only use queries of the form a i a j Or variants thereof ( and so forth).
7
Comparison-based sorting The important point here is that the algorithm can can only make comparison such as if( a[i] < a[j] ) … We are not allowed to look at pieces of the elements a[i] and a[j]. For example, if these elements are numbers, we are not allowed to compare the most significant digits.
8
Flips and inversions An unsorted array. 24471399105222 inversion flip Two elements are inverted if A[i] > A[j] for i < j. Flip is an inversion of two consecutive elements.
9
Example 321654 In the above list, there are 9 inversions: (3, 2), (2, 1), (3, 1), (6, 5), (5, 4), (6, 4), (9, 8), (8, 7), (9, 7) Out of which, 6 are flips: (3, 2), (2, 1), (6, 5), (5, 4), (9, 8), (8, 7) 987
10
Naïve sorting algorithms Bubble sort: scan for flips, until all are fixed What is the running time? 321654 231654 213654 213564 213546 Etc...
11
Naïve sorting algorithms Bubble sort: scan for flips, until all are fixed What is the running time? O(n 2 ) 321654 231654 213654 213564 213546 Etc...
12
Insertion sort for i = 1 to n-1 do insert a[i] in the proper place in a[0:i-1] Invariant: after i steps, the sub-array A[0:i+1] is sorted
13
Insertion sort 1054713993022247105139930222134710599302221347991053022213304799105222 10547139930222 Sorted subarray
14
How fast is insertion sort? tmp = a[i]; for (j = i; j>0 && a[j-1]>tmp; j--) a[j] = a[j-1]; a[j] = tmp; To insert a[i] into a[0:i-1], slide all elements larger than a[i] to the right.
15
How fast is insertion sort? O(#inversions) sliding steps very fast if array is nearly sorted to begin with tmp = a[i]; for (j = i; j>0 && a[j-1]>tmp; j--) a[j] = a[j-1]; a[j] = tmp; To insert a[i] into a[0:i-1], slide all elements larger than a[i] to the right.
16
How many inversions? In this case, there are 0+1 +... + (n-3)+(n-2)+(n-1)+n or inversions nn-13210 Consider the worst case:...
17
How many inversions? What about the average case? Consider: p = x 1 x 2 x 3 … x n-1 x n For any such p, let rev(p) be its reversal. Then (x i,x j ) is an inversion either in p or in rev(p). There are n(n-1)/2 pairs (x i,x j ), hence the average number of inversions in a permutation is n(n-1)/4, or O(n 2 ).
18
How long does it take to sort? Can we do better than O(n 2 )? In the worst case? In the average case?
19
Heapsort Remember heaps: complete binary tree, each parent smaller than the child buildHeap has O(n) worst-case running time. deleteMin has O(log n) worst-case running time. Heapsort: Build heap. O(n) DeleteMin until empty. O(n log n) Total worst case: O(n log n)
20
n 2 vs n log n
21
Sorting in O(n log n) Heapsort establishes the fact that sorting can be accomplished in O(n log n) worst-case running time.
22
Heapsort in practice The average-case analysis for heapsort is somewhat complex. In practice, heapsort consistently tends to use nearly n log n comparisons. So, while the worst case is better than n 2, other algorithms sometimes work better.
23
Shellsort Shellsort, like insertion sort, is based on swapping inverted pairs. Achieves O(n 3/2 ) running time (see book for details). Idea: sort in phases, with insertion sort at the end Remember: insertion sort is good when the array is nearly sorted
24
Shellsort Definition: k-sort == sort by insertion items that are k positions apart. therefore, 1-sort == simple insertion sort Now, perform a series of k-sorts for k = m 1 > m 2 >... > 1
25
Shellsort algorithm 10547131730222519 After 4-sort 3047517105222131951713193047105222 After 2-sort 51317193047105222 After 1-sort Note: many inversions fixed in one exchange.
26
Recursive Sorting
27
Recursive sorting Intuitively, divide the problem into pieces and then recombine the results. If array is length 1, then done. If array is length N>1, then split in half and sort each half. Then combine the results. An example of a divide-and-conquer algorithm.
28
Divide-and-conquer
30
Divide-and-conquer is important We will see several examples of divide-and-conquer in this course.
31
Recursive sorting If array is length 1, then done. Otherwise, split into two smaller pieces. Sort each piece. Combine the sorted pieces.
32
Analysis of recursive sorting Suppose it takes time T(N) to sort N elements. Suppose also it takes time N to combine the two sorted arrays. Then: T(1) = 1 T(N) = 2T(N/2) + N, for N>1 Solving for T gives the running time for the recursive sorting algorithm.
33
Remember recurrence relations? Systems of equations such as T(1) = 1 T(N) = 2T(N/2) + N, for N>1 are called recurrence relations (or sometimes recurrence equations).
34
A solution A solution for T(1) = 1 T(N) = 2T(N/2) + N is given by T(N) = N log N + N which is O(N log N).
35
Divide-and-Conquer Theorem Theorem: Let a, b, c 0. The recurrence relation T(1) = b T(N) = aT(N/c) + bN for any N which is a power of c has upper-bound solutions T(N) = O(N)if a<c T(N) = O(N log N)if a=c T(N) = O(N log c a )if a>c a=2, b=1, c=2 for recursive sorting
36
Upper-bounds Corollary: Dividing a problem into p pieces, each of size N/p, using only a linear amount of work, results in an O(N log N) algorithm.
37
Upper-bounds Proof of this theorem left for later.
38
Mergesort
39
Mergesort is the most basic recursive sorting algorithm. Divide array in halves A and B. Recursively mergesort each half. Combine A and B by successively looking at the first elements of A and B and moving the smaller one to the result array. Note: Should be careful to avoid creating lots of intermediate arrays.
40
Mergesort
41
But don’t actually want to create all of these arrays!
42
Mergesort LLRR Use simple indexes to perform the split. Use a single extra array to hold each intermediate result.
43
Analysis of mergesort Mergesort generates almost exactly the same recurrence relations shown before. T(1) = 1 T(N) = 2T(N/2) + N - 1, for N>1 Thus, mergesort is O(N log N).
44
Quicksort
45
Quicksort was invented in 1960 by Tony Hoare. Although it has O(N 2 ) worst-case performance, on average it is O(N log N). More importantly, it is the fastest known comparison-based sorting algorithm in practice.
46
Quicksort idea Choose a pivot.
47
Quicksort idea Choose a pivot. Rearrange so that pivot is in the “right” spot.
48
Quicksort idea Choose a pivot. Rearrange so that pivot is in the “right” spot. Recurse on each half and conquer!
49
Quicksort algorithm If array A has 1 (or 0) elements, then done. Choose a pivot element x from A. Divide A-{x} into two arrays: B = {yA | yx} C = {yA | yx} Quicksort arrays B and C. Result is B+{x}+C.
50
Quicksort algorithm 10547131730222519517134730222105 19 51730222105 1347 105222
51
Quicksort algorithm 10547131730222519 517134730222105 19 51730222105 1347 In practice, insertion sort is used once the arrays get “small enough”. 105222
52
Rearranging in place 85 24 63 50 17 31 96 45 85 24 63 45 17 31 96 50 LR LR 31 24 63 45 17 85 96 50 LR pivot
53
Rearranging in place 31 24 63 45 17 85 96 50 LR 31 24 17 45 63 85 96 50 RL 31 24 17 45 50 85 96 63 31 24 17 45 63 85 96 50 LR
54
Quicksort is fast but hard to do Quicksort, in the early 1960’s, was famous for being incorrectly implemented many times. More about invariants next time. Quicksort is very fast in practice. Faster than mergesort because Quicksort can be done “in place”.
55
Informal analysis If there are duplicate elements, then algorithm does not specify which subarray B or C should get them. Ideally, split down the middle. Also, not specified how to choose the pivot. Ideally, the median value of the array, but this would be expensive to compute. As a result, it is possible that Quicksort will show O(N 2 ) behavior.
56
Worst-case behavior 10547131730222519 5 471317302221910547105173022219 13 17 471051930222 19 If always pick the smallest (or largest) possible pivot then O(n 2 ) steps
57
Analysis of quicksort Assume random pivot. T(0) = 1 T(1) = 1 T(N) = T(i) + T(N-i-1) + cN, for N>1 where I is the size of the left subarray.
58
Worst-case analysis If the pivot is always the smallest element, then: T(0) = 1 T(1) = 1 T(N) = T(0) + T(N-1) + cN, for N>1 T(N-1) + cN = O(N 2 ) See the book for details on this solution.
59
Best-case analysis In the best case, the pivot is always the median element. In that case, the splits are always “down the middle”. Hence, same behavior as mergesort. That is, O(N log N).
60
Average-case analysis Consider the quicksort tree: 10547131730222519 517134730222105 19 51730222105 1347 105222
61
Average-case analysis At each level of the tree, there are less than N nodes So, time spent at each level is O(N) On average, how many levels are there? That is, what is the expected height of the tree? If on average there are O(log N) levels, then quicksort is O(N log N) on average.
62
Average-case analysis We’ll answer this question next time…
63
Summary of quicksort A fast sorting algorithm in practice. Can be implemented in-place. But is O(N 2 ) in the worst case. Average-case performance?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.