Download presentation
Presentation is loading. Please wait.
Published byRalf Gilbert Richard Modified over 9 years ago
1
1 G64ADS Advanced Data Structures Sorting
2
2 Insertion sort 1) Initially p = 1 2) Let the first p elements be sorted. 3) Insert the (p+1)th element properly in the list so that now p+1 elements are sorted. 4) increment p and go to step (3)
3
3 Insertion sort
4
4 oConsists of N - 1 passes oFor pass p = 1 through N - 1, ensures that the elements in positions 0 through p are in sorted order oelements in positions 0 through p - 1 are already sorted omove the element in position p left until its correct place is found among the first p + 1 elements
5
5 Insertion sort To sort the following numbers in increasing order: 34 8 64 51 32 21 p = 1; tmp = 8; 34 > tmp, so second element a[1] is set to 34: {8, 34}… We have reached the front of the list. Thus, 1st position a[0] = tmp=8 After 1st pass: 8 34 64 51 32 21 (first 2 elements are sorted)
6
6 p = 3; tmp = 51; 51 < 64, so we have 8 34 64 64 32 21, 34 < 51, so stop at 2nd position, set 3 rd position = tmp, After 3rd pass: 8 34 51 64 32 21 (first 4 elements are sorted) p = 4; tmp = 32, 32 < 64, so 8 34 51 64 64 21, 32 < 51, so 8 34 51 51 64 21, next 32 < 34, so 8 34 34, 51 64 21, next 32 > 8, so stop at 1st position and set 2 nd position = 32, After 4th pass: 8 32 34 51 64 21 p = 5; tmp = 21,... After 5th pass: 8 21 32 34 51 64 p = 2; tmp = 64; 34 < 64, so stop at 3 rd position and set 3 rd position = 64 After 2nd pass: 8 34 64 51 32 21 (first 3 elements are sorted)
7
7 Insertion sort: worst-case running time oInner loop is executed p times, for each p=1..N-1 Overall: 1 + 2 + 3 +... + N-1 = …= O(N 2 ) oSpace requirement is O(?)
8
8 Heapsort (1) Build a binary heap of N elements othe minimum element is at the top of the heap (2) Perform N DeleteMin operations othe elements are extracted in sorted order (3) Record these elements in a second array and then copy the array back
9
9 Heapsort -Analysis (1) Build a binary heap of N elements orepeatedly insert N elements O(N log N) time (2) Perform N DeleteMin operations oEach DeleteMin operation takes O(log N) O(N log N) (3) Record these elements in a second array and then copy the array back oO(N) oTotal time complexity: O(N log N) oMemory requirement: uses an extra array, O(N)
10
10 Heapsort – No Extra Memory oObservation: after each deleteMin, the size of heap shrinks by 1 oWe can use the last cell just freed up to store the element that was just deleted after the last deleteMin, the array will contain the elements in decreasing sorted order oTo sort the elements in the decreasing order, use a min heap oTo sort the elements in the increasing order, use a max heap othe parent has a larger element than the child
11
11 Heapsort – No Extra Memory Sort in increasing order: use max heap Delete 97
12
12 Mergesort Based on divide-and-conquer strategy oDivide the list into two smaller lists of about equal sizes oSort each smaller list recursively oMerge the two sorted lists to get one sorted list oHow to divide the list oRunning time oHow to merge the two sorted lists oRunning time
13
13 Mergesort Based on divide-and-conquer strategy oDivide the list into two smaller lists of about equal sizes oSort each smaller list recursively oMerge the two sorted lists to get one sorted list oHow to divide the list oRunning time oHow to merge the two sorted lists oRunning time
14
14 Mergesort: Divide oIf the input list is a linked list, dividing takes (N) time oWe scan the linked list, stop at the N/2 th entry and cut the link oIf the input list is an array A[0..N-1]: dividing takes O(1) time o we can represent a sublist by two integers left and right : to divide A[left..right], we compute center=(left+right)/2 and obtain A[left..center] and A[center+1..right] oTry left=0, right = 50, center=?
15
15 Mergesort oDivide-and-conquer strategy orecursively mergesort the first half and the second half omerge the two sorted halves together
16
16 Mergesort
17
17 Mergesort: Merge oInput: two sorted array A and B oOutput: an output sorted array C oThree counters: Actr, Bctr, and Cctr oinitially set to the beginning of their respective arrays (1) The smaller of A[Actr] and B[Bctr] is copied to the next entry in C, and the appropriate counters are advanced (2) When either input list is exhausted, the remainder of the other list is copied to C
18
18 Mergesort: Merge
19
19 Mergesort: Merge
20
20 Mergesort: Analysis oMerge takes O(m1 + m2) where m1 and m2 are the sizes of the two sublists. oSpace requirement: omerging two sorted lists requires linear extra memory oadditional work to copy to the temporary array and back
21
21 Mergesort: Analysis oLet T(N) denote the worst-case running time of mergesort to sort N numbers. oAssume that N is a power of 2. oDivide step: O(1) time oConquer step: 2 T(N/2) time oCombine step: O(N) time o Recurrence equation: o T(1) = 1 o T(N) = 2T(N/2) + N
22
22 Mergesort: Analysis Since N=2 k, we have k=log 2 n
23
23 Quicksort oDivide-and-conquer approach to sorting oLike MergeSort, except oDon’t divide the array in half oPartition the array based elements being less than or greater than some element of the array (the pivot) oWorst case running time O(N2) oAverage case running time O(N log N) oFastest generic sorting algorithm in practice oEven faster if use simple sort (e.g., InsertionSort) when array is small
24
24 Quicksort Algorithm oGiven array S oModify S so elements in increasing order 1.If size of S is 0 or 1, return 2.Pick any element v in S as the pivot 3.Partition S – {v} into two disjoint groups oS1 = {x Є(S –{v}) | x ≤ v} oS2 = {x Є(S –{v}) | x ≥ v} 4.Return QuickSort(S1), followed by v, followed by QuickSort(S2)
25
25 Quicksort Example
26
26 Why so fast? oMergeSort always divides array in half oQuickSort might divide array into subproblems of size 1 and N-1 oWhen? oLeading to O(N2) performance oNeed to choose pivot wisely (but efficiently) oMergeSort requires temporary array for merge step oQuickSort can partition the array in place oThis more than makes up for bad pivot choices
27
27 Picking the Pivot oChoosing the first element oWhat if array already or nearly sorted? oGood for random array oChoose random pivot oGood in practice if truly random oStill possible to get some bad choices oRequires execution of random number generator
28
28 Picking the Pivot oBest choice of pivot? oMedian of array oMedian is expensive to calculate oEstimate median as the median of three elements oChoose first, middle and last elements oe.g., oHas been shown to reduce running time (comparisons) by 14%
29
29 Partitioning Strategy oPartitioning is conceptually straightforward, but easy to do inefficiently o Good strategy oSwap pivot with last element S[right] oSet i = left oSet j = (right –1) oWhile (i < j) oIncrement i until S[i] > pivot oDecrement j until S[j] < pivot oIf (i < j), then swap S[i] and S[j] oSwap pivot and S[i]
30
30 Partitioning Example
31
31 Partitioning Example
32
32 Partitioning Strategy oHow to handle duplicates? oConsider the case where all elements are equal oCurrent approach: Skip over elements equal to pivot oNo swaps (good) oBut then i = (right –1) and array partitioned into N-1 and 1 elements oWorst case O(N 2 ) performance
33
33 Partitioning Strategy oHow to handle duplicates? oAlternative approach oDon’t skip elements equal to pivot oIncrement i while S[i] < pivot oDecrement j while S[j] > pivot oAdds some unnecessary swaps oBut results in perfect partitioning for array of identical elements oUnlikely for input array, but more likely for recursive calls to QuickSort
34
34 Small Arrays oWhen S is small, generating lots of recursive calls on small sub-arrays is expensive oGeneral strategy oWhen N < threshold, use a sort more efficient for small arrays (e.g., InsertionSort) oGood thresholds range from 5 to 20 oAlso avoids issue with finding median-of-three pivot for array of size 2 or less oHas been shown to reduce running time by 15%
35
35 QuickSort Implementation
36
36 QuickSort Implementation
37
37 QuickSort Implementation
38
38 Analysis of QuickSort oLet i be the number of elements sent to the left partition oCompute running time T(N) for array of size N oT(0) = T(1) = O(1) oT(N) = T(i) + T(N –i –1) + O(N)
39
39 Analysis of QuickSort
40
40 Analysis of QuickSort
41
41 Comparison Sorting
42
42 Comparison Sorting
43
43 Comparison Sorting
44
44 Lower Bound on Sorting oBest worst-case sorting algorithm (so far) is O(N log N) oCan we do better? oCan we prove a lower bound on the sorting problem? oPreview oFor comparison sorting, no, we can’t do better oCan show lower bound of Ω(N log N)
45
45 Decision Trees oA decision tree is a binary tree oEach node represents a set of possible orderings of the array elements oEach branch represents an outcome of a particular comparison oEach leaf of the decision tree represents a particular ordering of the original array elements
46
46 Decision Trees
47
47 Decision Tree for Sorting oThe logic of every sorting algorithm that uses comparisons can be represented by a decision tree oIn the worst case, the number of comparisons used by the algorithm equals the depth of the deepest leaf oIn the average case, the number of comparisons is the average of the depths of all leaves oThere are N! different orderings of N elements
48
48 Lower Bound for Comparison Sorting oLemma 7.1 oLemma 7.2 oTheorem 7.6 oTheorem 7.7
49
49 Linear Sorting oSome constraints on input array allow faster than Θ(N log N) sorting (no comparisons) oCountingSort oGiven array A of N integer elements, each less than M oCreate array C of size M, where C[i] is the number of i’s in A oUse C to place elements into new sorted array B oRunning time Θ(N+M) = Θ(N) if M = Θ(N)
50
50 Linear Sorting oBucketSort oAssume N elements of A uniformly distributed over the range [0,1) oCreate N equal-sized buckets over [0,1) oAdd each element of A into appropriate bucket oSort each bucket (e.g., with InsertionSort) oReturn concatentation of buckets oAverage case running time Θ(N) oAssumes each bucket will contain Θ(1) elements
51
51 External Sorting oWhat is the number of elements N we wish to sort do not fit in memory? oObviously, our existing sort algorithms are inefficient oEach comparison potentially requires a disk access oWe want to minimize disk accesses
52
52 External Mergesorting oN = number of elements in array A to be sorted oM = number of elements that fit in memory oK = roof [N/M] oApproach oRead in M amount of A, sort it using QuickSort, and write it back to disk: O(M log M) oRepeat above K times until all of A processed oCreate K input buffers and 1 output buffer, each of size M/(K+1) oPerform a K-way merge: O(N) oUpdate input buffers one disk-page at a time oWrite output buffer one disk-page at a time
53
53 External Mergesort oT(N,M) = O(K*M log M) + N) oT(N,M) = O((N/M)*M log M) + N) oT(N,M) = O((N log M) + N) oT(N,M) = O(N log M) oDisk accesses (all sequential) oP = page size oAccesses = 4N/P (read-all/write-all twice)
54
54 Summary
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.