Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 G64ADS Advanced Data Structures Sorting. 2 Insertion sort 1) Initially p = 1 2) Let the first p elements be sorted. 3) Insert the (p+1)th element properly.

Similar presentations


Presentation on theme: "1 G64ADS Advanced Data Structures Sorting. 2 Insertion sort 1) Initially p = 1 2) Let the first p elements be sorted. 3) Insert the (p+1)th element properly."— Presentation transcript:

1 1 G64ADS Advanced Data Structures Sorting

2 2 Insertion sort 1) Initially p = 1 2) Let the first p elements be sorted. 3) Insert the (p+1)th element properly in the list so that now p+1 elements are sorted. 4) increment p and go to step (3)

3 3 Insertion sort

4 4 oConsists of N - 1 passes oFor pass p = 1 through N - 1, ensures that the elements in positions 0 through p are in sorted order oelements in positions 0 through p - 1 are already sorted omove the element in position p left until its correct place is found among the first p + 1 elements

5 5 Insertion sort To sort the following numbers in increasing order: 34 8 64 51 32 21 p = 1; tmp = 8; 34 > tmp, so second element a[1] is set to 34: {8, 34}… We have reached the front of the list. Thus, 1st position a[0] = tmp=8 After 1st pass: 8 34 64 51 32 21 (first 2 elements are sorted)

6 6 p = 3; tmp = 51; 51 < 64, so we have 8 34 64 64 32 21, 34 < 51, so stop at 2nd position, set 3 rd position = tmp, After 3rd pass: 8 34 51 64 32 21 (first 4 elements are sorted) p = 4; tmp = 32, 32 < 64, so 8 34 51 64 64 21, 32 < 51, so 8 34 51 51 64 21, next 32 < 34, so 8 34 34, 51 64 21, next 32 > 8, so stop at 1st position and set 2 nd position = 32, After 4th pass: 8 32 34 51 64 21 p = 5; tmp = 21,... After 5th pass: 8 21 32 34 51 64 p = 2; tmp = 64; 34 < 64, so stop at 3 rd position and set 3 rd position = 64 After 2nd pass: 8 34 64 51 32 21 (first 3 elements are sorted)

7 7 Insertion sort: worst-case running time oInner loop is executed p times, for each p=1..N-1  Overall: 1 + 2 + 3 +... + N-1 = …= O(N 2 ) oSpace requirement is O(?)

8 8 Heapsort (1) Build a binary heap of N elements othe minimum element is at the top of the heap (2) Perform N DeleteMin operations othe elements are extracted in sorted order (3) Record these elements in a second array and then copy the array back

9 9 Heapsort -Analysis (1) Build a binary heap of N elements orepeatedly insert N elements  O(N log N) time (2) Perform N DeleteMin operations oEach DeleteMin operation takes O(log N)  O(N log N) (3) Record these elements in a second array and then copy the array back oO(N) oTotal time complexity: O(N log N) oMemory requirement: uses an extra array, O(N)

10 10 Heapsort – No Extra Memory oObservation: after each deleteMin, the size of heap shrinks by 1 oWe can use the last cell just freed up to store the element that was just deleted  after the last deleteMin, the array will contain the elements in decreasing sorted order oTo sort the elements in the decreasing order, use a min heap oTo sort the elements in the increasing order, use a max heap othe parent has a larger element than the child

11 11 Heapsort – No Extra Memory Sort in increasing order: use max heap Delete 97

12 12 Mergesort Based on divide-and-conquer strategy oDivide the list into two smaller lists of about equal sizes oSort each smaller list recursively oMerge the two sorted lists to get one sorted list oHow to divide the list oRunning time oHow to merge the two sorted lists oRunning time

13 13 Mergesort Based on divide-and-conquer strategy oDivide the list into two smaller lists of about equal sizes oSort each smaller list recursively oMerge the two sorted lists to get one sorted list oHow to divide the list oRunning time oHow to merge the two sorted lists oRunning time

14 14 Mergesort: Divide oIf the input list is a linked list, dividing takes  (N) time oWe scan the linked list, stop at the  N/2  th entry and cut the link oIf the input list is an array A[0..N-1]: dividing takes O(1) time o we can represent a sublist by two integers left and right : to divide A[left..right], we compute center=(left+right)/2 and obtain A[left..center] and A[center+1..right] oTry left=0, right = 50, center=?

15 15 Mergesort oDivide-and-conquer strategy orecursively mergesort the first half and the second half omerge the two sorted halves together

16 16 Mergesort

17 17 Mergesort: Merge oInput: two sorted array A and B oOutput: an output sorted array C oThree counters: Actr, Bctr, and Cctr oinitially set to the beginning of their respective arrays (1) The smaller of A[Actr] and B[Bctr] is copied to the next entry in C, and the appropriate counters are advanced (2) When either input list is exhausted, the remainder of the other list is copied to C

18 18 Mergesort: Merge

19 19 Mergesort: Merge

20 20 Mergesort: Analysis oMerge takes O(m1 + m2) where m1 and m2 are the sizes of the two sublists. oSpace requirement: omerging two sorted lists requires linear extra memory oadditional work to copy to the temporary array and back

21 21 Mergesort: Analysis oLet T(N) denote the worst-case running time of mergesort to sort N numbers. oAssume that N is a power of 2. oDivide step: O(1) time oConquer step: 2 T(N/2) time oCombine step: O(N) time o Recurrence equation: o T(1) = 1 o T(N) = 2T(N/2) + N

22 22 Mergesort: Analysis Since N=2 k, we have k=log 2 n

23 23 Quicksort oDivide-and-conquer approach to sorting oLike MergeSort, except oDon’t divide the array in half oPartition the array based elements being less than or greater than some element of the array (the pivot) oWorst case running time O(N2) oAverage case running time O(N log N) oFastest generic sorting algorithm in practice oEven faster if use simple sort (e.g., InsertionSort) when array is small

24 24 Quicksort Algorithm oGiven array S oModify S so elements in increasing order 1.If size of S is 0 or 1, return 2.Pick any element v in S as the pivot 3.Partition S – {v} into two disjoint groups oS1 = {x Є(S –{v}) | x ≤ v} oS2 = {x Є(S –{v}) | x ≥ v} 4.Return QuickSort(S1), followed by v, followed by QuickSort(S2)

25 25 Quicksort Example

26 26 Why so fast? oMergeSort always divides array in half oQuickSort might divide array into subproblems of size 1 and N-1 oWhen? oLeading to O(N2) performance oNeed to choose pivot wisely (but efficiently) oMergeSort requires temporary array for merge step oQuickSort can partition the array in place oThis more than makes up for bad pivot choices

27 27 Picking the Pivot oChoosing the first element oWhat if array already or nearly sorted? oGood for random array oChoose random pivot oGood in practice if truly random oStill possible to get some bad choices oRequires execution of random number generator

28 28 Picking the Pivot oBest choice of pivot? oMedian of array oMedian is expensive to calculate oEstimate median as the median of three elements oChoose first, middle and last elements oe.g., oHas been shown to reduce running time (comparisons) by 14%

29 29 Partitioning Strategy oPartitioning is conceptually straightforward, but easy to do inefficiently o  Good strategy oSwap pivot with last element S[right] oSet i = left oSet j = (right –1) oWhile (i < j) oIncrement i until S[i] > pivot oDecrement j until S[j] < pivot oIf (i < j), then swap S[i] and S[j] oSwap pivot and S[i]

30 30 Partitioning Example

31 31 Partitioning Example

32 32 Partitioning Strategy oHow to handle duplicates? oConsider the case where all elements are equal oCurrent approach: Skip over elements equal to pivot oNo swaps (good) oBut then i = (right –1) and array partitioned into N-1 and 1 elements oWorst case O(N 2 ) performance

33 33 Partitioning Strategy oHow to handle duplicates? oAlternative approach oDon’t skip elements equal to pivot oIncrement i while S[i] < pivot oDecrement j while S[j] > pivot oAdds some unnecessary swaps oBut results in perfect partitioning for array of identical elements oUnlikely for input array, but more likely for recursive calls to QuickSort

34 34 Small Arrays oWhen S is small, generating lots of recursive calls on small sub-arrays is expensive oGeneral strategy oWhen N < threshold, use a sort more efficient for small arrays (e.g., InsertionSort) oGood thresholds range from 5 to 20 oAlso avoids issue with finding median-of-three pivot for array of size 2 or less oHas been shown to reduce running time by 15%

35 35 QuickSort Implementation

36 36 QuickSort Implementation

37 37 QuickSort Implementation

38 38 Analysis of QuickSort oLet i be the number of elements sent to the left partition oCompute running time T(N) for array of size N oT(0) = T(1) = O(1) oT(N) = T(i) + T(N –i –1) + O(N)

39 39 Analysis of QuickSort

40 40 Analysis of QuickSort

41 41 Comparison Sorting

42 42 Comparison Sorting

43 43 Comparison Sorting

44 44 Lower Bound on Sorting oBest worst-case sorting algorithm (so far) is O(N log N) oCan we do better? oCan we prove a lower bound on the sorting problem? oPreview oFor comparison sorting, no, we can’t do better oCan show lower bound of Ω(N log N)

45 45 Decision Trees oA decision tree is a binary tree oEach node represents a set of possible orderings of the array elements oEach branch represents an outcome of a particular comparison oEach leaf of the decision tree represents a particular ordering of the original array elements

46 46 Decision Trees

47 47 Decision Tree for Sorting oThe logic of every sorting algorithm that uses comparisons can be represented by a decision tree oIn the worst case, the number of comparisons used by the algorithm equals the depth of the deepest leaf oIn the average case, the number of comparisons is the average of the depths of all leaves oThere are N! different orderings of N elements

48 48 Lower Bound for Comparison Sorting oLemma 7.1 oLemma 7.2 oTheorem 7.6 oTheorem 7.7

49 49 Linear Sorting oSome constraints on input array allow faster than Θ(N log N) sorting (no comparisons) oCountingSort oGiven array A of N integer elements, each less than M oCreate array C of size M, where C[i] is the number of i’s in A oUse C to place elements into new sorted array B oRunning time Θ(N+M) = Θ(N) if M = Θ(N)

50 50 Linear Sorting oBucketSort oAssume N elements of A uniformly distributed over the range [0,1) oCreate N equal-sized buckets over [0,1) oAdd each element of A into appropriate bucket oSort each bucket (e.g., with InsertionSort) oReturn concatentation of buckets oAverage case running time Θ(N) oAssumes each bucket will contain Θ(1) elements

51 51 External Sorting oWhat is the number of elements N we wish to sort do not fit in memory? oObviously, our existing sort algorithms are inefficient oEach comparison potentially requires a disk access oWe want to minimize disk accesses

52 52 External Mergesorting oN = number of elements in array A to be sorted oM = number of elements that fit in memory oK = roof [N/M] oApproach oRead in M amount of A, sort it using QuickSort, and write it back to disk: O(M log M) oRepeat above K times until all of A processed oCreate K input buffers and 1 output buffer, each of size M/(K+1) oPerform a K-way merge: O(N) oUpdate input buffers one disk-page at a time oWrite output buffer one disk-page at a time

53 53 External Mergesort oT(N,M) = O(K*M log M) + N) oT(N,M) = O((N/M)*M log M) + N) oT(N,M) = O((N log M) + N) oT(N,M) = O(N log M) oDisk accesses (all sequential) oP = page size oAccesses = 4N/P (read-all/write-all twice)

54 54 Summary


Download ppt "1 G64ADS Advanced Data Structures Sorting. 2 Insertion sort 1) Initially p = 1 2) Let the first p elements be sorted. 3) Insert the (p+1)th element properly."

Similar presentations


Ads by Google