Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sorting Data Structures and Algorithms (60-254). Sorting Sorting is one of the most well-studied problems in Computer Science The ultimate reference on.

Similar presentations


Presentation on theme: "Sorting Data Structures and Algorithms (60-254). Sorting Sorting is one of the most well-studied problems in Computer Science The ultimate reference on."— Presentation transcript:

1 Sorting Data Structures and Algorithms (60-254)

2 Sorting Sorting is one of the most well-studied problems in Computer Science The ultimate reference on the subject is: “The Art of Computer Programming: Vol. 3 Sorting and Searching”, by D. E. Knuth 2

3 Formal Statement Given a sequence of n numbers: a 1, a 2, …, a n find a permutation  of the numbers 1, 2, …, n such that a  (1)  a  (2)  …  a  (n) Permutation: 3, 2, 1   (1) = 3,  (2) = 2,  (3) = 1 2, 1, 3   (1) = 2,  (2) = 1,  (3) = 3 1, 3, 2   (1) = 1,  (2) = 3,  (3) = 2 … are all permutations of 1, 2, 3 3

4 Comparison Sorts A comparison sort sorts by comparing elements pairwise. We study these comparison sorts: Insertion Sort Shellsort Mergesort Quicksort 4

5 Insertion Sort Sort the sequence 3, 1, 5, 4, 2 Sort 3  3 Sort3, 1  1, 3 Sort1, 3, 5  1, 3, 5 Sort1, 3, 5, 4  1, 3, 4, 5 Sort1, 3, 4, 5, 2  1, 2, 3, 4, 5 5

6 Incremental sorting In general, at the i th step, a 1, a 2, a 3, …, a i-1, a i are already sorted  a  (1)  a  (2)  …  a  (i) for some permutation  of 1, 2, …, i. In the next step, a i+1 has to be inserted in the correct position 6

7 Analysis of Insertion Sort What is worst-case input? Elements in decreasing order!! Example: 5, 4, 3, 2, 1 # of comparisons 50 5, 4  4, 51 4, 5, 3  3, 4, 52 3, 4, 5, 2  2, 3, 4, 53 2, 3, 4, 5, 1  1, 2, 3, 4, 54 7

8 Worst case In general, to insert a i+1 in its proper place, w.r.t. the sorted preceeding i numbers a 1, a 2, …, a i, we can make i comparisons in the worst case. Thus, 8

9 Shellsort Due to Donald Shell For example: Shellsort the sequence: 81, 94, 11, 96, 12, 35, 17, 95 (1) Step 1: Sort all sequences that are four positions apart. 81, 12  12, 81 94, 35  35, 94 11, 17  11, 17 96, 95  95, 96 Results in: 12, 35, 11, 95, 81, 94, 17, 96(2) 9

10 Shellsort Step 2: 12, 35, 11, 95, 81, 94, 17, 96(2) Sort all sequences of (2) that are two positions apart. 12, 11, 81, 17  11, 12, 17, 81 35, 95, 94, 96  35, 94, 95, 96 Results in: 11, 35, 12, 94, 17, 95, 81, 96(3) Step 3: Sort all sequences of (3) that are one position apart. 11, 35, 12, 94, 17, 95, 81, 96  11, 12, 17, 35, 81, 94, 95, 96(4) Sequence (4) is sorted !! 10

11 Observations h 1, h 2, h 3 = 4, 2, 1 is called a gap sequence. Different gap sequences are possible Every one of them must end with 1 Shell’s gap sequence: h 1 = n/2 h i = h i-1 / 2 (downto h k = 1) All sequences were sorted using insertion sort In Step 3, we sorted the entire sequence, using insertion sort! Advantage over straightforward insertion sort? 11

12 Example Insertion sort on: 81, 94, 11, 96, 12, 35, 17, 95 810 81, 941 11, 81, 942 11, 81, 94, 961 11, 12, 81, 94, 964 11, 12, 35, 81, 94, 964 11, 12, 17, 35, 81, 94, 965 11, 12, 17, 35, 1, 94, 95, 962 __ Total # of comparisons19 12

13 Example Insertion sort on: 11, 35, 12, 94, 17, 95, 81, 96 11 0 11, 351 11, 12, 352 11, 12, 35, 941 11, 12, 17, 35, 943 11, 12, 17, 35, 94, 951 11, 12, 17, 35, 81, 94, 953 11, 12, 17, 35, 81, 94, 95, 961 __ Total # of comparisons12 13

14 Analysis of Shellsort Clever choice of a gap sequence leads to a subquadratic algorithm That is, for an n-element sequence, the # of comparisons: when using the Hibbard sequence: 1, 3, 7, …, 2 k -1 14

15 Mergesort Sort: 81, 94, 11, 96, | 12, 35, 17, 95  Mergesort (81, 94, 11, 96) Mergesort(12, 35, 17, 95) Merge the two sorted lists from above two lines. This is a Divide-and-conquer algorithm. 15

16 Divide 16

17 Conquer Merge two sorted lists. MS (81, 94, 11, 96) = 11, 81, 94, 96(1) MS (12, 35, 17, 95) = 12, 17, 35, 95(2) Compare 11 and 12 Output 11 Move index in list (1) Compare 12 and 81 Output 12 Move index in list (2) Compare 17 and 81 Output 17 Move index in list (2) 17

18 Number of Comparisons A total of seven comparisons to generate the sorted list 11, 12, 17, 35, 81, 94, 95, 96 This is the maximum! For if the lists were 81, 94, 95, 96 and 11, 12, 17, 35 We would need only four comparisons The algorithm follows… 18

19 Procedure Mergesort(A) n  size of A if (n > 1) Set A 1  A[1... n/2] // Create a new array A 1 Set A 2  A[n/2+1... n] // Create a new array A 2 Mergesort(A 1 ) Mergesort(A 2 ) Merge(A, A 1, A 2 ) else // A has only one element  do nothing! 19

20 Procedure Merge(A, A 1, A 2 ) n 1  size of A 1 n 2  size of A 2 i  1; j  1; k  1 while (i <= n 1 and j <= n 2 ) if (A 1 [i] < A 2 [j]) A[k]  A 1 [i]; i  i +1 else A[k]  A 2 [j]; j  j + 1 k  k + 1 for m  i to n 1 A[k]  A 1 [m]; k  k + 1 for m  j to n 2 A[k]  A 2 [m]; k  k + 1 20

21 Theorem To merge two sorted lists, each of length n, we need at most 2n – 1 comparisons 21

22 Complexity of Mergesort T(n) = 2 T(n/2) + O(n)n > 2 = 1n = 1 Solution:O(n log n) 22

23 A Partitioning Game GivenL = 5, 3, 2, 6, 4, 1, 3, 7 Partition L into L 1 and L 2 such that every element in list L 1 is less than or equal to every element in list L 2 How? 23

24 Split a = first element of L Make Every element of list L 1 less than or equal to a less than or equal to every element of list L 2 How? Using two indices lx = left index rx = right index 24

25 rx Initial configuration 25 5, 3, 2, 6, 4, 1, 3, 7 lx Rules:  lx moves right until it meets an element  5  rx moves left until it meets an element  5  exchange elements and continue until indices meet or cross.

26 Intermediate configurations 5, 3, 2, 6, 4, 1, 3, 7 26 lx rx lx and rx have crossed !!

27 Intermediate configurations L 1 = 3, 3, 2, 1, 4 L 2 = 6, 5, 7 Now, do the same with the lists L 1 and L 2 Initial configuration for L 1 3, 3, 2, 1, 4 27 lx rx 3 = first element.

28 Intermediate configuration for L 1 3, 3, 2, 1, 4 28 lx rx Exchange and continue: 1, 3, 2, 3, 4 lx rx Exchange and continue: 1, 2, 3, 3, 4 rx lx Left and right indices have crossed!

29 Quicksort We have new lists L 11 = 1, 2 L 12 = 3, 3, 4 with which we continue to do the same Partitioning stops once we have a list with only one element All this, done in place gives us the following sorted list L sorted = 1, 2, 3, 3, 4, 5, 6, 7 This is Quicksort!!! 29

30 Partition – Formal Description Procedure Partition(L, p, q) a  L[p] lx  p – 1 rx  q + 1 while true repeat rx  rx -1 // Move right index until L[rx]  a repeat lx  lx + 1// Move left index until L[lx]  a if (lx < rx) exchange(L[lx], L[rx]) else return rx // Indices have crossed 30

31 Quicksort Procedure Quicksort(L, p, q) if (p < q) r  Partition(L, p, q) Quicksort(L, p, r) Quicksort(L, r+1, q) To sort the entire array, the initial call is: Quicksort(L, 1, n) where n is the length of L 31

32 Observations Choice of the partitioning element a is important Determines how the lists are split Desirable: To split the list evenly How?... 32

33 Undesirable Partitioning 33 List of size 2 List of size n-1 List of size n

34 Example Such an undesirable partitioning is possible if we take the following sorted sequence 3, 4, 5, 6, 7, 8, 9, 10 and we partition as described above 34

35 Desirable Partitioning 35......

36 Choosing the pivot Steering between the two extremes: Can we choose the partitioning element to steer between the two extremes? Heuristics: Median-of-three Find median of first, middle and last element or Find median of three randomly chosen elements. 36

37 Analysis Worst-case behavior T(n) = n + n-1 + … + 2 = O(n 2 ) Since to partition a list of size n-i (i  0) into two lists of size 1 and n-i-1 we need to look at all n-i elements 37 n-i1n-i-1 T(n) = T(n-1) + O(n)

38 Best-case Behavior T(n) = 2T(n/2) + O(n) T(n) = O(n log n) where T(n) =time to partition a list of size n into two lists of size n/2 each. 38

39 Average-case Behavior T avg (n) = O(n log n) T(n) = T(  n) + T(  n) + O(n) where  +  = 1 39

40 Sorting – in Linear Time?? Yes… but… only under certain assumptions on the input data Linear-time sorting techniques: Counting sort Radix sort Bucket sort 40

41 Counting Sort Assumption: Input elements are integers in the range 1 to k where k = O(n) Example: Sort the list L = 3, 6, 4, 1, 3, 4, 1, 4 using counting sort. 41

42 Example A36413414 12345678 42 C 123456 B 12345678  Input is in array A  C[i] counts # of times i occurs in the input at first  Sorted array is stored in B

43 Example - Continued Count # of times i occurs in A: 43 C202301 123456 Cumulative counters: C224778 123456 Now C[i] contains the count of the number of elements in the input that are  i

44 Example - Continued Go through array A First element is 3 From C[3] we know that 4 elements are  to 3 So B[4] = 3 Decrease C[3] by 1 44 B 3 12345678 C223778 123456

45 Example - Continued Next element in A is 6 C[6] = 8  eight elements are  6 B[8] = 6 Decrease C[6] by 1 45 B 3 6 12345678 C223777 123456 and so on…

46 Example - Continued A[3] = 4, C[4] = 7, B[7] = 4 46 B 3 46 12345678 C223677 123456

47 Example - Continued A[4] = 1,C[1] = 2,B[2] = 1 47 B 1 3 46 12345678 C123677 123456

48 Example - Continued A[5] = 3, C[3] = 3,B[3] = 3 48 B 133 46 12345678 C222677 123456

49 Example - Continued A[6] = 4,C[4] = 6,B[6] = 4 49 B 133 446 12345678 C122577 123456

50 Example - Continued A[7] = 1, C[1] = 1,B[1] = 1 50 B1133 446 12345678 C022577 123456

51 Example - Continued A[8] = 4,C[4] = 5,B[5] = 4 51 B11334446 12345678 C022477 123456

52 Formal Algorithm Procedure CountingSort(A, B, k, n) for i  1 to k C[i]  0 for i  1 to n C[A[i]]  C[A[i]] + 1 // C[i] now contains a counter of how often i occurs for i  2 to k C[i]  C[i] + C[i-1] // C[i] now contains # of elements  i for i  n downto 1 B[C[A[i]]]  A[i] C[A[i]]  C[A[i]] - 1 52

53 Without using a second array B… Procedure Single_Array_CountingSort(A, k, n) for i  1 to k C[i]  0 for i  1 to n C[A[i]]  C[A[i]] + 1 // C[i] now contains a counter of how often i occurs pos  1 for i  1 to k for j  1 to C[i] A[pos]  i pos  pos + 1 53

54 Analysis of Single-Array Counting Sort First and second for loops take k = O(n) steps Then, two nested for loops…  O(n 2 ) ??? A more accurate upper bound??...Yes… For each i … inner for loop executes C[i] times Then, two for loops execute 54 Theorem: Proof (sketch): Second for loop executed n times. Each step an element in C is increased by 1. q.e.d.

55 Discussion Complexity: If the list is of size n and k = O(n), then T(n) = O(n) Stability: A sorting method is stable if equal elements are output in the same order they had in the input. Theorem: Counting Sort is stable. 55

56 Lower Bounds on Sorting Theorem: For any comparison sort of n elements T(n) =  (n log n) Remark: T(n) =  (g(n)) means that T(n) grows at least as fast as g(n) 56


Download ppt "Sorting Data Structures and Algorithms (60-254). Sorting Sorting is one of the most well-studied problems in Computer Science The ultimate reference on."

Similar presentations


Ads by Google