Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sorting. 2 The Sorting Problem Input: A sequence of n numbers a 1, a 2,..., a n Output: A permutation (reordering) a 1 ’, a 2 ’,..., a n ’ of the input.

Similar presentations


Presentation on theme: "Sorting. 2 The Sorting Problem Input: A sequence of n numbers a 1, a 2,..., a n Output: A permutation (reordering) a 1 ’, a 2 ’,..., a n ’ of the input."— Presentation transcript:

1 Sorting

2 2 The Sorting Problem Input: A sequence of n numbers a 1, a 2,..., a n Output: A permutation (reordering) a 1 ’, a 2 ’,..., a n ’ of the input sequence such that a 1 ’ ≤ a 2 ’ ≤ · · · ≤ a n ’

3 3 Structure of data

4 4 Why Study Sorting Algorithms? There are a variety of situations that we can encounter Do we have randomly ordered keys? Are all keys distinct? How large is the set of keys to be ordered? Need guaranteed performance? Various algorithms are better suited to some of these situations

5 5 Some Definitions Internal Sort The data to be sorted is all stored in the computer’s main memory. External Sort Some of the data to be sorted might be stored in some external, slower, device. In Place Sort The amount of extra space required to sort the data is constant with the input size.

6 6 Stability A STABLE sort preserves relative order of records with equal keys Sorted on first key: Sort file on second key: Records with key value 3 are not in order on first key!!

7 7 Insertion Sort Idea: like sorting a hand of playing cards Start with an empty left hand and the cards facing down on the table. Remove one card at a time from the table, and insert it into the correct position in the left hand compare it with each of the cards already in the hand, from right to left The cards held in the left hand are sorted these cards were originally the top cards of the pile on the table

8 8 To insert 12, we need to make room for it by moving first 36 and then 24. Insertion Sort 6 10 24 12 36

9 9 6 10 24 Insertion Sort 36 12

10 10 Insertion Sort 6 10 24 36 12

11 11 Insertion Sort 5 2 4 6 1 3 input array left sub-array right sub-array at each iteration, the array is divided in two sub-arrays: sorted unsorted

12 12 Insertion Sort

13 13 INSERTION-SORT Alg.: INSERTION-SORT(A) for j ← 2 to n do key ← A[ j ] Insert A[ j ] into the sorted sequence A[1.. j -1] i ← j - 1 while i > 0 and A[i] > key do A[i + 1] ← A[i] i ← i – 1 A[i + 1] ← key Insertion sort – sorts the elements in place a8a8 a7a7 a6a6 a5a5 a4a4 a3a3 a2a2 a1a1 12345678 key

14 14 INSERTION-SORT void insertion_sort ( int n, int a[n] ) { int i,j,e; for(j=1; j<n; j++) { e = a[j]; i = j-1; while ((i >= 0) && (e < a[i])){ a[i+1] = a[i]; i--; } a[i+1] = e; } a8a8 a7a7 a6a6 a5a5 a4a4 a3a3 a2a2 a1a1 12345678 key

15 15 Insertion Sort - Summary Advantages Good running time for “almost sorted” arrays  (n) Disadvantages  (n 2 ) running time in worst and average case  n 2 /2 comparisons and exchanges

16 16 Bubble Sort Idea: Repeatedly pass through the array Swaps adjacent elements that are out of order Easier to implement, but slower than Insertion sort 123n i 1329648 j

17 17 Example 1329648 i = 1j 3129648 j 3219648 j 3291648 j 3296148 j 3296418 j 3296481 j 3296481 i = 2j 3964821 i = 3j 9648321 i = 4j 9684321 i = 5j 9864321 i = 6j 9864321 i = 7 j

18 18 Bubble Sort Alg.: BUBBLESORT(A) for i  1 to length[A] do for j  length[A] downto i + 1 do if A[j] < A[j -1] then exchange A[j]  A[j-1] 1329648 i = 1j i

19 19 Bubble Sort void bubble_sort ( int n, int a[n]) { int switched = 1; int i,j,tmp; for(i = 0; ((i < n-1) && (switched == 1)); i++) { switched = 0; for(j=n-1; j > i; j--) { 1329648 i = 1j i if(a[j]< a[j-1]) { switched = 1; tmp = a[j]; a[j] = a[j-1]; a[j-1] = tmp; }

20 20 Selection Sort Idea: Find the smallest element in the array Exchange it with the element in the first position Find the second smallest element and exchange it with the element in the second position Continue until the array is sorted Disadvantage: Running time depends only slightly on the amount of order in the file

21 21 Example 1329648 8329641 8349621 8649321 8964321 8694321 9864321 9864321

22 22 Selection Sort Alg.: SELECTION-SORT(A) n ← length[A] for j ← 1 to n - 1 do smallest ← j for i ← j + 1 to n do if A[i] < A[smallest] then smallest ← i exchange A[j] ↔ A[smallest] 1329648

23 23 Selection Sort void selection_sort ( int n, int a[n] ) { int i,j,smallest,tmp; for(j=0; j<n-1; j++) { smallest = j; for(i=j+1; i <n; i++) if (a[i] < a[smallest]) smallest = i; 1329648 if(smallest != j) { tmp = a[j]; a[j] = a[smallest]; a[smallest] = tmp; }

24 24 Sorting Insertion sort Design approach: Sorts in place: Best case: Worst case: Bubble Sort Design approach: Sorts in place: Running time: Yes  (n)  (n 2 ) incremental Yes  (n 2 ) incremental

25 25 Sorting Selection sort Design approach: Sorts in place: Running time: Merge Sort Design approach: Sorts in place: Running time: Yes  (n 2 ) incremental No Let’s see!! divide and conquer

26 26 Divide-and-Conquer Divide the problem into a number of sub-problems Similar sub-problems of smaller size Conquer the sub-problems Solve the sub-problems recursively Sub-problem size small enough  solve the problems in straightforward manner Combine the solutions of the sub-problems Obtain the solution for the original problem

27 27 Merge Sort Approach To sort an array A[p.. r]: Divide Divide the n-element sequence to be sorted into two subsequences of n/2 elements each Conquer Sort the subsequences recursively using merge sort When the size of the sequences is 1 there is nothing more to do Combine Merge the two sorted subsequences

28 28 Merge Sort Alg.: MERGE-SORT (A, p, r) if p < r Check for base case then q ←  (p + r)/2  Divide MERGE-SORT (A, p, q) Conquer MERGE-SORT (A, q + 1, r) Conquer MERGE (A, p, q, r) Combine Initial call: MERGE-SORT (A, 1, n) 12345678 6231742 5 p r q

29 29 Example – n Power of 2 12345678 q = 4 6231742 5 1234 742 5 5678 623 1 12 2 5 34 74 56 3 1 78 62 1 5 2 2 3 4 4 7 1 6 3 7 2 8 6 5 Divide

30 30 Example – n Power of 2 1 5 2 2 3 4 4 7 1 6 3 7 2 8 6 5 12345678 7654322 1 1234 754 2 5678 632 1 12 5 2 34 74 56 3 1 78 62 Conquer and Merge

31 31 Example – n Not a Power of 2 62537416274 1234567891011 q = 6 416274 123456 62537 7891011 q = 9 q = 3 274 123 416 456 537 789 62 1011 74 12 2 3 16 45 4 6 37 78 5 9 2 10 6 11 4 1 7 2 6 4 1 5 7 7 3 8 Divide

32 32 Example – n Not a Power of 2 77665443221 1234567891011 764421 123456 76532 7891011 742 123 641 456 753 789 62 1011 2 3 4 6 5 9 2 10 6 11 4 1 7 2 6 4 1 5 7 7 3 8 74 12 61 45 73 78 Conquer and Merge

33 33 Merging Input: Array A and indices p, q, r such that p ≤ q < r Subarrays A[p.. q] and A[q + 1.. r] are sorted Output: One single sorted subarray A[p.. r] 12345678 6321754 2 p r q

34 34 Merging Idea for merging: Two piles of sorted cards Choose the smaller of the two top cards Remove it and place it in the output pile Repeat the process until one pile is empty Take the remaining input pile and place it face-down onto the output pile 12345678 6321754 2 p r q A1  A[p, q] A2  A[q+1, r] A[p, r]

35 35 Example: MERGE(A, 9, 12, 16) prq

36 36 Example: MERGE(A, 9, 12, 16)

37 37 Example (cont.)

38 38 Example (cont.)

39 39 Example (cont.) Done!

40 40 Merge - Pseudocode Alg.: MERGE(A, p, q, r) 1. Compute n 1 and n 2 2. Copy the first n 1 elements into L[1.. n 1 + 1] and the next n 2 elements into R[1.. n 2 + 1] 3. L[n 1 + 1] ←  ; R[n 2 + 1] ←  4. i ← 1; j ← 1 5. for k ← p to r 6. do if L[ i ] ≤ R[ j ] 7. then A[k] ← L[ i ] 8. i ← i + 1 9. else A[k] ← R[ j ] 10. j ← j + 1 pq 7542 6321 rq + 1 L R   12345678 6321754 2 p r q n1n1 n2n2

41 41 MergeSort void merge_sort ( int a[], int p, int r ) { if (p >= r) return; else if (p == r-1) { if (a[p] > a[r]){ int tmp = a[r]; a[r] = a[p]; a[p] = tmp; } else{ int q = (p+r)/2; merge_sort(a,p,q) ; merge_sort(a,q+1,r); { /* merge sorted halves */ int i,j,k,aux[r-p+1]; for(i=p,j=q+1,k=0; ((i<=q) && (j<=r)); k++) aux[k] = ( a[i] < a[j] ? a[i++] : a[j++] ); while (i<=q) aux[k++] = a[i++]; while (j<=r) aux[k++] = a[j++]; for(i=p,k=0; i<=r; i++,k++) a[i] = aux[k]; }

42 42 Running Time of Merge (assume last for loop) Initialization (copying into temporary arrays):  (n 1 + n 2 ) =  (n) Adding the elements to the final array: - n iterations, each taking constant time   (n) Total time for Merge:  (n)

43 43 Analyzing Divide-and Conquer Algorithms The recurrence is based on the three steps of the paradigm: T(n) – running time on a problem of size n Divide the problem into a subproblems, each of size n/b: takes D(n) Conquer (solve) the subproblems aT(n/b) Combine the solutions C(n)  (1) if n ≤ c T(n) = aT(n/b) + D(n) + C(n) otherwise

44 44 MERGE-SORT Running Time Divide: compute q as the average of p and r: D(n) =  (1) Conquer: recursively solve 2 subproblems, each of size n/2  2T (n/2) Combine: MERGE on an n -element subarray takes  (n) time  C(n) =  (n)  (1) if n =1 T(n) = 2T(n/2) +  (n) if n > 1

45 45 Merge Sort - Discussion Running time insensitive of the input Advantages: Guaranteed to run in  (nlgn) Disadvantage Requires extra space  N

46 46 Sorting Challenge 1 Problem: Sort a file of huge records with tiny keys Example application: Reorganize your MP-3 files Which method to use? A. merge sort, guaranteed to run in time  NlgN B. selection sort C. bubble sort D. a custom algorithm for huge records/tiny keys E. insertion sort

47 47 Sorting Files with Huge Records and Small Keys Insertion sort or bubble sort? NO, too many exchanges Selection sort? YES, it takes linear time for exchanges Merge sort or custom method? Probably not: selection sort simpler, does less swaps

48 48 Sorting Challenge 2 Problem: Sort a huge randomly-ordered file of small records Application: Process transaction record for a phone company Which sorting method to use? A. Bubble sort B. Selection sort C. Mergesort guaranteed to run in time  NlgN D. Insertion sort

49 49 Sorting Huge, Randomly - Ordered Files Selection sort? NO, always takes quadratic time Bubble sort? NO, quadratic time for randomly-ordered keys Insertion sort? NO, quadratic time for randomly-ordered keys Mergesort? YES, it is designed for this problem

50 50 Sorting Challenge 3 Problem: sort a file that is already almost in order Applications: Re-sort a huge database after a few changes Doublecheck that someone else sorted a file Which sorting method to use? A. Mergesort, guaranteed to run in time  NlgN B. Selection sort C. Bubble sort D. A custom algorithm for almost in-order files E. Insertion sort

51 51 Sorting Files That are Almost in Order Selection sort? NO, always takes quadratic time Bubble sort? NO, bad for some definitions of “almost in order” Ex: B C D E F G H I J K L M N O P Q R S T U V W X Y Z A Insertion sort? YES, takes linear time for most definitions of “almost in order” Mergesort or custom method? Probably not: insertion sort simpler and faster

52 52 Quicksort Sort an array A[p…r] Divide Partition the array A into 2 subarrays A[p..q] and A[q+1..r], such that each element of A[p..q] is smaller than or equal to each element in A[q+1..r] Need to find index q to partition the array ≤ A[p…q]A[q+1…r]

53 53 Quicksort Conquer Recursively sort A[p..q] and A[q+1..r] using Quicksort Combine Trivial: the arrays are sorted in place No additional work is required to combine them The entire array is now sorted A[p…q]A[q+1…r] ≤

54 54 QUICKSORT Alg.: QUICKSORT (A, p, r) if p < r then q  PARTITION (A, p, r) QUICKSORT (A, p, q) QUICKSORT (A, q+1, r) Recurrence: Initially: p=1, r=n PARTITION()) T(n) = T(q) + T(n – q) + f(n)

55 55 QUICKSORT void quick_sort ( int a[], int p, int r) { if (p < r) { int i; i = partition(a, p,r); quick_sort(a, p, i-1); quick_sort(a, i+1,r); }

56 56 Partitioning the Array Choosing PARTITION() There are different ways to do this Each has its own advantages/disadvantages Hoare partition Select a pivot element x around which to partition Grows two regions A[p…i]  x x  A[j…r] A[p…i]  xx  A[j…r] i j

57 57 Example 73146235 ij 75146233 ij 75146233 ij 75641233 ij 73146235 ij A[p…r] 75641233 ij A[p…q]A[q+1…r] pivot x=5

58 58 Example

59 59 Partitioning the Array Alg. PARTITION (A, p, r) 1. x  A[p] 2. i  p – 1 3. j  r + 1 4. while TRUE 5. do repeat j  j – 1 6. until A[j] ≤ x 7. do repeat i  i + 1 8. until A[i] ≥ x 9. if i < j 10. then exchange A[i]  A[j] 11. else return j Running time:  (n) n = r – p + 1 73146235 i j A: arar apap ij=q A: A[p…q]A[q+1…r] ≤ p r Each element is visited once!

60 60 Partitioning the Array int partition ( int a[], int p, int r) { int x,tmp,i,j; x = a[p]; j = r+1; i = p-1; while (i < j){ do i++; while ((a[i] <= x) && (i < r)); do j--; while (a[j] > x); if (i < j) { tmp = a[i]; a[i] = a[j]; a[j] = tmp; } } /* end of while : i == j */ a[p] = a[j]; a[j] = x; return(j); }

61 61 Recurrence Alg.: QUICKSORT (A, p, r) if p < r then q  PARTITION (A, p, r) QUICKSORT (A, p, q) QUICKSORT (A, q+1, r) Recurrence: Initially: p=1, r=n T(n) = T(q) + T(n – q) + n

62 62 Worst Case Partitioning Worst-case partitioning One region has one element and the other has n – 1 elements Maximally unbalanced Recurrence: q=1 T(n) = T(1) + T(n – 1) + n, T(1) =  (1) T(n) = T(n – 1) + n = n n - 1 n - 2 n - 3 2 1 1 1 1 1 1 n n n n - 1 n - 2 3 2  (n 2 )

63 63 Best Case Partitioning Best-case partitioning Partitioning produces two regions of size n/2 Recurrence: q=n/2 T(n) = 2T(n/2) +  (n)=  (nlgn)


Download ppt "Sorting. 2 The Sorting Problem Input: A sequence of n numbers a 1, a 2,..., a n Output: A permutation (reordering) a 1 ’, a 2 ’,..., a n ’ of the input."

Similar presentations


Ads by Google