Sorting Data Structures and Algorithms (60-254). Sorting Sorting is one of the most well-studied problems in Computer Science The ultimate reference on.

Slides:



Advertisements
Similar presentations
Introduction to Algorithms Quicksort
Advertisements

Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )
CSE 3101: Introduction to the Design and Analysis of Algorithms
MS 101: Algorithms Instructor Neelima Gupta
ADA: 5. Quicksort1 Objective o describe the quicksort algorithm, it's partition function, and analyse its running time under different data conditions.
Sorting Chapter 8 CSCI 3333 Data Structures.
CSC 2300 Data Structures & Algorithms March 16, 2007 Chapter 7. Sorting.
Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort
CS4413 Divide-and-Conquer
Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.
DIVIDE AND CONQUER APPROACH. General Method Works on the approach of dividing a given problem into smaller sub problems (ideally of same size).  Divide.
Analysis of Algorithms CS 477/677 Sorting – Part B Instructor: George Bebis (Chapter 7)
Spring 2015 Lecture 5: QuickSort & Selection
Quicksort CS 3358 Data Structures. Sorting II/ Slide 2 Introduction Fastest known sorting algorithm in practice * Average case: O(N log N) * Worst case:
25 May Quick Sort (11.2) CSE 2011 Winter 2011.
Sorting Algorithms and Average Case Time Complexity
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
CS 253: Algorithms Chapter 7 Mergesort Quicksort Credit: Dr. George Bebis.
CS 253: Algorithms Chapter 8 Sorting in Linear Time Credit: Dr. George Bebis.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Analysis of Algorithms CS 477/677 Midterm Exam Review Instructor: George Bebis.
Sorting. Introduction Assumptions –Sorting an array of integers –Entire sort can be done in main memory Straightforward algorithms are O(N 2 ) More complex.
Sorting Chapter 10.
TTIT33 Algorithms and Optimization – Dalg Lecture 2 HT TTIT33 Algorithms and optimization Lecture 2 Algorithms Sorting [GT] 3.1.2, 11 [LD] ,
Analysis of Algorithms CS 477/677
Chapter 7 (Part 2) Sorting Algorithms Merge Sort.
Sorting Chapter 10. Chapter 10: Sorting2 Chapter Objectives To learn how to use the standard sorting methods in the Java API To learn how to implement.
David Luebke 1 7/2/2015 Linear-Time Sorting Algorithms.
© 2006 Pearson Addison-Wesley. All rights reserved10 A-1 Chapter 10 Algorithm Efficiency and Sorting.
Computer Algorithms Lecture 10 Quicksort Ch. 7 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR.
Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)
Sorting in Linear Time Lower bound for comparison-based sorting
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
10 Algorithms in 20th Century Science, Vol. 287, No. 5454, p. 799, February 2000 Computing in Science & Engineering, January/February : The Metropolis.
HOW TO SOLVE IT? Algorithms. An Algorithm An algorithm is any well-defined (computational) procedure that takes some value, or set of values, as input.
1 Sorting in O(N) time CS302 Data Structures Section 10.4.
HKOI 2006 Intermediate Training Searching and Sorting 1/4/2006.
David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
1 Designing algorithms There are many ways to design an algorithm. Insertion sort uses an incremental approach: having sorted the sub-array A[1…j - 1],
Sorting Chapter 10. Chapter Objectives  To learn how to use the standard sorting methods in the Java API  To learn how to implement the following sorting.
Elementary Sorting Algorithms Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
CS 61B Data Structures and Programming Methodology July 28, 2008 David Sun.
Analysis of Algorithms CS 477/677
Data Structures Using C++ 2E Chapter 10 Sorting Algorithms.
© 2006 Pearson Addison-Wesley. All rights reserved10 A-1 Chapter 10 Algorithm Efficiency and Sorting.
© 2006 Pearson Addison-Wesley. All rights reserved10 B-1 Chapter 10 (continued) Algorithm Efficiency and Sorting.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
Chapter 18: Searching and Sorting Algorithms. Objectives In this chapter, you will: Learn the various search algorithms Implement sequential and binary.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
QuickSort (Ch. 7) Like Merge-Sort, based on the three-step process of divide- and-conquer. Input: An array A[1…n] of comparable elements, the starting.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
2IS80 Fundamentals of Informatics Fall 2015 Lecture 6: Sorting and Searching.
Young CS 331 D&A of Algo. Topic: Divide and Conquer1 Divide-and-Conquer General idea: Divide a problem into subprograms of the same kind; solve subprograms.
PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)
Sorting Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
QuickSort. Yet another sorting algorithm! Usually faster than other algorithms on average, although worst-case is O(n 2 ) Divide-and-conquer: –Divide:
Sorting Ordering data. Design and Analysis of Sorting Assumptions –sorting will be internal (in memory) –sorting will be done on an array of elements.
SORTING AND ASYMPTOTIC COMPLEXITY Lecture 13 CS2110 – Fall 2009.
Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University 1 Chapter 7 Sorting Sort is.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting Input: sequence of numbers Output: a sorted sequence.
Sorting.
Linear Sorting Sections 10.4
Saurav Karmakar Spring 2007
Linear Sorting Section 10.4
CS 583 Analysis of Algorithms
Presentation transcript:

Sorting Data Structures and Algorithms (60-254)

Sorting Sorting is one of the most well-studied problems in Computer Science The ultimate reference on the subject is: “The Art of Computer Programming: Vol. 3 Sorting and Searching”, by D. E. Knuth 2

Formal Statement Given a sequence of n numbers: a 1, a 2, …, a n find a permutation  of the numbers 1, 2, …, n such that a  (1)  a  (2)  …  a  (n) Permutation: 3, 2, 1   (1) = 3,  (2) = 2,  (3) = 1 2, 1, 3   (1) = 2,  (2) = 1,  (3) = 3 1, 3, 2   (1) = 1,  (2) = 3,  (3) = 2 … are all permutations of 1, 2, 3 3

Comparison Sorts A comparison sort sorts by comparing elements pairwise. We study these comparison sorts: Insertion Sort Shellsort Mergesort Quicksort 4

Insertion Sort Sort the sequence 3, 1, 5, 4, 2 Sort 3  3 Sort3, 1  1, 3 Sort1, 3, 5  1, 3, 5 Sort1, 3, 5, 4  1, 3, 4, 5 Sort1, 3, 4, 5, 2  1, 2, 3, 4, 5 5

Incremental sorting In general, at the i th step, a 1, a 2, a 3, …, a i-1, a i are already sorted  a  (1)  a  (2)  …  a  (i) for some permutation  of 1, 2, …, i. In the next step, a i+1 has to be inserted in the correct position 6

Analysis of Insertion Sort What is worst-case input? Elements in decreasing order!! Example: 5, 4, 3, 2, 1 # of comparisons 50 5, 4  4, 51 4, 5, 3  3, 4, 52 3, 4, 5, 2  2, 3, 4, 53 2, 3, 4, 5, 1  1, 2, 3, 4, 54 7

Worst case In general, to insert a i+1 in its proper place, w.r.t. the sorted preceeding i numbers a 1, a 2, …, a i, we can make i comparisons in the worst case. Thus, 8

Shellsort Due to Donald Shell For example: Shellsort the sequence: 81, 94, 11, 96, 12, 35, 17, 95 (1) Step 1: Sort all sequences that are four positions apart. 81, 12  12, 81 94, 35  35, 94 11, 17  11, 17 96, 95  95, 96 Results in: 12, 35, 11, 95, 81, 94, 17, 96(2) 9

Shellsort Step 2: 12, 35, 11, 95, 81, 94, 17, 96(2) Sort all sequences of (2) that are two positions apart. 12, 11, 81, 17  11, 12, 17, 81 35, 95, 94, 96  35, 94, 95, 96 Results in: 11, 35, 12, 94, 17, 95, 81, 96(3) Step 3: Sort all sequences of (3) that are one position apart. 11, 35, 12, 94, 17, 95, 81, 96  11, 12, 17, 35, 81, 94, 95, 96(4) Sequence (4) is sorted !! 10

Observations h 1, h 2, h 3 = 4, 2, 1 is called a gap sequence. Different gap sequences are possible Every one of them must end with 1 Shell’s gap sequence: h 1 = n/2 h i = h i-1 / 2 (downto h k = 1) All sequences were sorted using insertion sort In Step 3, we sorted the entire sequence, using insertion sort! Advantage over straightforward insertion sort? 11

Example Insertion sort on: 81, 94, 11, 96, 12, 35, 17, , , 81, , 81, 94, , 12, 81, 94, , 12, 35, 81, 94, , 12, 17, 35, 81, 94, , 12, 17, 35, 1, 94, 95, 962 __ Total # of comparisons19 12

Example Insertion sort on: 11, 35, 12, 94, 17, 95, 81, , , 12, , 12, 35, , 12, 17, 35, , 12, 17, 35, 94, , 12, 17, 35, 81, 94, , 12, 17, 35, 81, 94, 95, 961 __ Total # of comparisons12 13

Analysis of Shellsort Clever choice of a gap sequence leads to a subquadratic algorithm That is, for an n-element sequence, the # of comparisons: when using the Hibbard sequence: 1, 3, 7, …, 2 k -1 14

Mergesort Sort: 81, 94, 11, 96, | 12, 35, 17, 95  Mergesort (81, 94, 11, 96) Mergesort(12, 35, 17, 95) Merge the two sorted lists from above two lines. This is a Divide-and-conquer algorithm. 15

Divide 16

Conquer Merge two sorted lists. MS (81, 94, 11, 96) = 11, 81, 94, 96(1) MS (12, 35, 17, 95) = 12, 17, 35, 95(2) Compare 11 and 12 Output 11 Move index in list (1) Compare 12 and 81 Output 12 Move index in list (2) Compare 17 and 81 Output 17 Move index in list (2) 17

Number of Comparisons A total of seven comparisons to generate the sorted list 11, 12, 17, 35, 81, 94, 95, 96 This is the maximum! For if the lists were 81, 94, 95, 96 and 11, 12, 17, 35 We would need only four comparisons The algorithm follows… 18

Procedure Mergesort(A) n  size of A if (n > 1) Set A 1  A[1... n/2] // Create a new array A 1 Set A 2  A[n/ n] // Create a new array A 2 Mergesort(A 1 ) Mergesort(A 2 ) Merge(A, A 1, A 2 ) else // A has only one element  do nothing! 19

Procedure Merge(A, A 1, A 2 ) n 1  size of A 1 n 2  size of A 2 i  1; j  1; k  1 while (i <= n 1 and j <= n 2 ) if (A 1 [i] < A 2 [j]) A[k]  A 1 [i]; i  i +1 else A[k]  A 2 [j]; j  j + 1 k  k + 1 for m  i to n 1 A[k]  A 1 [m]; k  k + 1 for m  j to n 2 A[k]  A 2 [m]; k  k

Theorem To merge two sorted lists, each of length n, we need at most 2n – 1 comparisons 21

Complexity of Mergesort T(n) = 2 T(n/2) + O(n)n > 2 = 1n = 1 Solution:O(n log n) 22

A Partitioning Game GivenL = 5, 3, 2, 6, 4, 1, 3, 7 Partition L into L 1 and L 2 such that every element in list L 1 is less than or equal to every element in list L 2 How? 23

Split a = first element of L Make Every element of list L 1 less than or equal to a less than or equal to every element of list L 2 How? Using two indices lx = left index rx = right index 24

rx Initial configuration 25 5, 3, 2, 6, 4, 1, 3, 7 lx Rules:  lx moves right until it meets an element  5  rx moves left until it meets an element  5  exchange elements and continue until indices meet or cross.

Intermediate configurations 5, 3, 2, 6, 4, 1, 3, 7 26 lx rx lx and rx have crossed !!

Intermediate configurations L 1 = 3, 3, 2, 1, 4 L 2 = 6, 5, 7 Now, do the same with the lists L 1 and L 2 Initial configuration for L 1 3, 3, 2, 1, 4 27 lx rx 3 = first element.

Intermediate configuration for L 1 3, 3, 2, 1, 4 28 lx rx Exchange and continue: 1, 3, 2, 3, 4 lx rx Exchange and continue: 1, 2, 3, 3, 4 rx lx Left and right indices have crossed!

Quicksort We have new lists L 11 = 1, 2 L 12 = 3, 3, 4 with which we continue to do the same Partitioning stops once we have a list with only one element All this, done in place gives us the following sorted list L sorted = 1, 2, 3, 3, 4, 5, 6, 7 This is Quicksort!!! 29

Partition – Formal Description Procedure Partition(L, p, q) a  L[p] lx  p – 1 rx  q + 1 while true repeat rx  rx -1 // Move right index until L[rx]  a repeat lx  lx + 1// Move left index until L[lx]  a if (lx < rx) exchange(L[lx], L[rx]) else return rx // Indices have crossed 30

Quicksort Procedure Quicksort(L, p, q) if (p < q) r  Partition(L, p, q) Quicksort(L, p, r) Quicksort(L, r+1, q) To sort the entire array, the initial call is: Quicksort(L, 1, n) where n is the length of L 31

Observations Choice of the partitioning element a is important Determines how the lists are split Desirable: To split the list evenly How?... 32

Undesirable Partitioning 33 List of size 2 List of size n-1 List of size n

Example Such an undesirable partitioning is possible if we take the following sorted sequence 3, 4, 5, 6, 7, 8, 9, 10 and we partition as described above 34

Desirable Partitioning

Choosing the pivot Steering between the two extremes: Can we choose the partitioning element to steer between the two extremes? Heuristics: Median-of-three Find median of first, middle and last element or Find median of three randomly chosen elements. 36

Analysis Worst-case behavior T(n) = n + n-1 + … + 2 = O(n 2 ) Since to partition a list of size n-i (i  0) into two lists of size 1 and n-i-1 we need to look at all n-i elements 37 n-i1n-i-1 T(n) = T(n-1) + O(n)

Best-case Behavior T(n) = 2T(n/2) + O(n) T(n) = O(n log n) where T(n) =time to partition a list of size n into two lists of size n/2 each. 38

Average-case Behavior T avg (n) = O(n log n) T(n) = T(  n) + T(  n) + O(n) where  +  = 1 39

Sorting – in Linear Time?? Yes… but… only under certain assumptions on the input data Linear-time sorting techniques: Counting sort Radix sort Bucket sort 40

Counting Sort Assumption: Input elements are integers in the range 1 to k where k = O(n) Example: Sort the list L = 3, 6, 4, 1, 3, 4, 1, 4 using counting sort. 41

Example A C B  Input is in array A  C[i] counts # of times i occurs in the input at first  Sorted array is stored in B

Example - Continued Count # of times i occurs in A: 43 C Cumulative counters: C Now C[i] contains the count of the number of elements in the input that are  i

Example - Continued Go through array A First element is 3 From C[3] we know that 4 elements are  to 3 So B[4] = 3 Decrease C[3] by 1 44 B C

Example - Continued Next element in A is 6 C[6] = 8  eight elements are  6 B[8] = 6 Decrease C[6] by 1 45 B C and so on…

Example - Continued A[3] = 4, C[4] = 7, B[7] = 4 46 B C

Example - Continued A[4] = 1,C[1] = 2,B[2] = 1 47 B C

Example - Continued A[5] = 3, C[3] = 3,B[3] = 3 48 B C

Example - Continued A[6] = 4,C[4] = 6,B[6] = 4 49 B C

Example - Continued A[7] = 1, C[1] = 1,B[1] = 1 50 B C

Example - Continued A[8] = 4,C[4] = 5,B[5] = 4 51 B C

Formal Algorithm Procedure CountingSort(A, B, k, n) for i  1 to k C[i]  0 for i  1 to n C[A[i]]  C[A[i]] + 1 // C[i] now contains a counter of how often i occurs for i  2 to k C[i]  C[i] + C[i-1] // C[i] now contains # of elements  i for i  n downto 1 B[C[A[i]]]  A[i] C[A[i]]  C[A[i]]

Without using a second array B… Procedure Single_Array_CountingSort(A, k, n) for i  1 to k C[i]  0 for i  1 to n C[A[i]]  C[A[i]] + 1 // C[i] now contains a counter of how often i occurs pos  1 for i  1 to k for j  1 to C[i] A[pos]  i pos  pos

Analysis of Single-Array Counting Sort First and second for loops take k = O(n) steps Then, two nested for loops…  O(n 2 ) ??? A more accurate upper bound??...Yes… For each i … inner for loop executes C[i] times Then, two for loops execute 54 Theorem: Proof (sketch): Second for loop executed n times. Each step an element in C is increased by 1. q.e.d.

Discussion Complexity: If the list is of size n and k = O(n), then T(n) = O(n) Stability: A sorting method is stable if equal elements are output in the same order they had in the input. Theorem: Counting Sort is stable. 55

Lower Bounds on Sorting Theorem: For any comparison sort of n elements T(n) =  (n log n) Remark: T(n) =  (g(n)) means that T(n) grows at least as fast as g(n) 56