Sorting and Asymptotic Complexity

Slides:



Advertisements
Similar presentations
Introduction to Algorithms Quicksort
Advertisements

SEARCHING AND SORTING HINT AT ASYMPTOTIC COMPLEXITY Lecture 9 CS2110 – Spring 2015 We may not cover all this material.
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
ADA: 5. Quicksort1 Objective o describe the quicksort algorithm, it's partition function, and analyse its running time under different data conditions.
Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort
CS4413 Divide-and-Conquer
Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.
Chapter 4: Divide and Conquer Master Theorem, Mergesort, Quicksort, Binary Search, Binary Trees The Design and Analysis of Algorithms.
Using Divide and Conquer for Sorting
DIVIDE AND CONQUER APPROACH. General Method Works on the approach of dividing a given problem into smaller sub problems (ideally of same size).  Divide.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 5.
Spring 2015 Lecture 5: QuickSort & Selection
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
CS 171: Introduction to Computer Science II Quicksort.
Recitation on analysis of algorithms. runtimeof MergeSort /** Sort b[h..k]. */ public static void mS(Comparable[] b, int h, int k) { if (h >= k) return;
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 4 Comparison-based sorting Why sorting? Formal analysis of Quick-Sort Comparison.
1 Algorithm Efficiency and Sorting (Walls & Mirrors - Remainder of Chapter 9)
Merge sort, Insertion sort
TDDB56 DALGOPT-D DALG-C Lecture 8 – Sorting (part I) Jan Maluszynski - HT Sorting: –Intro: aspects of sorting, different strategies –Insertion.
TTIT33 Algorithms and Optimization – Dalg Lecture 2 HT TTIT33 Algorithms and optimization Lecture 2 Algorithms Sorting [GT] 3.1.2, 11 [LD] ,
Quicksort CIS 606 Spring Quicksort Worst-case running time: Θ(n 2 ). Expected running time: Θ(n lg n). Constants hidden in Θ(n lg n) are small.
Divide and Conquer Sorting
Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort or bubble sort 1. Find the minimum value in the list 2. Swap it with the value.
CSE 326: Data Structures Sorting Ben Lerner Summer 2007.
Sorting Lower Bound Andreas Klappenecker based on slides by Prof. Welch 1.
Sorting CS-212 Dick Steflik. Exchange Sorting Method : make n-1 passes across the data, on each pass compare adjacent items, swapping as necessary (n-1.
Chapter 7 (Part 2) Sorting Algorithms Merge Sort.
Design and Analysis of Algorithms – Chapter 51 Divide and Conquer (I) Dr. Ying Lu RAIK 283: Data Structures & Algorithms.
Sorting Chapter 6 Chapter 6 –Insertion Sort 6.1 –Quicksort 6.2 Chapter 5 Chapter 5 –Mergesort 5.2 –Stable Sorts Divide & Conquer.
CS2420: Lecture 11 Vladimir Kulyukin Computer Science Department Utah State University.
CSE 373 Data Structures Lecture 19
Computer Algorithms Lecture 10 Quicksort Ch. 7 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR.
Sorting (Part II: Divide and Conquer) CSE 373 Data Structures Lecture 14.
CSE 373 Data Structures Lecture 15
(c) , University of Washington
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Recitation 11 Analysis of Algorithms and inductive proofs 1.
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 19: Searching and Sorting Algorithms.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
CS 61B Data Structures and Programming Methodology July 28, 2008 David Sun.
Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort (iterative, recursive?) * Bubble sort.
1 Sorting Algorithms Sections 7.1 to Comparison-Based Sorting Input – 2,3,1,15,11,23,1 Output – 1,1,2,3,11,15,23 Class ‘Animals’ – Sort Objects.
Chapter 18: Searching and Sorting Algorithms. Objectives In this chapter, you will: Learn the various search algorithms Implement sequential and binary.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Recitation on analysis of algorithms. Formal definition of O(n) We give a formal definition and show how it is used: f(n) is O(g(n)) iff There is a positive.
Sorting Fundamental Data Structures and Algorithms Aleks Nanevski February 17, 2004.
1 Ch. 2: Getting Started. 2 About this lecture Study a few simple algorithms for sorting – Insertion Sort – Selection Sort (Exercise) – Merge Sort Show.
Young CS 331 D&A of Algo. Topic: Divide and Conquer1 Divide-and-Conquer General idea: Divide a problem into subprograms of the same kind; solve subprograms.
CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 122 – Data Structures Sorting.
Quicksort This is probably the most popular sorting algorithm. It was invented by the English Scientist C.A.R. Hoare It is popular because it works well.
QuickSort. Yet another sorting algorithm! Usually faster than other algorithms on average, although worst-case is O(n 2 ) Divide-and-conquer: –Divide:
CS 367 Introduction to Data Structures Lecture 11.
SORTING AND ASYMPTOTIC COMPLEXITY Lecture 13 CS2110 – Fall 2009.
Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University 1 Chapter 7 Sorting Sort is.
329 3/30/98 CSE 143 Searching and Sorting [Sections 12.4, ]
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
CMPT 238 Data Structures More on Sorting: Merge Sort and Quicksort.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting Input: sequence of numbers Output: a sorted sequence.
Sub-Quadratic Sorting Algorithms
Sorting and Asymptotic Complexity
Searching and Sorting Hint at Asymptotic Complexity
Searching and Sorting Hint at Asymptotic Complexity
The Selection Problem.
Searching and Sorting Hint at Asymptotic Complexity
Searching and Sorting Hint at Asymptotic Complexity
Presentation transcript:

Sorting and Asymptotic Complexity Lecture 14 CS2110 – Spring 2013

InsertionSort Many people sort cards this way //sort a[], an array of int for (int i = 1; i < a.length; i++) { // Push a[i] down to its sorted position // in a[0..i] int temp = a[i]; int k; for (k = i; 0 < k && temp < a[k–1]; k– –) a[k] = a[k–1]; a[k] = temp; } Worst-case: O(n2) (reverse-sorted input) Best-case: O(n) (sorted input) Expected case: O(n2) Expected number of inversions: n(n–1)/4 Added the comment: Push a[i] down. That helps them know what the following code does. Many people sort cards this way Invariant of main loop: a[0..i-1] is sorted Works especially well when input is nearly sorted

SelectionSort Another common way for people to sort cards Runtime //sort a[], an array of int for (int i = 1; i < a.length; i++) { int m= index of minimum of a[i..]; Swap b[i] and b[m]; } Another common way for people to sort cards Runtime Worst-case O(n2) Best-case O(n2) Expected-case O(n2) The diagram is actually the invariant of Selection sort algorithm. Putting everything at a high level (index of Minimum, swap) allows one to understand algorithm. Finding the index of the min is O(size of section) sorted, smaller values larger values a 0 i length Each iteration, swap min value of this section into a[i]

Divide & Conquer? It often pays to Break the problem into smaller subproblems, Solve the subproblems separately, and then Assemble a final solution This technique is called divide-and-conquer Caveat: It won’t help unless the partitioning and assembly processes are inexpensive Can we apply this approach to sorting?

MergeSort Quintessential divide-and-conquer algorithm Divide array into equal parts, sort each part, then merge Questions: Q1: How do we divide array into two equal parts? A1: Find middle index: a.length/2 Q2: How do we sort the parts? A2: Call MergeSort recursively! Q3: How do we merge the sorted subarrays? A3: Write some (easy) code

Merging Sorted Arrays A and B into C Picture shows situation after copying{4, 7} from A and {1, 3, 4, 6} from B into C 4 7 7 8 9 k A[0..i-1] and B[0..j-1] have been copied into C[0..k-1]. C[0..k-1] is sorted. Next, put a[i] in c[k], because a[i] < b[j]. Then increase k and i. Array A j 1 3 4 4 6 7 C: merged array Ken, I put this slide before the one that attempts to outline the algorithm. This is much More understandable. Then I redo your outline on the next slide. The red sentences form the invariant! 1 3 4 6 8 Array B

Merging Sorted Arrays A and B into C Create array C of size: size of A + size of B i= 0; j= 0; k= 0; // initially, nothing copied Copy smaller of A[i] and B[j] into C[k] Increment i or j, whichever one was used, and k When either A or B becomes empty, copy remaining elements from the other array (B or A, respectively) into C This tells what has been done so far: A[0..i-1] and B[0..j-1] have been placed in C[0..k-1]. C[0..k-1] is sorted.

MergeSort Analysis Outline (code on website) Split array into two halves Recursively sort each half Merge two halves Merge: combine two sorted arrays into one sorted array Rule: always choose smallest item Time: O(n) where n is the total size of the two arrays Runtime recurrence T(n): time to sort array of size n T(1) = 1 T(n) = 2T(n/2) + O(n) Can show by induction that T(n) is O(n log n) Alternatively, can see that T(n) is O(n log n) by looking at tree of recursive calls

MergeSort Notes Asymptotic complexity: O(n log n) Disadvantage Much faster than O(n2) Disadvantage Need extra storage for temporary arrays In practice, can be a disadvantage, even though MergeSort is asymptotically optimal for sorting Can do MergeSort in place, but very tricky (and slows execution significantly) Good sorting algorithms that do not use so much extra storage? Yes: QuickSort

QuickSort Idea To sort b[h..k], which has an arbitrary value x in b[h]: first swap array values around until b[h..k] looks like this: x ? h h+1 k x is called the pivot <= x x >= x h j k Then sort b[h..j-1] and b[j+1..k] —how do you do that? Recursively!

20 31 24 19 45 56 4 65 5 72 14 99 pivot partition j 19 4 5 14 20 31 24 45 56 65 72 99 Not yet sorted QuickSort 20 24 31 45 56 65 72 99 4 5 14 19 sorted Ken, look at this as a slide show, because it ahs two animations. Your version showed the two blue and red segments sort of as sets and then moved them into the array. Quicksort is an in-place algorithm, and it is much easier to visualize as shown above.

In-Place Partitioning On the previous slide we just moved the items to partition them But in fact this would require an extra array to copy them into Developer of QuickSort came up with a better idea In place partitioning cleverly splits the data in place

In-Place Partitioning Key issues How to choose a pivot? How to partition array in place? Partitioning in place Takes O(n) time (next slide) Requires no extra space Choosing pivot Ideal pivot: the median, since it splits array in half Computing median of unsorted array is O(n), quite complicated Popular heuristics: Use first array value (not good) middle array value median of first, middle, last, values GOOD! Choose a random element

In-Place Partitioning x ? h h+1 k Change b[h..k] from this: to this by repeatedly swapping array elements: b <= x x >= x h j k b Do it one swap at a time, keeping the array looking like this. At each step, swap b[j+1] with something <= x x ? >= x h j t k b Start with: j= h; t= k; Ken, the bottom diagram is the invariant. It is easily created by combining (generalizing) the other two diagrams –the pre- and post-conditions. The slide uses animation to show the bottom part when mouse clicked.

In-Place Partitioning <= x x ? >= x h j t k b Initially, with j = h and t = k, this diagram looks like the start diagram j= h; t= k; while (j < t) { if (b[j+1] <= x) { Swap b[j+1] and b[j]; j= j+1; } else { Swap b[j+1] and b[t]; t= t-1; } Terminates when j = t, so the “?” segment is empty, so diagram looks like result diagram Ken, given that simple invariant, it is easy to see how to DEVELOP the algorithm. Initialization must make b[j+1..t] be whole array. Stop when that segment is empty. To make progress, move b[j+1] into either the leftmost or the rightmost segment. This is O(n), but one can make it more efficient. However, our task at first is not to show them the most efficient implementation but to show the, as simply as possible, one that works. The one in my file of sorting algorithms is a bit more efficient.

In-Place Partitioning How can we move all the blues to the left of all the reds? Keep two indices, LEFT and RIGHT Initialize LEFT at start of array and RIGHT at end of array Invariant: all elements to left of LEFT are blue all elements to right of RIGHT are red Keep advancing indices until they pass, maintaining invariant I would delete these slides that show execution. For they don’t help anyone actually write the algorithm.

swap Now neither LEFT nor RIGHT can advance and maintain invariant. We can swap red and blue pointed to by LEFT and RIGHT indices. After swap, indices can continue to advance until next conflict. swap swap

Once indices cross, partitioning is done If you replace blue with ≤ p and red with ≥ p, this is exactly what we need for QuickSort partitioning Notice that after partitioning, array is partially sorted Recursive calls on partitioned subarrays will sort subarrays No need to copy/move arrays, since we partitioned in place Ken, the bottom three bullets should be removed. At this point, you are talking only about the partition algorithm, and to now bring in also the recursive calls will confuse. Focus one on thing. The next slide shows the whole algorithm, with the partition part as a function call. Much clearer.

QuickSort procedure /** Sort b[h..k]. */ public static void QS(int[] b, int h, int k) { if (b[h..k] has < 2 elements) return; int j= partition(b, h, k); // We know b[h..j–1] <= b[j] <= b[j+1..k] // So we need to sort b[h..j-1] and b[j+1..k] QS(b, h, j-1); QS(b, j+1, k); } Base case Function does the partition algorithm and returns position j of pivot

QuickSort versus MergeSort /** Sort b[h..k] */ public static void QS (int[] b, int h, int k) { if (k – h < 1) return; int j= partition(b, h, k); QS(b, h, j-1); QS(b, j+1, k); } /** Sort b[h..k] */ public static void MS (int[] b, int h, int k) { if (k – h < 1) return; MS(b, h, (h+k)/2); MS(b, (h+k)/2 + 1, k); merge(b, h, (h+k)/2, k); } One processes the array then recurses. One recurses then processes the array.

QuickSort Analysis Runtime analysis (worst-case) Partition can produce this: Runtime recurrence: T(n) = T(n–1) + n Can be solved to show worst-case T(n) is O(n2) Space can be O(n) —max depth of recursion Runtime analysis (expected-case) More complex recurrence Can be solved to show expected T(n) is O(n log n) Improve constant factor by avoiding QuickSort on small sets Use InsertionSort (for example) for sets of size, say, ≤ 9 Definition of small depends on language, machine, etc. p > p Ken, I added the fact that the space can be O(n), which happens if depth of recursion is O(n). Can be fixed by sorting longest of two segments recursively and the other iteratively. That’s what my Qsort does.

Sorting Algorithm Summary We discussed InsertionSort SelectionSort MergeSort QuickSort Other sorting algorithms HeapSort (will revisit) ShellSort (in text) BubbleSort (nice name) RadixSort BinSort CountingSort Why so many? Do computer scientists have some kind of sorting fetish or what? Stable sorts: Ins, Sel, Mer Worst-case O(n log n): Mer, Hea Expected O(n log n): Mer, Hea, Qui Best for nearly-sorted sets: Ins No extra space: Ins, Sel, Hea Fastest in practice: Qui Least data movement: Sel

Lower Bound for Comparison Sorting Goal: Determine minimum time required to sort n items Note: we want worst-case, not best-case time Best-case doesn’t tell us much. E.g. Insertion Sort takes O(n) time on already- sorted input Want to know worst-case time for best possible algorithm How can we prove anything about the best possible algorithm? Want to find characteristics that are common to all sorting algorithms Limit attention to comparison- based algorithms and try to count number of comparisons

Comparison Trees Comparison-based algorithms make decisions based on comparison of data elements Gives a comparison tree If algorithm fails to terminate for some input, comparison tree is infinite Height of comparison tree represents worst-case number of comparisons for that algorithm Can show: Any correct comparison- based algorithm must make at least n log n comparisons in the worst case a[i] < a[j] yes no

Lower Bound for Comparison Sorting Say we have a correct comparison-based algorithm Suppose we want to sort the elements in an array b[] Assume the elements of b[] are distinct Any permutation of the elements is initially possible When done, b[] is sorted But the algorithm could not have taken the same path in the comparison tree on different input permutations

Lower Bound for Comparison Sorting How many input permutations are possible? n! ~ 2n log n For a comparison-based sorting algorithm to be correct, it must have at least that many leaves in its comparison tree To have at least n! ~ 2n log n leaves, it must have height at least n log n (since it is only binary branching, the number of nodes at most doubles at every depth) Therefore its longest path must be of length at least n log n, and that it its worst-case running time

java.lang.Comparable<T> Interface public int compareTo(T x); Return a negative, zero, or positive value negative if this is before x 0 if this.equals(x) positive if this is after x Many classes implement Comparable String, Double, Integer, Character, Date, … Class implements Comparable? Its method compareTo is considered to define that class’s natural ordering Comparison-based sorting methods should work with Comparable for maximum generality