Sorting and Asymptotic Complexity

Slides:



Advertisements
Similar presentations
Introduction to Algorithms Quicksort
Advertisements

SEARCHING AND SORTING HINT AT ASYMPTOTIC COMPLEXITY Lecture 9 CS2110 – Spring 2015 We may not cover all this material.
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.
Using Divide and Conquer for Sorting
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 5.
Spring 2015 Lecture 5: QuickSort & Selection
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
CS 171: Introduction to Computer Science II Quicksort.
Recitation on analysis of algorithms. runtimeof MergeSort /** Sort b[h..k]. */ public static void mS(Comparable[] b, int h, int k) { if (h >= k) return;
Lecture 5: Linear Time Sorting Shang-Hua Teng. Sorting Input: Array A[1...n], of elements in arbitrary order; array size n Output: Array A[1...n] of the.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 4 Comparison-based sorting Why sorting? Formal analysis of Quick-Sort Comparison.
TTIT33 Algorithms and Optimization – Dalg Lecture 2 HT TTIT33 Algorithms and optimization Lecture 2 Algorithms Sorting [GT] 3.1.2, 11 [LD] ,
CSE 326: Data Structures Sorting Ben Lerner Summer 2007.
1 Algorithmic analysis Introduction. This handout tells you what you are responsible for concerning the analysis of algorithms You are responsible for:
SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY CS2110 – Spring 2015 Lecture 10 size n … Constant time n ops n + 2 ops 2n + 2 ops n*n ops Pic: Natalie.
Sorting and Asymptotic Complexity
CSE 373 Data Structures Lecture 15
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Recitation 11 Analysis of Algorithms and inductive proofs 1.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
CS 61B Data Structures and Programming Methodology July 28, 2008 David Sun.
Searching and Sorting Recursion, Merge-sort, Divide & Conquer, Bucket sort, Radix sort Lecture 5.
Recitation on analysis of algorithms. Formal definition of O(n) We give a formal definition and show how it is used: f(n) is O(g(n)) iff There is a positive.
SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY Lecture 10 CS2110 – Fall
Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.
SORTING AND ASYMPTOTIC COMPLEXITY Lecture 13 CS2110 – Fall 2009.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting Input: sequence of numbers Output: a sorted sequence.
Lecture 2 Algorithm Analysis
Lower Bounds & Sorting in Linear Time
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Analysis of Algorithms and Inductive Proofs
Analysis of Algorithms CS 477/677
Sorts, CompareTo Method and Strings
Subject Name: Design and Analysis of Algorithm Subject Code: 10CS43
Sorting Chapter 13 presents several common algorithms for sorting an array of integers. Two slow but simple algorithms are Selectionsort and Insertionsort.
Introduction to Algorithms
COSC160: Data Structures Linked Lists
Quicksort "There's nothing in your head the sorting hat can't see. So try me on and I will tell you where you ought to be." -The Sorting Hat, Harry Potter.
Divide and Conquer.
CS 3343: Analysis of Algorithms
Analysis of Algorithms
Chapter 4: Divide and Conquer
Linear Sort "Our intuition about the future is linear. But the reality of information technology is exponential, and that makes a profound difference.
Data Structures Review Session
Searching, Sorting, and Asymptotic Complexity
"Organizing is what you do before you do something, so that when you do it, it is not all mixed up." ~ A. A. Milne Sorting Lecture 11 CS2110 – Fall 2018.
CS200: Algorithm Analysis
Lecture No 6 Advance Analysis of Institute of Southern Punjab Multan
CS100A Lecture 15, 17 22, 29 October 1998 Sorting
Sub-Quadratic Sorting Algorithms
Lower Bounds & Sorting in Linear Time
Linear-Time Sorting Algorithms
CS 3343: Analysis of Algorithms
"Organizing is what you do before you do something, so that when you do it, it is not all mixed up." ~ A. A. Milne Sorting Lecture 11 CS2110 – Spring 2019.
CSE 373 Data Structures and Algorithms
Ch. 2: Getting Started.
Searching and Sorting Hint at Asymptotic Complexity
CS100A Lecture 15, 17 22, 29 October 1998 Sorting
Searching and Sorting Hint at Asymptotic Complexity
CS 583 Analysis of Algorithms
The Selection Problem.
Algorithms CSCI 235, Spring 2019 Lecture 17 Quick Sort II
Searching and Sorting Hint at Asymptotic Complexity
Searching and Sorting Hint at Asymptotic Complexity
Analysis of Algorithms CS 477/677
Divide and Conquer Merge sort and quick sort Binary search
Presentation transcript:

Sorting and Asymptotic Complexity File searchSortAlgorithms.zip on course website (lecture notes for lectures 12, 13) contains ALL searching/sorting algorithms. Download it and look at algorithms Sorting and Asymptotic Complexity Lecture 12 CS2110 – Spring 2014

Execution of logarithmic-space Quicksort /** Sort b[h..k]. */ public static void QS(int[] b, int h, int k) { int h1= h; int k1= k; // inv; b[h..k] is sorted if b[h1..k1] is while (size of b[h1..k1] > 1) { int j= partition(b, h1, k1); // b[h1..j-1] <= b[j] <= b[j+1..k1] if (b[h1..j-1] smaller than b[j+1..k1]) { QS(b, h, j-1); h1= j+1; } else {QS(b, j+1, k1); k1= j-1; } } Last lecture ended with presenting this algorithm. There was no time to explain it. We now show how it is executed in order to illustrate how the invariant is maintained

Call QS(b, 0, 11); Initially, h is 0 and k is 11. public static void QS(int[] b, int h, int k) { int h1= h; int k1= k; // inv; b[h..k] is sorted if b[h1..k1] is while (size of b[h1..k1] > 1) { int j= partition(b, h1, k1); // b[h1..j-1] <= b[j] <= b[j+1..k1] if (b[h1..j-1] smaller than b[j+1..k1]) { QS(b, h, j-1); h1= j+1; } else {QS(b, j+1, k1); k1= j-1; } } Initially, h is 0 and k is 11. The initialization stores 0 and 11 in h1 and k1. The invariant is true since h = h1 and k = k1. j ? h k 11 3 4 8 7 6 8 9 1 2 5 7 9 0 11 h1 k1 11

Call QS(b, 0, 11); public static void QS(int[] b, int h, int k) { int h1= h; int k1= k; // inv; b[h..k] is sorted if b[h1..k1] is while (size of b[h1..k1] > 1) { int j= partition(b, h1, k1); // b[h1..j-1] <= b[j] <= b[j+1..k1] if (b[h1..j-1] smaller than b[j+1..k1]) { QS(b, h, j-1); h1= j+1; } else {QS(b, j+1, k1); k1= j-1; } } The assignment to j partitions b, making it look like what is below. The two partitions are underlined j 2 h k 11 2 1 3 7 6 8 9 4 8 5 7 9 0 j 11 h1 k1 11

Call QS(b, 0, 11); public static void QS(int[] b, int h, int k) { int h1= h; int k1= k; // inv; b[h..k] is sorted if b[h1..k1] is while (size of b[h1..k1] > 1) { int j= partition(b, h1, k1); // b[h1..j-1] <= b[j] <= b[j+1..k1] if (b[h1..j-1] smaller than b[j+1..k1]) { QS(b, h, j-1); h1= j+1; } else {QS(b, j+1, k1); k1= j-1; } } The left partition is smaller, so it is sorted recursively by this call. We have changed the partition to the result. h k 11 1 2 3 7 6 8 9 4 8 5 7 9 0 j 11 h1 k1 11

Call QS(b, 0, 11); The assignment to h1 is done. public static void QS(int[] b, int h, int k) { int h1= h; int k1= k; // inv; b[h..k] is sorted if b[h1..k1] is while (size of b[h1..k1] > 1) { int j= partition(b, h1, k1); // b[h1..j-1] <= b[j] <= b[j+1..k1] if (b[h1..j-1] smaller than b[j+1..k1]) { QS(b, h, j-1); h1= j+1; } else {QS(b, j+1, k1); k1= j-1; } } Do you see that the inv is true again? If the underlined partition is sorted, then so is b[h..k]. Each iteration of the loop keeps inv true and reduces size of b[h1..k1]. j 2 h k 11 1 2 3 7 6 8 9 4 8 5 7 9 0 j 11 h1 3 k1 11

Divide & Conquer! It often pays to Break the problem into smaller subproblems, Solve the subproblems separately, and then Assemble a final solution This technique is called divide-and-conquer Caveat: It won’t help unless the partitioning and assembly processes are inexpensive We did this in Quicksort: Partition the array and then sort the two partitions.

MergeSort Quintessential divide-and-conquer algorithm: Divide array into equal parts, sort each part (recursively), then merge Questions: Q1: How do we divide array into two equal parts? A1: Find middle index: b.length/2 Q2: How do we sort the parts? A2: Call MergeSort recursively! Q3: How do we merge the sorted subarrays? A3: It takes linear time.

Merging Sorted Arrays A and B into C Picture shows situation after copying{4, 7} from A and {1, 3, 4, 6} from B into C 4 7 7 8 9 k A[0..i-1] and B[0..j-1] have been copied into C[0..k-1]. C[0..k-1] is sorted. Next, put a[i] in c[k], because a[i] < b[j]. Then increase k and i. Array A j 1 3 4 4 6 7 C: merged array Ken, I put this slide before the one that attempts to outline the algorithm. This is much More understandable. Then I redo your outline on the next slide. The red sentences form the invariant! 1 3 4 6 8 Array B

Merging Sorted Arrays A and B into C Create array C of size: size of A + size of B i= 0; j= 0; k= 0; // initially, nothing copied Copy smaller of A[i] and B[j] into C[k] Increment i or j, whichever one was used, and k When either A or B becomes empty, copy remaining elements from the other array (B or A, respectively) into C This tells what has been done so far: A[0..i-1] and B[0..j-1] have been placed in C[0..k-1]. C[0..k-1] is sorted.

MergeSort /** Sort b[h..k] */ public static void MS (int[] b, int h, int k) { if (k – h <= 1) return; MS(b, h, (h+k)/2); MS(b, (h+k)/2 + 1, k); merge(b, h, (h+k)/2, k); } merge 2 sorted arrays

QuickSort versus MergeSort /** Sort b[h..k] */ public static void QS (int[] b, int h, int k) { if (k – h <= 1) return; int j= partition(b, h, k); QS(b, h, j-1); QS(b, j+1, k); } /** Sort b[h..k] */ public static void MS (int[] b, int h, int k) { if (k – h <= 1) return; MS(b, h, (h+k)/2); MS(b, (h+k)/2 + 1, k); merge(b, h, (h+k)/2, k); } One processes the array then recurses. One recurses then processes the array. merge 2 sorted arrays

MergeSort Analysis Outline Split array into two halves Recursively sort each half Merge two halves Merge: combine two sorted arrays into one sorted array: Time: O(n) where n is the total size of the two arrays Runtime recurrence T(n): time to sort array of size n T(1) = 1 T(n) = 2T(n/2) + O(n) Can show by induction that T(n) is O(n log n) Alternatively, can see that T(n) is O(n log n) by looking at tree of recursive calls

MergeSort Notes Asymptotic complexity: O(n log n) Disadvantage Much faster than O(n2) Disadvantage Need extra storage for temporary arrays In practice, can be a disadvantage, even though MergeSort is asymptotically optimal for sorting Can do MergeSort in place, but very tricky (and slows execution significantly) Good sorting algorithm that does not use so much extra storage? Yes: QuickSort —when done properly, uses log n space.

QuickSort Analysis Runtime analysis (worst-case) Partition can produce this: Runtime recurrence: T(n) = T(n–1) + n Can be solved to show worst-case T(n) is O(n2) Space can be O(n) —max depth of recursion Runtime analysis (expected-case) More complex recurrence Can be solved to show expected T(n) is O(n log n) Improve constant factor by avoiding QuickSort on small sets Use InsertionSort (for example) for sets of size, say, ≤ 9 Definition of small depends on language, machine, etc. p > p Ken, I added the fact that the space can be O(n), which happens if depth of recursion is O(n). Can be fixed by sorting longest of two segments recursively and the other iteratively. That’s what my Qsort does.

Sorting Algorithm Summary We discussed InsertionSort SelectionSort MergeSort QuickSort Other sorting algorithms HeapSort (will revisit) ShellSort (in text) BubbleSort (nice name) RadixSort BinSort CountingSort Why so many? Do computer scientists have some kind of sorting fetish or what? Stable sorts: Ins, Sel, Mer Worst-case O(n log n): Mer, Hea Expected O(n log n): Mer, Hea, Qui Best for nearly-sorted sets: Ins No extra space: Ins, Sel, Hea Fastest in practice: Qui Least data movement: Sel A sorting algorithm is stable if: equal values stay in same order: b[i] = b[j] and i < j means that b[i] will precede b[j] in result

Lower Bound for Comparison Sorting Goal: Determine minimum time required to sort n items Note: we want worst-case, not best-case time Best-case doesn’t tell us much. E.g. Insertion Sort takes O(n) time on already- sorted input Want to know worst-case time for best possible algorithm How can we prove anything about the best possible algorithm? Want to find characteristics that are common to all sorting algorithms Limit attention to comparison- based algorithms and try to count number of comparisons

Comparison Trees Comparison-based algorithms make decisions based on comparison of data elements Gives a comparison tree If algorithm fails to terminate for some input, comparison tree is infinite Height of comparison tree represents worst-case number of comparisons for that algorithm Can show: Any correct comparison- based algorithm must make at least n log n comparisons in the worst case a[i] < a[j] yes no

Lower Bound for Comparison Sorting Say we have a correct comparison-based algorithm Suppose we want to sort the elements in an array b[] Assume the elements of b[] are distinct Any permutation of the elements is initially possible When done, b[] is sorted But the algorithm could not have taken the same path in the comparison tree on different input permutations

Lower Bound for Comparison Sorting How many input permutations are possible? n! ~ 2n log n For a comparison-based sorting algorithm to be correct, it must have at least that many leaves in its comparison tree To have at least n! ~ 2n log n leaves, it must have height at least n log n (since it is only binary branching, the number of nodes at most doubles at every depth) Therefore its longest path must be of length at least n log n, and that it its worst-case running time

Interface java.lang.Comparable<T> public int compareTo(T x); Return a negative, zero, or positive value negative if this is before x 0 if this.equals(x) positive if this is after x Many classes implement Comparable String, Double, Integer, Character, Date, … Class implements Comparable? Its method compareTo is considered to define that class’s natural ordering Comparison-based sorting methods should work with Comparable for maximum generality