Analysis of Algorithms CS 477/677

Slides:



Advertisements
Similar presentations
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.
Advertisements

Sorting in linear time (for students to read) Comparison sort: –Lower bound:  (nlgn). Non comparison sort: –Bucket sort, counting sort, radix sort –They.
Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )
Sorting in Linear Time Comp 550, Spring Linear-time Sorting Depends on a key assumption: numbers to be sorted are integers in {0, 1, 2, …, k}. Input:
Non-Comparison Based Sorting
CSE 3101: Introduction to the Design and Analysis of Algorithms
Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
MS 101: Algorithms Instructor Neelima Gupta
1 Sorting in Linear Time How can we do better?  CountingSort  RadixSort  BucketSort.
Mudasser Naseer 1 5/1/2015 CSC 201: Design and Analysis of Algorithms Lecture # 9 Linear-Time Sorting Continued.
Counting Sort Non-comparison sort. Precondition: n numbers in the range 1..k. Key ideas: For each x count the number C(x) of elements ≤ x Insert x at output.
Introduction to Algorithms
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Lecture 5: Linear Time Sorting Shang-Hua Teng. Sorting Input: Array A[1...n], of elements in arbitrary order; array size n Output: Array A[1...n] of the.
CS 253: Algorithms Chapter 8 Sorting in Linear Time Credit: Dr. George Bebis.
Comp 122, Spring 2004 Keys into Buckets: Lower bounds, Linear-time sort, & Hashing.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Comp 122, Spring 2004 Lower Bounds & Sorting in Linear Time.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Analysis of Algorithms CS 477/677 Midterm Exam Review Instructor: George Bebis.
Lecture 5: Master Theorem and Linear Time Sorting
TTIT33 Algorithms and Optimization – Dalg Lecture 2 HT TTIT33 Algorithms and optimization Lecture 2 Algorithms Sorting [GT] 3.1.2, 11 [LD] ,
Analysis of Algorithms CS 477/677
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 5 Linear-time sorting Can we do better than comparison sorting? Linear-time sorting.
David Luebke 1 7/2/2015 Linear-Time Sorting Algorithms.
Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8
Sorting in Linear Time Lower bound for comparison-based sorting
CSE 373 Data Structures Lecture 15
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
1 Sorting in O(N) time CS302 Data Structures Section 10.4.
David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
Introduction to Algorithms Jiafen Liu Sept
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
Fall 2015 Lecture 4: Sorting in linear time
Mudasser Naseer 1 11/5/2015 CSC 201: Design and Analysis of Algorithms Lecture # 8 Some Examples of Recursion Linear-Time Sorting Algorithms.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
1 Algorithms CSCI 235, Fall 2015 Lecture 17 Linear Sorting.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Analysis of Algorithms CS 477/677 Lecture 8 Instructor: Monica Nicolescu.
CS6045: Advanced Algorithms Sorting Algorithms. Heap Data Structure A heap (nearly complete binary tree) can be stored as an array A –Root of tree is.
Linear Sorting. Comparison based sorting Any sorting algorithm which is based on comparing the input elements has a lower bound of Proof, since there.
Lecture 5 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Lower Bounds & Sorting in Linear Time
Analysis of Algorithms CS 477/677
Linear-Time Sorting Continued Medians and Order Statistics
Introduction to Algorithms
Lecture 5 Algorithm Analysis
Linear Sorting Sections 10.4
Keys into Buckets: Lower bounds, Linear-time sort, & Hashing
Ch8: Sorting in Linear Time Ming-Te Chi
Lecture 5 Algorithm Analysis
CS200: Algorithm Analysis
Linear Sorting Sorting in O(n) Jeff Chastine.
Lower Bounds & Sorting in Linear Time
Linear Sorting Section 10.4
Linear-Time Sorting Algorithms
Lecture 5 Algorithm Analysis
Chapter 8: Overview Comparison sorts: algorithms that sort sequences by comparing the value of elements Prove that the number of comparison required to.
CS 583 Analysis of Algorithms
The Selection Problem.
Lecture 5 Algorithm Analysis
Presentation transcript:

Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 8

General Selection Problem Select the i-th order statistic (i-th smallest element) form a set of n distinct numbers Idea: Partition the input array similarly with the approach used for Quicksort (use RANDOMIZED-PARTITION) Recurse on one side of the partition to look for the i-th element depending on where i is with respect to the pivot p q r k = q – p + 1 A i < k  search in this partition i > k  search CS 477/677 - Lecture 7

A Better Selection Algorithm Can perform Selection in O(n) Worst Case Idea: guarantee a good split on partitioning Running time is influenced by how “balanced” are the resulting partitions Use a modified version of PARTITION Takes as input the element around which to partition CS 477/677 - Lecture 7

Selection in O(n) Worst Case x1 x2 x3 xn/5 x x k – 1 elements n - k elements Divide the n elements into groups of 5  n/5 groups Find the median of each of the n/5 groups Use insertion sort, then pick the median Use SELECT recursively to find the median x of the n/5 medians Partition the input array around x, using the modified version of PARTITION There are k-1 elements on the low side of the partition and n-k on the high side If i = k then return x. Otherwise, use SELECT recursively: Find the i-th smallest element on the low side if i < k Find the (i-k)-th smallest element on the high side if i > k CS 477/677 - Lecture 7

Analysis of Running Time Step 1: making groups of 5 elements takes Step 2: sorting n/5 groups in O(1) time each takes Step 3: calling SELECT on n/5 medians takes time Step 4: partitioning the n-element array around x takes Step 5: recursion on one partition takes O(n) O(n) T(n/5) O(n) depends on the size of the partition!! CS 477/677 - Lecture 7

Analysis of Running Time First determine an upper bound for the sizes of the partitions See how bad the split can be Consider the following representation Each column represents one group of 5 (elements in columns are sorted) Columns are sorted by their medians CS 477/677 - Lecture 7

Analysis of Running Time At least half of the medians found in step 2 are ≥ x: All but two of these groups contribute 3 elements > x groups with 3 elements > x At least elements greater than x SELECT is called on at most elements CS 477/677 - Lecture 7

Recurrence for the Running Time Step 1: making groups of 5 elements takes Step 2: sorting n/5 groups in O(1) time each takes Step 3: calling SELECT on n/5 medians takes time Step 4: partitioning the n-element array around x takes Step 5: recursion on one partition takes T(n) = T(n/5) + T(7n/10 + 6) + O(n) We will show that T(n) = O(n) O(n) O(n) T(n/5) O(n) time ≤ T(7n/10 + 6) CS 477/677 - Lecture 7

Substitution T(n) = T(n/5) + T(7n/10 + 6) + O(n) Show that T(n) ≤ cn for some constant c > 0 and all n ≥ n0 T(n) ≤ c n/5 + c (7n/10 + 6) + an ≤ cn/5 + c + 7cn/10 + 6c + an = 9cn/10 + 7c + an = cn + (-cn/10 + 7c + an) ≤ cn if: -cn/10 + 7c + an ≤ 0 c ≥ 10a(n/(n-70)) choose n0 > 70 and obtain the value of c CS 477/677 - Lecture 7

How Fast Can We Sort? Insertion sort, Bubble Sort, Selection Sort Merge sort Quicksort What is common to all these algorithms? These algorithms sort by making comparisons between the input elements To sort n elements, comparison sorts must make (nlgn) comparisons in the worst case (n2) (nlgn) (nlgn) CS 477/677 - Lecture 7

Lower-Bounds for Sorting Comparison sorts use comparisons between elements to gain information about an input sequence a1, a2, …, an We perform tests: ai < aj, ai ≤ aj, ai = aj, ai ≥ aj, or ai > aj to determine the relative order of ai and aj We will show that there cannot be a comparison-based algorithm faster than nlgn We assume that all the input elements are distinct CS 477/677 - Lecture 7

Decision Tree Model Represents the comparisons made by a sorting algorithm on an input of a given size: models all possible execution traces Control, data movement, other operations are ignored Count only the comparisons Decision tree for insertion sort on three elements: one execution trace node leaf: CS 477/677 - Lecture 7

Decision Tree Model All permutations on n elements must appear as one of the leaves in the decision tree Worst-case number of comparisons the length of the longest path from the root to a leaf the height of the decision tree n! permutations CS 477/677 - Lecture 7

Decision Tree Model Goal: finding a lower bound on the running time on any comparison sort algorithm find a lower bound on the heights of all decision trees for all algorithms CS 477/677 - Lecture 7

Lemma Any binary tree of height h has at most Proof: induction on h Basis: h = 0  tree has one node, which is a leaf 2h = 1 Inductive step: assume true for h-1 Extend the height of the tree with one more level Each leaf becomes parent to two new leaves No. of leaves for tree of height h = = 2  (no. of leaves for tree of height h-1) ≤ 2  2h-1 = 2h 2h leaves CS 477/677 - Lecture 7

Lower Bound for Comparison Sorts Theorem: Any comparison sort algorithm requires (nlgn) comparisons in the worst case. Proof: How many leaves does the tree have? At least n! (each of the n! permutations of the input appears as some leaf)  n! ≤ l At most 2h leaves  n! ≤ l ≤ 2h  h ≥ lg(n!) = (nlgn) h leaves l We can beat the (nlgn) running time if we use other operations than comparisons! CS 477/677 - Lecture 7

COUNTING-SORT Alg.: COUNTING-SORT(A, B, n, k) for i ← 0 to k 1 j n A Alg.: COUNTING-SORT(A, B, n, k) for i ← 0 to k do C[ i ] ← 0 for j ← 1 to n do C[A[ j ]] ← C[A[ j ]] + 1 C[i] contains the number of elements equal to i for i ← 1 to k do C[ i ] ← C[ i ] + C[i -1] C[i] contains the number of elements ≤ i for j ← n downto 1 do B[C[A[ j ]]] ← A[ j ] C[A[ j ]] ← C[A[ j ]] - 1 k C 1 n B CS 477/677 - Lecture 7

Example 3 2 5 1 4 6 7 8 A C 7 4 2 1 3 5 C 8 3 1 2 4 5 6 7 8 B C 3 1 2 4 5 6 7 8 B C 3 1 2 4 5 6 7 8 B C 3 2 1 4 5 6 7 8 B C CS 477/677 - Lecture 7

Example (cont.) 3 2 5 A 3 2 B C 5 3 2 B C 3 2 B C 5 3 2 B 2 5 1 4 6 7 8 A 3 2 1 4 5 6 7 8 B C 5 3 2 1 4 6 7 8 B C 3 2 1 4 5 6 7 8 B C 5 3 2 1 4 6 7 8 B CS 477/677 - Lecture 7

Analysis of Counting Sort Alg.: COUNTING-SORT(A, B, n, k) for i ← 0 to k do C[ i ] ← 0 for j ← 1 to n do C[A[ j ]] ← C[A[ j ]] + 1 C[i] contains the number of elements equal to i for i ← 1 to k do C[ i ] ← C[ i ] + C[i -1] C[i] contains the number of elements ≤ i for j ← n downto 1 do B[C[A[ j ]]] ← A[ j ] C[A[ j ]] ← C[A[ j ]] - 1 (k) (n) (k) (n) Overall time: (n + k) CS 477/677 - Lecture 7

Analysis of Counting Sort Overall time: (n + k) In practice we use COUNTING sort when k = O(n)  running time is (n) Counting sort is stable Numbers with the same value appear in the same order in the output array Important when additional data is carried around with the sorted keys CS 477/677 - Lecture 7

Radix Sort Considers keys as numbers in a base-k number A d-digit number will occupy a field of d columns Sorting looks at one column at a time For a d digit number, sort the least significant digit first Continue sorting on the next least significant digit, until all digits have been sorted Requires only d passes through the list CS 477/677 - Lecture 7

RADIX-SORT Alg.: RADIX-SORT(A, d) for i ← 1 to d do use a stable sort to sort array A on digit i 1 is the lowest order digit, d is the highest-order digit CS 477/677 - Lecture 7

Analysis of Radix Sort Given n numbers of d digits each, where each digit may take up to k possible values, RADIX-SORT correctly sorts the numbers in (d(n+k)) One pass of sorting per digit takes (n+k) assuming that we use counting sort There are d passes (for each digit) CS 477/677 - Lecture 7

Correctness of Radix sort We use induction on the number d of passes through the digits Basis: If d = 1, there’s only one digit, trivial Inductive step: assume digits 1, 2, . . . , d-1 are sorted Now sort on the d-th digit If ad < bd, sort will put a before b: correct a < b regardless of the low-order digits If ad > bd, sort will put a after b: correct a > b regardless of the low-order digits If ad = bd, sort will leave a and b in the same order and a and b are already sorted on the low-order d-1 digits CS 477/677 - Lecture 7

Bucket Sort Assumption: Idea: the input is generated by a random process that distributes elements uniformly over [0, 1) Idea: Divide [0, 1) into n equal-sized buckets Distribute the n input values into the buckets Sort each bucket Go through the buckets in order, listing elements in each one Input: A[1 . . n], where 0 ≤ A[i] < 1 for all i Output: elements in A sorted Auxiliary array: B[0 . . n - 1] of linked lists, each list initially empty CS 477/677 - Lecture 7

BUCKET-SORT Alg.: BUCKET-SORT(A, n) for i ← 1 to n do insert A[i] into list B[nA[i]] for i ← 0 to n - 1 do sort list B[i] with insertion sort concatenate lists B[0], B[1], . . . , B[n -1] together in order return the concatenated lists CS 477/677 - Lecture 7

Example - Bucket Sort 1 .78 .17 .39 .26 .72 .94 .21 .12 .23 .68 / 2 1 .17 .12 / 3 2 .26 .21 .23 / 4 3 .39 / 5 4 6 5 7 .68 / 6 8 7 .78 .72 / 9 8 10 9 .94 / CS 477/677 - Lecture 7

Example - Bucket Sort Concatenate the lists from .17 .12 .23 .26 .21 .39 .68 .78 .72 .94 / / 1 .12 .17 / 2 .21 .23 .26 / 3 .39 / 4 5 .68 / 6 7 .72 .78 / Concatenate the lists from 0 to n – 1 together, in order 8 9 .94 / CS 477/677 - Lecture 7

Correctness of Bucket Sort Consider two elements A[i], A[ j] Assume without loss of generality that A[i] ≤ A[j] Then nA[i] ≤ nA[j] A[i] belongs to the same group as A[j] or to a group with a lower index than that of A[j] If A[i], A[j] belong to the same bucket: insertion sort puts them in the proper order If A[i], A[j] are put in different buckets: concatenation of the lists puts them in the proper order CS 477/677 - Lecture 7

Analysis of Bucket Sort Alg.: BUCKET-SORT(A, n) for i ← 1 to n do insert A[i] into list B[nA[i]] for i ← 0 to n - 1 do sort list B[i] with insertion sort concatenate lists B[0], B[1], . . . , B[n -1] together in order return the concatenated lists (n) (n) (n) (n) CS 477/677 - Lecture 7

Conclusion Any comparison sort will take at least nlgn to sort an array of n numbers We can achieve a better running time for sorting if we can make certain assumptions on the input data: Counting sort: each of the n input elements is an integer in the range 0 to k Radix sort: the elements in the input are integers represented with d digits Bucket sort: the numbers in the input are uniformly distributed over the interval [0, 1) CS 477/677 - Lecture 7

Readings Chapters 8 and 9 CS 477/677 - Lecture 7