COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,

Slides:



Advertisements
Similar presentations
Sorting in linear time (for students to read) Comparison sort: –Lower bound:  (nlgn). Non comparison sort: –Bucket sort, counting sort, radix sort –They.
Advertisements

Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )
Sorting in Linear Time Comp 550, Spring Linear-time Sorting Depends on a key assumption: numbers to be sorted are integers in {0, 1, 2, …, k}. Input:
Non-Comparison Based Sorting
CSE 3101: Introduction to the Design and Analysis of Algorithms
MS 101: Algorithms Instructor Neelima Gupta
1 Sorting in Linear Time How can we do better?  CountingSort  RadixSort  BucketSort.
Mudasser Naseer 1 5/1/2015 CSC 201: Design and Analysis of Algorithms Lecture # 9 Linear-Time Sorting Continued.
Counting Sort Non-comparison sort. Precondition: n numbers in the range 1..k. Key ideas: For each x count the number C(x) of elements ≤ x Insert x at output.
Introduction to Algorithms
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Lecture 5: Linear Time Sorting Shang-Hua Teng. Sorting Input: Array A[1...n], of elements in arbitrary order; array size n Output: Array A[1...n] of the.
CS 253: Algorithms Chapter 8 Sorting in Linear Time Credit: Dr. George Bebis.
Comp 122, Spring 2004 Keys into Buckets: Lower bounds, Linear-time sort, & Hashing.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Comp 122, Spring 2004 Lower Bounds & Sorting in Linear Time.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Analysis of Algorithms CS 477/677 Midterm Exam Review Instructor: George Bebis.
Lecture 5: Master Theorem and Linear Time Sorting
Analysis of Algorithms CS 477/677
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 5 Linear-time sorting Can we do better than comparison sorting? Linear-time sorting.
David Luebke 1 7/2/2015 Linear-Time Sorting Algorithms.
Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8
Sorting in Linear Time Lower bound for comparison-based sorting
CSE 373 Data Structures Lecture 15
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
1 Sorting in O(N) time CS302 Data Structures Section 10.4.
David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
Introduction to Algorithms Jiafen Liu Sept
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
Analysis of Algorithms CS 477/677
Fall 2015 Lecture 4: Sorting in linear time
Mudasser Naseer 1 11/5/2015 CSC 201: Design and Analysis of Algorithms Lecture # 8 Some Examples of Recursion Linear-Time Sorting Algorithms.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
Searching and Sorting Recursion, Merge-sort, Divide & Conquer, Bucket sort, Radix sort Lecture 5.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
1 Algorithms CSCI 235, Fall 2015 Lecture 17 Linear Sorting.
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Analysis of Algorithms CS 477/677 Lecture 8 Instructor: Monica Nicolescu.
CS6045: Advanced Algorithms Sorting Algorithms. Heap Data Structure A heap (nearly complete binary tree) can be stored as an array A –Root of tree is.
Linear Sorting. Comparison based sorting Any sorting algorithm which is based on comparing the input elements has a lower bound of Proof, since there.
Sorting & Lower Bounds Jeff Edmonds York University COSC 3101 Lecture 5.
Lecture 5 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Lower Bounds & Sorting in Linear Time
Sorting.
Linear-Time Sorting Continued Medians and Order Statistics
Introduction to Algorithms
Lecture 5 Algorithm Analysis
Linear Sorting Sections 10.4
Keys into Buckets: Lower bounds, Linear-time sort, & Hashing
Ch8: Sorting in Linear Time Ming-Te Chi
Lecture 5 Algorithm Analysis
CS200: Algorithm Analysis
Linear Sorting Sorting in O(n) Jeff Chastine.
Lower Bounds & Sorting in Linear Time
Linear Sorting Section 10.4
Linear-Time Sorting Algorithms
Lecture 5 Algorithm Analysis
Chapter 8: Overview Comparison sorts: algorithms that sort sequences by comparing the value of elements Prove that the number of comparison required to.
CS 583 Analysis of Algorithms
The Selection Problem.
Lecture 5 Algorithm Analysis
Presentation transcript:

COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu, Univ. of Nevada, Reno,

6/08/2004 Lecture 6COSC3101A2 Selection General Selection Problem: –select the i-th smallest element form a set of n distinct numbers –that element is larger than exactly i - 1 other elements Idea: –Partition the input array –Recurse on one side of the partition to look for the i-th element qpr i < k  search in this partition i > k  search in this partition A

6/08/2004 Lecture 6COSC3101A3 A Better Selection Algorithm Can perform Selection in O(n) Worst Case Idea: guarantee a good split on partitioning –Running time is influenced by how “balanced” are the resulting partitions Use a modified version of PARTITION –Takes as input the element around which to partition

6/08/2004 Lecture 6COSC3101A4 Selection in O(n) Worst Case 1.Divide the n elements into groups of 5   n/5  groups 2.Find the median of each of the  n/5  groups 3.Use SELECT recursively to find the median x of the  n/5  medians 4.Partition the input array around x, using the modified version of PARTITION 5.If i = k then return x. Otherwise, use SELECT recursively: Find the i -th smallest element on the low side if i < k Find the (i-k) -th smallest element on the high side if i > k A: x1x1 x2x2 x3x3 x  n/5  x x k – 1 elements n - k elements

6/08/2004 Lecture 6COSC3101A5 Analysis of Running Time First determine an upper bound for the sizes of the partitions –See how bad the split can be Consider the following representation –Each column represents one group (elements in columns are sorted) –Columns are sorted by their medians

6/08/2004 Lecture 6COSC3101A6 Analysis of Running Time At least half of the medians found in step 2 are ≥ x All but two of these groups contribute 3 elements > x groups with 3 elements > x At least elements greater than x SELECT is called on at most elements

6/08/2004 Lecture 6COSC3101A7 Recurrence for the Running Time Step 1: making groups of 5 elements takes Step 2: sorting n/5 groups in O(1) time each takes Step 3: calling SELECT on  n/5  medians takes time Step 4: partitioning the n-element array around x takes Step 5: recursing on one partition takes T(n) = T(  n/5  ) + T(7n/10 + 6) + O(n) Show that T(n) = O(n) O(n) time O(n) T(  n/5  ) O(n) time time ≤ T(7n/10 + 6)

6/08/2004 Lecture 6COSC3101A8 Substitution T(n) = T(  n/5  ) + T(7n/10 + 6) + O(n) Show that T(n) ≤ cn for some constant c > 0 and all n ≥ n 0 T(n) ≤ c  n/5  + c (7n/10 + 6) + an ≤ cn/5 + c + 7cn/10 + 6c + an = 9cn/10 + 7c + an = cn + (-cn/10 + 7c + an) ≤ cn if: -cn/10 + 7c + an ≤ 0 c ≥ 10a(n/(n-70)) –choose n 0 > 70 and obtain the value of c

6/08/2004 Lecture 6COSC3101A9 How Fast Can We Sort? Insertion sort, Bubble Sort, Selection Sort Merge sort Quicksort What is common to all these algorithms? –These algorithms sort by making comparisons between the input elements To sort n elements, comparison sorts must make  (nlgn) comparisons in the worst case  (n 2 )  (nlgn)

6/08/2004 Lecture 6COSC3101A10 Lower-Bounds for Sorting Comparison sorts use comparisons between elements to gain information about an input sequence  a 1, a 2, …, a n  We perform tests: a i a j to determine the relative order of a i and a j We assume that all the input elements are distinct

6/08/2004 Lecture 6COSC3101A11 Decision Tree Model Represents the comparisons made by a sorting algorithm on an input of a given size: models all possible execution traces Control, data movement, other operations are ignored Count only the comparisons Decision tree for insertion sort on three elements: node leaf: one execution trace

6/08/2004 Lecture 6COSC3101A12 Decision Tree Model Each of the n! permutations on n elements must appear as one of the leaves in the decision tree The length of the longest path from the root to a leaf represents the worst-case number of comparisons –This is equal to the height of the decision tree Goal: find a lower bound on the heights of all decision trees in which each permutation appears as a reachable leaf –Equivalent to finding a lower bound on the running time on any comparison sort algorithm

6/08/2004 Lecture 6COSC3101A13 Lemma Any binary tree of height h has at most 2 h leaves Proof: induction on h Basis: h = 0  tree has one node, which is a leaf 2 h = 1 Inductive step: assume true for h-1 –Extend the height of the tree with one more level –Each leaf becomes parent to two new leaves No. of leaves at level h = 2  (no. of leaves at level h-1 ) = 2  2 h-1 = 2 h

6/08/2004 Lecture 6COSC3101A14 Lower Bound for Comparison Sorts Theorem: Any comparison sort algorithm requires  (nlgn) comparisons in the worst case. Proof: Need to determine the height of a decision tree in which each permutation appears as a reachable leaf Consider a decision tree of height h and l leaves, corresponding to a comparison sort of n elements Each of the n! permutations if the input appears as some leaf  n! ≤ l A binary tree of height h has no more than 2 h leaves  n! ≤ l ≤ 2 h (take logarithms)  h ≥ lg(n!) =  (nlgn) We can beat the  (nlgn) running time if we use other operations than comparisons!

6/08/2004 Lecture 6COSC3101A15 Counting Sort Assumption: –The elements to be sorted are integers in the range 0 to k Idea: –Determine for each input element x, the number of elements smaller than x –Place element x into its correct position in the output array Input: A[1.. n], where A[j]  {0, 1,..., k}, j = 1, 2,..., n –Array A and values n and k are given as parameters Output: B[1.. n], sorted –B is assumed to be already allocated and is given as a parameter Auxiliary storage: C[0.. k]

6/08/2004 Lecture 6COSC3101A16 COUNTING-SORT Alg.: COUNTING-SORT(A, B, n, k) 1.for i ← 0 to k 2. do C[ i ] ← 0 3.for j ← 1 to n 4. do C[A[ j ]] ← C[A[ j ]] C[i] contains the number of elements equal to i 6.for i ← 1 to k 7. do C[ i ] ← C[ i ] + C[i -1] 8. C[i] contains the number of elements ≤ i 9.for j ← n downto do B[C[A[ j ]]] ← A[ j ] 11. C[A[ j ]] ← C[A[ j ]] - 1 1n 0k A C 1n B j

6/08/2004 Lecture 6COSC3101A17 Example A C C B C B C B C B C 8 0

6/08/2004 Lecture 6COSC3101A18 Example (cont.) A B C B C B C B

6/08/2004 Lecture 6COSC3101A19 Analysis of Counting Sort Alg.: COUNTING-SORT(A, B, n, k) 1.for i ← 0 to k 2. do C[ i ] ← 0 3.for j ← 1 to n 4. do C[A[ j ]] ← C[A[ j ]] C[i] contains the number of elements equal to i 6.for i ← 1 to k 7. do C[ i ] ← C[ i ] + C[i -1] 8. C[i] contains the number of elements ≤ i 9.for j ← n downto do B[C[A[ j ]]] ← A[ j ] 11. C[A[ j ]] ← C[A[ j ]] - 1  (k)  (n)  (k)  (n) Overall time:  (n + k)

6/08/2004 Lecture 6COSC3101A20 Analysis of Counting Sort Overall time:  (n + k) In practice we use COUNTING sort when k = O(n)  running time is  (n) Counting sort is stable –Numbers with the same value appear in the same order in the output array –Important when satellite data is carried around with the sorted keys

6/08/2004 Lecture 6COSC3101A21 Radix Sort Considers keys as numbers in a base-R number –A d -digit number will occupy a field of d columns Sorting looks at one column at a time –For a d digit number, sort the least significant digit first –Continue sorting on the next least significant digit, until all digits have been sorted –Requires only d passes through the list Usage: –Sort records of information that are keyed by multiple fields: e.g., year, month, day

6/08/2004 Lecture 6COSC3101A22 RADIX-SORT Alg.: RADIX-SORT (A, d) for i ← 1 to d do use a stable sort to sort array A on digit i 1 is the lowest order digit, d is the highest-order digit

6/08/2004 Lecture 6COSC3101A23 Analysis of Radix Sort Given n numbers of d digits each, where each digit may take up to k possible values, RADIX- SORT correctly sorts the numbers in  (d(n+k)) –One pass of sorting per digit takes  (n+k) assuming that we use counting sort –There are d passes (for each digit)

6/08/2004 Lecture 6COSC3101A24 Correctness of Radix sort We use induction on number of passes through each digit Basis: If d = 1, there’s only one digit, trivial Inductive step: assume digits 1, 2,..., d-1 are sorted –Now sort on the d -th digit –If a d < b d, sort will put a before b : correct, since a < b regardless of the low-order digits –If a d > b d, sort will put a after b : correct, since a > b regardless of the low-order digits –If a d = b d, sort will leave a and b in the same order - we use a stable sorting for the digits. The result is correct since a and b are already sorted on the low-order d-1 digits

6/08/2004 Lecture 6COSC3101A25 Bucket Sort Assumption: –the input is generated by a random process that distributes elements uniformly over [0, 1) Idea: –Divide [0, 1) into n equal-sized buckets –Distribute the n input values into the buckets –Sort each bucket –Go through the buckets in order, listing elements in each one Input: A[1.. n], where 0 ≤ A[i] < 1 for all i Output: elements a i sorted Auxiliary array: B[0.. n - 1] of linked lists, each list initially empty

6/08/2004 Lecture 6COSC3101A26 BUCKET-SORT Alg.: BUCKET-SORT(A, n) for i ← 1 to n do insert A[i] into list B[  nA[i]  ] for i ← 0 to n - 1 do sort list B[i] with insertion sort concatenate lists B[0], B[1],..., B[n -1] together in order return the concatenated lists

6/08/2004 Lecture 6COSC3101A27 Example - Bucket Sort /.72 /.23 / /.68 /.39 / / / / /

6/08/2004 Lecture 6COSC3101A28 Example - Bucket Sort /.78 /.26 / /.68 /.39 / / / / / / Concatenate the lists from 0 to n – 1 together, in order

6/08/2004 Lecture 6COSC3101A29 Correctness of Bucket Sort Consider two elements A[i], A[ j] Assume without loss of generality that A[i] ≤ A[j] Then  nA[i]  ≤  nA[j]  –A[i] belongs to the same group as A[j] or to a group with a lower index than that of A[j] If A[i], A[j] belong to the same bucket: –insertion sort puts them in the proper order If A[i], A[j] are put in different buckets: –concatenation of the lists puts them in the proper order

6/08/2004 Lecture 6COSC3101A30 Analysis of Bucket Sort Alg.: BUCKET-SORT(A, n) for i ← 1 to n do insert A[i] into list B[  nA[i]  ] for i ← 0 to n - 1 do sort list B[i] with insertion sort concatenate lists B[0], B[1],..., B[n -1] together in order return the concatenated lists O(n)  (n) O(n)  (n)

6/08/2004 Lecture 6COSC3101A31 Conclusion Any comparison sort will take at least nlgn to sort an array of n numbers We can achieve a better running time for sorting if we can make certain assumptions on the input data: –Counting sort: each of the n input elements is an integer in the range 0 to k –Radix sort: the elements in the input are integers represented with d digits –Bucket sort: the numbers in the input are uniformly distributed over the interval [0, 1)

6/08/2004 Lecture 6COSC3101A32 Readings Chapter 8