Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8

Slides:



Advertisements
Similar presentations
Sorting in Linear Time Introduction to Algorithms Sorting in Linear Time CSE 680 Prof. Roger Crawfis.
Advertisements

Sorting in linear time (for students to read) Comparison sort: –Lower bound:  (nlgn). Non comparison sort: –Bucket sort, counting sort, radix sort –They.
Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )
Sorting in Linear Time Comp 550, Spring Linear-time Sorting Depends on a key assumption: numbers to be sorted are integers in {0, 1, 2, …, k}. Input:
Non-Comparison Based Sorting
CSE 3101: Introduction to the Design and Analysis of Algorithms
MS 101: Algorithms Instructor Neelima Gupta
1 Sorting in Linear Time How can we do better?  CountingSort  RadixSort  BucketSort.
Mudasser Naseer 1 5/1/2015 CSC 201: Design and Analysis of Algorithms Lecture # 9 Linear-Time Sorting Continued.
Counting Sort Non-comparison sort. Precondition: n numbers in the range 1..k. Key ideas: For each x count the number C(x) of elements ≤ x Insert x at output.
Lower bound for sorting, radix sort COMP171 Fall 2005.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Lower bound for sorting, radix sort COMP171 Fall 2006.
1 SORTING Dan Barrish-Flood. 2 heapsort made file “3-Sorting-Intro-Heapsort.ppt”
Lecture 5: Linear Time Sorting Shang-Hua Teng. Sorting Input: Array A[1...n], of elements in arbitrary order; array size n Output: Array A[1...n] of the.
CS 253: Algorithms Chapter 8 Sorting in Linear Time Credit: Dr. George Bebis.
Comp 122, Spring 2004 Keys into Buckets: Lower bounds, Linear-time sort, & Hashing.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
Comp 122, Spring 2004 Lower Bounds & Sorting in Linear Time.
Analysis of Algorithms CS 477/677 Midterm Exam Review Instructor: George Bebis.
Lecture 5: Master Theorem and Linear Time Sorting
CSE 326: Data Structures Sorting Ben Lerner Summer 2007.
Analysis of Algorithms CS 477/677
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 5 Linear-time sorting Can we do better than comparison sorting? Linear-time sorting.
David Luebke 1 7/2/2015 Linear-Time Sorting Algorithms.
Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Data Structure & Algorithm Lecture 7 – Linear Sort JJCAO Most materials are stolen from Prof. Yoram Moses’s course.
Sorting in Linear Time Lower bound for comparison-based sorting
CSE 373 Data Structures Lecture 15
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Computer Algorithms Ch. 2
1 Sorting in O(N) time CS302 Data Structures Section 10.4.
David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
Introduction to Algorithms Jiafen Liu Sept
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Analysis of Algorithms CS 477/677
Fall 2015 Lecture 4: Sorting in linear time
Mudasser Naseer 1 11/5/2015 CSC 201: Design and Analysis of Algorithms Lecture # 8 Some Examples of Recursion Linear-Time Sorting Algorithms.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
1 Algorithms CSCI 235, Fall 2015 Lecture 17 Linear Sorting.
Analysis of Algorithms CS 477/677 Lecture 8 Instructor: Monica Nicolescu.
CS6045: Advanced Algorithms Sorting Algorithms. Heap Data Structure A heap (nearly complete binary tree) can be stored as an array A –Root of tree is.
Linear Sorting. Comparison based sorting Any sorting algorithm which is based on comparing the input elements has a lower bound of Proof, since there.
Lecture 5 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Lower Bounds & Sorting in Linear Time
Linear-Time Sorting Continued Medians and Order Statistics
MCA 301: Design and Analysis of Algorithms
Introduction to Algorithms
Algorithm Design and Analysis (ADA)
Lecture 5 Algorithm Analysis
Linear Sorting Sections 10.4
Keys into Buckets: Lower bounds, Linear-time sort, & Hashing
Ch8: Sorting in Linear Time Ming-Te Chi
Lecture 5 Algorithm Analysis
Sorting in linear time (for students to read)
Linear Sort "Our intuition about the future is linear. But the reality of information technology is exponential, and that makes a profound difference.
Linear Sorting Sorting in O(n) Jeff Chastine.
Linear Sort "Our intuition about the future is linear. But the reality of information technology is exponential, and that makes a profound difference.
Lower Bounds & Sorting in Linear Time
Linear Sorting Section 10.4
Linear-Time Sorting Algorithms
Lecture 5 Algorithm Analysis
Algorithms CSCI 235, Spring 2019 Lecture 18 Linear Sorting
CS 583 Analysis of Algorithms
Lecture 5 Algorithm Analysis
Presentation transcript:

Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR

Comparison-based Sorting Comparison sort Only comparison of pairs of elements may be used to gain order information about a sequence. Hence, a lower bound on the number of comparisons will be a lower bound on the complexity of any comparison-based sorting algorithm. What sorts we have analyzed so far have been comparison sorts The best worst-case complexity so far is (n lg n) To sort n elements, comparison sorts must make (nlgn) comparisons in the worst case We prove a lower bound of n lg n, (or (n lg n)) for any comparison sort, implying that merge sort and heapsort are optimal.

Decision Tree Model Full binary tree: every node is either a leaf of has degree 2 Represents the comparisons made by a sorting algorithm on an input of a given size: models all possible execution traces Control, data movement, other operations are ignored Count only the comparisons Decision tree for insertion sort on three elements: one execution trace node leaf:

Decision Tree Model All permutations on n elements must appear as one of the leaves in the decision tree Worst-case number of comparisons the length of the longest path from the root to a leaf the height of the decision tree n! permutations

Decision Tree Model Goal: finding a lower bound on the running time on any comparison sort algorithm find a lower bound on the heights of all decision trees for all algorithms

Lemma Any binary tree of height h has at most Proof: induction on h Basis: h = 0  tree has one node, which is a leaf 2h = 1 Inductive step: assume true for h-1 (2h-1 leaves) Extend the height of the tree with one more level Each leaf becomes parent to two (complete binary tree) new leaves No. of leaves for tree of height h = = 2  (no. of leaves for tree of height h-1) ≤ 2  2h-1 = 2h 2h leaves

Lower Bound for Comparison Sorts Theorem: Any comparison sort algorithm requires (nlgn) comparisons in the worst case. Proof: How many leaves does the tree have? At least n! (each of the n! permutations of the input appears as some leaf)  n! ≤ l At most 2h leaves  n! ≤ l ≤ 2h  h ≥ lg(n!) = (nlgn) (3.18) h l leaves What comparison sorts are asymptotically optimal? Merge sort and heapsort We can beat the (nlgn) running time if we use other operations than comparisons!

Beating the lower bound We can beat the lower bound if we don’t base our sort on comparisons: Counting sort for keys in [0..k], k=O(n) Radix sort for keys with a fixed number of “digits” Bucket sort for random keys (uniformly distributed)

Counting Sort 3 2 5 7 4 2 8 5 3 2 Assumption: Idea: A C B The elements to be sorted are integers in the range 0 to k Idea: Determine for each input element x, the number of elements smaller than or equal to x Place element x into its correct position in the output array 3 2 5 1 4 6 7 8 A 7 4 2 1 3 5 C 8 5 3 2 1 4 6 7 8 B

Counting Sort Alg.: COUNTING-SORT(A, B, n, k) for i ← 0 to k 1 j n A Alg.: COUNTING-SORT(A, B, n, k) for i ← 0 to k do C[ i ] ← 0 for j ← 1 to n do C[A[ j ]] ← C[A[ j ]] + 1 C[i] contains the number of elements equal to i for i ← 1 to k do C[ i ] ← C[ i ] + C[i -1] C[i] contains the number of elements ≤ i for j ← n downto 1 do B[C[A[ j ]]] ← A[ j ] C[A[ j ]] ← C[A[ j ]] - 1 k C 1 n B

Example 3 2 5 7 4 2 8 3 3 3 3 2 A C C B C B C B C B C A[8]=3 A[7]=0 CountingSort(A, B, k) 1. for i  1 to k 2. do C[i]  0 3. for j  1 to length[A] 4. do C[A[j]]  C[A[j]] + 1 5. for i  2 to k 6. do C[i]  C[i] + C[i –1] 7. for j  length[A] downto 1 8. do B[C[A[ j ]]]  A[j] 9. C[A[j]]  C[A[j]]–1 3 2 5 1 4 6 7 8 A C 7 4 2 1 3 5 C 8 A[8]=3 A[7]=0 3 1 2 4 5 6 7 8 B C 3 1 2 4 5 6 7 8 B C A[6]=3 A[5]=2 3 1 2 4 5 6 7 8 B C 3 2 1 4 5 6 7 8 B C

Example (cont.) 3 2 5 3 2 5 3 2 3 2 5 3 2 A B C B C B C B A[4]=0 2 5 1 4 6 7 8 A A[4]=0 A[2]=5 3 2 1 4 5 6 7 8 B C 5 3 2 1 4 6 7 8 B C A[3]=3 A[1]=2 3 2 1 4 5 6 7 8 B C 5 3 2 1 4 6 7 8 B

Counting-Sort (A, B, k) O(k) O(n) O(k) O(n) CountingSort(A, B, k) 1. for i  1 to k 2. do C[i]  0 3. for j  1 to length[A] 4. do C[A[j]]  C[A[j]] + 1 5. for i  2 to k 6. do C[i]  C[i] + C[i –1] 7. for j  length[A] downto 1 8. do B[C[A[ j ]]]  A[j] 9. C[A[j]]  C[A[j]]–1 O(k) Stable, but not in place. No comparisons made: it uses actual values of the elements to index into an array. O(n) O(k) O(n) Overall time: (n + k)

Radix Sort It was used by the card-sorting machines. Card sorters worked on one column at a time. It is the algorithm for using the machine that extends the technique to multi-column sorting. The human operator was part of the algorithm! Key idea: sort on the “least significant digit” first and on the remaining digits in sequential order. The sorting method used to sort each digit must be “stable”. If we start with the “most significant digit”, we’ll need extra storage.

Radix Sort Considers keys as numbers in a base-k number A d-digit number will occupy a field of d columns Sorting looks at one column at a time For a d-digit number, sort the least significant digit first Continue sorting on the next least significant digit, until all digits have been sorted Requires only d passes through the list

An Example Input After sorting on LSD After sorting on middle digit After sorting on MSD 392 631 928 356 356 392 631 392 446 532 532 446 928  495  446  495 631 356 356 532 532 446 392 631 495 928 495 928   

1 is the lowest order digit, d is the highest-order digit Radix-Sort(A, d) RadixSort(A, d) 1. for i  1 to d do use a stable sort to sort array A on digit I 1 is the lowest order digit, d is the highest-order digit Correctness of Radix Sort By induction on the number of digits sorted. Assume that radix sort works for d – 1 digits. Show that it works for d digits. Radix sort of d digits  radix sort of the low-order d – 1 digits followed by a sort on digit d .

Correctness of Radix sort We use induction on the number d of passes through the digits Basis: If d = 1, there’s only one digit, trivial Inductive step: assume digits 1, 2, . . . , d-1 are sorted Now sort on the d-th digit If ad < bd, sort will put a before b: correct a < b regardless of the low-order digits If ad > bd, sort will put a after b: correct a > b regardless of the low-order digits If ad = bd, sort will leave a and b in the same order and a and b are already sorted on the low-order d-1 digits

Analysis of Radix Sort Given n numbers of d digits each, where each digit may take up to k possible values, RADIX-SORT correctly sorts the numbers in (d(n+k)) One pass of sorting per digit takes (n+k) assuming that we use counting sort There are d passes (for each digit)

Bucket Sort Assumption: Idea: the input is generated by a random process that distributes elements uniformly over [0, 1) the discrete () uniform distribution is a discrete probability distribution that can be characterized by saying that all values of a finite set of possible values are equally probable. Idea: Divide [0, 1) into n equal-sized buckets Distribute the n input values into the buckets Sort each bucket Go through the buckets in order, listing elements in each one Input: A[1 . . n], where 0 ≤ A[i] < 1 for all i Output: elements in A sorted Auxiliary array: B[0 . . n - 1] of linked lists, each list initially empty

Example - Bucket Sort 1 .78 .17 .39 .26 .72 .94 .21 .12 .23 .68 / 2 1 .17 .12 / 3 2 .26 .21 .23 / 4 3 .39 / 5 4 6 5 7 .68 / 6 8 7 .78 .72 / 9 8 10 9 .94 /

Example - Bucket Sort Concatenate the lists from .17 .12 .23 .26 .21 .39 .68 .78 .72 .94 / / 1 .12 .17 / 2 .21 .23 .26 / 3 .39 / 4 5 .68 / 6 7 .72 .78 / Concatenate the lists from 0 to n – 1 together, in order 8 9 .94 /

Analysis Relies on no bucket getting too many values. All lines except insertion sorting take O(n) altogether. Intuitively, if each bucket gets a constant number of elements, it takes O(1) time to sort each bucket  O(n) sort time for all buckets. We “expect” each bucket to have few elements, since the average is 1 element per bucket. But we need to do a careful analysis.

Summary Non-Comparison Based Sorts Running Time worst-case average-case best-case in place Counting Sort O(n + k) O(n + k) O(n + k) no Radix Sort O(d(n + k')) O(d(n + k')) O(d(n + k')) no Bucket Sort O(n) no Counting sort assumes input elements are in range [0,1,2,..,k] and uses array indexing to count the number of occurrences of each value. Radix sort assumes each integer consists of d digits, and each digit is in range [1,2,..,k']. Bucket sort requires advance knowledge of input distribution (sorts n numbers uniformly distributed in range in O(n) time).