MS 101: Algorithms Instructor Neelima Gupta

Slides:



Advertisements
Similar presentations
Sorting in Linear Time Introduction to Algorithms Sorting in Linear Time CSE 680 Prof. Roger Crawfis.
Advertisements

Analysis of Algorithms
Sorting in linear time (for students to read) Comparison sort: –Lower bound:  (nlgn). Non comparison sort: –Bucket sort, counting sort, radix sort –They.
Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )
Sorting in Linear Time Comp 550, Spring Linear-time Sorting Depends on a key assumption: numbers to be sorted are integers in {0, 1, 2, …, k}. Input:
Non-Comparison Based Sorting
CSE 3101: Introduction to the Design and Analysis of Algorithms
1 Sorting in Linear Time How can we do better?  CountingSort  RadixSort  BucketSort.
Mudasser Naseer 1 5/1/2015 CSC 201: Design and Analysis of Algorithms Lecture # 9 Linear-Time Sorting Continued.
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2010.
Counting Sort Non-comparison sort. Precondition: n numbers in the range 1..k. Key ideas: For each x count the number C(x) of elements ≤ x Insert x at output.
CSCE 3110 Data Structures & Algorithm Analysis
Lecture 5: Linear Time Sorting Shang-Hua Teng. Sorting Input: Array A[1...n], of elements in arbitrary order; array size n Output: Array A[1...n] of the.
CS 253: Algorithms Chapter 8 Sorting in Linear Time Credit: Dr. George Bebis.
Comp 122, Spring 2004 Keys into Buckets: Lower bounds, Linear-time sort, & Hashing.
Comp 122, Spring 2004 Lower Bounds & Sorting in Linear Time.
Lecture 5: Master Theorem and Linear Time Sorting
CSE 326: Data Structures Sorting Ben Lerner Summer 2007.
Analysis of Algorithms CS 477/677
1 Today’s Material Lower Bounds on Comparison-based Sorting Linear-Time Sorting Algorithms –Counting Sort –Radix Sort.
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 5 Linear-time sorting Can we do better than comparison sorting? Linear-time sorting.
David Luebke 1 7/2/2015 Linear-Time Sorting Algorithms.
Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8
Sorting in Linear Time Lower bound for comparison-based sorting
CSE 373 Data Structures Lecture 15
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Instructor Neelima Gupta Introduction to some tools to designing algorithms through Sorting Iterative Divide and Conquer.
1 Sorting in O(N) time CS302 Data Structures Section 10.4.
David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
Introduction to Algorithms Jiafen Liu Sept
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2012.
CS 61B Data Structures and Programming Methodology July 28, 2008 David Sun.
Analysis of Algorithms CS 477/677
Fall 2015 Lecture 4: Sorting in linear time
Mudasser Naseer 1 11/5/2015 CSC 201: Design and Analysis of Algorithms Lecture # 8 Some Examples of Recursion Linear-Time Sorting Algorithms.
Instructor Neelima Gupta Table of Contents Review of Lower Bounding Techniques Decision Trees Linear Sorting Selection Problems.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
1 Algorithms CSCI 235, Fall 2015 Lecture 17 Linear Sorting.
Analysis of Algorithms CS 477/677 Lecture 8 Instructor: Monica Nicolescu.
CS6045: Advanced Algorithms Sorting Algorithms. Heap Data Structure A heap (nearly complete binary tree) can be stored as an array A –Root of tree is.
Linear Sorting. Comparison based sorting Any sorting algorithm which is based on comparing the input elements has a lower bound of Proof, since there.
Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.
Lecture 5 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
1 Chapter 8-1: Lower Bound of Comparison Sorts. 2 About this lecture Lower bound of any comparison sorting algorithm – applies to insertion sort, selection.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Lower Bounds & Sorting in Linear Time
Sorting.
Linear-Time Sorting Continued Medians and Order Statistics
MCA 301: Design and Analysis of Algorithms
Introduction to Algorithms
Algorithm Design and Analysis (ADA)
Lecture 5 Algorithm Analysis
Linear Sorting Sections 10.4
Keys into Buckets: Lower bounds, Linear-time sort, & Hashing
Ch8: Sorting in Linear Time Ming-Te Chi
Lecture 5 Algorithm Analysis
Linear Sort "Our intuition about the future is linear. But the reality of information technology is exponential, and that makes a profound difference.
Linear Sorting Sorting in O(n) Jeff Chastine.
Linear Sort "Our intuition about the future is linear. But the reality of information technology is exponential, and that makes a profound difference.
Lower Bounds & Sorting in Linear Time
Linear Sorting Section 10.4
Linear-Time Sorting Algorithms
Lecture 5 Algorithm Analysis
Algorithms CSCI 235, Spring 2019 Lecture 18 Linear Sorting
Lecture 5 Algorithm Analysis
Presentation transcript:

MS 101: Algorithms Instructor Neelima Gupta

Table of Contents Lower Bounds Linear-Time Sorting Algorithms

Lower Bounds Please understand the following statements carefully. Any algorithm that sorts by removing at most one inversion per comparison does at least n(n-1)/2 comparisons in the worst case. –Proof will be done later. Hence Insertion Sort is optimal in this category of algorithms.

Optimal? What do you mean by the term “Optimal”? Answer: If an algorithm runs as fast in the worst case as it is possible (in the best case), we say that the algorithm is optimal. i.e if the worst case performance of an algorithm matches the lower bound, the algorithm is said to be “optimal”

How Fast Can We Sort? We will provide a lower bound, then beat it –How do you suppose we’ll beat it? First, an observation: all of the sorting algorithms so far are comparison sorts –The only operation used to gain ordering information about a sequence is the pairwise comparison of two elements –Theorem: all comparison sorts are  (n lg n) A comparison sort must do at least  (n) comparisons (why?) What about the gap between  (n) and  (n lg n)?

Decision Trees Decision trees provide an abstraction of comparison sorts –A decision tree represents the comparisons made by a comparison sort. Every thing else ignored –(Draw examples on board) What do the leaves represent? How many leaves must there be?

Decision Trees Decision trees can model comparison sorts. For a given algorithm: –One tree for each n –Tree paths are all possible execution traces –What’s the longest path in a decision tree for insertion sort? For merge sort? What is the asymptotic height of any decision tree for sorting n elements? Answer:  (n lg n) (now let’s prove it…)

Lower Bound For Comparison Sorting Thm: Any decision tree that sorts n elements has height  (n lg n) What’s the minimum # of leaves? What’s the maximum # of leaves of a binary tree of height h? Clearly the minimum # of leaves is less than or equal to the maximum # of leaves

Lower Bound For Comparison Sorting So we have… n!  2 h Taking logarithms: lg (n!)  h Stirling’s approximation tells us: Thus:

Lower Bound For Comparison Sorting So we have Thus the minimum height of a decision tree is  (n lg n)

Lower Bound For Comparison Sorts Thus the time to comparison sort n elements is  (n lg n) Corollary: Heapsort and Mergesort are asymptotically optimal comparison sorts But the name of this lecture is “Sorting in linear time”! –How can we do better than  (n lg n)?

Sorting In Linear Time Counting sort –No comparisons between elements! –But…depends on assumption about the numbers being sorted We assume numbers are in the range 1.. k –The algorithm: Input: A[1..n], where A[j]  {1, 2, 3, …, k} Output: B[1..n], sorted (notice: not sorting in place) Also: Array C[1..k] for auxiliary storage (constant for constant k) Can be made in place by computing the cumulative frequencies. It is stable. Ref : Cormen for implementation.

Counting Sort (skip) 1CountingSort(A, B, k) 2for i=1 to k 3C[i]= 0; 4for j=1 to n 5C[A[j]] += 1; 6for i=2 to k 7C[i] = C[i] + C[i-1]; 8for j=n downto 1 9B[C[A[j]]] = A[j]; 10C[A[j]] -= 1; Work through example: A={ }, k = 4

Counting Sort(skip) 1CountingSort(A, B, k) 2for i=1 to k 3C[i]= 0; 4for j=1 to n 5C[A[j]] += 1; 6for i=2 to k 7C[i] = C[i] + C[i-1]; 8for j=n downto 1 9B[C[A[j]]] = A[j]; 10C[A[j]] -= 1; What will be the running time? Takes time O(k) Takes time O(n)

Counting Sort Total time: O(n + k) –Usually, k = O(n) –Thus counting sort runs in O(n) time But sorting is  (n lg n)! –No contradiction--this is not a comparison sort (in fact, there are no comparisons at all!) –Notice that this algorithm is stable

Counting Sort Cool! Why don’t we always use counting sort? Because it depends on range k of elements Could we use counting sort to sort 32 bit integers? Why or why not? Answer: no, k too large (2 32 = 4,294,967,296)

Radix Sort Intuitively, you might sort on the most significant digit, then the second msd, etc. Problem: lots of intermediate piles to keep track of Key idea: sort the least significant digit first RadixSort(A, d) for i=1 to d StableSort(A) on digit i

Radix Sort Can we prove it will work? Sketch of an inductive argument (induction on the number of passes): –Assume lower-order digits {j: j<i}are sorted –Show that sorting next digit i leaves array correctly sorted If two digits at position i are different, ordering numbers by that digit is correct (lower-order digits irrelevant) If they are the same, numbers are already sorted on the lower- order digits. Since we use a stable sort, the numbers stay in the right order

Radix Sort What sort will we use to sort on digits? Counting sort is obvious choice: –Sort n numbers on digits that range from 1..k –Time: O(n + k) Each pass over n numbers with d digits takes time O(n+k), so total time O(dn+dk) –When d is constant and k=O(n), takes O(n) time How many bits in a computer word?

Radix Sort Problem: sort 1 million 64-bit numbers –Treat as four-digit radix 2 16 numbers –Can sort in just four passes with radix sort! Compares well with typical O(n lg n) comparison sort –Requires approx lg n = 20 operations per number being sorted So why would we ever use anything but radix sort?

Radix Sort In general, radix sort based on counting sort is –Fast –Asymptotically fast (i.e., O(n)) –Simple to code –A good choice To think about: Can radix sort be used on floating-point numbers?

Bucket Sort BUCKET_SORT (A) 1 n ← length [A], k  number of buckets 2 For i = 1 to n do 3 Insert A[i] into an appropriate bucket. 4 For i = 1 to k do 5 Sort ith bucket using any reasonable comparison sort. 6 Concatenate the buckets together in order

An example

Analysis Let S(m) denote the number of comparisons for a bucket with m keys. Let n i be the no of keys in the i th bucket. Total number of comparisons = ∑ k i=1 S(n i ) Total number of comparisons =

Analysis contd.. Let, S(m)=Θ(m (log m)) If the keys are uniformly distributed the expected size of each bucket=n/k Total number of comparisons = k(n/k) log(n/k) = n log(n/k). If k=n/10, then n log(10) comparisons would be done and running time would be linear in n.

How should we implement the buckets? Linked Lists or Arrays? Linked list saves space as some buckets may have few entries and some may have lots With linked fast algorithms like Quick Sort,Heap Sort cannot be used.

Can we use bucket sort recursively? No, With linked list implementation it involves too much of bookkeeping. With array implementation takes too much space.

When is bucket sort most suitable? When the input is uniformly distributed