CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2010.

Slides:

Advertisements

Similar presentations

Sorting in Linear Time Introduction to Algorithms Sorting in Linear Time CSE 680 Prof. Roger Crawfis.

Advertisements

Analysis of Algorithms

Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )

MS 101: Algorithms Instructor Neelima Gupta

Lower bound for sorting, radix sort COMP171 Fall 2005.

CSCE 3110 Data Structures & Algorithm Analysis

CSE 332 Data Abstractions: Sorting It All Out Kate Deibel Summer 2012 July 16, 2012CSE 332 Data Abstractions, Summer

Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 24 Sorting.

1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.

CS 171: Introduction to Computer Science II Quicksort.

CSE332: Data Abstractions Lecture 12: Introduction to Sorting Tyler Robison Summer

CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Tyler Robison Summer

Lower bound for sorting, radix sort COMP171 Fall 2006.

1 SORTING Dan Barrish-Flood. 2 heapsort made file “3-Sorting-Intro-Heapsort.ppt”

Lecture 5: Linear Time Sorting Shang-Hua Teng. Sorting Input: Array A[1...n], of elements in arbitrary order; array size n Output: Array A[1...n] of the.

CS 253: Algorithms Chapter 8 Sorting in Linear Time Credit: Dr. George Bebis.

Sorting Heapsort Quick review of basic sorting methods Lower bounds for comparison-based methods Non-comparison based sorting.

Comp 122, Spring 2004 Lower Bounds & Sorting in Linear Time.

Lecture 5: Master Theorem and Linear Time Sorting

Divide and Conquer Sorting

CSE 326: Data Structures Sorting Ben Lerner Summer 2007.

Analysis of Algorithms CS 477/677

1 Today’s Material Lower Bounds on Comparison-based Sorting Linear-Time Sorting Algorithms –Counting Sort –Radix Sort.

Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)

Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8

Data Structure & Algorithm Lecture 7 – Linear Sort JJCAO Most materials are stolen from Prof. Yoram Moses’s course.

Sorting in Linear Time Lower bound for comparison-based sorting

CSE 373 Data Structures Lecture 15

CSE373: Data Structure & Algorithms Lecture 22: Beyond Comparison Sorting Aaron Bauer Winter 2014.

CSE373: Data Structure & Algorithms Lecture 20: More Sorting Lauren Milne Summer 2015.

David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.

CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.

CSE373: Data Structures & Algorithms Lecture 20: Beyond Comparison Sorting Dan Grossman Fall 2013.

Introduction to Algorithms Jiafen Liu Sept

Sorting Fun1 Chapter 4: Sorting     29  9.

CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2012.

Analysis of Algorithms CS 477/677

Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.

CSE373: Data Structure & Algorithms Lecture 21: More Comparison Sorting Aaron Bauer Winter 2014.

Mudasser Naseer 1 11/5/2015 CSC 201: Design and Analysis of Algorithms Lecture # 8 Some Examples of Recursion Linear-Time Sorting Algorithms.

CSE373: Data Structure & Algorithms Lecture 22: More Sorting Linda Shapiro Winter 2015.

CSE373: Data Structure & Algorithms Lecture 21: More Sorting and Other Classes of Algorithms Lauren Milne Summer 2015.

COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,

Quick sort, lower bound on sorting, bucket sort, radix sort, comparison of algorithms, code, … Sorting: part 2.

CS6045: Advanced Algorithms Sorting Algorithms. Heap Data Structure A heap (nearly complete binary tree) can be stored as an array A –Root of tree is.

CSE332: Data Abstractions Lecture 12: Introduction to Sorting Dan Grossman Spring 2010.

CSE373: Data Structure & Algorithms Lecture 22: More Sorting Nicki Dell Spring 2014.

Linear Sorting. Comparison based sorting Any sorting algorithm which is based on comparing the input elements has a lower bound of Proof, since there.

Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.

19 March More on Sorting CSE 2011 Winter 2011.

Lecture 5 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.

CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.

1 Chapter 8-1: Lower Bound of Comparison Sorts. 2 About this lecture Lower bound of any comparison sorting algorithm – applies to insertion sort, selection.

CSE373: Data Structures & Algorithms

Lower Bounds & Sorting in Linear Time

May 26th –non-comparison sorting

CSE373: Data Structure & Algorithms Lecture 23: More Sorting and Other Classes of Algorithms Catie Baker Spring 2015.

May 17th – Comparison Sorts

CSE373: Data Structures & Algorithms

CSE332: Data Abstractions Lecture 12: Introduction to Sorting

Introduction to Algorithms

Instructor: Lilian de Greef Quarter: Summer 2017

Ch8: Sorting in Linear Time Ming-Te Chi

CSE373: Data Structure & Algorithms Lecture 22: More Sorting

Lower Bounds & Sorting in Linear Time

CSE373: Data Structure & Algorithms Lecture 21: Comparison Sorting

Linear-Time Sorting Algorithms

CSE 326: Data Structures Sorting

Lecture 5 Algorithm Analysis

Lower bound for sorting, radix sort

Presentation transcript:

CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2010

The Big Picture Surprising amount of juicy computer science: 2-3 lectures… Spring 20102CSE332: Data Abstractions Simple algorithms: O(n 2 ) Fancier algorithms: O(n log n) Comparison lower bound:  (n log n) Specialized algorithms: O(n) Handling huge data sets Insertion sort Selection sort Shell sort … Heap sort Merge sort Quick sort (avg) … Bucket sort Radix sort External sorting

How fast can we sort? Heapsort & mergesort have O(n log n) worst-case running time Quicksort has O(n log n) average-case running times These bounds are all tight, actually  (n log n) So maybe we need to dream up another algorithm with a lower asymptotic complexity, such as O(n) or O(n log log n) –Instead: prove that this is impossible Assuming our comparison model: The only operation an algorithm can perform on data items is a 2-element comparison Spring 20103CSE332: Data Abstractions

Permutations Assume we have n elements to sort –And for simplicity, none are equal (no duplicates) How many permutations (possible orderings) of the elements? Example, n=3 a[0]<a[1]<a[2]a[0]<a[2]<a[1]a[1]<a[0]<a[2] a[1]<a[2]<a[0]a[2]<a[0]<a[1]a[2]<a[1]<a[0] In general, n choices for least element, then n-1 for next, then n-2 for next, … –n(n-1)(n-2)…(2)(1) = n! possible orderings Spring 20104CSE332: Data Abstractions

Describing every comparison sort So every sorting algorithm has to “find” the right answer among the n! possible answers Starts “knowing nothing” and gains information with each comparison –Intuition: At best, each comparison can eliminate half of the remaining possibilities Can represent this process as a decision tree –Nodes are “remaining possibilities” –Edges are “answers from a comparison” –This is not a data structure, it’s what our proof uses to represent “the most any algorithm could know” Spring 20105CSE332: Data Abstractions

Decision tree for n=3 Spring 20106CSE332: Data Abstractions a < b < c, b < c < a, a < c < b, c < a < b, b < a < c, c < b < a a < b < c a < c < b c < a < b b < a < c b < c < a c < b < a a < b < c a < c < b c < a < b a < b < c a < c < b b < a < c b < c < a c < b < a b < c < a b < a < c a < b a > b a > ca < c b < c b > c b < c b > c c < a c > a a ? b The leaves contain all the possible orderings of a, b, c

What the decision tree tells us A binary tree because each comparison has 2 outcomes –No duplicate elements –Assume algorithm not so dumb as to ask redundant questions Because any data is possible, any algorithm needs to ask enough questions to produce all n! answers –Each answer is a leaf (no more questions to ask) –So the tree must be big enough to have n! leaves –Running any algorithm on any input will at best correspond to one root-to-leaf path in the decision tree –So no algorithm can have worst-case running time better than the height of the decision tree Spring 20107CSE332: Data Abstractions

Example Spring 20108CSE332: Data Abstractions a < b < c, b < c < a, a < c < b, c < a < b, b < a < c, c < b < a a < b < c a < c < b c < a < b b < a < c b < c < a c < b < a a < b < c a < c < b c < a < b a < b < c a < c < b b < a < c b < c < a c < b < a b < c < a b < a < c a < b a > b a > ca < c b < c b > c b < c b > c c < a c > a a ? b possible orders actual order

Where are we Proven: No comparison sort can have worst-case running time better than the height of a binary tree with n! leaves –Turns out average-case is same asymptotically Now: a binary tree with n! leaves has height  (n log n) –Factorial function grows very quickly Conclusion: (Comparison) Sorting is  (n log n) –This is an amazing computer-science result: proves all the clever programming in the world can’t sort in linear time Spring 20109CSE332: Data Abstractions

Lower bound on height The height of a binary tree with L leaves is at least log 2 L So the height of our decision tree, h: h  log 2 (n!) property of binary trees = log 2 (n*(n-1)*(n-2)…(2)(1)) definition of factorial = log 2 n + log 2 (n-1) + … + log 2 1 property of logarithms  log 2 n + log 2 (n-1) + … + log 2 (n/2) drop smaller terms (  0)  (n/2) log 2 (n/2) each of the n/2 terms left is  log 2 (n/2) = (n/2)( log 2 n - log 2 2) property of logarithms = (1/2)n log 2 n – (1/2)n arithmetic “=“  (n log n) Spring CSE332: Data Abstractions

The Big Picture Surprising amount of juicy computer science: 2-3 lectures… Spring CSE332: Data Abstractions Simple algorithms: O(n 2 ) Fancier algorithms: O(n log n) Comparison lower bound:  (n log n) Specialized algorithms: O(n) Handling huge data sets Insertion sort Selection sort Shell sort … Heap sort Merge sort Quick sort (avg) … Bucket sort Radix sort External sorting huh??? Change the model – assume more than items can be compared!

BucketSort (a.k.a. BinSort) If all values to be sorted are known to be integers between 1 and K (or any small range), create an array of size K and put each element in its proper bucket (a.ka. bin) –If data is only integers, don’t even need to store anything more than a count of how times that bucket has been used Output result via linear pass through array of buckets Spring CSE332: Data Abstractions count array Example: K=5 input (5,1,3,4,3,2,1,1,5,4,5) output: 1,1,1,2,3,3,4,4,5,5,5

Analyzing bucket sort Good when range, K, is smaller (or not much larger) than number of elements, n –Don’t spend time doing lots of comparisons of duplicates! Bad when K is much larger than n –Wasted space; wasted time during final linear O(K) pass Overall: O(n+K) –Linear in n, but also linear in K –  (n log n) lower bound does not apply because this is not a comparison sort For data in addition to integer keys, use list at each bucket Spring CSE332: Data Abstractions

Radix sort Radix = “the base of a number system” –Examples will use 10 because we are used to that –In implementations use larger numbers –For example, for ASCII strings, might use 128 Idea: –Bucket sort on one digit at a time Number of buckets = radix Starting with least significant digit Keeping sort stable –After k passes (digits), the last k digits are sorted Aside: Origins go back to the 1890 U.S. census Spring CSE332: Data Abstractions

Example Radix = 10 Input: Spring CSE332: Data Abstractions First pass: bucket sort by ones digit Order now:

Example Spring CSE332: Data Abstractions Second pass: stable bucket sort by tens digit Order now: Radix = 10 Order was:

Example Spring CSE332: Data Abstractions Third pass: stable bucket sort by 100s digit Order now: Radix = Order was:

Analysis Input size: n Number of buckets = Radix: B Number of passes = “Digits”: P Work per pass is 1 bucket sort: O(B+n) Total work is O(P(B+n)) Compared to comparison sorts, sometimes a win, but often not –Example: Strings of English letters up to length 15 15*(52 + n) This is less than n log n only if n > 33,000 Of course, cross-over point depends on constant factors of the implementations plus P and B –And radix sort can have poor locality properties Spring CSE332: Data Abstractions

Last word on sorting Simple O(n 2 ) sorts can be fastest for small n –selection sort, insertion sort (latter linear for mostly-sorted) –good for “below a cut-off” to help divide-and-conquer sorts O(n log n) sorts –heap sort, in-place but not stable nor parallelizable –merge sort, not in place but stable and works as external sort –quick sort, in place but not stable and O(n 2 ) in worst-case often fastest, but depends on costs of comparisons/copies  (n log n) is worst-case and average lower-bound for sorting by comparisons Non-comparison sorts –Bucket sort good for small number of key values –Radix sort uses fewer buckets and more phases Best way to sort? It depends! Spring CSE332: Data Abstractions