Sorting and Lower Bounds 15-211 Fundamental Data Structures and Algorithms Peter Lee February 25, 2003.

Slides:



Advertisements
Similar presentations
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Advertisements

Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2010.
§7 Quicksort -- the fastest known sorting algorithm in practice 1. The Algorithm void Quicksort ( ElementType A[ ], int N ) { if ( N < 2 ) return; pivot.
CSC 213 – Large Scale Programming or. Today’s Goals  Begin by discussing basic approach of quick sort  Divide-and-conquer used, but how does this help?
Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort
CSCE 3110 Data Structures & Algorithm Analysis
Chapter 4: Divide and Conquer Master Theorem, Mergesort, Quicksort, Binary Search, Binary Trees The Design and Analysis of Algorithms.
Using Divide and Conquer for Sorting
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 24 Sorting.
Quicksort CS 3358 Data Structures. Sorting II/ Slide 2 Introduction Fastest known sorting algorithm in practice * Average case: O(N log N) * Worst case:
25 May Quick Sort (11.2) CSE 2011 Winter 2011.
Quicksort COMP171 Fall Sorting II/ Slide 2 Introduction * Fastest known sorting algorithm in practice * Average case: O(N log N) * Worst case: O(N.
Data Structures Data Structures Topic #13. Today’s Agenda Sorting Algorithms: Recursive –mergesort –quicksort As we learn about each sorting algorithm,
CSC 2300 Data Structures & Algorithms March 23, 2007 Chapter 7. Sorting.
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Quick-Sort     29  9.
© 2004 Goodrich, Tamassia Quick-Sort     29  9.
CS 171: Introduction to Computer Science II Quicksort.
CS 206 Introduction to Computer Science II 04 / 27 / 2009 Instructor: Michael Eckmann.
CSC 2300 Data Structures & Algorithms March 27, 2007 Chapter 7. Sorting.
Chapter 4: Divide and Conquer The Design and Analysis of Algorithms.
Sorting. Introduction Assumptions –Sorting an array of integers –Entire sort can be done in main memory Straightforward algorithms are O(N 2 ) More complex.
TTIT33 Algorithms and Optimization – Dalg Lecture 2 HT TTIT33 Algorithms and optimization Lecture 2 Algorithms Sorting [GT] 3.1.2, 11 [LD] ,
Analysis of Algorithms CS 477/677
CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting.
1 Today’s Material Lower Bounds on Comparison-based Sorting Linear-Time Sorting Algorithms –Counting Sort –Radix Sort.
Lecture 8 Sorting. Sorting (Chapter 7) We have a list of real numbers. Need to sort the real numbers in increasing order (smallest first). Important points.
Chapter 7 (Part 2) Sorting Algorithms Merge Sort.
Sorting Chapter 6 Chapter 6 –Insertion Sort 6.1 –Quicksort 6.2 Chapter 5 Chapter 5 –Mergesort 5.2 –Stable Sorts Divide & Conquer.
Lecture 8 Sorting. Sorting (Chapter 7) We have a list of real numbers. Need to sort the real numbers in increasing order (smallest first). Important points.
© 2006 Pearson Addison-Wesley. All rights reserved10 A-1 Chapter 10 Algorithm Efficiency and Sorting.
CS 146: Data Structures and Algorithms July 14 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak
Sorting in Linear Time Lower bound for comparison-based sorting
CSE 373 Data Structures Lecture 15
CSC – 332 Data Structures Sorting
Sorting and Lower bounds Fundamental Data Structures and Algorithms Ananda Guna January 27, 2005.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
Quicksort, Mergesort, and Heapsort. Quicksort Fastest known sorting algorithm in practice  Caveats: not stable  Vulnerable to certain attacks Average.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
Analysis of Algorithms CS 477/677
1 Sorting Algorithms Sections 7.1 to Comparison-Based Sorting Input – 2,3,1,15,11,23,1 Output – 1,1,2,3,11,15,23 Class ‘Animals’ – Sort Objects.
Survey of Sorting Ananda Gunawardena. Naïve sorting algorithms Bubble sort: scan for flips, until all are fixed Etc...
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
CS 206 Introduction to Computer Science II 04 / 22 / 2009 Instructor: Michael Eckmann.
Sorting: Implementation Fundamental Data Structures and Algorithms Klaus Sutner February 24, 2004.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
CS 146: Data Structures and Algorithms July 14 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak
Divide And Conquer A large instance is solved as follows:  Divide the large instance into smaller instances.  Solve the smaller instances somehow. 
Sorting Fundamental Data Structures and Algorithms Aleks Nanevski February 17, 2004.
CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan
Sorting: Implementation Fundamental Data Structures and Algorithms Margaret Reid-Miller 24 February 2004.
Fundamental Structures of Computer Science February 20, 2003 Ananda Guna Introduction to Sorting.
SORTING AND ASYMPTOTIC COMPLEXITY Lecture 13 CS2110 – Fall 2009.
Intro. to Data Structures Chapter 7 Sorting Veera Muangsin, Dept. of Computer Engineering, Chulalongkorn University 1 Chapter 7 Sorting Sort is.
Sorting Fundamental Data Structures and Algorithms Klaus Sutner February 17, 2004.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
Sorting and Lower Bounds Fundamental Data Structures and Algorithms Klaus Sutner February 19, 2004.
Sorting and Runtime Complexity CS255. Sorting Different ways to sort: –Bubble –Exchange –Insertion –Merge –Quick –more…
Liang, Introduction to Java Programming, Seventh Edition, (c) 2009 Pearson Education, Inc. All rights reserved Chapter 26 Sorting.
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Fundamental Data Structures and Algorithms
Quick-Sort 9/13/2018 1:15 AM Quick-Sort     2
The Selection Problem.
Presentation transcript:

Sorting and Lower Bounds Fundamental Data Structures and Algorithms Peter Lee February 25, 2003

Announcements  Quiz #2 available today  Open until Wednesday midnight  Midterm exam next Tuesday  Tuesday, March 4, 2003, in class  Review session in Thursday’s class  Homework #4 is out  You should finish Part 1 this week!  Reading:  Chapter 8

Recap

Naïve sorting algorithms  Bubble sort  Insertion sort.

Heapsort  Build heap. O(N)  DeleteMin until empty. O(Nlog N)  Total worst case: O(Nlog N)

Shellsort  Example with sequence 3, Several inverted pairs fixed in one exchange.

Divide-and-conquer

Analysis of recursive sorting  Suppose it takes time T(N) to sort N elements.  Suppose also it takes time N to combine the two sorted arrays.  Then:  T(1) = 1  T(N) = 2T(N/2) + N, for N>1  Solving for T gives the running time for the recursive sorting algorithm.

Divide-and-Conquer Theorem  Theorem: Let a, b, c  0.  The recurrence relation  T(1) = b  T(N) = aT(N/c) + bN  for any N which is a power of c  has upper-bound solutions  T(N) = O(N)if a<c  T(N) = O(Nlog N)if a=c  T(N) = O(N log c a )if a>c a=2, b=1, c=2 for rec. sorting

Exact solutions  It is sometimes possible to derive closed-form solutions to recurrence relations.  Several methods exist for doing this.  Telescoping-sum method  Repeated-substitution method

Mergesort  Mergesort is the most basic recursive sorting algorithm.  Divide array in halves A and B.  Recursively mergesort each half.  Combine A and B by successively looking at the first elements of A and B and moving the smaller one to the result array.  Note: Should be a careful to avoid creating of lots of result arrays.

Mergesort LLRL Use simple indexes to perform the split. Use a single extra array to hold each intermediate result.

Analysis of mergesort  Mergesort generates almost exactly the same recurrence relations shown before.  T(1) = 1  T(N) = 2T(N/2) + N - 1, for N>1  Thus, mergesort is O(Nlog N).

Comparison-based sorting Recall that these are all examples of comparison-based sorting algorithms: Items are stored in an array. Can be moved around in the array. Can compare any two array elements. Comparison has 3 possible outcomes:

Non-comparison-based sorting  If we can do more than just compare pairs of elements, we can sometimes sort more quickly  Two simple examples are bucket sort and radix sort

Bucket Sort

Bucket sort  In addition to comparing pairs of elements, we require these additional restrictions:  all elements are non-negative integers  all elements are less than a predetermined maximum value

Bucket sort

Bucket sort characteristics  Runs in O(N) time.  Easy to implement each bucket as a linked list.  Is stable:  If two elements (A,B) are equal with respect to sorting, and they appear in the input in order (A,B), then they remain in the same order in the output.

Radix Sort

Radix sort  Another sorting algorithm that goes beyond comparison is radix sort Each sorting step must be stable.

Radix sort characteristics  Each sorting step can be performed via bucket sort, and is thus O(N).  If the numbers are all b bits long, then there are b sorting steps.  Hence, radix sort is O(bN).  Also, radix sort can be implemented in-place (just like quicksort).

Not just for binary numbers  Radix sort can be used for decimal numbers and alphanumeric strings

Why comparison-based?  Bucket and radix sort are much faster than any comparison-based sorting algorithm  Unfortunately, we can’t always live with the restrictions imposed by these algorithms  In such cases, comparison-based sorting algorithms give us general solutions

Back to Quick Sort

Review: Quicksort algorithm  If array A has 1 (or 0) elements, then done.  Choose a pivot element x from A.  Divide A-{x} into two arrays:  B = {yA | yx}  C = {yA | yx}  Quicksort arrays B and C.  Result is B+{x}+C.

Implementation issues  Quick sort can be very fast in practice, but this depends on careful coding  Three major issues: 1.doing quicksort in-place 2.picking the right pivot 3.avoiding quicksort on small arrays

1. Doing quicksort in place LR LR LR

1. Doing quicksort in place LR RL LR

2. Picking the pivot  In real life, inputs to a sorting routine are often partially sorted  why does this happen?  So, picking the first or last element to be the pivot is usually a bad choice  One common strategy is to pick the middle element  this is an OK strategy

2. Picking the pivot  A more sophisticated approach is to use random sampling  think about opinion polls  For example, the median-of-three strategy:  take the median of the first, middle, and last elements to be the pivot

3. Avoiding small arrays  While quicksort is extremely fast for large arrays, experimentation shows that it performs less well on small arrays  For small enough arrays, a simpler method such as insertion sort works better  The exact cutoff depends on the language and machine, but usually is somewhere between 10 and 30 elements

Putting it all together LR LR LR

Putting it all together LR RL LR

A complication!  What should happen if we encounter an element that is equal to the pivot?  Four possibilities:  L stops, R keeps going  R stops, L keeps going  L and R stop  L and R keep going

Quiz Break

Red-green quiz  What should happen if we encounter an element that is equal to the pivot?  Four possibilities:  L stops, R keeps going  R stops, L keeps going  L and R stop  L and R keep going  Explain why your choice is the only reasonable one

Quick Sort Analysis

Worst-case behavior

Best-case analysis  In the best case, the pivot is always the median element.  In that case, the splits are always “down the middle”.  Hence, same behavior as mergesort.  That is, O(Nlog N).

Average-case analysis  Consider the quicksort tree:

Average-case analysis  The time spent at each level of the tree is O(N).  So, on average, how many levels?  That is, what is the expected height of the tree?  If on average there are O(log N) levels, then quicksort is O(Nlog N) on average.

Expected height of qsort tree  Assume that pivot is chosen randomly.  When is a pivot “good”? “Bad”? Probability of a good pivot is 0.5. After good pivot, each child is at most 3/4 size of parent.

Expected height of qsort tree  So, if we descend k levels in the tree, each time being lucky enough to pick a “good” pivot, the maximum size of the k th child is:  N(3/4)(3/4) … (3/4) (k times)  = N(3/4) k  But on average, only half of the pivots will be good, so  N(3/4) k/2 = 2log 4/3 N = O(log N)

Summary of quicksort  A fast sorting algorithm in practice.  Can be implemented in-place.  But is O(N 2 ) in the worst case.  O(Nlog N) average-case performance.

Lower Bound for the Sorting Problem

How fast can we sort?  We have seen several sorting algorithms with O(Nlog N) running time.  In fact, O(Nlog N) is a general lower bound for the sorting algorithm.  A proof appears in Weiss.  Informally…

Upper and lower bounds N d  g(N) T(N) T(N) = O(f(N)) T(N) = (g(N)) c  f(N)

Decision tree for sorting a<b<c a<c<b b<a<c b<c<a c<a<b c<b<a a<b<c a<c<b c<a<b b<a<c b<c<a c<b<a a<b<c a<c<b c<a<b a<b<ca<c<b b<a<c b<c<a c<b<a b<a<cb<c<a a<bb<a b<cc<bc<aa<c b<cc<ba<cc<a N! leaves. So, tree has height log(N!). log(N!) =  (Nlog N).

Summary on sorting bound  If we are restricted to comparisons on pairs of elements, then the general lower bound for sorting is  (Nlog N).  A decision tree is a representation of the possible comparisons required to solve a problem.

External Sorting

External sorting  In many real-world situations, the amount of data to be sorted is much more than can be stored in memory  So, it is important in some cases to use algorithms that work well when sorting data stored externally  See tomorrow’s recitation…

World’s Fastest Sorters

Sorting competitions  There are several world-wide sorting competitions  Unix CoSort has achieved 1GB in under one minute, on a single Alpha  Berkeley’s NOW-sort sorted 8.4GB of disk data in under one minute, using a network of 95 workstations  Sandia Labs was able to sort 1TB of data in under 50 minutes, using a 144- node multiprocessor machine