CSCI 333 Data Structures Chapter 6 14 and 16 October 2002.

Slides:



Advertisements
Similar presentations
§7 Quicksort -- the fastest known sorting algorithm in practice 1. The Algorithm void Quicksort ( ElementType A[ ], int N ) { if ( N < 2 ) return; pivot.
Advertisements

Lower bound for sorting, radix sort COMP171 Fall 2005.
CSC 2300 Data Structures & Algorithms March 16, 2007 Chapter 7. Sorting.
Internal Sorting A brief review and some new ideas CS 400/600 – Data Structures.
Insertion Sort void inssort (ELEM * array, int n) { for (int I=1; I0) && (key(array[j])
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Lower bound for sorting, radix sort COMP171 Fall 2006.
Sorting Algorithms Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.
Sorting. Introduction Assumptions –Sorting an array of integers –Entire sort can be done in main memory Straightforward algorithms are O(N 2 ) More complex.
Divide and Conquer Sorting
CS 202, Spring 2003 Fundamental Structures of Computer Science II Bilkent University1 Sorting CS 202 – Fundamental Structures of Computer Science II Bilkent.
Sorting Chapter 6 Chapter 6 –Insertion Sort 6.1 –Quicksort 6.2 Chapter 5 Chapter 5 –Mergesort 5.2 –Stable Sorts Divide & Conquer.
CSE 373 Data Structures Lecture 19
Data Structures/ Algorithms and Generic Programming Sorting Algorithms.
Sorting (Part II: Divide and Conquer) CSE 373 Data Structures Lecture 14.
Chapter 7 Internal Sorting. Sorting Each record contains a field called the key. –Linear order: comparison. Measures of cost: –Comparisons –Swaps.
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
CS 61B Data Structures and Programming Methodology July 28, 2008 David Sun.
1 Sorting Algorithms Sections 7.1 to Comparison-Based Sorting Input – 2,3,1,15,11,23,1 Output – 1,1,2,3,11,15,23 Class ‘Animals’ – Sort Objects.
Comparison of Optimization Algorithms By Jonathan Lutu.
Sorting CSIT 402 Data Structures II. 2 Sorting (Ascending Order) Input ›an array A of data records ›a key value in each data record ›a comparison function.
Sorting. Sorting Terminology Sort Key –each element to be sorted must be associated with a sort key which can be compared with other keys e.g. for any.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Course notes CS2606: Data Structures and Object-Oriented Development Chapter 7: Sorting Department of Computer Science Virginia Tech Spring 2008 (The following.
Divide and Conquer Sorting Algorithms COMP s1 Sedgewick Chapters 7 and 8.
PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)
Internal Sorting Each record contains a field called the key. The keys can be places in a linear order. The Sorting Problem Given a sequence of record.
Chapter 5 Sorting There are several easy algorithms to sort in O(N 2 ), such as insertion sort. There is an algorithm, Shellsort, that is very simple to.
Priority Queues and Heaps. John Edgar  Define the ADT priority queue  Define the partially ordered property  Define a heap  Implement a heap using.
Prof. U V THETE Dept. of Computer Science YMA
Sorting Algorithms Sections 7.1 to 7.4.
CPSC 411 Design and Analysis of Algorithms
Data Structures Using C++
Searching & Sorting "There's nothing hidden in your head the sorting hat can't see. So try me on and I will tell you where you ought to be." -The Sorting.
Data Structures Using C++ 2E
Chapter 7 Sorting Spring 14
Quicksort "There's nothing in your head the sorting hat can't see. So try me on and I will tell you where you ought to be." -The Sorting Hat, Harry Potter.
Description Given a linear collection of items x1, x2, x3,….,xn
Sorting Chapter 13 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved
CSS 342 Data Structures, Algorithms, and Discrete Mathematics I
CSS 342 Data Structures, Algorithms, and Discrete Mathematics I
Sorting We have actually seen already two efficient ways to sort:
Advanced Sorting Methods: Shellsort
Quick Sort (11.2) CSE 2011 Winter November 2018.
Heapsort.
Mergesort.
8/04/2009 Many thanks to David Sun for some of the included slides!
CS200: Algorithm Analysis
Linear Sorting Sorting in O(n) Jeff Chastine.
COMP 352 Data Structures and Algorithms
Sub-Quadratic Sorting Algorithms
slides adapted from Marty Stepp
Topic 24 sorting and searching arrays
CSE 326: Data Structures Sorting
EE 312 Software Design and Implementation I
CSE 332: Data Abstractions Sorting I
Lower bound for sorting, radix sort
CSE 373 Data Structures and Algorithms
Algorithms: Design and Analysis
These notes were prepared by the text’s author Clifford A. Shaffer
CSE 326: Data Structures: Sorting
The Selection Problem.
CSE 332: Sorting II Spring 2016.
Algorithms CSCI 235, Spring 2019 Lecture 17 Quick Sort II
Sorting We have actually seen already two efficient ways to sort:
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
Advanced Sorting Methods: Shellsort
Sorting Popular algorithms:
Presentation transcript:

CSCI 333 Data Structures Chapter 6 14 and 16 October 2002

Notes with the dark blue background were prepared by the textbook author Clifford A. Shaffer Department of Computer Science Virginia Tech Copyright © 2000, 2001

There’s nothing in your head the sorting hat can’t see There’s nothing in your head the sorting hat can’t see. So try me on and I will tell you where you ought to be. Harry Potter and the Sorcerer’s Stone J. K. Rowling

Sorting Each record contains a field called the key. Measures of cost: Linear order: comparison. Measures of cost: Comparisons Swaps

Insertion Sort (1)

Insertion Sort (2) Best Case: Worst Case: Average Case: template <class Elem, class Comp> void inssort(Elem A[], int n) { for (int i=1; i<n; i++) for (int j=i; (j>0) && (Comp::lt(A[j], A[j-1])); j--) swap(A, j, j-1); } Best Case: Worst Case: Average Case: Best case is 0 swaps, n - 1 comparisons. Worst case is n2/2 swaps and comparisons. Average case is n2/4 swaps and comparisons. This algorithm as the best possible best case performance. This will be important later.

Bubble Sort (1)

Bubble Sort (2) Best Case: Worst Case: Average Case: template <class Elem, class Comp> void bubsort(Elem A[], int n) { for (int i=0; i<n-1; i++) for (int j=n-1; j>i; j--) if (Comp::lt(A[j], A[j-1])) swap(A, j, j-1); } Best Case: Worst Case: Average Case: Best case is 0 swaps, n2/2 comparisons. Worst case is n2/2 swaps and comparisons. Average case is n2/4 swaps and n2/2 comparisons. This algorithm has no redeeming features whatsoever. It’s a shame this algorithm typically gets taught to intro students in preference to the far superior insertion sort algorithm.

Selection Sort (1)

Selection Sort (2) Best Case: Worst Case: Average Case: template <class Elem, class Comp> void selsort(Elem A[], int n) { for (int i=0; i<n-1; i++) { int lowindex = i; // Remember its index for (int j=n-1; j>i; j--) // Find least if (Comp::lt(A[j], A[lowindex])) lowindex = j; // Put it in place swap(A, i, lowindex); } Best Case: Worst Case: Average Case: Best case is 0 swaps (n-1 as written), n2/2 comparisons. Worst case is n - 1 swaps and n2/2 comparisons. Average case is O(n) swaps and n2/2 comparisons. This is essentially a bubble sort with the swaps deferred. It can be a significant improvement if the cost of swapping is expensive. Note that the algorithm could be implemented to require 0 swaps in the best case by testing the two locations being swapped are not the same. But this test is likely more expensive than the work saved on average.

Pointer Swapping

Summary Insertion Bubble Selection Comparisons: Best Case (n) Q(n2) Average Case Worst Case Swaps

Exchange Sorting All of the sorts so far rely on exchanges of adjacent records. What is the average number of exchanges required? There are n! permutations Consider permuation X and its reverse, X’ Together, every pair requires n(n-1)/2 exchanges.

The bad sorts Insertion sort Selection sort Bubble sort

Exchange Sorting All of the sorts so far rely on exchanges of adjacent records. What is the average number of exchanges required? There are n! permutations Consider permuation X and its reverse, X’ Together, every pair requires n(n-1)/2 exchanges.

Shellsort Any increments will work, provided that that the last one is 1 (regular insertion sort). Shellsort takes advantage of Insertion Sort’s best-case performance.

Shellsort // Modified version of Insertion Sort template <class Elem, class Comp> void inssort2(Elem A[], int n, int incr) { for (int i=incr; i<n; i+=incr) for (int j=i; (j>=incr) && (Comp::lt(A[j], A[j-incr])); j-=incr) swap(A, j, j-incr); } void shellsort(Elem A[], int n) { // Shellsort for (int i=n/2; i>2; i/=2) // For each incr for (int j=0; j<i; j++) // Sort sublists inssort2<Elem,Comp>(&A[j], n-j, i); inssort2<Elem,Comp>(A, n, 1); Any increments will work, provided that that the last one is 1 (regular insertion sort). Shellsort takes advantage of Insertion Sort’s best-case performance.

Quicksort template <class Elem, class Comp> void qsort(Elem A[], int i, int j) { if (j <= i) return; // List too small int pivotindex = findpivot(A, i, j); swap(A, pivotindex, j); // Put pivot at end // k will be first position on right side int k = partition<Elem,Comp>(A, i-1, j, A[j]); swap(A, k, j); // Put pivot in place qsort<Elem,Comp>(A, i, k-1); qsort<Elem,Comp>(A, k+1, j); } template <class Elem> int findpivot(Elem A[], int i, int j) { return (i+j)/2; } Initial call: qsort(A, 0, n-1);

Quicksort Partition The cost for partition is Q(n). template <class Elem, class Comp> int partition(Elem A[], int l, int r, Elem& pivot) { do { // Move the bounds inward until they meet while (Comp::lt(A[++l], pivot)); // Move l right and while ((r != 0) && Comp::gt(A[--r], pivot)); // r left swap(A, l, r); // Swap out-of-place values } while (l < r); // Stop when they cross swap(A, l, r); // Reverse last swap return l; // Return first pos on right } The cost for partition is Q(n).

Partition Example

Quicksort Example

Cost of Quicksort Best case: Always partition in half. Worst case: Bad partition. Average case: T(n) = n + 1 + 1/(n-1) (T(k) + T(n-k)) Optimizations for Quicksort: Better Pivot Better algorithm for small sublists Eliminate recursion n-1 k=1 Best and average cases are (n log n). Worst case is (n2).

Sorting inventors Quicksort Shellsort Tony Hoare now at Microsoft Research in Cambridge Shellsort Donald L Shell then at General Electric

Mergesort List mergesort(List inlist) { if (inlist.length() <= 1)return inlist; List l1 = half of the items from inlist; List l2 = other half of items from inlist; return merge(mergesort(l1), mergesort(l2)); }

Mergesort Implementation template <class Elem, class Comp> void mergesort(Elem A[], Elem temp[], int left, int right) { int mid = (left+right)/2; if (left == right) return; mergesort<Elem,Comp>(A, temp, left, mid); mergesort<Elem,Comp>(A, temp, mid+1, right); for (int i=left; i<=right; i++) // Copy temp[i] = A[i]; int i1 = left; int i2 = mid + 1; for (int curr=left; curr<=right; curr++) { if (i1 == mid+1) // Left exhausted A[curr] = temp[i2++]; else if (i2 > right) // Right exhausted A[curr] = temp[i1++]; else if (Comp::lt(temp[i1], temp[i2])) else A[curr] = temp[i2++]; }} Mergesort is tricky to implement. The problem is to have the recursion work correctly with the space that stores the values. We do not want to continuously create new arrays. This implementation copies the sorted sublists into a temporary array and merges back to the original array.

Optimized Mergesort template <class Elem, class Comp> void mergesort(Elem A[], Elem temp[], int left, int right) { if ((right-left) <= THRESHOLD) { inssort<Elem,Comp>(&A[left],right-left+1); return; } int i, j, k, mid = (left+right)/2; if (left == right) return; mergesort<Elem,Comp>(A, temp, left, mid); mergesort<Elem,Comp>(A, temp, mid+1, right); for (i=mid; i>=left; i--) temp[i] = A[i]; for (j=1; j<=right-mid; j++) temp[right-j+1] = A[j+mid]; for (i=left,j=right,k=left; k<=right; k++) if (temp[i] < temp[j]) A[k] = temp[i++]; else A[k] = temp[j--]; Two optimizations. First, use insertion sort to sort small sublists. Second, have the two sublists run toward each other, so that their high ends meet in the middle. In this way, there is no need to test for end of list.

Mergesort Cost Mergesort cost: Mergsort is also good for sorting linked lists. Cost is (n log n) in the best, average, and worst cases. When merging linked lists, send records to alternating linked lists, mergesort each, then merge the sublists.

Heapsort template <class Elem, class Comp> void heapsort(Elem A[], int n) { // Heapsort Elem mval; maxheap<Elem,Comp> H(A, n, n); for (int i=0; i<n; i++) // Now sort H.removemax(mval); // Put max at end } Use a max-heap, so that elements end up sorted within the array. Cost of heapsort: Cost of finding K largest elements: The time to build the heap is (n). The time to remove the n elements is (n log n). So, if we only remove k elements, the total cost is (n + k log n). Compare to sorting with a BST. The heap has no overhead, and is always balanced. Further, the time to build the BST is expensive, while removing the elements once the BST is built is cheap.

Heapsort Example (1) The time to build the heap is (n). The time to remove the n elements is (n log n). So, if we only remove k elements, the total cost is (n + k log n). Compare to sorting with a BST. The heap has no overhead, and is always balanced. Further, the time to build the BST is expensive, while removing the elements once the BST is built is cheap.

Heapsort Example (2) The time to build the heap is (n). The time to remove the n elements is (n log n). So, if we only remove k elements, the total cost is (n + k log n). Compare to sorting with a BST. The heap has no overhead, and is always balanced. Further, the time to build the BST is expensive, while removing the elements once the BST is built is cheap.

Binsort (1) A simple, efficient sort: Ways to generalize: for (i=0; i<n; i++) B[A[i]] = A[i]; Ways to generalize: Make each bin the head of a list. Allow more keys than records. This sort is (n), but it only works on a permutation of the values from 0 to n-1.

Binsort (2) Cost: template <class Elem> void binsort(Elem A[], int n) { List<Elem> B[MaxKeyValue]; Elem item; for (i=0; i<n; i++) B[A[i]].append(A[i]); for (i=0; i<MaxKeyValue; i++) for (B[i].setStart(); B[i].getValue(item); B[i].next()) output(item); } Cost: Cost might first appear to be (n). Actually, it is (n + MaxKeyValue). Unfortunately, there is no necessary relationship between the size of n and the size of MaxKeyValue. Thus, this algorithm could be (n2) or arbitrarily worse.

Radix Sort (1)

Radix Sort (2) template <class Elem, class Comp> void radix(Elem A[], Elem B[], int n, int k, int r, int cnt[]) { // cnt[i] stores # of records in bin[i] int j; for (int i=0, rtok=1; i<k; i++, rtok*=r) { for (j=0; j<r; j++) cnt[j] = 0; // Count # of records for each bin for(j=0; j<n; j++) cnt[(A[j]/rtok)%r]++; // cnt[j] will be last slot of bin j. for (j=1; j<r; j++) cnt[j] = cnt[j-1] + cnt[j]; for (j=n-1; j>=0; j--)\ B[--cnt[(A[j]/rtok)%r]] = A[j]; for (j=0; j<n; j++) A[j] = B[j]; }}

Radix Sort Example

Radix Sort Cost Cost: Q(nk + rk) How do n, k, and r relate? If key range is small, then this can be Q(n). If there are n distinct keys, then the length of a key must be at least log n. Thus, Radix Sort is Q(n log n) in general case r is small and can be viewed as a constant.

Empirical Comparison (1) Window 98

Empirical Comparison (2) Linux

Sorting Lower Bound We would like to know a lower bound for all possible sorting algorithms. Sorting is O(n log n) (average, worst cases) because we know of algorithms with this upper bound. Sorting I/O takes (n) time. We will now prove (n log n) lower bound for sorting.

Decision Trees

Lower Bound Proof There are n! permutations. A sorting algorithm can be viewed as determining which permutation has been input. Each leaf node of the decision tree corresponds to one permutation. A tree with n nodes has W(log n) levels, so the tree with n! leaves has W(log n!) = W(n log n) levels. Which node in the decision tree corresponds to the worst case?

Sorting animations Sorting out Sorting 1981 30-minute film by Ronald Baecker Jason Harrison’s Sorting Algorithms Demo John Morris’s data structure course notes With animations by Woi Ang And more animations by Woi Ang