CSE 326: Data Structures Lecture #16 Sorting Things Out Bart Niswonger Summer Quarter 2001.

Slides:



Advertisements
Similar presentations
Sorting in Linear Time Introduction to Algorithms Sorting in Linear Time CSE 680 Prof. Roger Crawfis.
Advertisements

Introduction to Algorithms Quicksort
Analysis of Algorithms
Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )
Non-Comparison Based Sorting
Linear Sorts Counting sort Bucket sort Radix sort.
MS 101: Algorithms Instructor Neelima Gupta
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2010.
§7 Quicksort -- the fastest known sorting algorithm in practice 1. The Algorithm void Quicksort ( ElementType A[ ], int N ) { if ( N < 2 ) return; pivot.
Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort
CSCE 3110 Data Structures & Algorithm Analysis
Using Divide and Conquer for Sorting
25 May Quick Sort (11.2) CSE 2011 Winter 2011.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
1 SORTING Dan Barrish-Flood. 2 heapsort made file “3-Sorting-Intro-Heapsort.ppt”
Comp 122, Spring 2004 Lower Bounds & Sorting in Linear Time.
CSC 2300 Data Structures & Algorithms March 27, 2007 Chapter 7. Sorting.
Sorting. Introduction Assumptions –Sorting an array of integers –Entire sort can be done in main memory Straightforward algorithms are O(N 2 ) More complex.
Divide and Conquer Sorting
CSE 326: Data Structures Sorting Ben Lerner Summer 2007.
Analysis of Algorithms CS 477/677
CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting.
1 CSE 326: Data Structures: Sorting Lecture 16: Friday, Feb 14, 2003.
1 CSE 326: Data Structures: Sorting Lecture 17: Wednesday, Feb 19, 2003.
CS 146: Data Structures and Algorithms July 14 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8
Sorting in Linear Time Lower bound for comparison-based sorting
CSE 373 Data Structures Lecture 15
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
Introduction to Algorithms Jiafen Liu Sept
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2012.
CS 61B Data Structures and Programming Methodology July 28, 2008 David Sun.
Analysis of Algorithms CS 477/677
Fall 2015 Lecture 4: Sorting in linear time
1 CSE 326: Data Structures Sorting It All Out Henry Kautz Winter Quarter 2002.
September 26, 2005Copyright © Erik D. Demaine and Charles E. Leiserson L5.1 Introduction to Algorithms 6.046J/18.401J Prof. Erik Demaine LECTURE5.
Bucket & Radix Sorts. Efficient Sorts QuickSort : O(nlogn) – O(n 2 ) MergeSort : O(nlogn) Coincidence?
CSE 326: Data Structures Lecture 24 Spring Quarter 2001 Sorting, Part B David Kaplan
Mudasser Naseer 1 11/5/2015 CSC 201: Design and Analysis of Algorithms Lecture # 8 Some Examples of Recursion Linear-Time Sorting Algorithms.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
CSE373: Data Structure & Algorithms Lecture 22: More Sorting Linda Shapiro Winter 2015.
CSE 326 Sorting David Kaplan Dept of Computer Science & Engineering Autumn 2001.
1 CSE 326: Data Structures A Sort of Detour Henry Kautz Winter Quarter 2002.
CS 146: Data Structures and Algorithms July 14 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
1 CSE 326: Data Structures Sorting in (kind of) linear time Zasha Weinberg in lieu of Steve Wolfman Winter Quarter 2000.
Sorting divide and conquer. Divide-and-conquer  a recursive design technique  solve small problem directly  divide large problem into two subproblems,
CSE 326: Data Structures Lecture 23 Spring Quarter 2001 Sorting, Part 1 David Kaplan
Linear Sorting. Comparison based sorting Any sorting algorithm which is based on comparing the input elements has a lower bound of Proof, since there.
SORTING AND ASYMPTOTIC COMPLEXITY Lecture 13 CS2110 – Fall 2009.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
Sorting and Runtime Complexity CS255. Sorting Different ways to sort: –Bubble –Exchange –Insertion –Merge –Quick –more…
1 CSE 326: Data Structures Sorting by Comparison Zasha Weinberg in lieu of Steve Wolfman Winter Quarter 2000.
Lower Bounds & Sorting in Linear Time
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Linear-Time Sorting Continued Medians and Order Statistics
Introduction to Algorithms
Quick Sort (11.2) CSE 2011 Winter November 2018.
Ch8: Sorting in Linear Time Ming-Te Chi
Linear Sort "Our intuition about the future is linear. But the reality of information technology is exponential, and that makes a profound difference.
Lower Bounds & Sorting in Linear Time
Linear-Time Sorting Algorithms
CSE 326: Data Structures Sorting
CSE 326: Data Structures: Sorting
The Selection Problem.
Presentation transcript:

CSE 326: Data Structures Lecture #16 Sorting Things Out Bart Niswonger Summer Quarter 2001

Unix Tutorial!! Tuesday, July 31 st –10:50am, Sieg 322 Printing worksheet Shell different shell quotes : ' ` scripting, #! alias variables / environment redirection, piping Useful tools grep, egrep/grep -e sort cut file tr find, xargs diff, patch which, locate, whereis Finding info Techniques Resources (ACM webpage, web, internal docs) Process management File management/permissions Filesystem layout

Today’s Outline Project –Rules of competition Sorting by comparison –Simple : SelectionSort; BubbleSort; InsertionSort –Quick : QuickSort –Good Worst Case : MergeSort; HeapSort

Sorting: The Problem Space General problem Given a set of N orderable items, put them in order Without (significant) loss of generality, assume: –Items are integers –Ordering is  Most sorting problems map to the above in linear time.

Selection Sort 1.Find the smallest element, put it first 2.Find the next smallest element, put it second 3.Find the next smallest, put it next …etc.

Selection Sort procedure SelectionSort (Array[1..N] For i=1 to N-1 Find the smallest entry in Array[i..N] Let j be the index of that entry Swap(Array[i],Array[j]) End For While other people are coding QuickSort/MergeSort Twiddle thumbs End While

HeapSort Use a Priority Queue (Heap) Shove everything into a queue, take them out smallest to largest.

QuickSort 1. Basic idea: Pick a pivot. 2. Partition into less-than & greater-than pivot. 3. Sort each side recursively.

2 goes to less-than QuickSort Partition Pick pivot Partition with cursors <> <> <> 6, 8 swap less/greater-than ,5 less-than 9 greater-than Partition done. Recursively sort each side.

Analyzing QuickSort Picking pivot: constant time Partitioning: linear time Recursion: time for sorting left partition (say of size i) + time for right (size N-i-1) T(1) = b T(N) = T(i) + T(N-i-1) + cN where i is the number of elements smaller than the pivot

QuickSort : Worst Case What is the worst case?

Optimizing QuickSort Choosing the Pivot –Randomly choose pivot Good theoretically and practically, but call to random number generator can be expensive –Pick pivot cleverly “Median-of-3” rule takes element at Median(first value, last value). Works well in practice. Cutoff –Use simpler sorting technique below a certain problem size Weiss suggests using insertion sort, with a cutoff limit of 5-20

QuickSort : Best Case T(N) = T(i) + T(N-i-1) + cN T(N)= 2T(N/2 - 1) + cN < 2T(N/2) + cN < 4T(N/4) + c(2(N / 2) + N) < 8T(N/8) + cN( ) < kT(N/k) + cN log k = O(N log N)

QuickSort : Average Case Assume all size partitions equally likely, with probability 1/N details: Weiss pg

Merging Cars by key [Aggressiveness of driver]. Most aggressive goes first. MergeSort (Collection [1..n]) 1.Split Collection in half 2.Recursively sort each half 3.merge two sorted halves together merge (C1[1..n], C2[1..n]) i1=1, i2=1 while i1<n and i2<n if C1[i1] < C2[i2] Next is C1[i1] i1++ else Next is C2[i2] i2++ end If end while MergeSort

MergeSort Analysis Running Time –Worst case? –Best case? –Average case? Other considerations besides running time?

Is This The Best We Can Do? Sorting by Comparison –Only information available to us is the set of N items to be sorted –Only operation available to us is pairwise comparison between 2 items What is the best running time we can possibly achieve?

Decision Tree Analysis B<C A<C C<A C<B A<C C<A B<C C<B A<B facts Internal node, with facts known so far Leaf node, with ordering of A,B,C C<A Edge, with result of one comparison B<A A<B C<B B<A C<A B A CA B C A C BC A BB C AC B A A<B A B C

How deep is Decision Tree? How many permutations are there of N numbers? How many leaves does the tree have? What’s the shallowest tree with a given number of leaves? What is therefore the worst running time (number of comparisons) by the best possible sorting algorithm?

Lower Bound for log(n!) Stirling’s approximation:

Is This The Best We Can Do? Sorting by Comparison –Only information available to us is the set of N items to be sorted –Only operation available to us is pairwise comparison between 2 items What happens if we relax these constraints?

BinSort (a.k.a. BucketSort) Requires: –Knowing the keys to be in {1, …, K} –Having an array of size K Works by: Putting items into correct bin (cell) of array, based on key

BinSort example K=5 list=(5,1,3,4,3,2,1,1,5,4,5) Bins in array key = 11,1,1 key = 22 key = 33,3 key = 44,4 key = 55,5,5 Sorted list: 1,1,1,2,3,3,4,4,5,5,5

BinSort Pseudocode procedure BinSort (List L,K) LinkedList bins[1..K] // Each element of array bins is linked list. // Could also BinSort with array of arrays. For Each number x in L bins[x].Append(x) End For For i = 1..K For Each number x in bins[i] Print x End For

BinSort Running Time K is a constant –BinSort is linear time K is variable –Not simply linear time K is large (e.g ) –Impractical

BinSort is “stable” Definition: Stable Sorting Algorithm Items in input with the same key end up in the same order as when they began. BinSort is stable –Important if keys have associated values –Critical for RadixSort

Mr. Radix Herman Hollerith invented and developed a punch-card tabulation machine system that revolutionized statistical computation. Born in Buffalo, New York, the son of German immigrants, Hollerith enrolled in the City College of New York at age 15 and graduated from the Columbia School of Mines with distinction at the age of 19. His first job was with the U.S. Census effort of Hollerith successively taught mechanical engineering at the Massachusetts Institute of Technology and worked for the U.S. Patent Office. Hollerith began working on the tabulating system during his days at MIT, filing for the first patent in He developed a hand-fed 'press' that sensed the holes in punched cards; a wire would pass through the holes into a cup of mercury beneath the card closing the electrical circuit. This process triggered mechanical counters and sorter bins and tabulated the appropriate data. Hollerith's system-including punch, tabulator, and sorter-allowed the official 1890 population count to be tallied in six months, and in another two years all the census data was completed and defined; the cost was $5 million below the forecasts and saved more than two years' time. His later machines mechanized the card-feeding process, added numbers, and sorted cards, in addition to merely counting data. In 1896 Hollerith founded the Tabulating Machine Company, forerunner of Computer Tabulating Recording Company (CTR). He served as a consulting engineer with CTR until retiring in In 1924 CTR changed its name to IBM- the International Business Machines Corporation. Herman Hollerith Born February 29, Died November 17, 1929 Art of Compiling Statistics; Apparatus for Compiling Statistics Source: National Institute of Standards and Technology (NIST) Virtual Museum -

RadixSort Radix = “The base of a number system” (Webster’s dictionary) –alternate terminology: radix is number of bits needed to represent 0 to base-1; can say “base 8” or “radix 3” Idea: BinSort on each digit, bottom up.

RadixSort – magic! It works. Input list: 126, 328, 636, 341, 416, 131, 328 BinSort on lower digit: 341, 131, 126, 636, 416, 328, 328 BinSort result on next-higher digit: 416, 126, 328, 328, 131, 636, 341 BinSort that result on highest digit: 126, 131, 328, 328, 341, 416, 636

Not magic. It provably works. Keys –K-digit numbers –base B Claim: after i th BinSort, least significant i digits are sorted. –e.g. B=10, i=3, keys are 1776 and comes before 1776 for last 3 digits.

RadixSort Proof by Induction Base case: –i=0. 0 digits are sorted (that wasn’t hard!) Induction step –assume for i, prove for i+1. –consider two numbers: X, Y. Say X i is i th digit of X (from the right) X i+1 < Y i+1 then i+1 th BinSort will put them in order X i+1 > Y i+1, same thing X i+1 = Y i+1, order depends on last i digits. Induction hypothesis says already sorted for these digits. (Careful about ensuring that your BinSort preserves order aka “stable”…)

What types can you RadixSort? Any type T that can be BinSorted Any type T that can be broken into parts A and B, such that: –You can reconstruct T from A and B –A can be RadixSorted –B can be RadixSorted –A is always more significant than B, in ordering

Example: 1-digit numbers can be BinSorted 2 to 5-digit numbers can be BinSorted without using too much memory 6-digit numbers, broken up into A=first 3 digits, B=last 3 digits. –A and B can reconstruct original 6-digits –A and B each RadixSortable as above –A more significant than B

RadixSorting Strings 1 Character can be BinSorted Break strings into characters Need to know length of biggest string (or calculate this on the fly). Null-pad shorter strings Running time: –N is number of strings –L is length of longest string –RadixSort takes O(N*L)

To Do Finish Project III (due Wednesday!) Finish reading chapter 7

Coming Up More Algorithms! Sorting Project III due (Wednesday) Unix Tutorial (Tuesday, tomorrow!)