Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved. 0-13-140909-3 1 Sorting.

Slides:



Advertisements
Similar presentations
Transform and Conquer Chapter 6. Transform and Conquer Solve problem by transforming into: a more convenient instance of the same problem (instance simplification)
Advertisements

HST 952 Computing for Biomedical Scientists Lecture 9.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 24 Sorting.
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Sorting Gordon College 13.1 Some O(n2) Sorting Schemes
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Sorting.
Section 8.8 Heapsort.  Merge sort time is O(n log n) but still requires, temporarily, n extra storage locations  Heapsort does not require any additional.
CS 206 Introduction to Computer Science II 04 / 27 / 2009 Instructor: Michael Eckmann.
© 2006 Pearson Addison-Wesley. All rights reserved10-1 Chapter 10 Algorithm Efficiency and Sorting CS102 Sections 51 and 52 Marc Smith and Jim Ten Eyck.
CS 206 Introduction to Computer Science II 12 / 05 / 2008 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 12 / 03 / 2008 Instructor: Michael Eckmann.
Sorting Chapter 10.
© 2006 Pearson Addison-Wesley. All rights reserved12 A-1 Chapter 12 Heaps.
Sorting Chapter 13 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved
© 2006 Pearson Addison-Wesley. All rights reserved10 A-1 Chapter 10 Algorithm Efficiency and Sorting.
1 HEAPS & PRIORITY QUEUES Array and Tree implementations.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
ADT Table and Heap Ellen Walker CPSC 201 Data Structures Hiram College.
Final Review Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2010.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Ver Chapter 9: Algorithm Efficiency and Sorting Data Abstraction &
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
Chapter 10 B Algorithm Efficiency and Sorting. © 2004 Pearson Addison-Wesley. All rights reserved 9 A-2 Sorting Algorithms and Their Efficiency Sorting.
Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
Data Structures Using C++ 2E Chapter 10 Sorting Algorithms.
Adapted from instructor resource slides Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All.
© 2006 Pearson Addison-Wesley. All rights reserved10 A-1 Chapter 10 Algorithm Efficiency and Sorting.
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Sorting. 2 contents 3 kinds of sorting methods – Selection, exchange, and insertion O(n 2 ) sorts – VERY inefficient, but OK for ≈ 10 elements – Simple.
CS 61B Data Structures and Programming Methodology July 21, 2008 David Sun.
Sorting Dr. Yingwu Zhu. Heaps A heap is a binary tree with properties: 1. It is complete Each level of tree completely filled Except possibly bottom level.
CS 206 Introduction to Computer Science II 04 / 22 / 2009 Instructor: Michael Eckmann.
Sorting Dr. Yingwu Zhu. Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order Ascending or descending Some O(n.
Priority Queues and Heaps. October 2004John Edgar2  A queue should implement at least the first two of these operations:  insert – insert item at the.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
The ADT Table The ADT table, or dictionary Uses a search key to identify its items Its items are records that contain several pieces of data 2 Figure.
Sorting part 2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2008.
Exam #2 Review. Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved
Chapter 9 Sorting. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step is.
Chapter 13 Priority Queues. 2 Priority queue A stack is first in, last out A queue is first in, first out A priority queue is least-in-first-out The “smallest”
Sorting Dr. Yingwu Zhu. Heaps A heap is a binary tree with properties: 1. It is complete Each level of tree completely filled Except possibly bottom level.
HEAPS. Review: what are the requirements of the abstract data type: priority queue? Quick removal of item with highest priority (highest or lowest key.
FALL 2005CENG 213 Data Structures1 Priority Queues (Heaps) Reference: Chapter 7.
1 Ch.19 Divide and Conquer. 2 BIRD’S-EYE VIEW Divide and conquer algorithms Decompose a problem instance into several smaller independent instances May.
Heaps © 2010 Goodrich, Tamassia. Heaps2 Priority Queue ADT  A priority queue (PQ) stores a collection of entries  Typically, an entry is a.
1 Chapter 8 Sorting. 2 OBJECTIVE Introduces: Sorting Concept Sorting Types Sorting Implementation Techniques.
Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all.
Sorting Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
Chapter 9: Sorting1 Sorting & Searching Ch. # 9. Chapter 9: Sorting2 Chapter Outline  What is sorting and complexity of sorting  Different types of.
Quicksort Dr. Yingwu Zhu. 2 Quicksort A more efficient exchange sorting scheme than bubble sort – A typical exchange involves elements that are far apart.
Sorting part 2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2008.
Basic Sorting Algorithms Dr. Yingwu Zhu. Sorting Problem Consider list x 1, x 2, x 3, … x n Goal: arrange the elements of the list in order Ascending.
SORTING Chapter 8. Chapter Objectives  To learn how to use the standard sorting methods in the Java API  To learn how to implement the following sorting.
Sorting Cont. Quick Sort As the name implies quicksort is the fastest known sorting algorithm in practice. Quick-sort is a randomized sorting algorithm.
Liang, Introduction to Java Programming, Tenth Edition, (c) 2013 Pearson Education, Inc. All rights reserved. 1 Chapter 23 Sorting.
1 Priority Queues (Heaps). 2 Priority Queues Many applications require that we process records with keys in order, but not necessarily in full sorted.
Priority Queues and Heaps. John Edgar  Define the ADT priority queue  Define the partially ordered property  Define a heap  Implement a heap using.
Sorting Dr. Yingwu Zhu.
Description Given a linear collection of items x1, x2, x3,….,xn
Sorting Chapter 13 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved
Quicksort and Mergesort
Sorting Dr. Yingwu Zhu.
Adapted from instructor resource slides
Sorting Dr. Yingwu Zhu.
Chapter 12 Heap ADT © 2011 Pearson Addison-Wesley. All rights reserved.
Sorting Dr. Yingwu Zhu.
Heaps and priority queues
EECE.3220 Data Structures Instructor: Dr. Michael Geiger Spring 2019
Instructor: Dr. Michael Geiger Spring 2017 Lecture 36: Exam 3 Preview
Instructor: Dr. Michael Geiger Spring 2017 Lecture 30: Sorting & heaps
Presentation transcript:

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Sorting Chapter 13

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Chapter Contents 13.1 Some O(n 2 ) Sorting Schemes 13.2 Heaps, Heapsort, and Priority Queues 13.3 Quicksort 13.4 Mergesort 13.5 Radix Sort

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Chapter Objectives Describe the three kinds of sorting methods –Selection, exchange, and insertion Look at examples of each kind of sort that are O(n 2 ) sorts –Simple selection, bubble, and insertion sorts Study heaps, show how used for efficient selection sort, heapsort –Look at implementation of priority queues using heaps Study quicksort in detail as example of divide-and-conquer strategy for efficient exchange sort Study mergesort as example of sort usable with for sequential files Look at radix sort as example of non-comparison-based sort

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n 2 ) schemes –easy to understand and implement –inefficient for large data sets

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Category of Sorting Schemes Classification based on where the data to be sorted –Internal sort (e.g., main memory) The focus of the study in this text –External sort (e.g., secondary memory) Classification based on the general approaches used to carry out sorting –Selection sort –Exchange sort –Insertion sort

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Categories of Sorting Algorithms Selection sort –Make passes through a list –On each pass reposition correctly some element (e.g., the smallest element)

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Time Complexity of Selection Sort Worst-case computing time –Number of comparisons (n-1) + (n-2) + … +1 = n (n-1) /2 O (n 2 )

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Categories of Sorting Algorithms Exchange sort –Idea: systematically interchange pairs of elements which are out of order –Bubble sort does this Out of order, exchange In order, do not exchange

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Bubble Sort Algorithm 1. Initialize numCompares to n While numCompares != 0, do following a. Set last = 1 // location of last element in a swap b. For i = 1 to numCompares if x i > x i + 1 Swap x i and x i + 1 and set last = i c. Set numCompares = last – 1 End while

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Time Complexity of Bubble Sort Worst-case computing time (e.g., in reverse sorted order) –Number of comparisons (or exchange) (n-1) + (n-2) + … +1 = n (n-1) /2 O (n 2 )

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Categories of Sorting Algorithms Insertion sort –Repeatedly insert a new element into an already sorted list –Note this works well with a linked list implementation All these have computing time O(n 2 )

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Algorithm for Linear Insertion Sort For i = 2 to n do the following a. set n extElement = x[i] and x[0] = nextElement b. set j = i c. While nextElement < x[j – 1] do the following set x[j] equal to x[j – 1] decrement j by 1 End wile d. set x[j] equal to nextElement End for

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Example of Insertion Sort Given list to be sorted 67, 33, 21, 84, 49, 50, 75 –Note sequence of steps carried out

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Improved Schemes We seek improved computing times for sorts of large data sets Chapter presents schemes which can be proven to have average computing time O( n log 2 n ) Must be said, no such thing as a universally good sorting scheme –Results may depend just how out of order list is

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Comparisons of Sorts Sort of a randomly generated list of 500 items –Note: times are on 1970s hardware AlgorithmType of SortTime (sec) Simple selection Heapsort Bubble sort 2 way bubble sort Quicksort Linear insertion Binary insertion Shell sort Selection Exchange Insertion

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Indirect Sorts Possible that the items being sorted are large structures –Data transfer/swapping time unacceptable Alternative is indirect sort –Uses index table to store positions of the objects –Manipulate the index table for ordering

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Heaps A heap is a binary tree with properties: 1.It is complete Each level of tree completely filled Except possibly bottom level (nodes in left most positions) 2.It satisfies heap-order property Data in each node >= data in children

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Heaps Which of the following are heaps? A B C

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Implementing a Heap Use an array or vector Number the nodes from top to bottom –Number nodes on each row from left to right Store data in i th node in i th location of array (vector)

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Implementing a Heap Note the placement of the nodes in the array

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Implementing a Heap In an array implementation children of i th node are at myArray[2*i] and myArray[2*i+1] Parent of the i th node is at mayArray[i/2]

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Basic Heap Operations Constructor –Set mySize to 0, allocate array Empty –Check value of mySize Retrieve max (or min) item –Return root of the binary tree, myArray[1]

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Basic Heap Operations Delete max (or min) item –Max item is the root, replace with last node in tree –Then interchange root with larger of the two children –Continue this with the resulting sub-tree(s) Result called a semiheap

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Percolate Down Algorithm 1. Set c = 2 * r 2. While r <= n do the following a. If c < n and myArray[c] < myArray[c + 1] Increment c by 1 b. If myArray[r] < myArray[c] i. Swap myArray[r] and myArray[c] ii. set r = c iii. Set c = 2 * c else Terminate repetition End while

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Basic Heap Operations Insert an item –Amounts to a percolate up routine –Place new item at end of array –Interchange with its parent so long as it is greater than its parent

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Heapsort Given a list of numbers in an array –Stored in a complete binary tree Convert to a heap –Begin at last node not a leaf –Apply percolated down to this subtree –Continue until reaching the root

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Heapsort Algorithm for converting a complete binary tree to a heap – called "heapify" For r = n/2 down to 1 : Apply percolate_down to the subtree in myArray[r ], … myArray[n] End for Puts largest element at root

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Heapsort Now swap element 1 (root of tree) with last element –This puts largest element in correct location Use percolate down on remaining sublist –Converts from semi-heap to heap

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Heapsort Again swap root with rightmost leaf Continue this process with shrinking sublist

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Heapsort Algorithm 1. Consider x as a complete binary tree, use heapify to convert this tree to a heap 2. for i = n down to 2 : a. Interchange x[1] and x[i] (puts largest element at end) b. Apply percolate_down to convert binary tree corresponding to sublist in x[1].. x[i-1]

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Heap Algorithms in STL Found in the library –make_heap() heapify –push_heap() insert –pop_heap() delete –sort_heap() heapsort Note program which illustrates these operations, Fig. 13.1Fig. 13.1

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Priority Queue A collection of data elements –Items stored in order by priority –Higher priority items removed ahead of lower Operations –Constructor –Insert –Find, remove smallest/largest (priority) element –Replace –Change priority –Delete an item –Join two priority queues into a larger one

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Priority Queue Implementation possibilities –As a list (array, vector, linked list) –As an ordered list –Best is to use a heap Basic operations have O(log 2 n) time STL priority queue adapter uses heap –Note operations in table of Fig in text, page 751

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Quicksort A more efficient exchange sorting scheme than bubble sort –A typical exchange involves elements that are far apart –Fewer interchanges are required to correctly position an element. Quicksort uses a divide-and-conquer strategy –A recursive approach –The original problem partitioned into simpler sub- problems –Each sub problem considered independently. Subdivision continues until sub problems obtained are simple enough to be solved directly

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Quicksort Choose some element called a pivot ( 支點 ) Perform a sequence of exchanges so that –All elements that are less than this pivot are to its left and –All elements that are greater than the pivot are to its right. Divides the (sub)list into two smaller sub lists, Each of which may then be sorted independently in the same way.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Quicksort If the list has 0 or 1 elements, return. // the list is sorted Else do: Pick an element in the list to use as the pivot. Split the remaining elements into two disjoint groups: SmallerThanPivot = {all elements < pivot} LargerThanPivot = {all elements > pivot} Return the list rearranged as: Quicksort(SmallerThanPivot), pivot, Quicksort(LargerThanPivot).

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Quicksort Example Given to sort: 75, 70, 65,, 98, 78, 100, 93, 55, 61, 81, Select, arbitrarily, the first element, 75, as pivot. Search from right for elements <= 75, stop at first element <75 Search from left for elements > 75, stop at first element >=75 Swap these two elements, and then repeat this process 84 68

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Quicksort Example 75, 70, 65, 68, 61, 55, 100, 93, 78, 98, 81, 84 When done, swap with pivot This SPLIT operation placed pivot 75 so that all elements to the left were 75. –View code for split() templatesplit() template 75 is now placed appropriately Need to sort sublists on either side of 75

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Quicksort Example Need to sort (independently): 55, 70, 65, 68, 61 and 100, 93, 78, 98, 81, 84 Let pivot be 55, look from each end for values larger/smaller than 55, swap Same for 2 nd list, pivot is 100 Sort the resulting sublists in the same manner until sublist is trivial (size 0 or 1) View quicksort() recursive functionquicksort()

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Quicksort Note visual example of a quicksort on an array etc. …

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Quicksort Performance O(log 2 n) is the average case computing time –If the pivot results in sublists of approximately the same size. O(n 2 ) worst-case –List already ordered, elements in reverse –When Split() repetitively results, for example, in one empty sublist

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Improvements to Quicksort Quicksort is a recursive function –stack of activation records must be maintained by system to manage recursion. –The deeper the recursion is, the larger this stack will become. The depth of the recursion and the corresponding overhead can be reduced –sort the smaller sublist at each stage first

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Improvements to Quicksort Another improvement aimed at reducing the overhead of recursion is to use an iterative version of Quicksort() To do so, use a stack to store the first and last positions of the sublists sorted "recursively".

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Improvements to Quicksort An arbitrary pivot gives a poor partition for nearly sorted lists (or lists in reverse) Virtually all the elements go into either SmallerThanPivot or LargerThanPivot –all through the recursive calls. Quicksort takes quadratic time to do essentially nothing at all.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Improvements to Quicksort Better method for selecting the pivot is the median- of-three rule, –Select the median of the first, middle, and last elements in each sublist as the pivot. Often the list to be sorted is already partially ordered Median-of-three rule will select a pivot closer to the middle of the sublist than will the “first-element” rule.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Improvements to Quicksort For small files (n <= 20), quicksort is worse than insertion sort; –small files occur often because of recursion. Use an efficient sort (e.g., insertion sort) for small files. Better yet, use Quicksort() until sublists are of a small size and then apply an efficient sort like insertion sort.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Mergesort Sorting schemes are either … –internal -- designed for data items stored in main memory –external -- designed for data items stored in secondary memory. Previous sorting schemes were all internal sorting algorithms: –required direct access to list elements not possible for sequential files –made many passes through the list not practical for files

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Mergesort Mergesort can be used both as an internal and an external sort. Basic operation in mergesort is merging, –combining two lists that have previously been sorted –resulting list is also sorted.

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Merge Algorithm 1. Open File1 and File2 for input, File3 for output 2. Read first element x from File1 and first element y from File2 3. While neither eof File1 or eof File2 If x < y then a. Write x to File3 b. Read a new x value from File1 Otherwise a. Write y to File3 b. Read a new y from File2 End while 4. If eof File1 encountered copy rest of of File2 into File3. If eof File2 encountered, copy rest of File1 into File3

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Binary Merge Sort Given a single file Split into two files

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Binary Merge Sort Merge first one-element "subfile" of F1 with first one-element subfile of F2 –Gives a sorted two-element subfile of F Continue with rest of one-element subfiles

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Binary Merge Sort Split again Merge again as before Each time, the size of the sorted subgroups doubles

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Binary Merge Sort Last splitting gives two files each in order Last merging yields a single file, entirely in order Note we always are limited to subfiles of some power of 2

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Natural Merge Sort Allows sorted subfiles of other sizes –Number of phases can be reduced when file contains longer "runs" of ordered elements Consider file to be sorted, note in order groups

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Natural Merge Sort Copy alternate groupings into two files –Use the sub-groupings, not a power of 2 Look for possible larger groupings

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Natural Merge Sort Merge the corresponding sub files EOF for F2, Copy remaining groups from F1

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Natural Merge Sort Split again, alternating groups Merge again, now two subgroups One more split, one more merge gives sort

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Natural Merge Sort Note Split algorithm for natural merge sort, page 785 of text Note Merge algorithm for natural merge sort, page 785 of text Note Mergesort algorithm which uses the Split and Merge routines, page 786 of text Worst case for natural merge sort O(n log 2 n)

Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Radix Sort Based on examining digits in some base-b numeric representation of items (or keys) Least significant digit radix sort –Processes digits from right to left –Used in early punched-card sorting machines Create groupings of items with same value in specified digit –Collect in order and create grouping with next significant digit