Merge and Radix Sorts Data Structures Fall, 2007 13 th.

Slides:



Advertisements
Similar presentations
Sorting in Linear Time Introduction to Algorithms Sorting in Linear Time CSE 680 Prof. Roger Crawfis.
Advertisements

Abstract Data Types and Algorithms
Topic 14 Searching and Simple Sorts "There's nothing in your head the sorting hat can't see. So try me on and I will tell you where you ought to be." -The.
Advanced Sorting Methods: Shellsort Shellsort is an extension of insertion sort, which gains speed by allowing exchanges of elements that are far apart.
Garfield AP Computer Science
CSE Lecture 3 – Algorithms I
Heaps1 Part-D2 Heaps Heaps2 Recall Priority Queue ADT (§ 7.1.3) A priority queue stores a collection of entries Each entry is a pair (key, value)
Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
§7 Quicksort -- the fastest known sorting algorithm in practice 1. The Algorithm void Quicksort ( ElementType A[ ], int N ) { if ( N < 2 ) return; pivot.
CSCE 3110 Data Structures & Algorithm Analysis
DIVIDE AND CONQUER APPROACH. General Method Works on the approach of dividing a given problem into smaller sub problems (ideally of same size).  Divide.
1 HeapSort CS 3358 Data Structures. 2 Heapsort: Basic Idea Problem: Arrange an array of items into sorted order. 1) Transform the array of items into.
© Copyright 2012 by Pearson Education, Inc. All Rights Reserved. 1 Chapter 17 Sorting.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 24 Sorting.
Sorting Algorithms. Motivation Example: Phone Book Searching Example: Phone Book Searching If the phone book was in random order, we would probably never.
Data Structures Data Structures Topic #13. Today’s Agenda Sorting Algorithms: Recursive –mergesort –quicksort As we learn about each sorting algorithm,
Sorting Algorithms and Average Case Time Complexity
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
FALL 2004CENG 351 Data Management and File Structures1 External Sorting Reference: Chapter 8.
CS Data Structures Chapter 7 Sorting (Concentrating on Internal Sorting)
FALL 2006CENG 351 Data Management and File Structures1 External Sorting.
Department of Computer Eng. & IT Amirkabir University of Technology (Tehran Polytechnic) Data Structures Lecturer: Abbas Sarraf Internal.
Sorting CS-212 Dick Steflik. Exchange Sorting Method : make n-1 passes across the data, on each pass compare adjacent items, swapping as necessary (n-1.
Heaps and heapsort COMP171 Fall 2005 Part 2. Sorting III / Slide 2 Heap: array implementation Is it a good idea to store arbitrary.
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8
Been-Chian Chien, Wei-Pang Yang, and Wen-Yang Lin 7-1 Chapter 7 Sorting Introduction to Data Structure CHAPTER 7 SORTING 7.1 Searching and List Verification.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
Final Review Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2010.
Information and Computer Sciences University of Hawaii, Manoa
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 19: Searching and Sorting Algorithms.
HKOI 2006 Intermediate Training Searching and Sorting 1/4/2006.
David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
1 Joe Meehean.  Problem arrange comparable items in list into sorted order  Most sorting algorithms involve comparing item values  We assume items.
Heapsort. Heapsort is a comparison-based sorting algorithm, and is part of the selection sort family. Although somewhat slower in practice on most machines.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
Data Structures Using C++ 2E Chapter 10 Sorting Algorithms.
Bucket & Radix Sorts. Efficient Sorts QuickSort : O(nlogn) – O(n 2 ) MergeSort : O(nlogn) Coincidence?
Mudasser Naseer 1 11/5/2015 CSC 201: Design and Analysis of Algorithms Lecture # 8 Some Examples of Recursion Linear-Time Sorting Algorithms.
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Sorting CS 110: Data Structures and Algorithms First Semester,
Chapter 18: Searching and Sorting Algorithms. Objectives In this chapter, you will: Learn the various search algorithms Implement sequential and binary.
Priority Queues and Heaps. October 2004John Edgar2  A queue should implement at least the first two of these operations:  insert – insert item at the.
Searching and Sorting Recursion, Merge-sort, Divide & Conquer, Bucket sort, Radix sort Lecture 5.
Java Methods Big-O Analysis of Algorithms Object-Oriented Programming
Algorithms IS 320 Spring 2015 Sorting. 2 The Sorting Problem Input: –A sequence of n numbers a 1, a 2,..., a n Output: –A permutation (reordering) a 1.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Heaps & Priority Queues
1 Heaps A heap is a binary tree. A heap is best implemented in sequential representation (using an array). Two important uses of heaps are: –(i) efficient.
Liang, Introduction to Java Programming, Sixth Edition, (c) 2007 Pearson Education, Inc. All rights reserved Chapter 23 Algorithm Efficiency.
HEAPS. Review: what are the requirements of the abstract data type: priority queue? Quick removal of item with highest priority (highest or lowest key.
FALL 2005CENG 213 Data Structures1 Priority Queues (Heaps) Reference: Chapter 7.
Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All rights reserved. 1 Chapter 25 Sorting.
CS6045: Advanced Algorithms Sorting Algorithms. Heap Data Structure A heap (nearly complete binary tree) can be stored as an array A –Root of tree is.
Week 13 - Wednesday.  What did we talk about last time?  NP-completeness.
Review Quick Sort Quick Sort Algorithm Time Complexity Examples
Chapter 9: Sorting1 Sorting & Searching Ch. # 9. Chapter 9: Sorting2 Chapter Outline  What is sorting and complexity of sorting  Different types of.
Data Structure and Algorithms
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
Sorting and Runtime Complexity CS255. Sorting Different ways to sort: –Bubble –Exchange –Insertion –Merge –Quick –more…
Liang, Introduction to Java Programming, Tenth Edition, (c) 2013 Pearson Education, Inc. All rights reserved. 1 Chapter 23 Sorting.
Priority Queues and Heaps. John Edgar  Define the ADT priority queue  Define the partially ordered property  Define a heap  Implement a heap using.
Sorting With Priority Queue In-place Extra O(N) space
Description Given a linear collection of items x1, x2, x3,….,xn
Analysis of Algorithms
CENG 351 Data Management and File Structures
Algorithms Sorting.
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
EE 312 Software Design and Implementation I
Presentation transcript:

Merge and Radix Sorts Data Structures Fall, th

Merge Sort (1/13) Before looking at the merge sort algorithm to sort n records, let us see how one may merge two sorted lists to get a single sorted list. Before looking at the merge sort algorithm to sort n records, let us see how one may merge two sorted lists to get a single sorted list. Merging Merging The first one, Program 7.7, uses O(n) additional space. The first one, Program 7.7, uses O(n) additional space. It merges the sorted lists (list[i], …, list[m]) and (list[m+1], …, list[n]), into a single sorted list, (sorted[i], …, sorted[n]). It merges the sorted lists (list[i], …, list[m]) and (list[m+1], …, list[n]), into a single sorted list, (sorted[i], …, sorted[n]).

Merge (using O(n) space) Merge (using O(n) space)

Merge Sort (3/13) O(1) space merge O(1) space merge Steps in an O(1) space merge when the total number of records, n is a perfect square */ and the number of records in each of the files to be merged is a multiple of n */ Steps in an O(1) space merge when the total number of records, n is a perfect square */ and the number of records in each of the files to be merged is a multiple of n */ Step1: Identify the n records with largest key. This is done by following right to left along the two files to be merged. Step1: Identify the n records with largest key. This is done by following right to left along the two files to be merged. Step2: Exchange the records of the second file that were identified in Step1 with those just to the left of those identified from the first file so that the n record with largest keys form a contiguous block Step2: Exchange the records of the second file that were identified in Step1 with those just to the left of those identified from the first file so that the n record with largest keys form a contiguous block

Merge Sort (4/13) O(1) space merge (contd) O(1) space merge (contd) Step3: Swap the block of n largest records with the leftmost block (unless it is already the leftmost block). Sort the rightmost block Step3: Swap the block of n largest records with the leftmost block (unless it is already the leftmost block). Sort the rightmost block Step4: Reorder the blocks excluding the block of largest records into nondecreasing order of the last key in the blocks Step4: Reorder the blocks excluding the block of largest records into nondecreasing order of the last key in the blocks

Merge Sort (5/13) O(1) space merge (contd) O(1) space merge (contd) Step5: Perform as many merge sub steps as needed to merge the n-1 blocks other than the block with the largest keys. Step5: Perform as many merge sub steps as needed to merge the n-1 blocks other than the block with the largest keys w z u x a | v y b | c e g i j k | d f h o p q | l m n r s t z u x w 6 8 a | v y b | c e g i j k | d f h o p q | l m n r s t

Merge Sort (6/13) 6, 7, 8 are merged Segment one is merged (i.e., 0, 2, 4, 6, 8, a) Change place marker (longest sorted sequence of records) Segment one is merged (i.e., b, c, e, g, i, j, k) Change place marker Segment one is merged (i.e., o, p, q) No other segment. Sort the largest keys. Step6: Sort the block with the largest keys Step6: Sort the block with the largest keys

When selection sort is used to implement Step 4 each block is regarded as a single record with key equal to that of the last record in the block. The time needed for these is O(n). When selection sort is used to implement Step 4 each block is regarded as a single record with key equal to that of the last record in the block. The time needed for these is O(n). The total time is O(n). The total time is O(n). The additional space used is O(1). The additional space used is O(1). Example: Example: Input list (26, 5, 77, 1, 61, 11, 59, 15, 48, 19) Input list (26, 5, 77, 1, 61, 11, 59, 15, 48, 19)

selection sort void selectionSort(int numbers[], int array_size) { int i, j; int i, j; int min, temp; int min, temp; for (i = 0; i < array_size-1; i++) { for (i = 0; i < array_size-1; i++) { min = i; min = i; for (j = i+1; j < array_size; j++) { for (j = i+1; j < array_size; j++) { if (numbers[j] < numbers[min]) if (numbers[j] < numbers[min]) min = j; min = j; } temp = numbers[i]; temp = numbers[i]; numbers[i] = numbers[min]; numbers[i] = numbers[min]; numbers[min] = temp; numbers[min] = temp; }} O(n 2 ) O(n 2 )

Merge Sort (7/13) Iterative merge sort Iterative merge sort 1.We assume that the input sequence has n sorted lists, each of length 1. 2.We merge these lists pairwise to obtain n/2 lists of size 2. 3.We then merge the n/2 lists pairwise, and so on, until a single list remains. Analysis Analysis Total number of passes is the celling of log 2 n Total number of passes is the celling of log 2 n merge two sorted list in linear time: O(n) merge two sorted list in linear time: O(n) The total computing time is O(n log n). The total computing time is O(n log n).

Merge Sort (8/13) merge_pass merge_pass Invokes merge (Program 7.7) to merge the sorted sublists Invokes merge (Program 7.7) to merge the sorted sublists Perform one pass of the merge sort. It merges adjancent pairs of subfiles from list into sorted. Perform one pass of the merge sort. It merges adjancent pairs of subfiles from list into sorted. the number of elements in the list the length of the subfile [0][1][2][3][4][5][6][7][8][9] length=2 n=10 i= list sorted 457 8

merge_sort: Perform a merge sort on the file merge_sort: Perform a merge sort on the file [0][1][2][3][4][5][6] [7][8][9] length=1 list extra n=10 2 list 4 extra 8 list 16

Merge Sort (10/13) Recursive merge sort concept Recursive merge sort concept

Merge Sort (10/13) Recursive merge sort concept Recursive merge sort concept

Merge Sort (10/13) Recursive merge sort concept Recursive merge sort concept

Merge Sort (10/13) Recursive merge sort concept Recursive merge sort concept

Merge Sort (10/13) Recursive merge sort concept Recursive merge sort concept

listmerge: listmerge: Takes two sorted chains and returns an integer that points to the start of the sorted list Takes two sorted chains and returns an integer that points to the start of the sorted list The link field in each record is initially set to -1 Since the elements were numbered from 0 to n-1, we use list[n] to store the start pointer

rmerge: sort the list, list[lower], …, list[upper]. The link field in each record is initially set to -1 rmerge: sort the list, list[lower], …, list[upper]. The link field in each record is initially set to -1 start = rmerge(list, 0, n-1); [0][1][2][3][4][5][6] [7][8][9] lower= upper= middle= = 0 list

Merge Sort (13/13) Variation: Natural merge sort : Variation: Natural merge sort : We can modify merge_sort to take into account the prevailing order within the input list. We can modify merge_sort to take into account the prevailing order within the input list. In this implementation we make an initial pass over the data to determine the sequences of records that are in order. In this implementation we make an initial pass over the data to determine the sequences of records that are in order. The merge sort then uses these initially ordered sublists for the remainder of the passes. The merge sort then uses these initially ordered sublists for the remainder of the passes.

Heap Sort (1/3) The challenges of merge sort The challenges of merge sort The merge sort requires additional storage proportional to the number of records in the file being sorted. The merge sort requires additional storage proportional to the number of records in the file being sorted. By using the O(1) space merge algorithm, the space requirements can be reduced to O(1), but significantly slower than the original one. By using the O(1) space merge algorithm, the space requirements can be reduced to O(1), but significantly slower than the original one. Heap sort Heap sort Require only a fixed amount of additional storage Require only a fixed amount of additional storage Slightly slower than merge sort using O(n) additional space Slightly slower than merge sort using O(n) additional space Faster than merge sort using O(1) additional space. Faster than merge sort using O(1) additional space. The worst case and average computing time is O(n log n), same as merge sort The worst case and average computing time is O(n log n), same as merge sort Unstable Unstable

adjust adjust adjust the binary tree to establish the heap adjust the binary tree to establish the heap /* compare root and max. root */ /* move to parent */ [1] [2][3] [4][5][6][7] [8][9][10] rootkey = root = 1 n = child =

Heap Sort (3/3) [1] [2][3] [4][5][6][7] [8][9][10] heapsort heapsort n = 10 i = ascending order (max heap) bottom-up top-down

Radix Sort (1/8) We considers the problem of sorting records that have several keys We considers the problem of sorting records that have several keys These keys are labeled K 0 (most significant key), K 1, …, K r-1 (least significant key). These keys are labeled K 0 (most significant key), K 1, …, K r-1 (least significant key). Let K i j denote key K j of record R i. Let K i j denote key K j of record R i. A list of records R 0, …, R n-1, is lexically sorted with respect to the keys K 0, K 1, …, K r-1 iff (K i 0, K i 1, …, K i r-1 ) (K 0 i+1, K 1 i+1, …, K r-1 i+1 ), 0 i < n-1 A list of records R 0, …, R n-1, is lexically sorted with respect to the keys K 0, K 1, …, K r-1 iff (K i 0, K i 1, …, K i r-1 ) (K 0 i+1, K 1 i+1, …, K r-1 i+1 ), 0 i < n-1

Radix Sort (2/8) Example Example sorting a deck of cards on two keys, suit and face value, in which the keys have the ordering relation: K 0 [Suit]: < < < K 1 [Face value]: 2 < 3 < 4 < … < 10 < J < Q < K < A sorting a deck of cards on two keys, suit and face value, in which the keys have the ordering relation: K 0 [Suit]: < < < K 1 [Face value]: 2 < 3 < 4 < … < 10 < J < Q < K < A Thus, a sorted deck of cards has the ordering: 2, …, A, …, 2, …, A Thus, a sorted deck of cards has the ordering: 2, …, A, …, 2, …, A Two approaches to sort: Two approaches to sort: 1. MSD (Most Significant Digit) first: 1. MSD (Most Significant Digit) first: sort on K 0, then K 1, LSD (Least Significant Digit) first: 2. LSD (Least Significant Digit) first: sort on K r-1, then K r-2,...

Radix Sort (3/8) MSD first 1. 1.MSD sort first, e.g., bin sort, four bins 2. 2.LSD sort second 2, …, A, …, 2, …, A Result: 2, …, A, …, 2, …, A

Radix Sort (4/8) LSD first 1. 1.LSD sort first, e.g., face sort, 13 bins 2, 3, 4, …, 10, J, Q, K, A 2. 2.MSD sort second (may not needed, we can just classify these 13 piles into 4 separated piles by considering them from face 2 to face A) Simpler than the MSD one because we do not have to sort the subpiles independently Result: 2, …, A, …, 2, …, A 2, …, A

Radix Sort (5/8) We also can use an LSD or MSD sort when we have only one logical key, if we interpret this key as a composite of several keys. We also can use an LSD or MSD sort when we have only one logical key, if we interpret this key as a composite of several keys. Example: Example: integer: the digit in the far right position is the least significant and the most significant for the far left position integer: the digit in the far right position is the least significant and the most significant for the far left position range: range: 0 K 999 using LSD or MSD sort for three keys (K 0, K 1, K 2 ) since an LSD sort does not require the maintainence of independent subpiles, it is easier to implement MSDLSD 0-9

Radix Sort (6/8) radix sort radix sort decompose the sort key into digits using a radix r. decompose the sort key into digits using a radix r. Ex: When r =10, we get the common base 10 or decimal decomposition of the key Ex: When r =10, we get the common base 10 or decimal decomposition of the key LSD radix r sort LSD radix r sort The records, R 0, …,R n-1 The records, R 0, …,R n-1 Keys: d-tuples (x 0, x 1, …, x d-1 ) and that 0 x i < r. Keys: d-tuples (x 0, x 1, …, x d-1 ) and that 0 x i < r. Each record has a link field, and that the input list is stored as a dynamically linked list. Each record has a link field, and that the input list is stored as a dynamically linked list. We implement the bins as queues We implement the bins as queues front[i], 0 i < r, pointing to the first record in bin i front[i], 0 i < r, pointing to the first record in bin i rear[i], 0 i < r, pointing to the last record in bin i rear[i], 0 i < r, pointing to the last record in bin i #define MAX_DIGIT 3 /* 0 to 999 */ #define RADIX_SIZE 10 typedef struct list_node *list_pointer; typedef struct list_node { int key[MAX_DIGIT]; list_pointer link;};

LSD Radix Sort LSD Radix Sort Time complexity: O(MAX_DIGIT(RADIX_SIZE+n)) MAX_DIGIT passes O(RADIX_SIZE) O(n) RADIX_SIZE = 10 MAX_DIGIT = 3 f[9] f[8] f[7] f[6] f[5] f[4] f[3] f[2] f[1] f[0] 271 NULL NULL 984 NULL NULL 208 NULL NULL r[9] r[8] r[7] r[6] r[5] r[4] r[3] r[2] r[1] r[0] Initial input: Chain after first pass, i=2:

Radix Sort (8/8) Simulation of radix_sort Simulation of radix_sort f[9] f[8] f[7] f[6] f[5] f[4] f[3] f[2] f[1] f[0] 271 NULL NULL 984 NULL NULL NULL r[9] r[8] r[7] r[6] r[5] r[4] r[3] r[2] r[1] r[0] f[9] f[8] f[7] f[6] f[5] f[4] f[3] f[2] f[1] f[0] 271 NULL NULL NULL NULL r[9] r[8] r[7] r[6] r[5] r[4] r[3] r[2] r[1] r[0] NULL Chain after second pass, i=1: Chain after third pass, i=0:

Summary of Internal Sorting (1/2) Insertion Sort Insertion Sort Works well when the list is already partially ordered Works well when the list is already partially ordered The best sorting method for small n The best sorting method for small n Merge Sort Merge Sort The best/worst case (O(nlogn)) The best/worst case (O(nlogn)) Require more storage than a heap sort Require more storage than a heap sort Slightly more overhead than quick sort Slightly more overhead than quick sort Quick Sort Quick Sort The best average behavior The best average behavior The worst complexity in worst case (O(n 2 )) The worst complexity in worst case (O(n 2 )) Radix Sort Radix Sort Depend on the size of the keys and the choice of the radix Depend on the size of the keys and the choice of the radix

Summary of Internal Sorting (2/2) Analysis of the average running times Analysis of the average running times