Mergesort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos

Slides:



Advertisements
Similar presentations
Mergesort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Advertisements

Introduction to Algorithms Quicksort
Efficient Sorts. Divide and Conquer Divide and Conquer : chop a problem into smaller problems, solve those – Ex: binary search.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 5.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Mergesort, Analysis of Algorithms Jon von Neumann and ENIAC (1945)
Insertion sort, Merge sort COMP171 Fall Sorting I / Slide 2 Insertion sort 1) Initially p = 1 2) Let the first p elements be sorted. 3) Insert the.
Sorting - Merge Sort Cmput Lecture 12 Department of Computing Science University of Alberta ©Duane Szafron 2000 Some code in this lecture is based.
Merge sort, Insertion sort
Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort or bubble sort 1. Find the minimum value in the list 2. Swap it with the value.
Design and Analysis of Algorithms – Chapter 51 Divide and Conquer (I) Dr. Ying Lu RAIK 283: Data Structures & Algorithms.
Sorting Chapter 6 Chapter 6 –Insertion Sort 6.1 –Quicksort 6.2 Chapter 5 Chapter 5 –Mergesort 5.2 –Stable Sorts Divide & Conquer.
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
Quicksort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Divide and Conquer Sorting Algorithms CS /02/05 HeapSort Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University.
QUICKSORT 2015-T2 Lecture 16 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus Frean.
Priority Queues, Heaps, and Heapsort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Divide and Conquer Sorting Algorithms COMP s1 Sedgewick Chapters 7 and 8.
PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)
Lecture # 6 1 Advance Analysis of Algorithms. Divide-and-Conquer Divide the problem into a number of subproblems Similar sub-problems of smaller size.
Nothing is particularly hard if you divide it into small jobs. Henry Ford Nothing is particularly hard if you divide it into small jobs. Henry Ford.
Today’s Material Sorting: Definitions Basic Sorting Algorithms
329 3/30/98 CSE 143 Searching and Sorting [Sections 12.4, ]
Priority Queues, Heaps, and Heapsort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Advanced Sorting.
Prof. U V THETE Dept. of Computer Science YMA
CMPT 438 Algorithms.
Example Algorithms CSE 2320 – Algorithms and Data Structures
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Sorting.
Sorts, CompareTo Method and Strings
Fundamentals of Algorithms MCS - 2 Lecture # 11
Sorting Mr. Jacobs.
Subject Name: Design and Analysis of Algorithm Subject Code: 10CS43
Lecture No.45 Data Structures Dr. Sohail Aslam.
COMP 53 – Week Seven Big O Sorting.
Simple Sorting Algorithms
David Kauchak cs201 Spring 2014
Sorting by Tammy Bailey
Chapter 4 Divide-and-Conquer
COSC160: Data Structures Linked Lists
Algorithm Analysis CSE 2011 Winter September 2018.
MergeSort Source: Gibbs & Tamassia.
More on Merge Sort CS 244 This presentation is not given in class
Algorithm Design and Analysis (ADA)
Chapter 4: Divide and Conquer
Quicksort analysis Bubble sort
Hassan Khosravi / Geoffrey Tien
Mergesort Based on divide-and-conquer strategy
CS Two Basic Sorting Algorithms Review Exchange Sorting Merge Sorting
Lecture No 6 Advance Analysis of Institute of Southern Punjab Multan
CSE373: Data Structure & Algorithms Lecture 21: Comparison Sorting
CSE 332: Data Abstractions Sorting I
ITEC 2620M Introduction to Data Structures
CSE 373 Data Structures and Algorithms
Algorithms: Design and Analysis
Divide & Conquer Algorithms
Merge Sort (11.1) CSE 2011 Winter April 2019.
Data Structures & Algorithms
Sorting and Asymptotic Complexity
Quicksort.
Analysis of Algorithms
Alexandra Stefan Last updated: 4/16/2019
Algorithms Sorting.
Module 8 – Searching & Sorting Algorithms
Divide and Conquer Merge sort and quick sort Binary search
Advanced Sorting Methods: Shellsort
CMPT 225 Lecture 10 – Merge Sort.
Presentation transcript:

Mergesort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington

Mergesort Overview Mergesort is a very popular sorting algorithm. The worst-case time complexity is Θ(Ν log N). Another algorithm, quicksort, has a worst-case complexity of Θ(Ν2). Thus, mergesort is preferable if we want a theoretical guarantee that the time will never exceed O(N log N). Mergesort is stable. A sorting method is called stable if it does not change the order of items whose keys are equal. Disadvantage: mergesort typically requires extra Θ(Ν) space, to copy the data.

Mergesort Implementation Top-level function: Allocates memory for a scratch array. Note that the size of the scratch array is the same as the size of the input array A. Then, we just call a recursive helper function that does all the work. void mergesort(Item * A, int length) { Item * aux = malloc(sizeof(Item) * length); msort_help(A, length, aux); free(aux); } void msort_help(Item * A, int length, Item * aux) if (length <= 1) return; int M = length/2; Item * C = &(A[M]); int P = length-M; msort_help(A, M, aux); msort_help(C, P, aux); merge(A, A, M, C, P, aux);

Mergesort Implementation Recursive helper, basic approach: Split A to two halves. Left half: Pointer to A. Length M = length(A)/2. Right half: Pointer C = &(A[M]). Length P = length(A) - M. Sort each half. Done via recursive call to itself. Merge the results (we will see how in a bit). void mergesort(Item * A, int length) { Item * aux = malloc(sizeof(Item) * length); msort_help(A, length, aux); free(aux); } void msort_help(Item * A, int length, Item * aux) if (length <= 1) return; int M = length/2; Item * C = &(A[M]); int P = length-M; msort_help(A, M, aux); msort_help(C, P, aux); merge(A, A, M, C, P, aux);

Merging Two Sorted Sets void merge(Item *D, Item *B, int M, Item *C, int P, Item * aux ) { int T = M+P; int i, j, k; for (i = 0; i < M; i++) aux[i] = B[i]; for (j = 0; j < P; j++) aux[j+M] = C[j]; for (i = 0, j = M, k = 0; k < T; k++) if (i == M) { D[k] = aux[j++]; continue; } if (j == T) { D[k] = aux[i++]; continue; } if (less(aux[i], aux[j])) D[k] = aux[i++]; else D[k] = aux[j++]; } Summary: Assumption: arrays B and C are already sorted. Merges B and C, produces sorted array D: Result array pointer D is passed as input. It must already have enough memory.

Merging Two Sorted Sets void merge(Item *D, Item *B, int M, Item *C, int P, Item * aux ) { int T = M+P; int i, j, k; for (i = 0; i < M; i++) aux[i] = B[i]; for (j = 0; j < P; j++) aux[j+M] = C[j]; for (i = 0, j = M, k = 0; k < T; k++) if (i == M) { D[k] = aux[j++]; continue; } if (j == T) { D[k] = aux[i++]; continue; } if (less(aux[i], aux[j])) D[k] = aux[i++]; else D[k] = aux[j++]; } Note that here is where we use the aux array. Why do we need it?

Merging Two Sorted Sets void merge(Item *D, Item *B, int M, Item *C, int P, Item * aux ) { int T = M+P; int i, j, k; for (i = 0; i < M; i++) aux[i] = B[i]; for (j = 0; j < P; j++) aux[j+M] = C[j]; for (i = 0, j = M, k = 0; k < T; k++) if (i == M) { D[k] = aux[j++]; continue; } if (j == T) { D[k] = aux[i++]; continue; } if (less(aux[i], aux[j])) D[k] = aux[i++]; else D[k] = aux[j++]; } Note that here is where we use the aux array. Why do we need it? Using aux allows for D to share memory with B and C. We first copy the contents of both B and C to aux. Then, as we write into D, possibly overwriting B and C, aux still has the data we need.

Merge Execution Array D: Array aux: Array B: Array C: Initial state position 1 2 3 4 5 6 7 8 9 value Array aux: position 1 2 3 4 5 6 7 8 9 value Array B: position 1 2 3 4 value 10 35 75 80 90 Initial state Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: Step 1: position 1 2 3 4 5 6 7 8 9 value Array aux: position 1 2 3 4 5 6 7 8 9 value 10 35 75 80 90 Array B: position 1 2 3 4 value 10 35 75 80 90 Step 1: Copy B to aux. Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: Main loop position k=0 1 2 3 4 5 6 7 8 9 value Array aux: position i=0 1 2 3 4 j=5 6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Main loop initialization Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 2 3 4 5 k=1 2 3 4 5 6 7 8 9 value 10 Array aux: position i=1 2 3 4 j=5 6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 3 4 5 1 k=2 3 4 5 6 7 8 9 value 10 17 Array aux: position i=1 2 3 4 5 j=6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 4 5 1 2 k=3 4 5 6 7 8 9 value 10 17 30 Array aux: position i=1 2 3 4 5 6 j=7 8 9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 5 1 2 3 k=4 5 6 7 8 9 value 10 17 30 35 Array aux: position 1 i=2 3 4 5 6 j=7 8 9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 4 1 2 3 4 k=5 6 7 8 9 value 10 17 30 35 40 Array aux: position 1 i=2 3 4 5 6 7 j=8 9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 4 1 2 3 4 5 k=6 7 8 9 value 10 17 30 35 40 60 Array aux: position 1 i=2 3 4 5 6 7 8 j=9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 4 1 2 3 4 5 6 k=7 8 9 value 10 17 30 35 40 60 70 Array aux: position 1 i=2 3 4 5 6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 j=10 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 4 1 2 3 4 5 6 7 k=8 9 value 10 17 30 35 40 60 70 75 Array aux: position 1 2 i=3 4 5 6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 j=10 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 4 1 2 3 4 5 6 7 8 k=9 value 10 17 30 35 40 60 70 75 80 Array aux: position 1 2 3 i=4 5 6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 j=10 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 4 5 6 7 8 9 value 10 17 30 35 40 60 70 75 80 90 k=10 Array aux: position 1 2 3 4 i=5 6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 j=10 Array B: position 1 2 3 4 value 10 35 75 80 90 k=10, so we are done! Array C: Position 1 2 3 4 Value 17 30 40 60 70

Mergesort Execution Trace

Mergesort Execution Trace mergesort([35, 30, 17, 10, 60]) mergesort([35, 30])

Mergesort Execution Trace mergesort([35, 30, 17, 10, 60]) mergesort([35, 30]) [30,35] mergesort([35])  [35] mergesort([30])  [30] merge([35],[30])  [30,35]

Mergesort Execution Trace mergesort([35, 30, 17, 10, 60]) mergesort([35, 30]) [30,35] mergesort([35])  [35] mergesort([30])  [30] merge([35],[30])  [30,35] mergesort([17, 10, 60]) mergesort([17])  [17] mergesort([10, 60])

Mergesort Execution Trace mergesort([35, 30, 17, 10, 60]) mergesort([35, 30]) [30,35] mergesort([35])  [35] mergesort([30])  [30] merge([35],[30])  [30,35] mergesort([17, 10, 60]) mergesort([17])  [17] mergesort([10, 60])  [10, 60] mergesort([10])  [10] mergesort([60])  [60] merge([10],[60])  [10,60]

Mergesort Execution Trace mergesort([35, 30, 17, 10, 60]) mergesort([35, 30]) [30,35] mergesort([35])  [35] mergesort([30])  [30] merge([35],[30])  [30,35] mergesort([17, 10, 60])  [10,17,60] mergesort([17])  [17] mergesort([10, 60])  [10, 60] mergesort([10])  [10] mergesort([60])  [60] merge([10],[60])  [10,60] merge([17], [10,60])  [10,17,60]

Mergesort Execution Trace mergesort([35, 30, 17, 10, 60])  [10, 17, 30, 35, 60] mergesort([35, 30]) [30,35] mergesort([35])  [35] mergesort([30])  [30] merge([35],[30])  [30,35] mergesort([17, 10, 60])  [10,17,60] mergesort([17])  [17] mergesort([10, 60]) mergesort([10])  [10] mergesort([60])  [60] merge([10],[60])  [10,60] merge([17], [10,60])  [10,17,60] merge([30, 35], [10,17,60])  [10, 17, 30, 35, 60]

Mergesort Complexity Let T(N) be the worst-case running time complexity of mergesort for sorting N items. What is the recurrence for T(N)?

Mergesort Complexity Let T(N) be the worst-case running time complexity of mergesort for sorting N items. T(N) = 2T(N/2) + N. Let N = 2n. T(N) = 2T(2n-1) + 2n = 22T(2n-2) + 2*2n = 23T(2n-3) + 3*2n = 24T(2n-4) + 4*2n ... = 2nT(2n-n) + n*2n = N * constant + N * lg N = Θ(N lg N).

Mergesort Variations Mergesort with insertion sort when N<=cut-off items. Alternate between array and aux Will copy data only once (not twice) per recursive call. No need for linear extra space. More complicated, somewhat slower. Bottom-up mergesort, no recursive calls. Mergesort using lists and not arrays. Both top-down and bottom-up implementations Bitonic sequence (first increasing, then decreasing) Can eliminate the index boundary check for the subarrays.