Mergesort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos

Slides:

Advertisements

Similar presentations

Mergesort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.

Advertisements

Introduction to Algorithms Quicksort

Efficient Sorts. Divide and Conquer Divide and Conquer : chop a problem into smaller problems, solve those – Ex: binary search.

Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 5.

1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.

Mergesort, Analysis of Algorithms Jon von Neumann and ENIAC (1945)

Insertion sort, Merge sort COMP171 Fall Sorting I / Slide 2 Insertion sort 1) Initially p = 1 2) Let the first p elements be sorted. 3) Insert the.

Sorting - Merge Sort Cmput Lecture 12 Department of Computing Science University of Alberta ©Duane Szafron 2000 Some code in this lecture is based.

Merge sort, Insertion sort

Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort or bubble sort 1. Find the minimum value in the list 2. Swap it with the value.

Design and Analysis of Algorithms – Chapter 51 Divide and Conquer (I) Dr. Ying Lu RAIK 283: Data Structures & Algorithms.

Sorting Chapter 6 Chapter 6 –Insertion Sort 6.1 –Quicksort 6.2 Chapter 5 Chapter 5 –Mergesort 5.2 –Stable Sorts Divide & Conquer.

1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.

Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.

Quicksort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.

Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.

Divide and Conquer Sorting Algorithms CS /02/05 HeapSort Slide 2 Copyright 2005, by the authors of these slides, and Ateneo de Manila University.

QUICKSORT 2015-T2 Lecture 16 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus Frean.

Priority Queues, Heaps, and Heapsort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.

Divide and Conquer Sorting Algorithms COMP s1 Sedgewick Chapters 7 and 8.

PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)

Lecture # 6 1 Advance Analysis of Algorithms. Divide-and-Conquer Divide the problem into a number of subproblems Similar sub-problems of smaller size.

Nothing is particularly hard if you divide it into small jobs. Henry Ford Nothing is particularly hard if you divide it into small jobs. Henry Ford.

Today’s Material Sorting: Definitions Basic Sorting Algorithms

329 3/30/98 CSE 143 Searching and Sorting [Sections 12.4, ]

Priority Queues, Heaps, and Heapsort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.

Advanced Sorting.

Prof. U V THETE Dept. of Computer Science YMA

CMPT 438 Algorithms.

Example Algorithms CSE 2320 – Algorithms and Data Structures

Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.

Sorts, CompareTo Method and Strings

Fundamentals of Algorithms MCS - 2 Lecture # 11

Sorting Mr. Jacobs.

Subject Name: Design and Analysis of Algorithm Subject Code: 10CS43

Lecture No.45 Data Structures Dr. Sohail Aslam.

COMP 53 – Week Seven Big O Sorting.

Simple Sorting Algorithms

David Kauchak cs201 Spring 2014

Sorting by Tammy Bailey

Chapter 4 Divide-and-Conquer

COSC160: Data Structures Linked Lists

Algorithm Analysis CSE 2011 Winter September 2018.

MergeSort Source: Gibbs & Tamassia.

More on Merge Sort CS 244 This presentation is not given in class

Algorithm Design and Analysis (ADA)

Chapter 4: Divide and Conquer

Quicksort analysis Bubble sort

Hassan Khosravi / Geoffrey Tien

Mergesort Based on divide-and-conquer strategy

CS Two Basic Sorting Algorithms Review Exchange Sorting Merge Sorting

Lecture No 6 Advance Analysis of Institute of Southern Punjab Multan

CSE373: Data Structure & Algorithms Lecture 21: Comparison Sorting

CSE 332: Data Abstractions Sorting I

ITEC 2620M Introduction to Data Structures

CSE 373 Data Structures and Algorithms

Algorithms: Design and Analysis

Divide & Conquer Algorithms

Merge Sort (11.1) CSE 2011 Winter April 2019.

Data Structures & Algorithms

Sorting and Asymptotic Complexity

Analysis of Algorithms

Alexandra Stefan Last updated: 4/16/2019

Algorithms Sorting.

Module 8 – Searching & Sorting Algorithms

Divide and Conquer Merge sort and quick sort Binary search

Advanced Sorting Methods: Shellsort

CMPT 225 Lecture 10 – Merge Sort.

Presentation transcript:

Mergesort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington

Mergesort Overview Mergesort is a very popular sorting algorithm. The worst-case time complexity is Θ(Ν log N). Another algorithm, quicksort, has a worst-case complexity of Θ(Ν2). Thus, mergesort is preferable if we want a theoretical guarantee that the time will never exceed O(N log N). Mergesort is stable. A sorting method is called stable if it does not change the order of items whose keys are equal. Disadvantage: mergesort typically requires extra Θ(Ν) space, to copy the data.

Mergesort Implementation Top-level function: Allocates memory for a scratch array. Note that the size of the scratch array is the same as the size of the input array A. Then, we just call a recursive helper function that does all the work. void mergesort(Item * A, int length) { Item * aux = malloc(sizeof(Item) * length); msort_help(A, length, aux); free(aux); } void msort_help(Item * A, int length, Item * aux) if (length <= 1) return; int M = length/2; Item * C = &(A[M]); int P = length-M; msort_help(A, M, aux); msort_help(C, P, aux); merge(A, A, M, C, P, aux);

Mergesort Implementation Recursive helper, basic approach: Split A to two halves. Left half: Pointer to A. Length M = length(A)/2. Right half: Pointer C = &(A[M]). Length P = length(A) - M. Sort each half. Done via recursive call to itself. Merge the results (we will see how in a bit). void mergesort(Item * A, int length) { Item * aux = malloc(sizeof(Item) * length); msort_help(A, length, aux); free(aux); } void msort_help(Item * A, int length, Item * aux) if (length <= 1) return; int M = length/2; Item * C = &(A[M]); int P = length-M; msort_help(A, M, aux); msort_help(C, P, aux); merge(A, A, M, C, P, aux);

Merging Two Sorted Sets void merge(Item *D, Item *B, int M, Item *C, int P, Item * aux ) { int T = M+P; int i, j, k; for (i = 0; i < M; i++) aux[i] = B[i]; for (j = 0; j < P; j++) aux[j+M] = C[j]; for (i = 0, j = M, k = 0; k < T; k++) if (i == M) { D[k] = aux[j++]; continue; } if (j == T) { D[k] = aux[i++]; continue; } if (less(aux[i], aux[j])) D[k] = aux[i++]; else D[k] = aux[j++]; } Summary: Assumption: arrays B and C are already sorted. Merges B and C, produces sorted array D: Result array pointer D is passed as input. It must already have enough memory.

Merging Two Sorted Sets void merge(Item *D, Item *B, int M, Item *C, int P, Item * aux ) { int T = M+P; int i, j, k; for (i = 0; i < M; i++) aux[i] = B[i]; for (j = 0; j < P; j++) aux[j+M] = C[j]; for (i = 0, j = M, k = 0; k < T; k++) if (i == M) { D[k] = aux[j++]; continue; } if (j == T) { D[k] = aux[i++]; continue; } if (less(aux[i], aux[j])) D[k] = aux[i++]; else D[k] = aux[j++]; } Note that here is where we use the aux array. Why do we need it?

Merging Two Sorted Sets void merge(Item *D, Item *B, int M, Item *C, int P, Item * aux ) { int T = M+P; int i, j, k; for (i = 0; i < M; i++) aux[i] = B[i]; for (j = 0; j < P; j++) aux[j+M] = C[j]; for (i = 0, j = M, k = 0; k < T; k++) if (i == M) { D[k] = aux[j++]; continue; } if (j == T) { D[k] = aux[i++]; continue; } if (less(aux[i], aux[j])) D[k] = aux[i++]; else D[k] = aux[j++]; } Note that here is where we use the aux array. Why do we need it? Using aux allows for D to share memory with B and C. We first copy the contents of both B and C to aux. Then, as we write into D, possibly overwriting B and C, aux still has the data we need.

Merge Execution Array D: Array aux: Array B: Array C: Initial state position 1 2 3 4 5 6 7 8 9 value Array aux: position 1 2 3 4 5 6 7 8 9 value Array B: position 1 2 3 4 value 10 35 75 80 90 Initial state Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: Step 1: position 1 2 3 4 5 6 7 8 9 value Array aux: position 1 2 3 4 5 6 7 8 9 value 10 35 75 80 90 Array B: position 1 2 3 4 value 10 35 75 80 90 Step 1: Copy B to aux. Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: Main loop position k=0 1 2 3 4 5 6 7 8 9 value Array aux: position i=0 1 2 3 4 j=5 6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Main loop initialization Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 2 3 4 5 k=1 2 3 4 5 6 7 8 9 value 10 Array aux: position i=1 2 3 4 j=5 6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 3 4 5 1 k=2 3 4 5 6 7 8 9 value 10 17 Array aux: position i=1 2 3 4 5 j=6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 4 5 1 2 k=3 4 5 6 7 8 9 value 10 17 30 Array aux: position i=1 2 3 4 5 6 j=7 8 9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 5 1 2 3 k=4 5 6 7 8 9 value 10 17 30 35 Array aux: position 1 i=2 3 4 5 6 j=7 8 9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 4 1 2 3 4 k=5 6 7 8 9 value 10 17 30 35 40 Array aux: position 1 i=2 3 4 5 6 7 j=8 9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 4 1 2 3 4 5 k=6 7 8 9 value 10 17 30 35 40 60 Array aux: position 1 i=2 3 4 5 6 7 8 j=9 value 10 35 75 80 90 17 30 40 60 70 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 4 1 2 3 4 5 6 k=7 8 9 value 10 17 30 35 40 60 70 Array aux: position 1 i=2 3 4 5 6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 j=10 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 4 1 2 3 4 5 6 7 k=8 9 value 10 17 30 35 40 60 70 75 Array aux: position 1 2 i=3 4 5 6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 j=10 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 4 1 2 3 4 5 6 7 8 k=9 value 10 17 30 35 40 60 70 75 80 Array aux: position 1 2 3 i=4 5 6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 j=10 Array B: position 1 2 3 4 value 10 35 75 80 90 Array C: Position 1 2 3 4 Value 17 30 40 60 70

Merge Execution Array D: Array aux: Array B: Array C: position 1 2 3 4 5 6 7 8 9 value 10 17 30 35 40 60 70 75 80 90 k=10 Array aux: position 1 2 3 4 i=5 6 7 8 9 value 10 35 75 80 90 17 30 40 60 70 j=10 Array B: position 1 2 3 4 value 10 35 75 80 90 k=10, so we are done! Array C: Position 1 2 3 4 Value 17 30 40 60 70

Mergesort Execution Trace

Mergesort Execution Trace mergesort([35, 30, 17, 10, 60]) mergesort([35, 30])

Mergesort Execution Trace mergesort([35, 30, 17, 10, 60]) mergesort([35, 30]) [30,35] mergesort([35])  [35] mergesort([30])  [30] merge([35],[30])  [30,35]

Mergesort Execution Trace mergesort([35, 30, 17, 10, 60]) mergesort([35, 30]) [30,35] mergesort([35])  [35] mergesort([30])  [30] merge([35],[30])  [30,35] mergesort([17, 10, 60]) mergesort([17])  [17] mergesort([10, 60])

Mergesort Execution Trace mergesort([35, 30, 17, 10, 60]) mergesort([35, 30]) [30,35] mergesort([35])  [35] mergesort([30])  [30] merge([35],[30])  [30,35] mergesort([17, 10, 60]) mergesort([17])  [17] mergesort([10, 60])  [10, 60] mergesort([10])  [10] mergesort([60])  [60] merge([10],[60])  [10,60]

Mergesort Execution Trace mergesort([35, 30, 17, 10, 60]) mergesort([35, 30]) [30,35] mergesort([35])  [35] mergesort([30])  [30] merge([35],[30])  [30,35] mergesort([17, 10, 60])  [10,17,60] mergesort([17])  [17] mergesort([10, 60])  [10, 60] mergesort([10])  [10] mergesort([60])  [60] merge([10],[60])  [10,60] merge([17], [10,60])  [10,17,60]

Mergesort Execution Trace mergesort([35, 30, 17, 10, 60])  [10, 17, 30, 35, 60] mergesort([35, 30]) [30,35] mergesort([35])  [35] mergesort([30])  [30] merge([35],[30])  [30,35] mergesort([17, 10, 60])  [10,17,60] mergesort([17])  [17] mergesort([10, 60]) mergesort([10])  [10] mergesort([60])  [60] merge([10],[60])  [10,60] merge([17], [10,60])  [10,17,60] merge([30, 35], [10,17,60])  [10, 17, 30, 35, 60]

Mergesort Complexity Let T(N) be the worst-case running time complexity of mergesort for sorting N items. What is the recurrence for T(N)?

Mergesort Complexity Let T(N) be the worst-case running time complexity of mergesort for sorting N items. T(N) = 2T(N/2) + N. Let N = 2n. T(N) = 2T(2n-1) + 2n = 22T(2n-2) + 2*2n = 23T(2n-3) + 3*2n = 24T(2n-4) + 4*2n ... = 2nT(2n-n) + n*2n = N * constant + N * lg N = Θ(N lg N).

Mergesort Variations Mergesort with insertion sort when N<=cut-off items. Alternate between array and aux Will copy data only once (not twice) per recursive call. No need for linear extra space. More complicated, somewhat slower. Bottom-up mergesort, no recursive calls. Mergesort using lists and not arrays. Both top-down and bottom-up implementations Bitonic sequence (first increasing, then decreasing) Can eliminate the index boundary check for the subarrays.