Dale Roberts Mergesort Dale Roberts, Lecturer Computer Science, IUPUI Department of Computer and Information Science, School.

Slides:



Advertisements
Similar presentations
Divide & Conquer n Divide into two halves n Recursively sort the two n Merge the two sorted lists 1 ALGORITHMS Merge Sort.
Advertisements

1 auxiliary array smallest AGLORHIMST Merging Merge. n Keep track of smallest element in each sorted half. n Insert smallest of two elements into auxiliary.
AGLOR HIMST Merging Merge. n Keep track of smallest element in each sorted half. n Choose smaller of two elements. n Repeat until done. A.
DIVIDE AND CONQUER. 2 Algorithmic Paradigms Greedy. Build up a solution incrementally, myopically optimizing some local criterion. Divide-and-conquer.
Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.
Chapter 4: Divide and Conquer Master Theorem, Mergesort, Quicksort, Binary Search, Binary Trees The Design and Analysis of Algorithms.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Mergesort, Analysis of Algorithms Jon von Neumann and ENIAC (1945)
Insertion sort, Merge sort COMP171 Fall Sorting I / Slide 2 Insertion sort 1) Initially p = 1 2) Let the first p elements be sorted. 3) Insert the.
Chapter 4: Divide and Conquer The Design and Analysis of Algorithms.
Cmpt-225 Sorting. Fundamental problem in computing science  putting a collection of items in order Often used as part of another algorithm  e.g. sort.
TTIT33 Algorithms and Optimization – Dalg Lecture 2 HT TTIT33 Algorithms and optimization Lecture 2 Algorithms Sorting [GT] 3.1.2, 11 [LD] ,
Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Linear Time Selection These lecture slides are adapted from CLRS 10.3.
CS Main Questions Given that the computer is the Great Symbol Manipulator, there are three main questions in the field of computer science: What kinds.
Design and Analysis of Algorithms – Chapter 51 Divide and Conquer (I) Dr. Ying Lu RAIK 283: Data Structures & Algorithms.
Sorting Chapter 6 Chapter 6 –Insertion Sort 6.1 –Quicksort 6.2 Chapter 5 Chapter 5 –Mergesort 5.2 –Stable Sorts Divide & Conquer.
Sorting and Asymptotic Complexity
Sorting (Part II: Divide and Conquer) CSE 373 Data Structures Lecture 14.
CSE 373 Data Structures Lecture 15
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Analysis of Algorithms
Dale Roberts Mergesort Dale Roberts, Lecturer Computer Science, IUPUI Department of Computer and Information Science, School.
Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort (iterative, recursive?) * Bubble sort.
Dale Roberts Sorting Dale Roberts, Lecturer Computer Science, IUPUI Department of Computer and Information Science, School.
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Merge Sort Data Structures and Algorithms CS 244 Brent M. Dingle, Ph.D. Department of Mathematics, Statistics, and Computer Science University of Wisconsin.
2IS80 Fundamentals of Informatics Fall 2015 Lecture 6: Sorting and Searching.
Chapter 9 sorting. Insertion Sort I The list is assumed to be broken into a sorted portion and an unsorted portion The list is assumed to be broken into.
A Introduction to Computing II Lecture 7: Sorting 1 Fall Session 2000.
PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)
Sorting Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
CSCI 256 Data Structures and Algorithm Analysis Lecture 10 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
SORTING AND ASYMPTOTIC COMPLEXITY Lecture 13 CS2110 – Fall 2009.
Sorting & Lower Bounds Jeff Edmonds York University COSC 3101 Lecture 5.
Lower Bounds & Sorting in Linear Time
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Sorting.
Analysis of Algorithms CS 477/677
Fundamentals of Algorithms MCS - 2 Lecture # 11
Subject Name: Design and Analysis of Algorithm Subject Code: 10CS43
Analysis of Algorithms
COP 3503 FALL 2012 Shayan Javed Lecture 15
Merging Merge. Keep track of smallest element in each sorted half.
COMP 53 – Week Seven Big O Sorting.
Mergesort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos
More on Merge Sort CS 244 This presentation is not given in class
Department of Computer and Information Science, School of Science, IUPUI Quicksort Dale Roberts, Lecturer Computer Science, IUPUI
CS 3343: Analysis of Algorithms
CSS 342 Data Structures, Algorithms, and Discrete Mathematics I
Chapter 4: Divide and Conquer
Advanced Sorting Methods: Shellsort
CO 303 Algorithm Analysis And Design Quicksort
Topic: Divide and Conquer
CS200: Algorithm Analysis
Lower Bounds & Sorting in Linear Time
Linear-Time Sorting Algorithms
CSE 332: Data Abstractions Sorting I
A G L O R H I M S T A Merging Merge.
CSE 373: Data Structures and Algorithms
CSE 373 Data Structures and Algorithms
Algorithms: Design and Analysis
Application: Efficiency of Algorithms II
Divide & Conquer Sorting
Sorting and Asymptotic Complexity
Application: Efficiency of Algorithms II
A G L O R H I M S T A Merging Merge.
A G L O R H I M S T A Merging Merge.
The Selection Problem.
Richard Anderson Lecture 14 Divide and Conquer
Advanced Sorting Methods: Shellsort
Presentation transcript:

Dale Roberts Mergesort Dale Roberts, Lecturer Computer Science, IUPUI Department of Computer and Information Science, School of Science, IUPUI

Dale Roberts Why Does It Matter? 1000 Time to solve a problem of size 10, ,000 million 10 million 1.3 seconds 22 minutes 15 days 41 years 41 millennia 920 3,600 14,000 41,000 1,000 Run time (nanoseconds) 1.3 N 3 second Max size problem solved in one minute hour day 10 msec 1 second 1.7 minutes 2.8 hours 1.7 weeks 10,000 77, , million N msec 6 msec 78 msec 0.94 seconds 11 seconds 1 million 49 million 2.4 trillion 50 trillion N log 2 N msec 0.48 msec 4.8 msec 48 msec 0.48 seconds 21 million 1.3 billion 76 trillion 1,800 trillion N N multiplied by 10, time multiplied by

Dale Roberts Orders of Magnitude Meters Per Second in / decade Imperial Units 1 ft / year 3.4 in / day 1.2 ft / hour 2 ft / minute 2.2 mi / hour 220 mi / hour Continental drift Example Hair growing Glacier Gastro-intestinal tract Ant Human walk Propeller airplane mi / min 620 mi / sec 62,000 mi / sec Space shuttle Earth in galactic orbit 1/3 speed of light 1 Seconds second Equivalent 1.7 minutes 17 minutes 2.8 hours 1.1 days 1.6 weeks 3.8 months 3.1 years 3.1 decades 3.1 centuries forever age of universe 2 10 thousand 2 20 million 2 30 billion seconds Powers of 2

Dale Roberts Impact of Better Algorithms Example 1: N-body-simulation. Simulate gravitational interactions among N bodies. physicists want N = # atoms in universe Brute force method: N 2 steps. Appel (1981). N log N steps, enables new research. Example 2: Discrete Fourier Transform (DFT). Breaks down waveforms (sound) into periodic components. foundation of signal processing CD players, JPEG, analyzing astronomical data, etc. Grade school method: N 2 steps. Runge-König (1924), Cooley-Tukey (1965). FFT algorithm: N log N steps, enables new technology.

Dale Roberts Mergesort Mergesort (divide-and-conquer) Divide array into two halves. ALGORITHMS divide ALGORITHMS

Dale Roberts Mergesort Mergesort (divide-and-conquer) Divide array into two halves. Recursively sort each half. sort ALGORITHMS divide ALGORITHMS AGLORHIMST

Dale Roberts Mergesort Mergesort (divide-and-conquer) Divide array into two halves. Recursively sort each half. Merge two halves to make sorted whole. merge sort ALGORITHMS divide ALGORITHMS AGLORHIMST AGHILMORST

Dale Roberts auxiliary array smallest AGLORHIMST Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. A

Dale Roberts auxiliary array smallest AGLORHIMST A Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. G

Dale Roberts auxiliary array smallest AGLORHIMST AG Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. H

Dale Roberts auxiliary array smallest AGLORHIMST AGH Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. I

Dale Roberts auxiliary array smallest AGLORHIMST AGHI Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. L

Dale Roberts auxiliary array smallest AGLORHIMST AGHIL Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. M

Dale Roberts auxiliary array smallest AGLORHIMST AGHILM Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. O

Dale Roberts auxiliary array smallest AGLORHIMST AGHILMO Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. R

Dale Roberts auxiliary array first half exhausted smallest AGLORHIMST AGHILMOR Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. S

Dale Roberts auxiliary array first half exhausted smallest AGLORHIMST AGHILMORS Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. T

Dale Roberts auxiliary array first half exhausted second half exhausted AGLORHIMST AGHILMORST Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done.

Dale Roberts Implementing Mergesort Item aux[MAXN]; void mergesort(Item a[], int left, int right) { int mid = (right + left) / 2; if (right <= left) return; mergesort(a, left, mid); mergesort(a, mid + 1, right); merge(a, left, mid, right); } mergesort (see Sedgewick Program 8.3) uses scratch array

Dale Roberts Implementing Merge (Idea 0) mergeAB(Item c[], Item a[], int N, Item b[], int M ) { int i, j, k; { int i, j, k; for (i = 0, j = 0, k = 0; k < N+M; k++) for (i = 0, j = 0, k = 0; k < N+M; k++) { if (i == N) { c[k] = b[j++]; continue; } if (i == N) { c[k] = b[j++]; continue; } if (j == M) { c[k] = a[i++]; continue; } if (j == M) { c[k] = a[i++]; continue; } c[k] = (less(a[i], b[j])) ? a[i++] : b[j++]; c[k] = (less(a[i], b[j])) ? a[i++] : b[j++]; } }

Dale Roberts Implementing Mergesort void merge(Item a[], int left, int mid, int right) { int i, j, k; for (i = mid+1; i > left; i--) aux[i-1] = a[i-1]; for (j = mid; j < right; j++) aux[right+mid-j] = a[j+1]; for (k = left; k <= right; k++) if (ITEMless(aux[i], aux[j])) a[k] = aux[i++]; else a[k] = aux[j--]; } merge (see Sedgewick Program 8.2) copy to temporary array merge two sorted sequences

Dale Roberts Mergesort Demo Mergesort Mergesort The auxilliary array used in the merging operation is shown to the right of the array a[], going from (N+1, 1) to (2N, 2N). Mergesort The demo is a dynamic representation of the algorithm in action, sorting an array a containing a permutation of the integers 1 through N. For each i, the array element a[i] is depicted as a black dot plotted at position (i, a[i]). Thus, the end result of each sort is a diagonal of black dots going from (1, 1) at the bottom left to (N, N) at the top right. Each time an element is moved, a green dot is left at its old position. Thus the moving black dots give a dynamic representation of the progress of the sort and the green dots give a history of the data-movement cost.

Dale Roberts Computational Complexity Framework to study efficiency of algorithms. Example = sorting. MACHINE MODEL = count fundamental operations. count number of comparisons UPPER BOUND = algorithm to solve the problem (worst-case). N log 2 N from mergesort LOWER BOUND = proof that no algorithm can do better. N log 2 N - N log 2 e OPTIMAL ALGORITHM: lower bound ~ upper bound. mergesort

Dale Roberts Decision Tree print a 1, a 2, a 3 a 1 < a 2 YESNO a 2 < a 3 YES NO a 2 < a 3 YESNO a 1 < a 3 YESNO a 1 < a 3 YESNO print a 1, a 3, a 2 print a 3, a 1, a 2 print a 2, a 1, a 3 print a 2, a 3, a 1 print a 3, a 2, a 1

Dale Roberts Comparison Based Sorting Lower Bound Theorem. Any comparison based sorting algorithm must use  (N log 2 N) comparisons. Proof. Worst case dictated by tree height h. N! different orderings. One (or more) leaves corresponding to each ordering. Binary tree with N! leaves must have height Food for thought. What if we don't use comparisons? Stay tuned for radix sort. Stirling's formula

Dale Roberts Mergesort Analysis How long does mergesort take? Bottleneck = merging (and copying). merging two files of size N/2 requires N comparisons T(N) = comparisons to mergesort N elements. to make analysis cleaner, assume N is a power of 2 Claim. T(N) = N log 2 N. Note: same number of comparisons for ANY file. even already sorted We'll prove several different ways to illustrate standard techniques.

Dale Roberts Profiling Mergesort Empirically void merge(Item a[], int left, int mid, int right) { int i, j, k; for ( i = mid+1; i > left; i--) aux[i-1] = a[i-1]; for ( j = mid; j j++) aux[right+mid-j] = a[j+1]; for ( k = left; k k++) if ( ITEMless(aux[i], aux[j])) a[k] = aux[i++]; else a[k] = aux[j--]; } void mergesort(Item a[], int left, int right) { int mid = (right + left) / 2; if ( right <= left) return ; mergesort(a, aux, left, mid); mergesort(a, aux, mid+1, right); merge(a, aux, left, mid, right); } Mergesort prof.out Striking feature: All numbers SMALL! # comparisons Theory ~ N log 2 N = 9,966 Actual = 9,976

Dale Roberts Sorting Analysis Summary Running time estimates: Home pc executes 10 8 comparisons/second. Supercomputer executes comparisons/second. Lesson 1: good algorithms are better than supercomputers. Lesson 2: great algorithms are better than good ones. computer home super thousand instant million 2.8 hours 1 second billion 317 years 1.6 weeks Insertion Sort (N 2 ) thousand instant million 1 sec instant billion 18 min instant Mergesort (N log N) thousand instant million 0.3 sec instant billion 6 min instant Quicksort (N log N)

Dale Roberts Acknowledgements Sorting methods are discussed in our Sedgewick text. Slides and demos are from our text’s website at princeton.edu. Special thanks to Kevin Wayne in helping to prepare this material.

Dale Roberts Extra Slides

Dale Roberts Proof by Picture of Recursion Tree T(N) T(N/2) T(N/4) T(2) N T(N / 2 k ) 2(N/2) 4(N/4) 2 k (N / 2 k ) N/2 (2)... log 2 N N log 2 N

Dale Roberts Proof by Telescoping Claim. T(N) = N log 2 N (when N is a power of 2). Proof. For N > 1:

Dale Roberts Mathematical Induction Mathematical induction. Powerful and general proof technique in discrete mathematics. To prove a theorem true for all integers k  0: Base case: prove it to be true for N = 0. Induction hypothesis: assuming it is true for arbitrary N Induction step: show it is true for N + 1 Claim: N = N(N+1) / 2 for all N  0. Proof: (by mathematical induction) Base case (N = 0). 0 = 0(0+1) / 2. 0 = 0(0+1) / 2. Induction hypothesis: assume N = N(N+1) / 2 Induction step: N + N + 1 = ( N) + N+1 = N (N+1) /2 + N+1 = (N+2)(N+1) / 2

Dale Roberts Proof by Induction Claim. T(N) = N log 2 N (when N is a power of 2). Proof. (by induction on N) Base case: N = 1. Inductive hypothesis: T(N) = N log 2 N. Goal: show that T(2N) = 2N log 2 (2N).

Dale Roberts Proof by Induction What if N is not a power of 2? T(N) satisfies following recurrence. Claim.T(N)  N  log 2 N . Proof.See supplemental slides.

Dale Roberts Proof by Induction Claim. T(N)  N  log 2 N . Proof. (by induction on N) Base case: N = 1. Define n 1 =  N / 2 , n 2 =  N / 2 . Induction step: assume true for 1, 2,..., N – 1.