Mergesort, Analysis of Algorithms Jon von Neumann and ENIAC (1945)

Slides:



Advertisements
Similar presentations
ADA: 5. Quicksort1 Objective o describe the quicksort algorithm, it's partition function, and analyse its running time under different data conditions.
Advertisements

Divide-and-Conquer The most-well known algorithm design strategy:
DIVIDE AND CONQUER. 2 Algorithmic Paradigms Greedy. Build up a solution incrementally, myopically optimizing some local criterion. Divide-and-conquer.
1 Divide-and-Conquer The most-well known algorithm design strategy: 1. Divide instance of problem into two or more smaller instances 2. Solve smaller instances.
Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.
Chapter 4: Divide and Conquer Master Theorem, Mergesort, Quicksort, Binary Search, Binary Trees The Design and Analysis of Algorithms.
DIVIDE AND CONQUER APPROACH. General Method Works on the approach of dividing a given problem into smaller sub problems (ideally of same size).  Divide.
Sorting Algorithms and Average Case Time Complexity
Lecture 7COMPSCI.220.FS.T Algorithm MergeSort John von Neumann ( 1945 ! ): a recursive divide- and-conquer approach Three basic steps: –If the.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
Lecture 5: Linear Time Sorting Shang-Hua Teng. Sorting Input: Array A[1...n], of elements in arbitrary order; array size n Output: Array A[1...n] of the.
Insertion sort, Merge sort COMP171 Fall Sorting I / Slide 2 Insertion sort 1) Initially p = 1 2) Let the first p elements be sorted. 3) Insert the.
Sorting. Input: A sequence of n numbers a 1, …, a n Output: A reordering a 1 ’, …, a n ’, such that a 1 ’ < … < a n ’
CSC 2300 Data Structures & Algorithms March 27, 2007 Chapter 7. Sorting.
Chapter 4: Divide and Conquer The Design and Analysis of Algorithms.
Merge sort, Insertion sort
CS2420: Lecture 10 Vladimir Kulyukin Computer Science Department Utah State University.
Cmpt-225 Sorting. Fundamental problem in computing science  putting a collection of items in order Often used as part of another algorithm  e.g. sort.
Lecture 5: Master Theorem and Linear Time Sorting
Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Linear Time Selection These lecture slides are adapted from CLRS 10.3.
Divide and Conquer Sorting
Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort or bubble sort 1. Find the minimum value in the list 2. Swap it with the value.
Sorting Lower Bound Andreas Klappenecker based on slides by Prof. Welch 1.
CS Main Questions Given that the computer is the Great Symbol Manipulator, there are three main questions in the field of computer science: What kinds.
Design and Analysis of Algorithms – Chapter 51 Divide and Conquer (I) Dr. Ying Lu RAIK 283: Data Structures & Algorithms.
Sorting Chapter 6 Chapter 6 –Insertion Sort 6.1 –Quicksort 6.2 Chapter 5 Chapter 5 –Mergesort 5.2 –Stable Sorts Divide & Conquer.
Sorting and Asymptotic Complexity
Analysis of Recursive Algorithms October 29, 2014
HOW TO SOLVE IT? Algorithms. An Algorithm An algorithm is any well-defined (computational) procedure that takes some value, or set of values, as input.
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “ Introduction to the Design & Analysis of Algorithms, ” 2 nd ed., Ch. 1 Chapter.
Analysis of Algorithms
Dale Roberts Mergesort Dale Roberts, Lecturer Computer Science, IUPUI Department of Computer and Information Science, School.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Algorithms 2. Quicksort General Quicksort Algorithm: Select an element from the array to be the pivot Select an element from the array to be the.
1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Merge sort, Insertion sort. Sorting I / Slide 2 Sorting * Selection sort (iterative, recursive?) * Bubble sort.
Data Structures Using C++ 2E Chapter 9 Searching and Hashing Algorithms.
1 CSE 373 Sorting 3: Merge Sort, Quick Sort reading: Weiss Ch. 7 slides created by Marty Stepp
Review 1 Selection Sort Selection Sort Algorithm Time Complexity Best case Average case Worst case Examples.
Merge Sort Data Structures and Algorithms CS 244 Brent M. Dingle, Ph.D. Department of Mathematics, Statistics, and Computer Science University of Wisconsin.
Foundations II: Data Structures and Algorithms
Complexity Analysis. 2 Complexity The complexity of an algorithm quantifies the resources needed as a function of the amount of input data size. The resource.
2IS80 Fundamentals of Informatics Fall 2015 Lecture 6: Sorting and Searching.
2IL50 Data Structures Spring 2016 Lecture 2: Analysis of Algorithms.
 Design and Analysis of Algorithms تصميم وتحليل الخوارزميات (311 عال) Chapter 2 Sorting (insertion Sort, Merge Sort)
A Introduction to Computing II Lecture 7: Sorting 1 Fall Session 2000.
PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)
Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.
CSCI 256 Data Structures and Algorithm Analysis Lecture 10 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
SORTING AND ASYMPTOTIC COMPLEXITY Lecture 13 CS2110 – Fall 2009.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
Dale Roberts Mergesort Dale Roberts, Lecturer Computer Science, IUPUI Department of Computer and Information Science, School.
Chapter 2 Divide-and-Conquer algorithms
Mergesort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos
More on Merge Sort CS 244 This presentation is not given in class
Chapter 4: Divide and Conquer
CS200: Algorithm Analysis
CSE 332: Data Abstractions Sorting I
CSE 373: Data Structures and Algorithms
CSE 373 Data Structures and Algorithms
Algorithms: Design and Analysis
Application: Efficiency of Algorithms II
Divide & Conquer Sorting
Sorting and Asymptotic Complexity
Application: Efficiency of Algorithms II
The Selection Problem.
Presentation transcript:

Mergesort, Analysis of Algorithms Jon von Neumann and ENIAC (1945)

2 Why Does It Matter? 1000 Time to solve a problem of size 10, ,000 million 10 million 1.3 seconds 22 minutes 15 days 41 years 41 millennia 920 3,600 14,000 41,000 1,000 Run time (nanoseconds) 1.3 N 3 second Max size problem solved in one minute hour day 10 msec 1 second 1.7 minutes 2.8 hours 1.7 weeks 10,000 77, , million N msec 6 msec 78 msec 0.94 seconds 11 seconds 1 million 49 million 2.4 trillion 50 trillion N log 2 N msec 0.48 msec 4.8 msec 48 msec 0.48 seconds 21 million 1.3 billion 76 trillion 1,800 trillion N N multiplied by 10, time multiplied by

3 Orders of Magnitude Meters Per Second in / decade Imperial Units 1 ft / year 3.4 in / day 1.2 ft / hour 2 ft / minute 2.2 mi / hour 220 mi / hour Continental drift Example Hair growing Glacier Gastro-intestinal tract Ant Human walk Propeller airplane mi / min 620 mi / sec 62,000 mi / sec Space shuttle Earth in galactic orbit 1/3 speed of light 1 Seconds second Equivalent 1.7 minutes 17 minutes 2.8 hours 1.1 days 1.6 weeks 3.8 months 3.1 years 3.1 decades 3.1 centuries forever age of universe 2 10 thousand 2 20 million 2 30 billion seconds Powers of 2

4 Impact of Better Algorithms Example 1: N-body-simulation. n Simulate gravitational interactions among N bodies. – physicists want N = # atoms in universe n Brute force method: N 2 steps. n Appel (1981). N log N steps, enables new research. Example 2: Discrete Fourier Transform (DFT). n Breaks down waveforms (sound) into periodic components. – foundation of signal processing – CD players, JPEG, analyzing astronomical data, etc. n Grade school method: N 2 steps. n Runge-König (1924), Cooley-Tukey (1965). FFT algorithm: N log N steps, enables new technology.

5 Mergesort Mergesort (divide-and-conquer) n Divide array into two halves. ALGORITHMS divide ALGORITHMS

6 Mergesort Mergesort (divide-and-conquer) n Divide array into two halves. n Recursively sort each half. sort ALGORITHMS divide ALGORITHMS AGLORHIMST

7 Mergesort Mergesort (divide-and-conquer) n Divide array into two halves. n Recursively sort each half. n Merge two halves to make sorted whole. merge sort ALGORITHMS divide ALGORITHMS AGLORHIMST AGHILMORST

8 Mergesort Analysis How long does mergesort take? n Bottleneck = merging (and copying). – merging two files of size N/2 requires N comparisons n T(N) = comparisons to mergesort N elements. – to make analysis cleaner, assume N is a power of 2 Claim. T(N) = N log 2 N. n Note: same number of comparisons for ANY file. – even already sorted n We'll prove several different ways to illustrate standard techniques.

9 Proof by Picture of Recursion Tree T(N) T(N/2) T(N/4) T(2) N T(N / 2 k ) 2(N/2) 4(N/4) 2 k (N / 2 k ) N/2 (2)... log 2 N N log 2 N

10 Proof by Telescoping Claim. T(N) = N log 2 N (when N is a power of 2). Proof. For N > 1:

11 Mathematical Induction Mathematical induction. n Powerful and general proof technique in discrete mathematics. n To prove a theorem true for all integers k  0: – Base case: prove it to be true for N = 0. – Induction hypothesis: assuming it is true for arbitrary N – Induction step: show it is true for N + 1 Claim: N = N(N+1) / 2 for all N  0. Proof: (by mathematical induction) n Base case (N = 0). – 0 = 0(0+1) / 2. n Induction hypothesis: assume N = N(N+1) / 2 n Induction step: N + N + 1= ( N) + N+1 = N (N+1) /2 + N+1 = (N+2)(N+1) / 2

12 Proof by Induction Claim. T(N) = N log 2 N (when N is a power of 2). Proof. (by induction on N) n Base case: N = 1. n Inductive hypothesis: T(N) = N log 2 N. n Goal: show that T(2N) = 2N log 2 (2N).

13 Proof by Induction What if N is not a power of 2? n T(N) satisfies following recurrence. Claim.T(N)  N  log 2 N . Proof.See supplemental slides.

14 Computational Complexity Framework to study efficiency of algorithms. Example = sorting. n MACHINE MODEL = count fundamental operations. – count number of comparisons n UPPER BOUND = algorithm to solve the problem (worst-case). – N log 2 N from mergesort n LOWER BOUND = proof that no algorithm can do better. – N log 2 N - N log 2 e n OPTIMAL ALGORITHM: lower bound ~ upper bound. – mergesort

15 Decision Tree print a 1, a 2, a 3 a 1 < a 2 YESNO a 2 < a 3 YES NO a 2 < a 3 YESNO a 1 < a 3 YESNO a 1 < a 3 YESNO print a 1, a 3, a 2 print a 3, a 1, a 2 print a 2, a 1, a 3 print a 2, a 3, a 1 print a 3, a 2, a 1

16 Comparison Based Sorting Lower Bound Theorem. Any comparison based sorting algorithm must use  (N log 2 N) comparisons. Proof. Worst case dictated by tree height h. n N! different orderings. n One (or more) leaves corresponding to each ordering. n Binary tree with N! leaves must have height Food for thought. What if we don't use comparisons?  Stay tuned for radix sort. Stirling's formula

Extra Slides

18 Proof by Induction Claim. T(N)  N  log 2 N . Proof. (by induction on N) n Base case: N = 1. n Define n 1 =  N / 2 , n 2 =  N / 2 . n Induction step: assume true for 1, 2,..., N – 1.

19 Implementing Mergesort Item aux[MAXN]; void mergesort(Item a[], int left, int right) { int mid = (right + left) / 2; if (right <= left) return; mergesort(a, left, mid); mergesort(a, mid + 1, right); merge(a, left, mid, right); } mergesort (see Sedgewick Program 8.3) uses scratch array

20 Implementing Mergesort void merge(Item a[], int left, int mid, int right) { int i, j, k; for (i = mid+1; i > left; i--) aux[i-1] = a[i-1]; for (j = mid; j < right; j++) aux[right+mid-j] = a[j+1]; for (k = left; k <= right; k++) if (ITEMless(aux[i], aux[j])) a[k] = aux[i++]; else a[k] = aux[j--]; } merge (see Sedgewick Program 8.2) copy to temporary array merge two sorted sequences

21 Profiling Mergesort Empirically void merge(Item a[], int left, int mid, int right) { int i, j, k; for ( i = mid+1; i > left; i--) aux[i-1] = a[i-1]; for ( j = mid; j j++) aux[right+mid-j] = a[j+1]; for ( k = left; k k++) if ( ITEMless(aux[i], aux[j])) a[k] = aux[i++]; else a[k] = aux[j--]; } void mergesort(Item a[], int left, int right) { int mid = (right + left) / 2; if ( right <= left) return ; mergesort(a, aux, left, mid); mergesort(a, aux, mid+1, right); merge(a, aux, left, mid, right); } Mergesort prof.out Striking feature: All numbers SMALL! # comparisons Theory ~ N log 2 N = 9,966 Actual = 9,976

22 Sorting Analysis Summary Running time estimates: n Home pc executes 10 8 comparisons/second. n Supercomputer executes comparisons/second. Lesson 1: good algorithms are better than supercomputers. Lesson 2: great algorithms are better than good ones. computer home super thousand instant million 2.8 hours 1 second billion 317 years 1.6 weeks Insertion Sort (N 2 ) thousand instant million 1 sec instant billion 18 min instant Mergesort (N log N) thousand instant million 0.3 sec instant billion 6 min instant Quicksort (N log N)