Analysis of Algorithms

Slides:



Advertisements
Similar presentations
MATH 224 – Discrete Mathematics
Advertisements

Analysis of Algorithms
September 12, Algorithms and Data Structures Lecture III Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Divide-and-Conquer Recursive in structure –Divide the problem into several smaller sub-problems that are similar to the original but smaller in size –Conquer.
11 Computer Algorithms Lecture 6 Recurrence Ch. 4 (till Master Theorem) Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR.
Algorithms Recurrences. Definition – a recurrence is an equation or inequality that describes a function in terms of its value on smaller inputs Example.
1 ICS 353 Design and Analysis of Algorithms Spring Semester (062) King Fahd University of Petroleum & Minerals Information & Computer Science.
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 3 Recurrence equations Formulating recurrence equations Solving recurrence equations.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 3 Recurrence equations Formulating recurrence equations Solving recurrence equations.
Analysis of Recursive Algorithms
Updates HW#1 has been delayed until next MONDAY. There were two errors in the assignment Merge sort runs in Θ(n log n). Insertion sort runs in Θ(n2).
Tirgul 2 Asymptotic Analysis. Motivation: Suppose you want to evaluate two programs according to their run-time for inputs of size n. The first has run-time.
David Luebke 1 7/2/2015 Merge Sort Solving Recurrences The Master Theorem.
Recurrences Part 3. Recursive Algorithms Recurrences are useful for analyzing recursive algorithms Recurrence – an equation or inequality that describes.
Analysis of Algorithms
October 1, Algorithms and Data Structures Lecture III Simonas Šaltenis Nykredit Center for Database Research Aalborg University
CSC 201 Analysis and Design of Algorithms Lecture 03: Introduction to a CSC 201 Analysis and Design of Algorithms Lecture 03: Introduction to a lgorithms.
Lecture 8. How to Form Recursive relations 1. Recap Asymptotic analysis helps to highlight the order of growth of functions to compare algorithms Common.
Analysis of Algorithms
Analyzing Recursive Algorithms A recursive algorithm can often be described by a recurrence equation that describes the overall runtime on a problem of.
Iterative Algorithm Analysis & Asymptotic Notations
1 Recurrences Algorithms Jay Urbain, PhD Credits: Discrete Mathematics and Its Applications, by Kenneth Rosen The Design and Analysis of.
Project 2 due … Project 2 due … Project 2 Project 2.
Analysis of Algorithms These slides are a modified version of the slides used by Prof. Eltabakh in his offering of CS2223 in D term 2013.
Analyzing algorithms & Asymptotic Notation BIO/CS 471 – Algorithms for Bioinformatics.
Merge Sort Solving Recurrences The Master Theorem
CSC 413/513: Intro to Algorithms Merge Sort Solving Recurrences.
Algorithms Merge Sort Solving Recurrences The Master Theorem.
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 2: Mathematical Foundations.
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
September 17, 2001 Algorithms and Data Structures Lecture II Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Algorithm Analysis Part of slides are borrowed from UST.
Foundations II: Data Structures and Algorithms
Design & Analysis of Algorithms COMP 482 / ELEC 420 John Greiner
Divide and Conquer Faculty Name: Ruhi Fatima Topics Covered Divide and Conquer Matrix multiplication Recurrence.
Chapter 4: Solution of recurrence relationships Techniques: Substitution: proof by induction Tree analysis: graphical representation Master theorem: Recipe.
BITS Pilani Pilani Campus Data Structure and Algorithms Design Dr. Maheswari Karthikeyan Lecture1.
Advanced Algorithms Analysis and Design By Dr. Nazir Ahmad Zafar Dr Nazir A. Zafar Advanced Algorithms Analysis and Design.
Recursion Ali.
Analysis of Algorithms
Analysis of Algorithms
Unit 1. Sorting and Divide and Conquer
Complexity analysis.
CS 3343: Analysis of Algorithms
Chapter 4: Divide and Conquer
Analysis of algorithms
Chapter 2 Fundamentals of the Analysis of Algorithm Efficiency
CS 3343: Analysis of Algorithms
Introduction to Algorithms Analysis
Algorithms and Data Structures Lecture III
CS 3343: Analysis of Algorithms
Divide-and-Conquer 7 2  9 4   2   4   7
Ch 4: Recurrences Ming-Te Chi
CS200: Algorithms Analysis
Recurrences (Method 4) Alexandra Stefan.
Asst. Dr.Surasak Mungsing
Introduction to Algorithms
CE 221 Data Structures and Algorithms
Divide and Conquer (Merge Sort)
Divide-and-Conquer 7 2  9 4   2   4   7
Trevor Brown CS 341: Algorithms Trevor Brown
At the end of this session, learner will be able to:
Discrete Mathematics 7th edition, 2009
Analysis of algorithms
Introduction To Algorithms
David Kauchak cs161 Summer 2009
Algorithms Recurrences.
Algorithms and Data Structures Lecture III
Divide-and-Conquer 7 2  9 4   2   4   7
Algorithms and Data Structures Lecture II
Presentation transcript:

Analysis of Algorithms

Analyzing Algorithms We need methods and metrics to analyze algorithms for: Correctness Methods for proving correctness Efficiency Time complexity, Asymptotic analysis

Lecture Outline Short Review on Asymptotic Analysis Asymptotic notations Upper bound, Lower bound, Tight bound Running time estimation in complex cases: Summations Recurrences The substitution method The recursion tree method The Master Theorem

Review - Asymptotic Analysis Running time depends on the size of the input T(n): the time taken on input with size n it is the rate of growth, or order of growth, of the running time that really interests us Look at growth of T(n) as n→∞. Worst-case and average-case running times are difficult to compute precisely, so we calculate upper and lower bounds of the function.

Review - Asymptotic Notations O: Big-Oh = asymptotic upper bound Ω: Big-Omega = asymptotic lower bound Θ: Theta = asymptotically tight bound [CLRS] – chap 3

Big O O (g(n)) is the set of all functions with a smaller or same order of growth as g(n), within a constant multiple If f(n)  O(g(n)) (f(n) is in O(g(n)), it means that g(n) is an asymptotic upper bound of f(n) Intuitively, it is like f(n) ≤ g(n) We write f(n)=O(g(n)) [CLRS], Fig. 3.1

Examples Examples: if g(n)=n2 , some functions f(n) in O(n2) are:

Big Ω Ω (g(n)) is the set of all functions with a larger or same order of growth as g(n), within a constant multiple f(n)  Ω(g(n)) means g(n) is an asymptotic lower bound of f(n) Intuitively, it is like g(n) ≤ f(n) [CLRS], Fig. 3.1

Examples Examples: if g(n)=n2 , some functions f(n) in Ω (n2):

Theta (Θ) Informally, Θ (g(n)) is the set of all functions with the same order of growth as g(n), within a constant multiple f(n)  Θ(g(n)) means g(n) is an asymptotically tight bound of f(n) Intuitively, it is like f(n) = g(n) [CLRS], Fig. 3.1

Examples Examples of functions f(n) in Θ(n2): f(n) is in Θ(g(n)) if both f(n) is in O(g(n)) and f(n) is in Ω(g(n))

Running Time Estimation In practice, estimating the running time T(n) means finding a function f(n), such that T(n) in O(f(n)) or T(n) in Θ(f(n)) If we prove that T(n) in O(f(n)) we just guarantee that T(n) “is not worse than f(n)” Attention to overapproximations ! If we prove that T(n) is in Θ(f(n)) we actually determine the order of growth of the running time.

Running Time Estimation Simplifying assumption: each statement takes the same unit amount of time. usually we can assume that the statements within a loop will be executed as many times as the maximum permitted by the loop control. More complex situations when running time estimation can be difficult: Summations (a loop is executed many times, each time with a different complexity) Recurrences (recursive algorithms)

Summations - Example O(n3) But what about Θ ? For i=1 to n do For j=1 to i do For k=1 to i do something_simple n i<=n O(1) O(n3) But what about Θ ? Function something_simple is executed exactly S(n) times: S(n)= Σi=1n i2 = n(n+1)(2n+1)/6 S(n) in Θ(n3) See also: [CLRS] – Appendix A - Summation formulas

Recurrences-Example T(n) = Θ(1), n=1 2*T(n/2) + Θ(n), n>1 p q r MERGE-SORT(A[p..r]) if p < r q= (p+r)/2 MERGE-SORT(A[p..q]) MERGE-SORT(A[q+1..r]) MERGE(A[p..r],q) To sort an array of n numbers, call MERGE-SORT(A[1..n]) T(n) = Θ(1), n=1 2*T(n/2) + Θ(n), n>1 In case of recursive algorithms, we get a recurrence relationship on the run time function T(n)

Solving Recurrences The recurrence has to be solved in order to find out T(n) as a function of n General methods for solving recurrences: Substitution Method Recursion-tree Method

The Substitution Method The substitution method for solving recurrences: do a few substitution steps in the recurrence relationship until you can guess the solution (the formula) and prove it with math induction Example: applying the substitution method for solving the MergeSort recurrence

Substitution meth: T(n)=2 *T(n/2)+n By substituting T(n/2) in the first relationship, we obtain: T(n) = 22*T(n/22)+2*n/2+n = 22*T(n/22)+2*n T(n/22) = 2*T(n/23)+n/22 T(n) = 23*T(n/23)+22*n/22+2*n = 23*T(n/23)+3*n .. T(n) = 2k*T(n/2k)+k*n (we assume) T(n/2k) = 2*T(n/2k+1)+n/2k (we know by the recurrence formula) By substituting T(n/2k+1) in the relationship above, we obtain: T(n)=2k+1*T(n/2k+1)+(k+1)*n => assumption proved T(n) = 2k*T(n/2k)+k*n. How many steps k=x are needed to eliminate the recurence ? When n/2x=1 => x=log2 n T(n)=n * T(1) + n * log2 n Θ(n * log2 n)

The Recursion Tree Method The recursion tree method for solving recurrences: converts the recurrence into a tree of the recursive function calls. Each node has a cost, representing the workload of the corresponding call (without the recursive calls done there). The total workload results by adding the workloads of all nodes. It uses techniques for bounding summations. Example: applying the recursion tree method for solving the MergeSort recurrence

Recursion tree: T(n)=2 *T(n/2)+n The recursion tree has to be expanded until it reaches its leafs. To compute the total runtime we have to sum up the costs of all the nodes: Find out how many levels there are Find out the workload on each level

Recursion tree: T(n)=2 *T(n/2)+n log 2 n n n/2 n/2 n n/4 n/4 n/4 n/4 n T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) n Θ(n * log2 n)

Solving Recurrences A particular case are the recurrences of the form T(n)=aT(n/b)+f(n), f(n)=c*nk this form of recurrence appears frequently in divide-and-conquer algorithms The Master Theorem: provides bounds for recurrences of this particular form 3 cases, according to the values of a, b and k Can be proved by substitution or recursion-tree [CLRS] chap 4

T(n)=aT(n/b)+f(n) Recursion tree

T(n)=aT(n/b)+f(n) Recursion tree Which is the height of the tree ? How many nodes are there on each level ? How many leaves are there ? Which is the workload on each non-leaf level ? Which is the workload on the leaves level ? Recursion tree

T(n)=aT(n/b)+f(n) [CLRS] Fig 4.4

T(n)=aT(n/b)+f(n) Intuitively: T(n) will result good if: b is big a is small f(n)=O(nk), k is small [CLRS] Fig 4.4

T(n)=aT(n/b)+f(n) From the recursion tree: Workload in leaves Sum for all levels Workload per level i

T(n)=aT(n/b)+f(n), f(n)=c*nk Geometric series of factor a/bk

Math review See also: [CLRS] – Appendix A - Summation formulas

Applying the math review See also: [CLRS] – Appendix A - Summation formulas

Applying the math

T(n)=aT(n/b)+f(n), f(n)=c*nk We just proved the Master Theorem: The solution of the recurrence relation is:

Merge-sort revisited T(n) = Θ(1), n=1 2*T(n/2) + Θ(n), n>1 The recurrence relation of Merge-sort is a case of the master theorem, for a=2, b=2, k=1 Case a=bk => Θ(nk * log n), k=1 => Θ(n * log n) Conclusion: In order to solve a recurrence having the form T(n)=aT(n/b)+f(n), f(n)=c*nk, we can either: Memorize the result of the Master Theorem and apply it directly Do the reasoning (by substitution or by recursion-tree) on the particular case

Master Theorem – Applicability The results of the theorem are valid ONLY for particular cases of recurrences, these of the form T(n)=aT(n/b)+f(n), f(n)=c*nk For all other types of recurrences, you have to apply the substitution method or the recursion tree method Examples: recursive factorial, recursive Fibonacci – the Master theorem does NOT apply for this kind of recurrences !

Difficult recurrences Some recurrences can be difficult to solve mathematically, thus we cannot directly determine a tight bound (Theta) for their running times. In this cases, we can apply one of the following strategies: Strategy 1: Try to determine a lower bound (Omega) and an upper bound (O). Stragtegy 2: Guess a function for the tight bound and prove that it verifies the recurrence formula (The “Guess and prove” approach)

Example: Recursive Fibonacci function Fibonacci (n:integer) returns integer is: if (n==1) or (n==2) return 1 else return Fibonacci (n-1) + Fibonacci (n-2) T(n) = c1, n<=2 T(n-1) + T(n-2) + c2, n>2 This recurrence is difficult to solve by substitution or call tree. We have to try something else.

Example: Computing Lower and upper bounds We always should try to do our best: find a lower bound which is the highest lower bound that we can prove, and find an upper bound which is the lowest upper bound that we can prove. If we can prove the same function both for lower bound and upper bound, then we even managed to find the tight bound (Theta).

Example: Upper bound for Fibonacci T(n) = c1, n<=2 T(n-1) + T(n-2) + c2, n>2 T(n) is in O(f(n)), if there exist a>0, n0>0, such that T(n)<=a*f(n), for all n>=n0. For any algorithm, we have: T(n-2)<=T(n-1) Taking this into account (replacing T(n-2) with the bigger T(n-1)), the Fibonacci recurrence leads to: T(n)=T(n-1)+T(n-2)+c2 <= 2*T(n-1) + c2

Example: Upper bound for Fibonacci (cont) T(n)=T(n-1)+T(n-2)+c2 <= 2*T(n-1) + c2 By substitution, T(n)<= 2^k *T(n-k) + c2 (2^(k-1) + 2^(k-2) + .... + 2^2 +2 +1) Substitution stops when n-k=1, k=n-1 Results that T(n)<= a* 2^n T(n) is in O(2^n)

Example: Lower bound for Fibonacci T(n) = c1, n<=2 T(n-1) + T(n-2) + c2, n>2 T(n) is in Omega(f(n)), if there exist b>0, n0>0, such that T(n)>=b*f(n), for all n>=n0. For any algorithm, we have: T(n-2)<=T(n-1), T(n-3)<=T(n-2), ... T(n-k)<=T(n-k+1) Taking this into account, replacing T(n-1) by the smaller T(n-2), the Fibonacci recurrence leads to: T(n)=T(n-1)+T(n-2)+c2 >= 2*T(n-2) + c2

Example: Lower bound for Fibonacci (cont) T(n)=T(n-1)+T(n-2)+c2 >= 2*T(n-2) + c2 By substitution, T(n)>= 2^k *T(n-2*k) + c2 ( 2^(k-1)) + .... + 2^2 +2 +1) Substitution stops when k=n/2 Results that: T(n)>= b* 2^(n/2) T(n) is in Omega(2^n/2)

Example: Guess and prove By upper and lower bounds we found that Fibonacci time is: b* 2^(n/2) <=T(n) <= a* 2^n We presume that T(n)=x^n. We have to prove this (to find the value of x) T(n)=T(n-1)+T(n-2)+c2 x^n = x^(n-1)+ x^(n-2) x^2 – x-1 = 0 => x=1.618 Fibonacci is Θ (x^n)

Conclusions We estimate asymptotic complexity with: Upper Bound (Big-O), Lower Bound (Big-Omega) and Tight Bound (Big-Theta). Determining the asymptotic complexity of recursive algorithms can be difficult. For this, you will have to solve the recurrence relationship that describes the recursive algorithm. General methods for solving recurrence relationships are the substitution method and the recursion-tree method. For certain particular types of recurrences (the divide-and-conquer type of recurrences) the result is also given by the Master Theorem. Sometimes solving certain recurrence relationships is mathematically difficult and we cannot calculate the tight bound. In this case we can apply one of the following methods: introduce approximations, that will help us determine only lower bounds and upper bounds guess and prove

Bibliography Review Analysis of algorithms: [CLRS] – chap 3 (Growth of functions), chap 4 (Recurrences, Master Theorem) or [Manber] – chap 3

And there is more to this subject … The running time can vary also depending on the input values, not only input size: Average vs. Worst-case. Worst case =running time of a program is guaranteed less than a certain bound (as a function of the input size), no matter what the input. This approach is needed in critical software. In other applications it is enough to analyse average performance. This may need probabilistic analysis. Randomized algorithms. Random inputs guide the behaviour, in the hope of achieving good performance in the "average case“.  Their running time and/or their output are random variables. [CLRS ch 5] Amortized analysis: provides a worst-case performance guarantee on a sequence of operations; while each operation may have its worst case guarantee, a specific sequence of these operations may behave better than the individuals. [CLRS ch 17]. Will have a small example later at disjoint sets.