Recursion In this Chapter we Study: Recursion Divide and Conquer Dynamic Programming.

Recursion In this Chapter we Study: Recursion Divide and Conquer Dynamic Programming

Recursion involves: Self-reference (as in a recursive definition, or in a recursive method) Each recursive call to a function should conduct to a problem of smaller size. The chain of self-reference is terminated by a base case, in which the solution to the problem is trivial. General form of a recursion function: if (Condition for which the problem is trivial) // fundamental case Trivial solution else // general case Recursive call to the function for a smaller case

if n = 0 if n > 0 Example 1: The factorial of a non-negative integer, n, is defined as follows:          2/2/ 2/2/ x nn nn n xx xxx if n = 1 if n is odd else Example 2: The power n of a number x is defined as follows: Let us illustrate this with some examples. x

Example 3: The GCD of two non-negative integers, neither of them 0, is defined as follows: m if n = 0 gcd(m, n)  gcd(n, m mod n)if n > 0 Example 4: The product of two positive integers m and n is defined as follows: mif n = 1 P(m, n)  m + P(m, n-1)if n > 1

Example 5: Towers of Hanoi A classical example of a problem which is nicely solved by recursion is: We have: n disks in order of decreasing size, which are stacked on a peg, say A. We want to: Move these disks to another peg, say B, so that they are in the same decreasing order of size. using a third peg, say C, as workspace, and under the following rules: We can move only one disk at a time. We cannot place a larger disk on top of a smaller one in any move.

A B C ABC A B C

Recursive Solution Move (n-1, A, C, B) + Move (1, A, B, C) + Move (n-1, C, B, A)  Move ( n, A, B, C) how many? fromtoworkspace

Example 6: Another nice example of recursion: already discussed in Stacks. In fact there is a great similarity between stack and recursion! Involves printing the digits of a number n, say n = 1234 from right to left, say 4321 What the solution is? We can extract the digits in reverse order in which we want to print them

Recursive Solution Algorithm RecursiveReverseNumber Input: A non-negative number, n Output: The digits of n in reverse order. procedure printDecimal(int n) Print n mod 10 if (n >= 10) printDecimal(n div 10) What if we want to print digits in the same order? Idea: use a stack

Implementation of Recursion Recursion is implemented by using a stack of activation records. Each activation record stores all parameters related to the present call, so that it can be completed when we return to this call. Why a stack? Because calls are completed in the reverse order in which they are generated. example: compute the sum of the integer from 1 to n 1 + 2 +... + n That can be also write: n + sum of the integer from 1 to (n - 1), ~ n + 1 + 2 +... + (n - 1) or n + Sum(n - 1)

Return 4 + Sum(3) = 4 + 6 = 10 call 1: Sum(4) Return 3 + Sum(2) = 3 + 3 = 6 call 2: Sum(3) Return 2 + Sum(1) = 2 + 1 = 3 call 3: Sum(2) n==1 Return 1 call 4: Sum(1) 4 3434 234234 12341234 int Sum (int n) { if ( n == 1) // fondamental case return 1 ; else // general case return (n + Sum(n -1)) ; } 12341234

Proof of correctness of recursive algorithms The proof of the correctness of a recursive algorithm is often a proof by induction. Formulating a proof by induction 1 State the proposition P(n) and the range of n for which you are trying to prove the proposition. 2 Verify the base case: that is, to verify that P(n) is true for the smallest value of n in the range. 3 Formulate the inductive hypothesis: that is, P(k) is true (for k < n). 4 Prove the induction step: that is, proving the induction hypothesis is true for the next value k + 1 of n. We prove the following: if P(k) is true, then P(k + 1) is also true. 5 Conclude that P(n) is true for all n in the stated range.

Example 1:

Example 2:

Correctness of Factorial algorithm The factorial of an integer n is defined as follows: 0! = 1 n! = 1 × 2 × 3 × 4 ×... × n for all n  1 A recursive method to calculate the factorial of n: public static int factorial(int n){ If (n==0)return 1; else return n*factorial(n-1); } In order to prove the correctness of the factorial algorithm, we need to prove the following proposition: n! = n × (n − 1)! is true for all n  1.

Proof by induction 1 Prove of the previous proposition. 2 Verify the base case: 1! = 1 × (1 − 1)! = 1 The above proposition is true for n = 1. 3 Induction hypothesis: k! = k × (k − 1)! true for n = k. 4 Prove induction step: prove that (k + 1)! = (k + 1) × k!. Proof: 5 Conclusion: the proposition n! = n × (n − 1)! is true for all n 1.

Considering the following problem: Compute the value of the Fibonacci numbers : 1, 1, 2, 3, 5, 8, 13, 21, 34 … Recursive definition: Recursive implementation int fib(int n) { if (n==1 || n==2) return 1; else if (n>2) return fib(n-1)+ fib(n-2); } if n > 1 if n=1 or n=2 int Iterativefib( int n){ int prev1=1, prev2=1; current=1; for (int i=3; i<=n; i++){ current= prev1+prev2; prev2= prev1; prev1= current; } return current; }

The recursive solution for the Fibonacci number is simple but highly inefficient! The tree of function call exponentially grow. Many call with the same parameters! The iterative solution is much more efficient. If n = 35 then: The iterative program performs 33 additions. The recursive program performs 18.5 million calls!!!

Complexity of the recursive algorithm The recursive algorithm fiboRec that calculates the n th Fibonacci number is exponential: it runs O(2 n ) time-complexity. The proof to the exponential complexity of requires proving the 2 following propositions: Proposition 1 fib oRec (n) performs f n − 1 additions before it terminates (f n is the n th Fibonacci number). Proposition 2 The number of additions f n − 1 performed by fib oRec (n) is greater than some exponential function of n.

Proof of proposition 1:

Proof of proposition 2:

The complexity of the iterative algorithm for Fibonacci is O(n) !

Recursion or iteration? In general there exists an iterative and an recursive solution for each problem. A recursive approach is a good solution when the problem is intrinsically recursive. In this case iterative solutions are much more complicate (ex. Towers of Hanoi). In general iterative solutions are more efficient than recursive ones. Redundancy of computation (ex. Fibonacci). Recursion imply number function calls. Each function call imply transfer of parameters, copy of the state of the program… Recursive call need a lot of memory.

Proof by induction that if n ≥ 5, n 2 < 2 n. True for n = 5 : 25 < 32. Set that k 2 5. Then 2k 2 < 2 * 2 k = 2 k+1 When n > 5, n 2 > 5n + 1 > 2n + 1 Then (k + 1) 2 = k 2 + 2k + 1 < k 2 + k 2 = 2 k 2 < 2 k+1

What is the big-Oh complexity of the following algorithm? What is the return value of the call func(15)? Func(n) { If ( n ≤ 0 ) Return 10 Else Return Func(n-2) – 2 Func(15) = Func(13) – 2 = Func(11) – 2 – 2 = Func(9) – 2 – 2 - 2 = … Func(-1) – 2 – 2 – 2 – 2 – 2 – 2 – 2 – 2 = 10 – 2 – 2 – 2 – 2 – 2 – 2 – 2 – 2 = - 6 n/2 calls to subtractions then complexity of O(n)

Proof by induction that for n ≥ 0. True for n = 0 : 0 = 0 Assume is true for a given n 1 + 4 + 9 + … + n 2 + (n+1) 2 = = We have also (n+2)(2(n+1) +1) = (n+2)(2n+3) = 2n 2 + 7n + 6 Then

Divide-and-Conquer An important paradigm… in algorithm design Divide: It recursively breaks down a problem into two (or more) subproblems of the same (or related) type, until these become simple enough to be solved directly (base case). Conquer: The solutions to the sub-problems are then combined to give a solution to the original problem.

Example: Linear and Binary search Linear search Given an unsorted array, find the index position of a value x in the array. Algorithm: Compare x to all values in the array until the value of x is found or the end of the array is reached. If x is found in the array return its index position, else return -1. Complexity: The algorithm is linear in the worst case.

low high key indices 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 array 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 16 18 20 22 24 26 28 24 26 28 24 NOTE: refer to element of the array that are tested x = BinarySearch(a, 0, 14, 25); Binary search

Given a sorted array, find the index position of a value x in the array. int BinarySearch (const int a[ ], int low, int high, int key ){ // a [ low.. high ] sorted in increasing order int mid ; if ( low > high ) // fundamental case: not found return -1; else { mid = (low + high) / 2 ; if ( a [ mid ] == key ) // fundamental case: found in the middle return mid ; else if ( key < a [ mid ] ) // search in the lower half return BinarySearch ( a, low, mid - 1, key ); else // search in the higher half return BinarySearch( a, mid + 1, high, key ) ; } } ; The complexity in the worse case: O(log n )

An iterative version of the binary search int BinarySearch (const int a[ ], int low, int high, int key ) { // a [ low.. high ] sorted in increasing order int mid; while ( low <= high ) { mid = (low + high) / 2 ; if ( a [ mid ] == key ) // found in the middle return mid ; else if ( key < a [ mid ] ) // search in lower half high = mid - 1 ; else // search in higher half low = mid + 1 ; } return -1 ; // key not found } ; In fact, most of the recursive algorithm can be write in an iterative form. The complexity is the same but the program efficiency is greater.

How to chose the best algorithm for a given problem? Look at the complexity of each part of the algorithm. Define how many time each part of the algorithm will be apply. Example with the linear/binary search: linear search: build the array in O(n) search in the array in O(n) binary search: build the array in O(n) sort the array in O(n log n) search in the array in O(log n) So: for a problem with few searches -> linear search with lot of searches -> binary search

Example: Maximum Contiguous Subsequence Problem a 1, a 2, …, a  n/2  | a  n/2  +1, a  n/2  +2, …, a n Find maximum contiguous subsequence of first half. Find maximum contiguous subsequence of second half. MCS  max. of these two? first halfsecond half

What if max. contiguous subsequence straddles both halves? Example: -1, 2, 3, | 4, 5, -6 MCS in first half is 2, 3 MCS in second half is 4,5 However … the MCS of the given sequence is: 2, 3, 4, 5 It straddles both halves! Observation: For a straddling sequence, a  n/2  is the last element in first half, and a  n/2  +1 is the first element in the second half.

In the example: Last element = 3 First element = 4 Computing a max. straddling sequence: a 1, a 2, a 3, a 4, |a 5, a 6, a 7, a 8 Find maximum of: a 4 a 4 + a 3 a 4 + a 3 + a 2 a 4 + a 3 + a 2 + a 1 Let the maximum be max1

Find the maximum of: a 5 a 5 + a 6 a 5 + a 6 + a 7 a 5 + a 6 + a 7 + a 8 Let the maximum be max2 Required maximum: max1 + max2 Required subsequence: Glue subsequence for max1 and subsequence for max2

Complexity of for the straddling sequence Theorem: For an n-element sequence, the complexity of finding a maximum straddling sequence is O(n) Proof: We look at each element exactly once! Putting things together… S = a 1, a 2, …, a  n/2 , a  n/2  +1, a  n/2  +2, …, a n S1 = a 1, a 2, …, a  n/2  S2 = a  n/2  +1, a  n/2  +2, …, a n MCS ( S ) = Max ( MCS (S1), MCS (S2), Max (straddling sequence) ) MCS (S1) and MCS (S2) are found recursively.

Complexity of Recursive MCS Recursive complexity: How computing the global complexity? T(n/2) = time to solve a problem of size n/2 O(n) = complexity of finding a Maximum straddling subsequence

What is a solution to (1)? Approach 1: # of levels in the tree of recursive calls?  1 + log 2 n  O(n) work needed to go from one level of the tree of recursive call to the next higher level. Therefore, we have a global complexity of: O (n log n)

n n / 2 n / 4 n/8 n … log n +1 The tree of recursive calls A total of… O (n log n)

Approach 2: 2 0  T(n)  2 1 T(n/2) + c n, 2 1  T(n/2)  2 2 T(n/4) + 2 c n/2, 2 2  T(n/4)  2 3 T(n/8) + 2 2 c n/4, 2 3  T(n/8)  2 4 T(n/16) + 2 3 c n/8, ……… Adding up, we have: 2 0 T(n) + 2 1 T(n/2) + 2 2 T(n/4) + 2 3 T(n/8) + …  2 1 T(n/2) + 2 2 T(n/4) + 2 3 T(n/8) + …+ c n + 2 c n/2 + 2 2 c n/4 + … T(n)  c n + c n + c n + … log 2 n terms T(n)  c n log 2 n Recursive MCS takes… T(n) = O (n log n) whereas non-recursive MCS takes… T(n) = O (n) Divide-and-conquer  Not always best solution !!

Another example: Find both the maximum and minimum of a set of n elements, S Naive solution: Find maximum: n – 1 comparisons Find minimum: n – 2 comparisons Total: 2n - 3 comparisons Can we do better??... Yes…

Algorithm Divide-and-Conquer procedure maxMin(S) if |S| = 2 // S = {a,b} return (max(a,b), min(a,b)) else divide S into two subsets, say S1 and S2, each with half of elements. (max1, min1)  maxMin(S1) (max2, min2)  maxMin(S2) return(max(max1, max2), min(min1, min2))

Analysis of the divide and conquer algorithm for maxMin Number of comparisons? T(n) = # of comparisons on n elements T(n) = 1n = 2 = 2T(n/2) + 2, n > 2 Solution: T(4) = 2T(2) + 2 = 4 T(8) = 2T(4) + 2 = 10 T(16) = 2T(8) + 2 = 22 … T(2 k ) = 3  2 k-1 – 2 (proof by induction) = 3/2 2 k - 2 vs. 2  2 k - 3, where n = 2 k

However… T(n) = O(n) !!! for both the naive and the divide-and-conquer algorithm For the curious!! Approximately 3n/2 – 2 comparisons are both necessary and sufficient to find the maximum and minimum of a set of n elements.

Dynamic Programming Reminder: The complexity of the recursive Fibonacci algorithm is in O(2 n ) while the complexity of the iterative one is in O(n)! Why? `Because the recursive algorithm perform the same operations a huge number of time. Idea: Memorizing the value in an array: Dynamic programming algorithm for Fibonacci numbers: For i=2 to n { t[i] = t[i-1] + t[i-2] } 4... 0123n 11235

Essence of this method Solving problem method by combination of subproblems solutions. Apply to a recursive problem composed of dependent subproblems. In fact: a transformation of a recursive algorithm by the use of data structure to store intermediate solutions. Each subproblem is solved only one time and is store in an array for later use. Can lower the complexity from exponential to polynomial. Numerous application for optimization problems.

Another example of an optimization problem: Given coins which worth c1, c2, …, cn cents Make up change for k cents, using the minimum number of coins of the above denominations. In order that this be always possible, we assume that c1 = 1 Let min(k) denote the minimum number of coins needed to make k cents of change. Then we set, min(k) = min { min(r) + min(k-r) } for all 1  r   k/2  if we know min(1), min(2), …, min(k-1)

Example: c1 = 1 cent c2 = 5 cents c3 = 10 cents c4 = 25 cents min(1) = 1, and min(2) = min(1) + min(1) = 2 We know also min(5) = min(10) = min(25) = 1 Already know the value of min(2), then min(3) = min(1) + min(2) = 1 + 2 = 3 Again, already know values of min(2) and min(3), so: min(4) = min{min(1) + min(3), min(2) + min(2) } = min {4, 4} = 4

# of centsMinHow solution is found? 11min(1) 22min{min(1)+min(1)} 33min{min(1)+min(2)} 44min{min(1)+min(3), min(2)+min(2)} 51min{min(1)+min(4), min(2)+min(3)} 62min{min(1)+min(5), min(2)+min(4), min(3)+min(3)} 73min{min(1)+min(6), min(2)+min(5), min(3)+min(4)}

84min{min(1)+min(7),min(2)+min(6), min(3)+min(5), min(4)+min(4)} 95min{min(1)+min(8), min(2)+min(7), min(3)+min(6), min(4)+min(5)} 101min{min(1)+min(9), min(2)+min(8), min(3)+min(7), min(4)+min(6)} min(11) = min{ 1+10, 2+9, 3+8, 4+7, 5+6} = min{ 2, 7, 7, 7, 3} = 2 1+1=22+5=73+4=71+2=3

Why don’t we use recursion? Quite inefficient !! What do we use instead? Dynamic Programming… The algorithm uses one arrays: coinsUsed: Stores the minimum number of coins needed to make change of k cents, k = 1,…, maxChange. coinsUsed[0]  0; for cents  1 to maxChange do // maxChange = k minCoins  cents for j  1 to diffCoins do // diffCoins = n if (coins[j] > cents) continue// Cannot use coin j if (coinsUsed[cents - coins[j]] + 1 < minCoins) minCoins  coinsUsed[cents – coins[j]] + 1 coinsUsed[cents]  minCoins Print “minimum number of coins:”, coinsUsed[maxChange]

Time Complexity: O ( n k ) where n = number of coins of different denominations k = amount of change we want to make

Another example from bioinformatics : Sequence alignment by dynamic programming Important problem in bioinformatics: determining the level of similarity between two protein sequences. Very useful to discover the function of a new protein. A protein sequence is represented by a sequence of characters, each character corresponding to one amino acid. Align: to match the higher number of characters between two sequences in order to maximize a similarity score. Three possible configurations (match): Identity: A A Substitution (mismatch): A C Insertion or deletion (gap): A - or - A Example: A T - G G - T A A C - G C T

A score value is associated to each configuration. The global score is the sum of the score of all matches in the alignment. Example: Identity score: 4 mismatch score: -1 gap score: -2 A T - G G - T A A C - G C T Score: 4 - 1 - 2 - 2 + 4 - 2 + 4 = 5 Problem: to compute the best alignment, all the alignments have to be constructed.

The solution is defined recursively. We define P = p 1, p 2, …, p n the first sequence and Q = q 1, q 2, …, q m the second sequence. We define F(i,j) the score of the best alignment between p 1, p 2, …, p i and q 1, q 2, …, q j.

Finding the score of i,j i-i- ijij -j-j 1…i 1…j-1 1…i-1 1…j-1 1…i-1 1…j + + + Three ways to build the alignment 1…i 1…j

Cédric Notredame (13/10/2015) Finding the score of i,j 1…i-1 1…j-1 1…i 1…j-1 1…i-1 1…j In order to Compute the score of 1…i 1…j All we need are the scores of:

Cédric Notredame (13/10/2015) Formalizing the algorithm F(i,j)= best F(i-1,j) + Gap F(i-1,j-1) + Mat[i,j] F(i,j-1) + Gap X-X- XXXX -X-X 1…i 1…j-1 1…i-1 1…j-1 1…i-1 1…j + + + The direct application of the recursive formula is in O(c n ). There is overlap between subproblems => dynamic programming.

Cédric Notredame (13/10/2015) Arranging Everything in a Table -FA - F A S T T 1…I-1 1…J-1 1…I 1…J-1 1…I-1 1…J 1…I 1…J We use an bidimensional array D to store all the values of the partial alignments

Cédric Notredame (13/10/2015) Taking Care of the Limits -FA - F A S T T -4 Match=2 MisMatch=-1 Gap=-1 -3 FAT --- F-F- -2 FA -- F-F- -2 FA -- -3 FAS --- 0

Cédric Notredame (13/10/2015) Filing Up The Matrix

Cédric Notredame (13/10/2015) -FA - F A S -3 -2 -2 T -3 T -4 -2 +2 -2 +2 -3 -2 +1 -4 -3 0 0 +1 -2 -3 +1 0 +4 0 0 +3 0 -3 -4 0 +3 0 +3 +2 +3 +2 +3 -4 -5 +2 -2 +2 +5 +1 +5 0

Cédric Notredame (13/10/2015) Delivering the alignment: Trace-back Score of 1…3 Vs 1…4  Optimal Aln Score TTTT S-S- AAAA FFFF Dynamic programming algorithm complexity in O(n 2 )

-GCTCTGCGAATA - 0-2-4-6-8-10-12-14-16-18-20-22-24 C -20-2-4-6-8-10-12-14-16-18-20 G -40-2-3-5-4-6-8-10-12-14-16 T -6-20-2-3-5-7-9-11-10-12 T -8-4-310-2-4-6-8-10-9-11 G -10-6-50-220 -4-6-8-10 A -12-8-7-3-201 0-2-4-6 G -14-10-9-5-4-3131 -3-5 A -16-12-11-7-6-501531 T -18-14-13-9-8-4-3-23453 A -20-16-15-11-10-6-5-4-31537 C -22-18-14-13-9-8-7-3-5345 T -24-20-16-12-11-7-9-5-4-3153 Alignment between GCTCTGCGAATA and CGTTGAGATACT (match=2, mismatch = -1 and gap = -2) G C T C T G C G A - A T A - C G T T G A G A T A C T

Recursion In this Chapter we Study: Recursion Divide and Conquer Dynamic Programming.

Similar presentations

Presentation on theme: "Recursion In this Chapter we Study: Recursion Divide and Conquer Dynamic Programming."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Recursion In this Chapter we Study: Recursion Divide and Conquer Dynamic Programming.

Similar presentations

Presentation on theme: "Recursion In this Chapter we Study: Recursion Divide and Conquer Dynamic Programming."— Presentation transcript:

Similar presentations

About project

Feedback