Data Structures and Algorithms (60-254) Recursion Data Structures and Algorithms (60-254)
Topics Recursion Divide and Conquer Dynamic Programming The Greedy Paradigm
Recursion Recursion involves: Self-reference as in a recursive definition, or in a recursive method The chain of self-reference is terminated by a base case, which we must define separately. Let us illustrate this with some examples.
Examples Example 1: n! n (n-1)! n 1 1 n = 0 Example 2: The factorial of a non-negative integer, n, is defined as follows: n! n (n-1)! n 1 1 n = 0 Example 2: The GCD of two non-negative integers, not both 0, is defined as follows: gcd(m, n) gcd(n, m mod n) if n > 0 m if n = 0
Examples Example 3: P(m, n) m + P(m, n-1) if n > 1 m if n = 1 The product of two positive integers m and n is defined as follows: P(m, n) m + P(m, n-1) if n > 1 m if n = 1 Example 4: A classical example of a problem which is nicely solved by recursion is:
Towers of Hanoi We have: n disks in order of decreasing size, which are stacked on a peg, say A. We want to: Move these disks to another peg, say C, so that they are in the same decreasing order of size. using a third peg, say B, as workspace, and under the following rules: We can move only one disk at a time. We cannot place a larger disk on top of a smaller one in any move.
Recursive Solution Move (n, A, C, B) how many? from to workspace how many? from to workspace Move (n, A, C, B) Move (n-1, A, B, C) + Move (1, A, C, B) + Move (n-1, B, C, A) Homework: How many moves to solve with n disks?
Examples Example 5: Another nice example of recursion: already discussed in Stacks. Involves printing the digits of a number n, say n = 1234 from right to left, say 4321 The digits are input in the reverse order that we want to print them. We want to postpone printing the least significant digits until we have printed the more significant one. Recursion is one way to achieve this.
Algorithm RecursiveReverseNumber Input: A non-negative number, n Output: The digits of n in reverse order. procedure printDecimal(int n) Print n mod 10 if (n >= 10) printDecimal(n div 10) Compare this 3 line recursive algorithm with the 10 line iterative (stack based) algorithm from chapter 3. What if we want to print digits in the same order?
Implementation of Recursion Recursion is implemented by using a stack of activation records. Each activation record stores all parameters related to the present call, so that it can be completed when we return to this call. Why a stack? Because calls are completed in the reverse order in which they are generated.
Tail recursion optimization Tail recursion is a special case of recursion where the last operation is a recursive call. Compilers can easily transform tail recursive code into equivalent iterative code Can dramatically decrease stack space used and increase efficiency. printDecimal(int n) Print n mod 10 if (n >= 10) printDecimal(n div 10) loop: Print n mod 10 if (n >=10) n=n div 10; goto loop Homework: Does C do tail recursion optimization? C++? Java? C#?
Proving Program Correctness Close connection between recursive programs and mathematical induction Can use mathematical induction to aid in proving program correctness for recursive programs How? Prove the program produces the correct output for the base case. Assume it produces the correct output for the case n=k Prove it produces the correct output for n=k+1 Homework: Prove correctness of the recursive towers of Hanoi implementation
An Inefficient Use of Recursion Recursion is a powerful tool but there are limitations Computing Fibonacci numbers The nth Fibonacci number is given by: Fn = Fn-1 + Fn-2 if n > 2 F0 = 0 , F1 = 1 Here are the first few: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, …
Algorithm Fibonacci (Recursive) Input: A positive integer n Output: The sequence of Fibonacci numbers procedure Fibonacci(int n) if (n == 0 or n == 1) return n; else return Fibonacci(n-1) + Fibonacci(n-2)
Inefficient Why is it inefficient using recursion for Fibonacci? Homework: Design an iterative Fibonacci algorithm
Divide-and-Conquer An important paradigm … in algorithm design Divide: Problem into sub-problems Conquer: By merging solutions of sub-problems … into solution to the original problem.
Maximum Contiguous Subsequence Problem Find maximum contiguous subsequence of first half. Find maximum contiguous subsequence of second half. MCS max. of these two?
Straddling sequence What if max. contiguous subsequence straddles both halves? Example: -1, 2, 3, | 4, 5, -6 MCS in first half is 2, 3 MCS in second half is 4,5 However …
Observation The MCS of the given sequence is: 2, 3, 4, 5 It straddles both halves! Observation: For a straddling sequence, an/2 is the last element in first half, and an/2+1 is the first element in the second half.
Computing a max. straddling sequence In the example: Last element = 3 First element = 4 Computing a max. straddling sequence: a1, a2, a3, a4, |a5, a6, a7, a8 Find maximum (max1) of: a4 a4 + a3 a4 + a3 + a2 a4 + a3+ a2+ a1
Computing a max. straddling sequence Find the maximum of: a5 a5 + a6 a5 + a6 + a7 a5 + a6+ a7+ a8 Let the maximum be max2 Required maximum: max1 + max2 Required subsequence: Glue subsequence for max1 and subsequence for max2
Complexity Theorem: For an n-element sequence, the complexity of finding a maximum straddling sequence is O(n) Proof: We look at each element exactly once!
Putting things together … S = a1, a2, … , an/2, an/2+1, an/2+2, …, an S1 = a1, a2, … , an/2 S2 = an/2+1, an/2+2, …, an Max ( S ) = Max ( Max (S1), Max (S2), Max (straddling sequence) ) Max (S1) and Max (S2) are found recursively.
Complexity of Recursive MCS How? T(n/2) = time to solve a problem of size n/2 O(n) = complexity of finding a Maximum straddling subsequence What is a solution to (1)?
Approach 1 # of levels in the tree of recursive calls? 1 + log2 n O(n) work needed to go from one level to the next higher level. Therefore, we have O (n log n)
Graphically A total of… O (n log n)
Approach 2 20 T(n) 2 T(n/2) + c n, 21 T(n/2) 2 T(n/4) + c n/2, 22 T(n/4) 2 T(n/8) + c n/4, 23 T(n/8) 2 T(n/16) + c n/8, ……… Adding up, we have: 20 T(n) + 21 T(n/2) + 22 T(n/4) + 23 T(n/8) + … 2 T(n/2) + 22 T(n/4) + 23 T(n/8) + … + c n + 2 c n/2 + 22 c n/4 + …
Moral of the story log2 n terms T(n) c n log2 n Recursive MCS takes… T(n) c n + c n + c n + … log2 n terms T(n) c n log2 n Recursive MCS takes… T(n) = O (n log n) whereas non-recursive MCS takes… T(n) = O (n) Divide-and-conquer Not always best solution !!
A General Recurrence Relation For T(n) = A T(n/B) + O(nk) where A 1, B > 1 Such recurrences do occur in practice…
Matrix Multiplication or where T(n) is the time-complexity of multiplying two nxn matrices. In (1), A = 8, B = 2, k = 2 (2), A = 7, B = 2, k = 2 A > B2
Tree of Recursive Calls (n = 8)
Divide and Conquer Another example: Find both the maximum and minimum of a set of n elements, S Naive solution: Find maximum: n – 1 comparisons Find minimum: n – 2 comparisons Total: 2n - 3 comparisons Can we do better?? ... Yes …
Algorithm Divide-and-Conquer procedure maxMin(S) if |S| = 2 // S = {a,b} return (max(a,b), min(a,b)) else divide S into two subsets, say S1 and S2, each with half of elements. (max1, min1) maxMin(S1) (max2, min2) maxMin(S2) return( max(max1, max2), min(min1, min2) )
Number of comparisons? T(n) = # of comparisons on n elements T(n) = 1 n = 2 = 2T(n/2) + 2, n > 2 Solution: n=22 T(4) = 2T(2) + 2 = 4 n=23 T(8) = 2T(4) + 2 = 10 n=24 T(16) = 2T(8) + 2 = 22 … n=2k T(2k) = 3 2k-1 – 2 (Guess pattern and prove by induction) = 3/2 n - 2 vs. 2 n - 3
Solving by Repeated Substitution T(n) = 1, n = 2 = 2T(n/2) + 2, n > 2 T(n/2) = 2T(n/2/2) + 2 = 2T(n/22) + 2 T(n) = 2[2T(n/22) + 2] + 2 {since T(n/2) = 2T(n/2/2) + 2 = 2T(n/22) + 2} = 22T(n/22 ) + 22 + 21 = 22[2T(n/23) + 2] + 22 + 21 {since T(n/22) = 2T(n/22/2) + 2 = 2T(n/23) + 2} = 23T(n/23 ) + 23 + 22 + 21 … = 2kT(n/2k ) + 2k + … + 23 + 22 + 21 {Continue until n/2k = 2 (base case). So 2k=n/2} = n/2 T(2) + 2(2k-1 + … + 23 + 22 + 21 + 20) = n/2 + 2(2k - 1) = n/2 + 2(n/2 - 1) = n/2 + 2n/2 – 2 = 3n/2 - 2 Note: 2k-1 + … + 23 + 22 + 21 + 20 is a geometric series Sum = a(1-rk)/(1-r) where a=1, r=2 = (1 – 2k)/(1-2) = 2k – 1
Note However… T(n) = O(n) !!! for both the naive and the divide-and-conquer algorithm For the curious!! Approximately 3n/2 – 2 comparisons are both necessary and sufficient to find the maximum and minimum of a set of n elements.
Dynamic Programming Another Paradigm… Given coins worth c1, c2, …, cn cents, Make up change for k cents, using the minimum number of coins of the above denominations. In order that this be always possible, We assume that c1 = 1
Coins problem The essence of this method is to record (in a table) the solutions to all smaller subproblems. Let min(k) denote the minimum number of coins needed to make k cents of change. Then, min(k) = min { min(r) + min(k-r) } 0 r k/2 if we know min(1), min(2), …, min(k-1)
Example c1 = 1 cent c2 = 5 cents c3 = 10 cents c4 = 25 cents min(1) = 1, and min(2) = min(1) + min(1) = 2 Already know the value of min(2), then min(3) = min(1) + min(2) = 1 + 2 = 3 Again, already know values of min(2) and min(3), so: min(4) = min{ min(0) + min(4), min(1) + min(3), min(2) + min(2) } = 4
Example # of cents Min How solution is found? 1 min{min(1)+min(0)} 2 min{min(0)+min(2), min(1)+min(1)} 3 min{min(0)+min(3), min(1)+min(2)} 4 min{min(0)+min(4), min(1)+min(3), min(2)+min(2)} 5 min{min(0)+min(5), min(1)+min(4), min(2)+min(3)} 6 min{min(0)+min(6), min(1)+min(5), min(2)+min(4), min(3)+min(3)} 7 min{min(0)+min(7), min(1)+min(6), min(2)+min(5), min(3)+min(4)}
Example # of cents Min How solution is found? 8 4 min{min(0)+min(8), min(1)+min(7), min(2)+min(6), min(3)+min(5), min(4)+min(4)} 9 5 min{min(0)+min(9), min(1)+min(8), min(2)+min(7), min(3)+min(6), min(4)+min(5)} 10 1 min{min(0)+min(10), min(1)+min(9), min(2)+min(8), min(3)+min(7), min(4)+min(6)}
Dynamic Programming Solution Why don’t we use recursion? Quite inefficient !! What do we use instead? Dynamic Programming… The algorithm uses two arrays: coinsUsed: Stores the minimum number of coins needed to make change of k cents, k = 1,…, maxChange. lastCoin: Stores the last coin that was used by the minimum (stored in coinsUsed) It is used to go backwards and list the coin denominations that compose the solution (after the minimum has been found).
Coins coinsUsed[0] 0; lastCoin[0] 1 for cents 1 to maxChange do // maxChange = k minCoins cents newCoin 1 for j 1 to diffCoins do // diffCoins = n if (coins[j] > cents) continue // Cannot use coin j if (coinsUsed[cents - coins[j]] + 1 < minCoins) minCoins coinsUsed[cents – coins[j]] + 1 newCoin coins[j] coinsUsed[cents] minCoins lastCoin[cents] newCoin Print “minimum number of coins:”, coinsUsed[maxChange]
Time Complexity O ( n k ) where where n = number of coins of different denominations k = amount of change we want to make
Greedy Paradigm An interesting question: Can we reach a global optimum by a process of incremental optimization? We can, but not always guaranteed to succeed. Coin change: Greedy does not work !!! Take n1 of largest coin n2 of second largest coin … nk of smallest coin.
Counter-examples Coin denominations: 1, 4, 5, 6 cents. Make change for 9 cents. Greedy: 6 + 1 + 1 + 1 (four coins) Optimal: 4 + 5 (just two coins) Coin denominations: 1, 10 and 25 cents Make change for 31 cents. Greedy: 25 + 1 + 1 + 1 + 1 + 1 + 1 , i.e., 7 coins required Optimal solution: 10 + 10 + 10 + 1 i.e., 4 coins required !!!
Generating Permutations Problem: Give an algorithm that computes the different ways of making k cents in change. ?
Generating Permutations Write a recursive program that generates all permutations of 1, 2, 3, …, n For example, if n = 3, the output should be: 1 2 3 2 1 3 1 3 2 3 1 2 3 2 1 2 3 1
O(n!) O(1) Constant O(log(n)) Logarithmic Someone asked about O(n!) List all the combinations of a set of n elements is O(n!) Traveling Saleman problem has a naïve solution that is O(n!), but there is a dynamic programming solution that is O(n2 2n) The following is a list of common types of orders and their names: Notation Name O(1) Constant O(log(n)) Logarithmic O(log(log(n)) Double logarithmic (iterative logarithmic) O(n) Linear O(n log(n)) Loglinear, Linearithmic, Quasilinear or Supralinear O(n2) Quadratic O(n3) Cubic O(nc) Polynomial (different class for each c > 1) O(cn) Exponential (different class for each c > 1) O(n!) Factorial O(nn) Yuck!