Trevor Brown trevor.brown@uwaterloo.ca DC 2338, Office hour M3-4pm CS 341: Algorithms Trevor Brown trevor.brown@uwaterloo.ca DC 2338, Office hour M3-4pm
This time DP: longest common subsequence (partially covered) Memoization VS dynamic programming DP: minimum length triangulation
Problem: Longest Common Subsequence (LCS)
Examples X=aaaaa Y=bbbbb Z=LCS(X,Y)=? Z=𝜖 (empty sequence) X=abcde Y=bcd Z=LCS(X,Y)=? Z=bcd X=abcde Y=labef Z=LCS(X,Y)=? Z=abe
Thinking about subproblems Entire problem: # characters in LCS(X,Y) How to reduce problem size? Reduce size of X or Y. Define X’ and Y’ as follows 𝑿= 𝑥 1 𝑥 2 𝑥 3 𝑥 4 … 𝑥 𝑚−1 𝑥 𝑚 𝑿 ′ = 𝑥 1 𝑥 2 𝑥 3 𝑥 4 … 𝑥 𝑚−1 𝒀= 𝑦 1 𝑦 2 𝑦 3 𝑦 4 … 𝑦 𝑛−1 𝑦 𝑛 Note to self: replace generic X, Y with actual strings you can grab onto… 𝒀 ′ = 𝑦 1 𝑦 2 𝑦 3 𝑦 4 … 𝑦 𝑛−1
Consider an optimal solution Z Can we express Z in terms of X’ and Y’ instead of X and Y? By definition, Z = LCS(X,Y) 𝒁= 𝑧 1 𝑧 2 … 𝑧 ℓ−1 𝑧 ℓ 𝑿= 𝑥 1 𝑥 2 𝑥 3 𝑥 4 … 𝑥 𝑚−1 𝑥 𝑚 𝑿 ′ = 𝑥 1 𝑥 2 𝑥 3 𝑥 4 … 𝑥 𝑚−1 𝒀= 𝑦 1 𝑦 2 𝑦 3 𝑦 4 … 𝑦 𝑛−1 𝑦 𝑛 𝒀 ′ = 𝑦 1 𝑦 2 𝑦 3 𝑦 4 … 𝑦 𝑛−1
Consider an optimal solution Z Can we express Z in terms of X’ and Y’ instead of X and Y? By definition, Z = LCS(X,Y) Suppose 𝒛 ℓ matches both 𝑥 𝑚 and 𝑦 𝑛 𝒁= 𝑧 1 𝑧 2 … 𝑧 ℓ−1 𝑧 ℓ Then 𝑍=𝐿𝐶𝑆( 𝑋 ′ , 𝑌 ′ ) + 𝑧 ℓ 𝑿= 𝑥 1 𝑥 2 𝑥 3 𝑥 4 … 𝑥 𝑚−1 𝑥 𝑚 𝑿 ′ = 𝑥 1 𝑥 2 𝑥 3 𝑥 4 … 𝑥 𝑚−1 Consumed by being matched with yn 𝒀= 𝑦 1 𝑦 2 𝑦 3 𝑦 4 … 𝑦 𝑛−1 𝑦 𝑛 Consumed by being matched with xm 𝒀 ′ = 𝑦 1 𝑦 2 𝑦 3 𝑦 4 … 𝑦 𝑛−1
Consider an optimal solution Z Can we express Z in terms of X’ and Y’ instead of X and Y? By definition, Z = LCS(X,Y) Suppose 𝒛 ℓ matches only 𝒙 𝒎 (so 𝑥 𝑚 ≠ 𝑦 𝑛 ) 𝒁= 𝑧 1 𝑧 2 … 𝑧 ℓ−1 𝑧 ℓ Then 𝑍=𝐿𝐶𝑆(𝑋, 𝑌 ′ ) 𝑿= 𝑥 1 𝑥 2 𝑥 3 𝑥 4 … 𝑥 𝑚−1 𝑥 𝑚 Maybe still needed by Z 𝑿 ′ = 𝑥 1 𝑥 2 𝑥 3 𝑥 4 … 𝑥 𝑚−1 (Might be matched with something in Y’) 𝒀= 𝑦 1 𝑦 2 𝑦 3 𝑦 4 … 𝑦 𝑛−1 𝑦 𝑛 𝒀 ′ = 𝑦 1 𝑦 2 𝑦 3 𝑦 4 … 𝑦 𝑛−1 Not needed by Z Remove to shrink problem size!
Consider an optimal solution Z Can we express Z in terms of X’ and Y’ instead of X and Y? By definition, Z = LCS(X,Y) Suppose 𝒛 ℓ matches only 𝒚 𝒏 (so 𝑥 𝑚 ≠ 𝑦 𝑛 ) 𝒁= 𝑧 1 𝑧 2 … 𝑧 ℓ−1 𝑧 ℓ Then 𝑍=𝐿𝐶𝑆( 𝑋 ′ ,𝑌) 𝑿= 𝑥 1 𝑥 2 𝑥 3 𝑥 4 … 𝑥 𝑚−1 𝑥 𝑚 𝑿 ′ = 𝑥 1 𝑥 2 𝑥 3 𝑥 4 … 𝑥 𝑚−1 Not needed by Z 𝒀= 𝑦 1 𝑦 2 𝑦 3 𝑦 4 … 𝑦 𝑛−1 𝑦 𝑛 𝒀 ′ = 𝑦 1 𝑦 2 𝑦 3 𝑦 4 … 𝑦 𝑛−1 Maybe still needed by Z
Consider an optimal solution Z Can we express Z in terms of X’ and Y’ instead of X and Y? By definition, Z = LCS(X,Y) Suppose 𝑧 ℓ matches neither. 𝒁= 𝑧 1 𝑧 2 … 𝑧 ℓ−1 𝑧 ℓ Take 𝑍=𝐿𝐶𝑆( 𝑋 ′ , 𝑌 ′ ) 𝑿= 𝑥 1 𝑥 2 𝑥 3 𝑥 4 … 𝑥 𝑚−1 𝑥 𝑚 𝑿 ′ = 𝑥 1 𝑥 2 𝑥 3 𝑥 4 … 𝑥 𝑚−1 Note that 𝒙 𝒎 ≠ 𝒚 𝒏 , or else we could improve 𝑍 by adding them! Not needed by Z 𝒀= 𝑦 1 𝑦 2 𝑦 3 𝑦 4 … 𝑦 𝑛−1 𝑦 𝑛 𝒀 ′ = 𝑦 1 𝑦 2 𝑦 3 𝑦 4 … 𝑦 𝑛−1 Not needed by Z
Let 𝑋 𝑖 =( 𝑥 1 ,…, 𝑥 𝑖 ), 𝑌 𝑗 = 𝑦 1 ,…, 𝑦 𝑗 and 𝒄 𝒊,𝒋 = 𝑳𝑪𝑺 𝑿 𝒊 , 𝒀 𝒋 Four cases Case 𝒛 ℓ matches both (so 𝑥 𝑚 = 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆( 𝑋 ′ , 𝑌 ′ ) + 𝒛 ℓ Case 𝒛 ℓ matches only 𝒙 𝒎 (so 𝑥 𝑚 ≠ 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆(𝑿, 𝑌 ′ ) Case 𝒛 ℓ matches only 𝒚 𝒏 (so 𝑥 𝑚 ≠ 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆( 𝑋 ′ ,𝒀) Case 𝒛 ℓ matches neither (recall 𝑥 𝑚 ≠ 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆( 𝑋 ′ ,𝑌′) We don’t know 𝑧 ℓ ! How to identify case 1 vs 2-4? (If 𝑥 𝑚 = 𝑦 𝑛 ) How to differentiate between cases 2-4 without knowing 𝑧 ℓ ? Try all 3 possibilities in the recurrence and maximize length! Let 𝑋 𝑖 =( 𝑥 1 ,…, 𝑥 𝑖 ), 𝑌 𝑗 = 𝑦 1 ,…, 𝑦 𝑗 and 𝒄 𝒊,𝒋 = 𝑳𝑪𝑺 𝑿 𝒊 , 𝒀 𝒋 In-class exercise: derive the recurrence for 𝑐 𝑖,𝑗 (part 1) and give pseudocode to solve the problem (part 2)
In-class exercise Part 1: derive 𝑐[𝑖,𝑗] Case 𝒛 ℓ matches both (so 𝑥 𝑚 = 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆( 𝑋 ′ , 𝑌 ′ ) + 𝒛 ℓ Case 𝒛 ℓ matches only 𝒙 𝒎 (so 𝑥 𝑚 ≠ 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆(𝑿, 𝑌 ′ ) Case 𝒛 ℓ matches only 𝒚 𝒏 (so 𝑥 𝑚 ≠ 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆( 𝑋 ′ ,𝒀) Case 𝒛 ℓ matches neither (recall 𝑥 𝑚 ≠ 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆( 𝑋 ′ ,𝑌′) Let 𝑋 𝑖 =( 𝑥 1 ,…, 𝑥 𝑖 ), 𝑌 𝑗 = 𝑦 1 ,…, 𝑦 𝑗 and 𝒄 𝒊,𝒋 = 𝑳𝑪𝑺 𝑿 𝒊 , 𝒀 𝒋 𝑐 𝑖,𝑗 = ??? if 𝑖=0 or 𝑗=0 ??? if 𝑖,𝑗≥1 and 𝑥 𝑖 = 𝑦 𝑗 ??? if 𝑖,𝑗≥1 and 𝑥 𝑖 ≠ 𝑦 𝑗
In-class exercise Part 1: derive 𝑐[𝑖,𝑗] Case 𝒛 ℓ matches both (so 𝑥 𝑚 = 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆( 𝑋 ′ , 𝑌 ′ ) + 𝒛 ℓ Case 𝒛 ℓ matches only 𝒙 𝒎 (so 𝑥 𝑚 ≠ 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆(𝑿, 𝑌 ′ ) Case 𝒛 ℓ matches only 𝒚 𝒏 (so 𝑥 𝑚 ≠ 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆( 𝑋 ′ ,𝒀) Case 𝒛 ℓ matches neither (recall 𝑥 𝑚 ≠ 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆( 𝑋 ′ ,𝑌′) Let 𝑋 𝑖 =( 𝑥 1 ,…, 𝑥 𝑖 ), 𝑌 𝑗 = 𝑦 1 ,…, 𝑦 𝑗 and 𝒄 𝒊,𝒋 = 𝑳𝑪𝑺 𝑿 𝒊 , 𝒀 𝒋 𝑐 𝑖,𝑗 = 0 if 𝑖=0 or 𝑗=0 𝑐 𝑖−1,𝑗−1 +1 if 𝑖,𝑗≥1 and 𝑥 𝑖 = 𝑦 𝑗 max{𝑐 𝑖,𝑗−1 ,𝑐 𝑖−1,𝑗 ,𝑐[𝑖−1,𝑗−1]} if 𝑖,𝑗≥1 and 𝑥 𝑖 ≠ 𝑦 𝑗 Can simplify! Observe that 𝑐 𝑖−1,𝑗−1 ≤𝑐 𝑖,𝑗−1 , because the former only has a subset of the input to the latter! Therefore, it can’t be the max
In-class exercise Part 1: derive 𝑐[𝑖,𝑗] Case 𝒛 ℓ matches both (so 𝑥 𝑚 = 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆( 𝑋 ′ , 𝑌 ′ ) + 𝒛 ℓ Case 𝒛 ℓ matches only 𝒙 𝒎 (so 𝑥 𝑚 ≠ 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆(𝑿, 𝑌 ′ ) Case 𝒛 ℓ matches only 𝒚 𝒏 (so 𝑥 𝑚 ≠ 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆( 𝑋 ′ ,𝒀) Case 𝒛 ℓ matches neither (recall 𝑥 𝑚 ≠ 𝑦 𝑛 ): 𝑍=𝐿𝐶𝑆( 𝑋 ′ ,𝑌′) Let 𝑋 𝑖 =( 𝑥 1 ,…, 𝑥 𝑖 ), 𝑌 𝑗 = 𝑦 1 ,…, 𝑦 𝑗 and 𝒄 𝒊,𝒋 = 𝑳𝑪𝑺 𝑿 𝒊 , 𝒀 𝒋 𝑐 𝑖,𝑗 = 0 if 𝑖=0 or 𝑗=0 𝑐 𝑖−1,𝑗−1 +1 if 𝑖,𝑗≥1 and 𝑥 𝑖 = 𝑦 𝑗 max{𝑐 𝑖,𝑗−1 ,𝑐 𝑖−1,𝑗 } if 𝑖,𝑗≥1 and 𝑥 𝑖 ≠ 𝑦 𝑗
Suppose 𝑿= gdvegta and 𝒀= gvcekst 𝒄 𝒊,𝒋 = 𝟎 𝒊𝒇 𝒊=𝟎 𝒐𝒓 𝒋=𝟎 𝒄 𝒊−𝟏,𝒋−𝟏 +𝟏 𝐢𝐟 𝒊,𝒋≥𝟏 𝒂𝒏𝒅 𝒙 𝒊 = 𝒚 𝒋 𝐦𝐚𝐱{𝒄 𝒊,𝒋−𝟏 ,𝒄 𝒊−𝟏,𝒋 } 𝐢𝐟 𝒊,𝒋≥𝟏 𝒂𝒏𝒅 𝒙 𝒊 ≠ 𝒚 𝒋 Suppose 𝑿= gdvegta and 𝒀= gvcekst Question 1 Q2 Q3 … Q6 … Q4 Q7 Q5 … …
Exercise part 2: pseudocode 𝒄 𝒊,𝒋 = 𝟎 𝒊𝒇 𝒊=𝟎 𝒐𝒓 𝒋=𝟎 𝒄 𝒊−𝟏,𝒋−𝟏 +𝟏 𝐢𝐟 𝒊,𝒋≥𝟏 𝒂𝒏𝒅 𝒙 𝒊 = 𝒚 𝒋 𝐦𝐚𝐱{𝒄 𝒊,𝒋−𝟏 ,𝒄 𝒊−𝟏,𝒋 } 𝐢𝐟 𝒊,𝒋≥𝟏 𝒂𝒏𝒅 𝒙 𝒊 ≠ 𝒚 𝒋 Give pseudocode to compute c[i,j] for all i,j and return the length of LCS(X,Y). Remaining code: Assume c[] already exists. ??? Complexity? Space? Time? Θ 𝑛𝑚 for both Start here: ???
Computing the LCS (not its length) 𝒋 𝒊 𝑐[𝑖,𝑗] 𝒄 𝒊,𝒋 = 𝟎 𝒊𝒇 𝒊=𝟎 𝒐𝒓 𝒋=𝟎 𝒄 𝒊−𝟏,𝒋−𝟏 +𝟏 𝐢𝐟 𝒊,𝒋≥𝟏 𝒂𝒏𝒅 𝒙 𝒊 = 𝒚 𝒋 𝐦𝐚𝐱{𝒄 𝒊,𝒋−𝟏 ,𝒄 𝒊−𝟏,𝒋 } 𝐢𝐟 𝒊,𝒋≥𝟏 𝒂𝒏𝒅 𝒙 𝒊 ≠ 𝒚 𝒋
Saving the direction to the predecessor subproblem 𝝅 𝒄 𝒊,𝒋 = 𝟎 𝒊𝒇 𝒊=𝟎 𝒐𝒓 𝒋=𝟎 𝒄 𝒊−𝟏,𝒋−𝟏 +𝟏 𝐢𝐟 𝒊,𝒋≥𝟏 𝒂𝒏𝒅 𝒙 𝒊 = 𝒚 𝒋 𝐦𝐚𝐱{𝒄 𝒊,𝒋−𝟏 ,𝒄 𝒊−𝟏,𝒋 } 𝐢𝐟 𝒊,𝒋≥𝟏 𝒂𝒏𝒅 𝒙 𝒊 ≠ 𝒚 𝒋 +1 means 𝑥 𝑖 is in the LCS! hidden If there are multiple possible sequences with the same length |LCS(X,Y)| then 𝑥 𝑖 is in some such seqeuence hidden hidden
How to obtain LCS=gvet from this table? Example seq=et Done: seq=gvet this is. seq=t seq=gvet seq=vet this “a” is not in
Following predecessors to compute the LCS Complexity of this trace-back: Space? Time? Recall: 𝑋 =𝑚, 𝑌 =𝑛 space: O(nm) time: O(n+m)
Memoization: an alternative to DP
Example: using memorization to compute Fibonacci numbers efficiently
Comparing with Traditional recursion Done! Memoization reduces this tree to a line with right-hanging leaves. # recursive calls = O(n) instead of ~2n Done! Already done! Done! Already done! Already done! Done! Done! Done! Calls not needed because of memoization Calls not needed because of memoization Calls not needed because of memoization Calls not needed because of memoization Calls not needed because of memoization Calls not needed because of memoization If M[n] is already computed, don’t recurse!
Problem: minimum length triangulation Input: 𝑛 points 𝑞 1 ,…, 𝑞 𝑛 in 2D space that form a convex 𝑛-gon 𝑃 Find: a triangulation of 𝑃 such that the sum of the perimeters of the 𝑛−2 triangles is minimized Output: the sum of the perimeters of the triangles in 𝑃 Input points are sorted in clockwise order around the center of 𝑃 [Example input on blackboard]
How hard is this Problem? How many triangulations are there? Number of triangulations of a convex 𝑛-gon = the 𝒏−𝟐 nd Catalan number This is 𝐶 𝑛−2 = 1 𝑛−1 2𝑛−4 𝑛−2 It can be shown that 𝐶 𝑛−2 ∈Θ( 4 𝑛 / 𝑛−2 3/2 )
Problem decomposition
How to fill in the table? [blackboard] Recurrence relation How to fill in the table? [blackboard]
Next time Graph algorithms Maybe: big-picture overview of the algorithmic design paradigms we’ve seen so far Brute force, divide and conquer, dynamic programming, greedy Pros/cons of each? When to use each?