Download presentation
Presentation is loading. Please wait.
1
Longest common subsequence (LCS)
2
The longest common subsequence (LCS) problem
A string : A = b a c a d A subsequence of A: deleting 0 or more symbols from A (not necessarily consecutive). e.g. ad, ac, bac, acad, bacad, bcd. Common subsequences of A = b a c a d and B = a c c b a d c b : ad, ac, bac, acad. The longest common subsequence (LCS) of A and B: a c a d. Subsequence need not be consecutive, but must be in order.
3
Longest common subsequence
INPUT: two strings OUTPUT: longest common subsequence ACTGAACTCTGTGCACT TGACTCAGCACAAAAAC
4
Longest common subsequence
INPUT: two strings OUTPUT: longest common subsequence ACTGAACTCTGTGCACT TGACTCAGCACAAAAAC
5
Longest common subsequence
INPUT: two strings OUTPUT: longest common subsequence ACTGAACTCTGTGCACT TGACTCAGCACAAAAAC
6
Longest common subsequence
INPUT: two strings OUTPUT: longest common subsequence ACTGAACTCTGTGCACT TGACTCAGCACAAAAAC
7
Longest common subsequence
INPUT: two strings OUTPUT: longest common subsequence ACTGAACTCTGTGCACT TGACTCAGCACAAAAAC
8
Naïve /Brute Force Algorithm
For every subsequence of X, check whether it’s a subsequence of Y . X has 2m subsequences. Each subsequence takes Θ(n) time to check: scan Y for first letter, for second, and so on. Time: Θ(n2m).
9
Optimal Substructure Theorem
Let Z = z1, , zk be any LCS of X x1, . . , xk and Y y1, . . , yk . 1. If xm = yn, then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1. 2. If xm yn, then either zk xm and Z is an LCS of Xm-1 and Y . or zk yn and Z is an LCS of X and Yn-1.
10
Optimal Substructure Proof: (case 1: xm = yn)
Theorem Let Z = z1, , zk be any LCS of X and Y . 1. If xm = yn, then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1. Proof: (case 1: xm = yn) Any sequence Z’ that does not end in xm = yn can be made longer by adding xm = yn to the end. Therefore, longest common subsequence (LCS) Z must end in xm = yn. Zk-1 is a common subsequence of Xm-1 and Yn-1, and there is no longer CS of Xm-1 and Yn-1, or Z would not be an LCS.
11
Optimal Substructure Proof: (case 2: xm yn, and zk xm)
Theorem Let Z = z1, , zk be any LCS of X and Y . 2. If xm yn, then either zk xm and Z is an LCS of Xm-1 and Y . Proof: (case 2: xm yn, and zk xm) Since Z does not end in xm, Z is a common subsequence of Xm-1 and Y, and there is no longer CS of yn-1 and x, or Z would not be an LCS.
12
Optimal Substructure Proof: (case 2: xm yn, and zk yn)
Theorem Let Z = z1, , zk be any LCS of X and Y . 2. If xm yn, then zk yn and Z is an LCS of X and Yn-1. Proof: (case 2: xm yn, and zk yn) Since Z does not end in yn, Z is a common subsequence of yn-1 and x, and there is no longer CS of Xm-1 and Y, or Z would not be an LCS.
13
Recursive Solution Sequences x1,…,xn, and y1,…,ym
LCS(i,j) = length of a longest common subsequence of x1,…,xi and y1,…,yj if xi = yj then LCS(i,j) =
14
Recursive Solution Sequences x1,…,xn, and y1,…,ym
LCS(i,j) = length of a longest common subsequence of x1,…,xi and y1,…,yj if xi = yj then LCS(i,j) = 1 + LCS(i-1,j-1)
15
Recursive Solution Sequences x1,…,xn, and y1,…,ym
LCS(i,j) = length of a longest common subsequence of x1,…,xi and y1,…,yj if xi yj then LCS(i,j) = max (LCS(i-1,j),LCS(i,j-1)) xi and yj cannot both be in LCS
16
Recursive Solution Sequences x1,…,xn, and y1,…,ym
LCS(i,j) = length of a longest common subsequence of x1,…,xi and y1,…,yj if xi = yj then LCS(i,j) = 1 + LCS(i-1,j-1) if xi yj then LCS(i,j) = max (LCS(i-1,j),LCS(i,j-1))
17
Recursive Solution Define c[i, j] = length of LCS of Xi and Yj .
We want c[m,n].
18
Constructing an LCS
19
LCS Example What is the Longest Common Subsequence of X and Y?
We’ll see how LCS algorithm works on the following example: X = ABCB Y = BDCAB What is the Longest Common Subsequence of X and Y? LCS(X, Y) = BCB X = A B C B Y = B D C A B
20
ABCB BDCAB X = ABCB; m = |X| = 4 Y = BDCAB; n = |Y| = 5
j i Yj B D C A B Xi A 1 B 2 3 C 4 B X = ABCB; m = |X| = 4 Y = BDCAB; n = |Y| = 5 Allocate array c[5,4]
21
ABCB BDCAB for i = 1 to m c[i,0] = 0 for j = 1 to n c[0,j] = 0
Yj B D C A B Xi A 1 B 2 3 C 4 B for i = 1 to m c[i,0] = 0 for j = 1 to n c[0,j] = 0
22
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 B 2 3 C if ( Xi == Yj )
A 1 B 2 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 4 B
23
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 B 2 3 C if ( Xi == Yj )
A 1 B 2 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 4 B
24
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 B 2 3 C
A 1 1 B 2 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 4 B
25
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 3 C
A 1 1 1 B 2 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 4 B
26
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 3 C
A 1 1 1 B 2 1 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 4 B
27
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 3 C
A 1 1 1 B 2 1 1 1 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 4 B
28
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 3 C
A 1 1 1 B 2 1 1 1 1 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 4 B
29
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C
A 1 1 1 B 2 1 1 1 1 2 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 4 B
30
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C
A 1 1 1 B 2 1 1 1 1 2 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 1 1 4 B
31
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C
A 1 1 1 B 2 1 1 1 1 2 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 1 1 2 4 B
32
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C
A 1 1 1 B 2 1 1 1 1 2 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 1 1 2 2 4 B
33
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C
A 1 1 1 B 2 1 1 1 1 2 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 1 1 2 2 2 4 B
34
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C
A 1 1 1 B 2 1 1 1 1 2 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 1 1 2 2 2 4 B 1
35
ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C
A 1 1 1 B 2 1 1 1 1 2 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 1 1 2 2 2 4 B 1 1 2 2
36
3 ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C
A 1 1 1 B 2 1 1 1 1 2 3 C if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 b[i,j] = “ ” if ( c[i-1,j] >= c[i,j] ) c[i,j] = c[i-1,j] b[i,j] = “ ” else c[i,j] = c[i,j-1] 1 1 2 2 2 3 4 B 1 1 2 2
37
Computing the Length of an LCS
38
Analysis of an LCS O(m*n) Running time and memory: O(mn) and O(mn).
since each c[i,j] is calculated in constant time, and there are m*n elements in the array Running time and memory: O(mn) and O(mn).
39
How to find actual LCS 3 j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 2 B 1
Xi A 1 1 1 2 B 1 1 1 1 2 3 C 1 1 2 2 2 4 3 B 1 1 2 2
40
Finding LCS 3 j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3
A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 3 4 B 1 1 2 2
41
How to find actual LCS 3 j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 2 B 1
Xi A 1 1 1 2 B 1 1 1 1 2 3 C 1 1 2 2 2 4 3 B 1 1 2 2 Initial call is PRINT-LCS (b, X,m, n). When b[i, j ] = , we have extended LCS by one character. So LCS = entries with in them. Time: O(m+n) (no of characters to be checked)
42
3 B C B LCS (reversed order): B C B
j i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 3 4 B 1 1 2 2 B C B LCS (reversed order): B C B (this string turned out to be a palindrome) LCS (straight order):
43
LCS Example What is the Longest Common Subsequence of X and Y?
We’ll see how LCS algorithm works on the following example: X = PRESIDENT Y = PROVIDENCE What is the Longest Common Subsequence of X and Y?
45
Output: PRIDEN
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.