Longest Common Subsequence

Slides:



Advertisements
Similar presentations
Dynamic Programming Nithya Tarek. Dynamic Programming Dynamic programming solves problems by combining the solutions to sub problems. Paradigms: Divide.
Advertisements

Lecture 8: Dynamic Programming Shang-Hua Teng. Longest Common Subsequence Biologists need to measure how similar strands of DNA are to determine how closely.
CPSC 335 Dynamic Programming Dr. Marina Gavrilova Computer Science University of Calgary Canada.
Overview What is Dynamic Programming? A Sequence of 4 Steps
Algorithms Dynamic programming Longest Common Subsequence.
COMP8620 Lecture 8 Dynamic Programming.
David Luebke 1 5/4/2015 CS 332: Algorithms Dynamic Programming Greedy Algorithms.
1 Longest Common Subsequence (LCS) Problem: Given sequences x[1..m] and y[1..n], find a longest common subsequence of both. Example: x=ABCBDAB and y=BDCABA,
November 7, 2005Copyright © by Erik D. Demaine and Charles E. Leiserson Dynamic programming Design technique, like divide-and-conquer. Example:
© 2004 Goodrich, Tamassia Dynamic Programming1. © 2004 Goodrich, Tamassia Dynamic Programming2 Matrix Chain-Products (not in book) Dynamic Programming.
Lecture 8: Dynamic Programming Shang-Hua Teng. First Example: n choose k Many combinatorial problems require the calculation of the binomial coefficient.
Dynamic Programming 0-1 Knapsack These notes are taken from the notes by Dr. Steve Goddard at
1 Dynamic Programming Jose Rolim University of Geneva.
Lecture 7 Topics Dynamic Programming
Longest Common Subsequence
Dynamic Programming – Part 2 Introduction to Algorithms Dynamic Programming – Part 2 CSE 680 Prof. Roger Crawfis.
ADA: 7. Dynamic Prog.1 Objective o introduce DP, its two hallmarks, and two major programming techniques o look at two examples: the fibonacci.
1 0-1 Knapsack problem Dr. Ying Lu RAIK 283 Data Structures & Algorithms.
Greedy Methods and Backtracking Dr. Marina Gavrilova Computer Science University of Calgary Canada.
6/4/ ITCS 6114 Dynamic programming Longest Common Subsequence.
Introduction to Algorithms Jiafen Liu Sept
2/19/ ITCS 6114 Dynamic programming 0-1 Knapsack problem.
CSC 213 Lecture 19: Dynamic Programming and LCS. Subsequences (§ ) A subsequence of a string x 0 x 1 x 2 …x n-1 is a string of the form x i 1 x.
9/27/10 A. Smith; based on slides by E. Demaine, C. Leiserson, S. Raskhodnikova, K. Wayne Adam Smith Algorithm Design and Analysis L ECTURE 16 Dynamic.
TU/e Algorithms (2IL15) – Lecture 4 1 DYNAMIC PROGRAMMING II
CS583 Lecture 12 Jana Kosecka Dynamic Programming Longest Common Subsequence Matrix Chain Multiplication Greedy Algorithms Many slides here are based on.
Dynamic Programming Csc 487/687 Computing for Bioinformatics.
Dynamic Programming Typically applied to optimization problems
Merge Sort 5/28/2018 9:55 AM Dynamic Programming Dynamic Programming.
Many slides here are based on D. Luebke slides
Algorithm Design and Analysis (ADA)
David Meredith Dynamic programming David Meredith
CS 3343: Analysis of Algorithms
Advanced Algorithms Analysis and Design
Least common subsequence:
JinJu Lee & Beatrice Seifert CSE 5311 Fall 2005 Week 10 (Nov 1 & 3)
Dynamic Programming CISC4080, Computer Algorithms CIS, Fordham Univ.
CS200: Algorithm Analysis
Chapter 8 Dynamic Programming.
CSCE 411 Design and Analysis of Algorithms
Dynamic Programming.
CS Algorithms Dynamic programming 0-1 Knapsack problem 12/5/2018.
Greedy Algorithms Many optimization problems can be solved more quickly using a greedy approach The basic principle is that local optimal decisions may.
Dynamic Programming Dr. Yingwu Zhu Chapter 15.
CS 3343: Analysis of Algorithms
Dynamic Programming 1/15/2019 8:22 PM Dynamic Programming.
Merge Sort Dynamic Programming
Dynamic Programming Dynamic Programming 1/15/ :41 PM
Dynamic Programming.
CS6045: Advanced Algorithms
Dynamic Programming.
Lecture 8. Paradigm #6 Dynamic Programming
Dynamic programming & Greedy algorithms
CS 332: Algorithms Amortized Analysis Continued
Introduction to Algorithms: Dynamic Programming
Dynamic Programming.
CSC 413/513- Intro to Algorithms
Algorithm Design Techniques Greedy Approach vs Dynamic Programming
Longest Common Subsequence
Dynamic Programming II DP over Intervals
Algorithms and Data Structures Lecture X
Merge Sort 4/28/ :13 AM Dynamic Programming Dynamic Programming.
Longest common subsequence (LCS)
Lecture 5 Dynamic Programming
Analysis of Algorithms CS 477/677
0-1 Knapsack problem.
Longest Common Subsequence
Dynamic Programming.
Longest Common Subsequence
Algorithm Course Dr. Aref Rashad
Presentation transcript:

Longest Common Subsequence CS 332 - Algorithms Dynamic programming Longest Common Subsequence 2/22/2019

Dynamic programming It is used, when the solution can be recursively described in terms of solutions to subproblems (optimal substructure) Algorithm finds solutions to subproblems and stores them in memory for later use More efficient than “brute-force methods”, which solve the same subproblems over and over again 2/22/2019

Longest Common Subsequence (LCS) Application: comparison of two DNA strings Ex: X= {A B C B D A B }, Y= {B D C A B A} Longest Common Subsequence: X = A B C B D A B Y = B D C A B A Brute force algorithm would compare each subsequence of X with the symbols in Y 2/22/2019

LCS Algorithm if |X| = m, |Y| = n, then there are 2m subsequences of x; we must compare each with Y (n comparisons) So the running time of the brute-force algorithm is O(n 2m) Notice that the LCS problem has optimal substructure: solutions of subproblems are parts of the final solution. Subproblems: “find LCS of pairs of prefixes of X and Y” 2/22/2019

LCS Algorithm First we’ll find the length of LCS. Later we’ll modify the algorithm to find LCS itself. Define Xi, Yj to be the prefixes of X and Y of length i and j respectively Define c[i,j] to be the length of LCS of Xi and Yj Then the length of LCS of X and Y will be c[m,n] 2/22/2019

LCS recursive solution We start with i = j = 0 (empty substrings of x and y) Since X0 and Y0 are empty strings, their LCS is always empty (i.e. c[0,0] = 0) LCS of empty string and any other string is empty, so for every i and j: c[0, j] = c[i,0] = 0 2/22/2019

LCS recursive solution When we calculate c[i,j], we consider two cases: First case: x[i]=y[j]: one more symbol in strings X and Y matches, so the length of LCS Xi and Yj equals to the length of LCS of smaller strings Xi-1 and Yi-1 , plus 1 2/22/2019

LCS recursive solution Second case: x[i] != y[j] As symbols don’t match, our solution is not improved, and the length of LCS(Xi , Yj) is the same as before (i.e. maximum of LCS(Xi, Yj-1) and LCS(Xi-1,Yj) Why not just take the length of LCS(Xi-1, Yj-1) ? 2/22/2019

LCS Length Algorithm LCS-Length(X, Y) 1. m = length(X) // get the # of symbols in X 2. n = length(Y) // get the # of symbols in Y 3. for i = 1 to m c[i,0] = 0 // special case: Y0 4. for j = 1 to n c[0,j] = 0 // special case: X0 5. for i = 1 to m // for all Xi 6. for j = 1 to n // for all Yj 7. if ( Xi == Yj ) 8. c[i,j] = c[i-1,j-1] + 1 9. else c[i,j] = max( c[i-1,j], c[i,j-1] ) 10. return c 2/22/2019

LCS Example We’ll see how LCS algorithm works on the following example: X = ABCB Y = BDCAB What is the Longest Common Subsequence of X and Y? LCS(X, Y) = BCB X = A B C B Y = B D C A B 2/22/2019

LCS Example (0) ABCB BDCAB X = ABCB; m = |X| = 4 j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 B 2 3 C 4 B X = ABCB; m = |X| = 4 Y = BDCAB; n = |Y| = 5 Allocate array c[5,4] 2/22/2019

LCS Example (1) ABCB BDCAB for i = 1 to m c[i,0] = 0 j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 B 2 3 C 4 B for i = 1 to m c[i,0] = 0 for j = 1 to n c[0,j] = 0 2/22/2019

LCS Example (2) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 B 2 3 C A 1 B 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Example (3) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 B 2 3 C A 1 B 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Example (4) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 B 2 3 A 1 1 B 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Example (5) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 A 1 1 1 B 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Example (6) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 A 1 1 1 B 2 1 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Example (7) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 A 1 1 1 B 2 1 1 1 1 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Example (8) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 A 1 1 1 B 2 1 1 1 1 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Example (10) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Example (11) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Example (12) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Example (13) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 4 B 1 if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Example (14) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 4 B 1 1 2 2 if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Example (15) 3 ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 3 4 B 1 1 2 2 if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] ) 2/22/2019

LCS Algorithm Running Time LCS algorithm calculates the values of each entry of the array c[m,n] So what is the running time? O(m*n) since each c[i,j] is calculated in constant time, and there are m*n elements in the array 2/22/2019

How to find actual LCS So far, we have just found the length of LCS, but not LCS itself. We want to modify this algorithm to make it output Longest Common Subsequence of X and Y Each c[i,j] depends on c[i-1,j] and c[i,j-1] or c[i-1, j-1] For each c[i,j] we can say how it was acquired: For example, here c[i,j] = c[i-1,j-1] +1 = 2+1=3 2 2 2 3 2/22/2019

How to find actual LCS - continued Remember that So we can start from c[m,n] and go backwards Whenever c[i,j] = c[i-1, j-1]+1, remember x[i] (because x[i] is a part of LCS) When i=0 or j=0 (i.e. we reached the beginning), output remembered letters in reverse order 2/22/2019

Finding LCS 3 j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 3 4 B 1 1 2 2 2/22/2019

Finding LCS (2) 3 B C B LCS (reversed order): B C B j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 3 4 B 1 1 2 2 B C B LCS (reversed order): B C B (this string turned out to be a palindrome) LCS (straight order): 2/22/2019

Knapsack problem Given some items, pack the knapsack to get the maximum total value. Each item has some weight and some value. Total weight that we can carry is no more than some fixed number W. So we must consider weights of items as well as their value. Item # Weight Value 1 1 8 2 3 6 3 5 5 2/22/2019

Knapsack problem There are two versions of the problem: (1) “0-1 knapsack problem” and (2) “Fractional knapsack problem” (1) Items are indivisible; you either take an item or not. Solved with dynamic programming (2) Items are divisible: you can take any fraction of an item. Solved with a greedy algorithm. 2/22/2019