Linear space LCS algorithm

Slides:



Advertisements
Similar presentations
A simple example finding the maximum of a set S of n numbers.
Advertisements

What is divide and conquer? Divide and conquer is a problem solving technique. It does not imply any specific computing problems. The idea is to divide.
Introduction to Bioinformatics Algorithms Divide & Conquer Algorithms.
CS223 Advanced Data Structures and Algorithms 1 Divide and Conquer Neil Tang 4/15/2010.
Chapter 4 Systems of Linear Equations; Matrices Section 6 Matrix Equations and Systems of Linear Equations.
Nattee Niparnan. Recall  Complexity Analysis  Comparison of Two Algos  Big O  Simplification  From source code  Recursive.
CSL758 Instructors: Naveen Garg Kavitha Telikepalli Scribe: Manish Singh Vaibhav Rastogi February 7 & 11, 2008.
Space Efficient Alignment Algorithms and Affine Gap Penalties
Space Efficient Alignment Algorithms Dr. Nancy Warter-Perez June 24, 2005.
This material in not in your text (except as exercises) Sequence Comparisons –Problems in molecular biology involve finding the minimum number of edit.
CS232.
Space Efficient Alignment Algorithms Dr. Nancy Warter-Perez.
Dynamic Programming Introduction to Algorithms Dynamic Programming CSE 680 Prof. Roger Crawfis.
CS 5263 Bioinformatics Lecture 4: Global Sequence Alignment Algorithms.
Space-Efficient Sequence Alignment Space-Efficient Sequence Alignment Bioinformatics 202 University of California, San Diego Lecture Notes No. 7 Dr. Pavel.
Dynamic Programming. Well known algorithm design techniques:. –Divide-and-conquer algorithms Another strategy for designing algorithms is dynamic programming.
Membership problem CYK Algorithm Project presentation CS 5800 Spring 2013 Professor : Dr. Elise de Doncker Presented by : Savitha parur venkitachalam.
Sorting CS 110: Data Structures and Algorithms First Semester,
Introduction to Bioinformatics Algorithms Divide & Conquer Algorithms.
Space Efficient Alignment Algorithms and Affine Gap Penalties Dr. Nancy Warter-Perez.
1 Ch20. Dynamic Programming. 2 BIRD’S-EYE VIEW Dynamic programming The most difficult one of the five design methods Has its foundation in the principle.
1 How to Multiply Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. integers, matrices, and polynomials.
Algorithm Design Techniques, Greedy Method – Knapsack Problem, Job Sequencing, Divide and Conquer Method – Quick Sort, Finding Maximum and Minimum, Dynamic.
Recursion 5/4/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M. H.
Linear Algebra Review.
Divide & Conquer Algorithms
Week 9 - Monday CS 113.
Lecture 5 Dynamic Programming
David Meredith Dynamic programming David Meredith
Multiplication of Matrices
Advanced Design and Analysis Techniques
Searching & Sorting "There's nothing hidden in your head the sorting hat can't see. So try me on and I will tell you where you ought to be." -The Sorting.
CS 213: Data Structures and Algorithms
Block LU Decomposition: explained
Lecture 5 Dynamic Programming
CS200: Algorithm Analysis
Lecture 22: Parallel Algorithms
Fitting Curve Models to Edges
CS38 Introduction to Algorithms
Growth Functions Algorithms Lecture 8
CSCE 411 Design and Analysis of Algorithms
Dynamic Programming.
Graph & BFS.
Chapter 6. Large Scale Optimization
Unit-2 Divide and Conquer
“Human Sorting” It’s a “Problem Solving” game:
CS Two Basic Sorting Algorithms Review Exchange Sorting Merge Sorting
Topic: Divide and Conquer
Searching: linear & binary
Simple Sorting Methods: Bubble, Selection, Insertion, Shell
Introduction to Data Structures
Searching and Sorting Arrays
Dynamic Programming-- Longest Common Subsequence
Dynamic Programming General Idea
Divide and Conquer Neil Tang 4/24/2008
Introduction to Algorithms: Dynamic Programming
Bioinformatics Algorithms and Data Structures
Longest increasing subsequence (LIS) Matrix chain multiplication
CPS120: Introduction to Computer Science
Multiplication of Matrices
Binhai Zhu Computer Science Department, Montana State University
3.6 Multiply Matrices.
CPS120: Introduction to Computer Science
Analysis of Algorithms CS 477/677
Closures of Relations Epp, section 10.1,10.2 CS 202.
“Human Sorting” It’s a “Problem Solving” game:
The Gauss Jordan method
CSE 5290: Algorithms for Bioinformatics Fall 2009
Chapter 6. Large Scale Optimization
Presentation transcript:

Linear space LCS algorithm Divide-and Conquer : Linear space LCS algorithm 2019/5/22 chapter25

The Complexity of the Normal Algorithm for LCS Time Complexity: O(nm), n and m are the lengths of sequences.. Space Complexity: O(nm). For some applications, the length of the sequence could be 10,000. The space required is 100,000,000, which might be too large for some computers. For most of computers, it is hard to handle sequences of length 100,000. 2019/5/22 chapter25

Is it possible to use O(n+m) space? Yes. Main Idea: As long as the cells are not useful for computing the values of other cells, release them. O(n) space is enough, where n is the length of the two sequences . 2019/5/22 chapter25

Original method for CS A B C A B C B A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 A 0 1 2 2 3 3 3 4 5 2019/5/22 chapter25

Steps for computing c[n,m], the last cell in the matrix. A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 A 0 Initial values C 0 A 0 2019/5/22 chapter25

Steps for computing c[n,m] A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 C 0 B 0 2019/5/22 chapter25

Steps for computing c[n,m] A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 B 0 the 2nd row is no longer useful B 0 A 0 2019/5/22 chapter25

Steps for computing c[n,m] A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 B 0 the 2nd row is no longer useful B 0 A 0 2019/5/22 chapter25

Steps for computing c[n,m] A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 the 3rd row is no longer useful B 0 A 0 2019/5/22 chapter25

Steps for computing c[n,m] A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 A 0 2019/5/22 chapter25

Steps for computing c[n,m] A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 4th row is not useful A 0 2019/5/22 chapter25

Steps for computing c[n,m] A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 4th row is not useful A 0 2019/5/22 chapter25

Steps for computing c[n,m] A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 1 2 2 2 3 3 4 4 A 0 2019/5/22 chapter25

Steps for computing c[n,m] A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 1 2 2 2 3 3 4 4 A 0 2019/5/22 chapter25

Steps for computing c[n,m] A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 A 0 1 2 2 3 3 3 4 5 2019/5/22 chapter25

Linear Space Algorithm for LCS At any time, the space required is at most one column + 2 rows,i.e.O(n+m). How to do backtracking for getting the LCS? 2019/5/22 chapter25

Backtracking using linear space Simple method: Go back by one step at a time. Compute the sub-matrix again. Repeat the process until back to the 1st row or the 1st column. Time required O(n3) (n=m). 2019/5/22 chapter25

Backtracking A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 A 0 1 2 2 3 3 3 4 5 2019/5/22 chapter25

Backtracking A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 1 2 2 2 3 3 4 4 Time O(nm) A 0 1 2 2 3 3 3 4 5 Re-compute the cells in the blue box. 2019/5/22 chapter25

Backtracking A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 1 2 2 2 3 3 4 4 Time O(nm) A 0 1 2 2 3 3 3 4 5 Go back by one step. 2019/5/22 chapter25

Backtracking A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 1 2 2 2 3 3 4 4 Time O(nm) A 0 1 2 2 3 3 3 4 5 Re-compute the values in the blue box. 2019/5/22 chapter25

Backtracking A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 1 2 2 2 3 3 4 4 Time O(nm) A 0 1 2 2 3 3 3 4 5 Go back by one step 2019/5/22 chapter25

Backtracking A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 1 2 2 2 3 3 4 4 Time O(nm) A 0 1 2 2 3 3 3 4 5 Re-compute the cells in he blue box. 2019/5/22 chapter25

Backtracking A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 1 2 2 2 3 3 4 4 Time O(nm) A 0 1 2 2 3 3 3 4 5 Go back by one step. 2019/5/22 chapter25

Backtracking A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 1 2 2 2 3 3 4 4 Time O(nm) A 0 1 2 2 3 3 3 4 5 Re-compute the cells in the blue box. 2019/5/22 chapter25

Backtracking A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 1 2 2 2 3 3 4 4 Time O(nm) A 0 1 2 2 3 3 3 4 5 Go back by two steps. 2019/5/22 chapter25

Backtracking A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 1 2 2 2 3 3 4 4 Time O(n2) A 0 1 2 2 3 3 3 4 5 Re-compute the cells in the blue box. 2019/5/22 chapter25

Backtracking A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 A 0 1 2 2 3 3 3 4 5 Go back by two steps. 2019/5/22 chapter25

An Algorithm with O(nm) time and O(n+m) space. Divide and conquer. Main ideas Let n be the length of both sequences. K=n/2. After we get c[n,n] (the last cell), we know how to decompose the two sequences s1 and s2 and work for the LCS for s1[1…K] and s2[1…J], and s1[K+1…n] and s2[J+1…n]. The key point is to know J when c[n,m] is computed. 2019/5/22 chapter25

Example for Decomposition A B C A B C B A 0 0 0 0 0 0 0 0 0 B 0 0 1 1 1 1 1 1 1 A 0 1 1 1 2 2 2 2 2 C 0 1 1 2 2 2 3 3 3 B 0 1 2 2 2 3 3 4 4 B 0 1 2 2 2 3 3 4 4 Time O(n2) A 0 1 2 2 3 3 3 4 5 The red rectangle will be re-computed. 2019/5/22 chapter25

Re-computation of two LCS A B C A B C B A 0 0 0 0 0 B 0 0 1 1 1 A 0 1 1 1 2 0 0 0 0 0 C 0 1 1 2 2 C 0 0 1 1 1 B 0 1 2 2 2 B 0 1 1 2 2 A 0 1 2 2 3 A 0 1 1 2 3 LCS=BA+CBA. 2019/5/22 chapter25

How to know J when c[n,m] is computed? Main ideas when computing cell c[i,j], where i>=K, we have to know the configuration of s1[K], i.e., the cell c[K, J] that passes its value (may via a long way) to c[i, j]. We use an array d[i,j] to store J. The value of J will be passed to the cells while computing c[i, j]. 2019/5/22 chapter25

The array d[i, j] that stores the J values A B C A B C B A x x x x 0 0 0 0 0 B x x x x 1 0 0 0 0 A x x x x 2 2 2 2 0 C x x x x 3 2 2 2 2 B x x x x 4 3 2 2 2 B x x x x 5 3 2 2 2 A x x x x 6 6 3 2 2 c[6,8]=2 indicates where to decompose. 2019/5/22 chapter25

The divide and conquer algorithm 1. Compute c[n,n] and d[n,n] (assume |s1|=|s2|) 2. Cut s1 into s1[1…K] and s1[K+1…n] Cut s2 into s2[1…J] and s2[J+1…n]. 3. Compute the LCS L1 for s1[1…K] and s1[K+1…n] and compute the LSC L2 for s2[1…J] and s2[J+1…n]. (Recursive step) 4. Combine L1 and L2 to get LCS L1L2. Space: O(n+m). Time: O(nm). (twice of the original alg.) 2019/5/22 chapter25

Example: Time : nm(1+ i=1 log n 0.5i)2 nm S1=ABCABCBA & s2=BACBBA nm ABCA & BA; BCBA & CBBA 0.5n m AB & B; CA & A; BC & CB; BA & BA 0.25nm A & ; B & B; C & ; A & A,B & ; C & CB; B & B; A & A 0.125n m 2019/5/22 chapter25

Time complexity Level 1: compute nm matrices c[i, j] and d[i, j]. Level 2:n/2m1 matrices and n/2m2 matrices m1+m2=m. Subtotal n/2 m Level 3: n/4m1 , n/4m2 n/4m3 n/4m4 m1+m2+m3+m4=m. Subtotal n/4 m …. Level log n: 1 m1 +1 m2 +…+1 mq m1 +m2 +…+mq =m. Subtotal m. Total: nm(1+ i=1 log n 0.5i)2 nm . 2019/5/22 chapter25

More on dynamic programming algorithms 2019/5/22 chapter25

2019/5/22 chapter25

2019/5/22 chapter25

2019/5/22 chapter25

2019/5/22 chapter25

2019/5/22 chapter25

Backtracking is used to get the schedule. Time complexity O(n) if the jobs are sorted. Total time: O(n log n) including sorting. 2019/5/22 chapter25

2019/5/22 chapter25

2019/5/22 chapter25

2019/5/22 chapter25

2019/5/22 chapter25

2019/5/22 chapter25

2019/5/22 chapter25