Space-Saving Strategies for Computing Δ-points

Slides:



Advertisements
Similar presentations
Sequence Alignment Tutorial #2
Advertisements

Sequence Alignment Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
Global Alignment: Dynamic Progamming Table s 1 : acagagtaac s 2 : acaagtgatc -acaagtgatc - a c a g a g t a a c j s2s2 i s1s1 Scores: match=1, mismatch=-1,
Sequence Alignment Tutorial #2
Space Efficient Alignment Algorithms and Affine Gap Penalties
Space Efficient Alignment Algorithms Dr. Nancy Warter-Perez June 24, 2005.
Sequence Alignment Cont’d. Sequence Alignment -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC--- TAG-CTATCAC--GACCGC--GGTCGATTTGCCCGAC Definition Given two strings.
Sequence Alignment II CIS 667 Spring Optimal Alignments So we know how to compute the similarity between two sequences  How do we construct an.
Sequence Alignment Oct 9, 2002 Joon Lee Genomics & Computational Biology.
Alignment II Dynamic Programming
Space Efficient Alignment Algorithms Dr. Nancy Warter-Perez.
Space-Saving Strategies for Computing Δ-points Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University,
Sequence Alignment.
Sequence Alignment Algorithms Morten Nielsen Department of systems biology, DTU.
Counting Spanning Trees Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
Comp. Genomics Recitation 2 12/3/09 Slides by Igor Ulitsky.
Pairwise Sequence Alignment (I) (Lecture for CS498-CXZ Algorithms in Bioinformatics) Sept. 22, 2005 ChengXiang Zhai Department of Computer Science University.
Multiple Sequence Alignment Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan WWW:
Sequence Alignment Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
Chapter 3 Computational Molecular Biology Michael Smith
Apple Raises $17 Billion in Record Debt Sale Kun-Mao Chao Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan.
Sequence Alignment Tanya Berger-Wolf CS502: Algorithms in Computational Biology January 25, 2011.
Space Efficient Alignment Algorithms and Affine Gap Penalties Dr. Nancy Warter-Perez.
Dynamic Programming Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
Space-Saving Strategies for Analyzing Biomolecular Sequences Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan.
Learning to Align: a Statistical Approach
CS502: Algorithms in Computational Biology
Homology Search Tools Kun-Mao Chao (趙坤茂)
Online Courses A note given in BCC class on May 10, 2016
Sequence Alignment Kun-Mao Chao (趙坤茂)
Homology Search Tools Kun-Mao Chao (趙坤茂)
A note given in BCC class on March 15, 2016
Dynamic-Programming Strategies for Analyzing Biomolecular Sequences
Homology Search Tools Kun-Mao Chao (趙坤茂)
Sequence Alignment Using Dynamic Programming
Minimum Spanning Trees
Using Dynamic Programming To Align Sequences
SMA5422: Special Topics in Biotechnology
Shortest-Paths Trees Kun-Mao Chao (趙坤茂)
Heaviest Segments in a Number Sequence
Sequence Alignment Kun-Mao Chao (趙坤茂)
The Largest Known Prime Number
A Quick Note on Useful Algorithmic Strategies
A Note on Useful Algorithmic Strategies
A Note on Useful Algorithmic Strategies
A Note on Useful Algorithmic Strategies
Sequence Alignment Kun-Mao Chao (趙坤茂)
A Note on Useful Algorithmic Strategies
Sequence Alignment Algorithms Morten Nielsen BioSys, DTU
Sequence Alignment Kun-Mao Chao (趙坤茂)
Space-Saving Strategies for Computing Δ-points
Space-Saving Strategies for Analyzing Biomolecular Sequences
Multiple Sequence Alignment
Facebook’s WhatsApp Purchase
Minimum Spanning Trees
Space-Saving Strategies for Computing Δ-points
Space-Saving Strategies for Computing Δ-points
Space-Saving Strategies for Analyzing Biomolecular Sequences
Sequence Alignment (I)
Trees Kun-Mao Chao (趙坤茂)
Space-Saving Strategies for Computing Δ-points
A Note on Useful Algorithmic Strategies
A Note on Useful Algorithmic Strategies
Homology Search Tools Kun-Mao Chao (趙坤茂)
Trees Kun-Mao Chao (趙坤茂)
Minimum Spanning Trees
Multiple Sequence Alignment
Space-Saving Strategies for Computing Δ-points
Space-Saving Strategies for Computing Δ-points
Dynamic Programming Kun-Mao Chao (趙坤茂)
Presentation transcript:

Space-Saving Strategies for Computing Δ-points Kun-Mao Chao (趙坤茂) Department of Computer Science and Information Engineering National Taiwan University, Taiwan http://www.csie.ntu.edu.tw/~kmchao

Δ-points S-(i, j): the best score of a path from (0, 0) to (i, j). S+(i, j): the best score of a path from (i, j) to (M, N). Δ-points: S-(i, j) + S+(i, j) >= Δ S - S +

C G G A T C A T -3 -6 -9 -12 -15 -18 -21 -24 8 5 2 -1 -4 -7 -10 -13 3 Match: 8 Mismatch: -5 Gap symbol: -3 C G G A T C A T -3 -6 -9 -12 -15 -18 -21 -24 8 5 2 -1 -4 -7 -10 -13 3 7 4 1 -2 -5 9 6 10 -8 -11 -14 14 C T T A A C T optimal score

C T T A A C – T C G G A T C A T 8 – 5 –5 +8 -5 +8 -3 +8 = 14 8 – 5 –5 +8 -5 +8 -3 +8 = 14 C G G A T C A T -3 -6 -9 -12 -15 -18 -21 -24 8 5 2 -1 -4 -7 -10 -13 3 7 4 1 -2 -5 9 6 10 -8 -11 -14 14 C T T A A C T

C G G A T C A T -3 -6 -9 -12 -15 -18 -21 -24 8 5 2 -1 -4 -7 -10 -13 3 Match: 8 Mismatch: -5 Gap symbol: -3 S- Matrix C G G A T C A T -3 -6 -9 -12 -15 -18 -21 -24 8 5 2 -1 -4 -7 -10 -13 3 7 4 1 -2 -5 9 6 10 -8 -11 -14 14 C T T A A C T

-21 C G G A T C A T -18 -15 C T T A A C T -12 -9 -6 -3 -24 Match: 8 Mismatch: -5 Gap symbol: -3 S+ Matrix C G G A T C A T -21 -18 -15 -12 -9 -6 -3 -24 C T T A A C T

Match: 8 Mismatch: -5 Gap symbol: -3 S+ Matrix C G G A T C A T 14 3 6 8 10 12 1 -10 -21 11 13 2 4 -7 -18 5 16 7 -4 -15 -1 -12 9 15 18 -9 -2 -6 -13 -3 -24 C T T A A C T

C G G A T C A T C T T A A C T Match: 8 Mismatch: -5 Gap symbol: -3 S- and S+ Matrix C G G A T C A T 14 -3 3 -6 6 -9 8 -12 10 -15 12 -18 1 -21 -10 -24 5 2 11 -1 13 -4 -7 4 -13 16 7 -2 -5 9 15 18 -8 -11 -14 C T T A A C T

C G G A T C A T C T T A A C T Match: 8 Mismatch: -5 S- and S+ Matrix Gap symbol: -3 S- and S+ Matrix C G G A T C A T 14 -3 3 -6 6 -9 8 -12 10 -15 12 -18 1 -21 -10 -24 5 2 11 -1 13 -4 -7 4 -13 16 7 -2 -5 9 15 18 -8 -11 -14 C T T A A C T

Match: 8 Mismatch: -5 Gap symbol: -3 S- + S+ Matrix C G G A T C A T 14 -1 -2 -3 -17 -31 -45 13 12 11 1 -16 -15 -30 -29 C T T A A C T

Match: 8 Mismatch: -5 Gap symbol: -3 S- + S+ Matrix Δ = 14 C G G A T C A T 14 -1 -2 -3 -17 -31 -45 13 12 11 1 -16 -15 -30 -29 C T T A A C T

Match: 8 Mismatch: -5 Gap symbol: -3 S- + S+ Matrix Δ = 13 C G G A T C A T 14 -1 -2 -3 -17 -31 -45 13 12 11 1 -16 -15 -30 -29 C T T A A C T

The leftmost/rightmost Δ-paths For simple scoring schemes, finding the leftmost Δ-path and the rightmost Δ-path is easy. For affine gap penalties, it is more complicated.

Two alignments may not intersect!

Method 1: O(MN) time; O(MN) space

Method 2: O(M2N) time; O(N) space Each row takes O(MN) time. In total, O(M) x O(MN) = O(M2N) S + M

Method 3: O(MN) time; O(N) space

Method 4: O(MN log M) time; O(N log M) space

Method 4: O(MN log M) time; O(N log M) space (cont’d) … O(log M) layers M O(N) O(N) O(N) O(N) O(N)

The computation of S-(i, j) and S+(i, j) inside a block

Method 5: O(MN log min {M, N}) time; O(M+N) space

Method 5: O(MN log min {M, N}) time; O(M+N) space (cont’d) … O(log min {M, N}) layers M 4(M+N) 2(M+N) M+N 1/2(M+N) 1/4(M+N)

Method 6: O(MN log log min {M, N}) time; O(M+N) space Real Size 1/25 1/23 N 1/210 1/25 1/22 M 1/29 1/219

Method 7: O(1/ε MN) time; O(1/ε MεN) space Here we use ε= 1/2 to illustrate the idea. Solve each M1/2N problem M1/2 S - S + M

Method 8: O(1/εMN) time; O(1/ε M1+ε+ N) space Here we use ε= 1/2 to illustrate the idea. O(N) M Solve each M1/2M problem M1/2 S - S + M

Methods Method 1: O(MN) time; O(MN) space Method 2: O(M2N) time; O(M) space Method 3: O(MN) time; O(M) space Method 4: O(MN log M) time; O(N log M) space Method 5: O(MN log min {M, N}) time; O(M+N) space Method 6: O(MN log log min {M, N}) time; O(M+N) space Method 7: O(1/εMN) time; O(1/ ε MεN) space Method 8: O(1/εMN) time; O(1/ε M1+ε+ N) space

Bonus points O(MN) time; O(M+N) space o(MN log log min {M, N}) time; O(M+N) space O(1/εMN) time; o(1/ε M1+ε+N) space