Space-Saving Strategies for Computing Δ-points Kun-Mao Chao (趙坤茂) Department of Computer Science and Information Engineering National Taiwan University, Taiwan http://www.csie.ntu.edu.tw/~kmchao
Δ-points S-(i, j): the best score of a path from (0, 0) to (i, j). S+(i, j): the best score of a path from (i, j) to (M, N). Δ-points: S-(i, j) + S+(i, j) >= Δ S - S +
C G G A T C A T -3 -6 -9 -12 -15 -18 -21 -24 8 5 2 -1 -4 -7 -10 -13 3 Match: 8 Mismatch: -5 Gap symbol: -3 C G G A T C A T -3 -6 -9 -12 -15 -18 -21 -24 8 5 2 -1 -4 -7 -10 -13 3 7 4 1 -2 -5 9 6 10 -8 -11 -14 14 C T T A A C T optimal score
C T T A A C – T C G G A T C A T 8 – 5 –5 +8 -5 +8 -3 +8 = 14 8 – 5 –5 +8 -5 +8 -3 +8 = 14 C G G A T C A T -3 -6 -9 -12 -15 -18 -21 -24 8 5 2 -1 -4 -7 -10 -13 3 7 4 1 -2 -5 9 6 10 -8 -11 -14 14 C T T A A C T
C G G A T C A T -3 -6 -9 -12 -15 -18 -21 -24 8 5 2 -1 -4 -7 -10 -13 3 Match: 8 Mismatch: -5 Gap symbol: -3 S- Matrix C G G A T C A T -3 -6 -9 -12 -15 -18 -21 -24 8 5 2 -1 -4 -7 -10 -13 3 7 4 1 -2 -5 9 6 10 -8 -11 -14 14 C T T A A C T
-21 C G G A T C A T -18 -15 C T T A A C T -12 -9 -6 -3 -24 Match: 8 Mismatch: -5 Gap symbol: -3 S+ Matrix C G G A T C A T -21 -18 -15 -12 -9 -6 -3 -24 C T T A A C T
Match: 8 Mismatch: -5 Gap symbol: -3 S+ Matrix C G G A T C A T 14 3 6 8 10 12 1 -10 -21 11 13 2 4 -7 -18 5 16 7 -4 -15 -1 -12 9 15 18 -9 -2 -6 -13 -3 -24 C T T A A C T
C G G A T C A T C T T A A C T Match: 8 Mismatch: -5 Gap symbol: -3 S- and S+ Matrix C G G A T C A T 14 -3 3 -6 6 -9 8 -12 10 -15 12 -18 1 -21 -10 -24 5 2 11 -1 13 -4 -7 4 -13 16 7 -2 -5 9 15 18 -8 -11 -14 C T T A A C T
C G G A T C A T C T T A A C T Match: 8 Mismatch: -5 S- and S+ Matrix Gap symbol: -3 S- and S+ Matrix C G G A T C A T 14 -3 3 -6 6 -9 8 -12 10 -15 12 -18 1 -21 -10 -24 5 2 11 -1 13 -4 -7 4 -13 16 7 -2 -5 9 15 18 -8 -11 -14 C T T A A C T
Match: 8 Mismatch: -5 Gap symbol: -3 S- + S+ Matrix C G G A T C A T 14 -1 -2 -3 -17 -31 -45 13 12 11 1 -16 -15 -30 -29 C T T A A C T
Match: 8 Mismatch: -5 Gap symbol: -3 S- + S+ Matrix Δ = 14 C G G A T C A T 14 -1 -2 -3 -17 -31 -45 13 12 11 1 -16 -15 -30 -29 C T T A A C T
Match: 8 Mismatch: -5 Gap symbol: -3 S- + S+ Matrix Δ = 13 C G G A T C A T 14 -1 -2 -3 -17 -31 -45 13 12 11 1 -16 -15 -30 -29 C T T A A C T
The leftmost/rightmost Δ-paths For simple scoring schemes, finding the leftmost Δ-path and the rightmost Δ-path is easy. For affine gap penalties, it is more complicated.
Two alignments may not intersect!
Method 1: O(MN) time; O(MN) space
Method 2: O(M2N) time; O(N) space Each row takes O(MN) time. In total, O(M) x O(MN) = O(M2N) S + M
Method 3: O(MN) time; O(N) space
Method 4: O(MN log M) time; O(N log M) space
Method 4: O(MN log M) time; O(N log M) space (cont’d) … O(log M) layers M O(N) O(N) O(N) O(N) O(N)
The computation of S-(i, j) and S+(i, j) inside a block
Method 5: O(MN log min {M, N}) time; O(M+N) space
Method 5: O(MN log min {M, N}) time; O(M+N) space (cont’d) … O(log min {M, N}) layers M 4(M+N) 2(M+N) M+N 1/2(M+N) 1/4(M+N)
Method 6: O(MN log log min {M, N}) time; O(M+N) space Real Size 1/25 1/23 N 1/210 1/25 1/22 M 1/29 1/219
Method 7: O(1/ε MN) time; O(1/ε MεN) space Here we use ε= 1/2 to illustrate the idea. Solve each M1/2N problem M1/2 S - S + M
Method 8: O(1/εMN) time; O(1/ε M1+ε+ N) space Here we use ε= 1/2 to illustrate the idea. O(N) M Solve each M1/2M problem M1/2 S - S + M
Methods Method 1: O(MN) time; O(MN) space Method 2: O(M2N) time; O(M) space Method 3: O(MN) time; O(M) space Method 4: O(MN log M) time; O(N log M) space Method 5: O(MN log min {M, N}) time; O(M+N) space Method 6: O(MN log log min {M, N}) time; O(M+N) space Method 7: O(1/εMN) time; O(1/ ε MεN) space Method 8: O(1/εMN) time; O(1/ε M1+ε+ N) space
Bonus points O(MN) time; O(M+N) space o(MN log log min {M, N}) time; O(M+N) space O(1/εMN) time; o(1/ε M1+ε+N) space