Piecewise linear gap alignment.
Motivation Sequences are sometimes similar over some regions but different but different over other regions. Chao etc. propose a generalized global alignment algorithm for comparing sequences with intermittent similarities, an ordered list of similar region separated by different region The algorithm introduces “difference block”, which present a long gap cost a fixed score.
GCGCTCCGGGACGCCTTCCGCCGTCGGGAGCCCTACAACTACCTGCAGAGGGCCTATTAC +++++++++++++++++++++++++||||||| ||||||||||||||||||||||| ||| GGGAGCCTTACAACTACCTGCAGAGGGCCTACTAC CAGGTGGGGAGCGGGCCGGGCAG TAG |||||| ||---||||||| |||+++++++++++++++++++++++++++++++++++++ CAGGTGCGG GGGCCGGCCAGGGTGCTACCCCAAGCCTACTGACTGTCTTACTGG CCTTCCCCAGAGCCCCCTAGCCGCAGGCACCAGAGGGTCCAAGACAAGACTGGAAGGGCA +++++++++++++++++++++++|| || ||| | ||||| || || |||| | | | CAAGCTTCAGCGAGTCCAGGAGAAAGCTGGGAAGCCC CCTCGGGTTCGG GAGGAGCTGTGAGTGGCT | ||||| |||------||||| |||||| |||||++++++++++++++++++++++++ CGCCGGGTCCGGGTCCGAGAGGAACTGTGAATGGCTGAGCCTGCTTCTCGAGGATCAGGC
Local alignment 1 Local alignment 2
O(NC) algorithm Use the diagonalwise method can improve the generalized global alignment’s computation time. But the diagonalwise method minimizes the penalty scores. If apply difference block, the scores will always be one difference block penalty, when similarity is not good engogh.
ACCGGTCTTGAAGCGTGTGACGTGGGCAGGGGAATTCCCGTGAGCCTAAGTGTCCCGCGCTA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ CCGTTTGGAAACCGGGTGGGGG++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++++++AAAACGGTTGCAAATGCCCTTTAATGGGGCCGATGGGAAA +++++++++++++++++++++++++++++++++++ AACTGCCGTAACGTTTAGGCTAAAGCCCTGCTACG The alignment score is one difference block penalty
Piecewise linear gap We can use piecewise linear gap to implement alignment. If the gap length <= L1, penalize each gap extension penalty g1, the length > L1 and < L2 penalize each gap extension penalty g2, the length > L2 penalize each gap extension penalty g3.
ACCGTT--CTTGTGGCAAAC A-CGTTAAATTGT------- 06000063400006332221 38 L1 = 3, L2=6 g1 = 3, g2= 2, g3=1 substitution = 4, match = 0 gap open = 3 ACCGTT--CTTGTGGCAAAC A-CGTTAAATTGT------- 06000063400006332221 38
Intuitional think to implement piecewise linear gap Keeps the deletion and insertion gaps length to correspond the gap extension penalties. The above method has some problem.
C2 = op + (g1-g2)*L1 + (g2-g3)*L2 C = op+g1*L1+g2*(L2-L1)+g3*(L-L2) g2 C2 C1 C = op+ g1*L1+g2*(L-L1) g1 Gap open penalty L1 L2 L C1 = op + (g1-g2)*L1 C2 = op + (g1-g2)*L1 + (g2-g3)*L2
Piecewise linear gap Add tables D1 & I1 for deletion and insertion gaps, then pre-penalize C1. For each extension gap penalize g2 in D1 & I1. Add another tables D2 & I2 for deletion and insertion gaps, then pre-penalize C2. For each extension gap penalize g3 in D1 & I1.
A C T T A G C C C C C T T op = 5 g1 = 3 g2 = 1 L1 = 3 Sub = 5 D D1 I I1 S = 0 D D1 I = 8 I1 = 12 S = 8 D D1 I = 11 I1 = 13 S = 11 D D1 I = 14 I1 = 14 S = 14 D D1 I = 17 I1 = 15 S = 15 D D1 I = 20 I1 = 16 S = 16 D D1 I = 23 I1 = 17 S = 17 D D1 I = 26 I1 = 18 S = 18 D D1 I = 29 I1 = 19 S = 19 g2 = 1 L1 = 3 Sub = 5 D = 8 D1 = 12 I I1 S = 8 D =16 D1 = 20 I = 16 I1 = 20 S = 5 D =19 D1 = 23 I = 13 I1 = 17 S = 8 D = 22 D1 = 26 I = 16 I1 = 18 S = 16 D = 23 D1 = 27 I = 19 I1 = 19 S = 19 D = 24 D1 = 28 I = 22 I1 = 20 S = 20 D = 25 D1 = 29 I = 25 I1 = 21 S = 21 D = 26 D1 = 30 I = 28 I1 = 22 S = 17 D = 27 D1 = 31 I = 25 I1 = 23 S = 18 C1=5+2*3=11 C D = 11 D1 = 13 I I1 S = 11 D = 13 D1 = 17 I = 19 I1 = 23 S = 13 D = 16 D1 = 20 I = 21 I1 = 24 S = 5 D = 24 D1 = 27 I = 13 I1 = 17 S = 13 D = 26 D1 = 28 I = 16 I1 = 18 S = 16 D = 27 D1 = 29 I = 19 I1 = 19 S = 19 D = 28 D1 = 30 I = 22 I1 = 20 S = 20 D = 25 D1 = 29 I = 25 I1 = 21 S = 21 D = 26 D1 = 30 I = 28 I1 = 22 S = 17 C D = 14 D1 = 14 I I1 S = 14 D = 16 D1 = 18 I = 22 I1 = 16 S = 16 D = 13 D1 = 17 I = 24 I1 = 27 S = 13 D = 21 D1 = 25 I = 21 I1 = 25 S = 10 D = 24 D1 = 28 I = 18 I1 = 22 S = 18 D = 27 D1 = 30 I = 21 I1 = 23 S = 21 D = 28 D1 = 31 I = 24 I1 = 24 S = 24 D = 28 D1 = 30 I = 27 I1 = 25 S = 20 D = 25 D1 = 29 I = 28 I1 = 26 S = 21 C D = 17 D1 = 15 I I1 S = 15 D = 19 D1 = 19 I = 23 I1 = 27 S = 19 D = 16 D1 = 18 I = 26 I1 = 28 S = 16 D = 18 D1 =22 I = 24 I1 = 28 S = 13 D = 26 D1 = 29 I = 21 I1 = 25 S = 10 D = 29 D1 = 31 I = 18 I1 = 22 S = 18 D = 31 D1 = 32 I = 21 I1 = 23 S = 21 D = 28 D1 = 31 I = 24 I1 = 24 S = 24 D = 28 D1 = 30 I = 27 I1 = 25 S = 25 T D = 20 D1 = 16 I I1 S = 16 D = 22 D1 = 20 I = 24 I1 = 28 S = 20 D = 19 D1 = 19 I = 27 I1 = 29 S = 19 D = 21 D1 = 23 I = 27 I1 = 30 S = 16 D = 18 D1 = 22 I = 24 I1 = 28 S = 13 D =26 D1 = 30 I = 21 I1 = 25 S = 15 D = 29 D1 = 33 I = 23 I1 = 26 S = 23 D = 31 D1 = 32 I = 26 I1 = 27 S = 26 D = 31 D1 = 31 I = 29 I1 = 28 S = 28 T