Download presentation
Presentation is loading. Please wait.
1
Global, local, repeated and overlaping
Alignment: Global, local, repeated and overlaping Jan
2
Global Alignment by the Needleman-Wunsch algorithm
From scoring matrix S Linear gap penalty F -d -2d … -nd -d F(1,1) F(2,1) F(…,1) F(n,1) -2d F(1,2) F(2,2) F(…,2) F(n,2) … F(1,…) F(2,…) F(…,…) Total score F(n,…) -md F(1,m) F(2,m) F(…,m) F(n,m) Trace back
3
2. Local alignment by the Smith-Waterman algorithm
To break uniformative pairwise combinations From scoring matrix S Linear gap penalty F … Total score F(1,1) F(2,1) F(…,1) F(n,1) F(1,2) F(2,2) F(…,2) F(n,2) … F(1,…) F(2,…) F(…,…) max F(n,…) F(1,m) F(2,m) F(…,m) F(n,m) Trace back
4
could be used to search for a repeated domain or motif in a protein
3. Repeated matches could be used to search for a repeated domain or motif in a protein care is required in implementation of this algorithm as it is asymmetric in the sense that x: represents the sequence containing the domain y: represents the sequence in which we look for repeated matches
5
3. Repeated matches Traceback from M(n+1,0) records the best alignment
The global alignment contains matched and unmatched regions Only matches above a threshold, T, are recorded Scores are computed relative to this threshold Changing T will thus affect the outcome of the algorithm If T is too large, some matches will be missed If T is too small, match regions may be split and too many weak matches may be found Variation of this algorithm: WATERMAN-EGGERT ALGORITHM (1987)
6
threshold value 3. Repeated matches (Local alignment) F Total score
F(1,0) F(2,0) F(…,0) F(n,0) F(n+1,0) … F(1,1) F(1,2) F(1,…) F(1,m) F(2,1) F(2,2) F(2,…) F(2,m) F(…,1) F(…,2) F(…,…) F(…,m) F(n,1) F(n,2) F(n,…) F(n,m) Trace back
7
4. Overlap matches global alignment strategy which does not penalize overlapping ends Algorithm is as per global alignment, except at initialization and a slight alteration of traceback Traceback starts from the maximum recorded element in the mth row and the nth column i.e., max{F(n,0), F(n,1),….,F(n,m), F(n-1,m), F(n-2,m),…, F(0,m)} ends when the 0th row or column is reached. Hitting the 0th row corresponds to the end of sequence x Hitting the 0th column corresponds to the end of sequence y
8
------ threshold value 4. Overlap matches (Global alignment)
(Local alignment) F F(1,0) F(2,0) F(…,0) F(n,0) F(0,m+1) Total score … F(1,1) F(1,2) F(1,…) F(1,m) F(1,2) F(2,2) F(2,…) F(2,m) F(…,1) F(…,2) F(…,…) F(…,m) F(n,1) F(n,2) F(…,m) F(n,m) max Trace back
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.