Download presentation
Presentation is loading. Please wait.
1
©CMBI 2005 Sequence Alignment In phylogeny one wants to line up residues that came from a common ancestor. For information transfer one wants to line up residues at similar positions in the structure. gap = insertion ór deletion
2
©CMBI 2005 Global versus Local Alignment Global Local
3
©CMBI 2005 Global Alignment Align two sequences from “head to toe”, i.e. from 5’ ends to 3’ ends from N-termini to C-termini Algorithm published by: Needleman, S.B. and Wunsch, C.D. (1970) “A general method applicable to the search for similarities in the amino acid sequence of two proteins”. J. Mol. Biol. 48:443-453.
4
©CMBI 2005 Global Alignment aacttgagc- c-6 t-5 g-4 a-3 g-2 t-1 --9-8-7-6-5-4-3-2-10 We fill-up this matrix backwards, using a very simple scorings scheme. Identity = 1. Other = 0. Gaps cost -1.
5
©CMBI 2005 Global Alignment aacttgagc- c-6 t-5 g-4 a-3 g-2 t-1 --9-8-7-6-5-4-3-2-10 Score = Where you came from + Gap penalty + Similarity score
6
©CMBI 2005 Global Alignment aacttgagc- c-6 t-5 g-4 a-3 g-2 t0-1 --9-8-7-6-5-4-3-2-10 0 + 0 = 0 -1 + 0 – 1 = -2
7
©CMBI 2005 Global Alignment aacttgagc- c-1-2-4-6 t0-1-4-5 g310-3-4 a120-2-3 g001-1-2 t-3-2-10-1 --9-8-7-6-5-4-3-2-10 2 + 1 = 3 1 + 1 – 1 = 1 1 + 1 – 1 = 1
8
©CMBI 2005 Global Alignment aacttgagc- c345431-1-2-4-6 t1234420-1-4-5 g-2-1012310-3-4 a-2-2-2-10 120-2-3 g-5-4-3-2-1001-1-2 t-6-5-4-3-3-3-2-10-1 --9-8-7-6-5-4-3-2-10
9
©CMBI 2005 Global Alignment aacttgagc- c345431-1-2-4-6 t1234420-1-4-5 g-2-1012310-3-4 a-2-2-2-10 120-2-3 g-5-4-3-2-1001-1-2 t-6-5-4-3-3-3-2-10-1 --9-8-7-6-5-4-3-2-10 aacttgagc--ct-gagtaacttgagc--ct-gagt
10
©CMBI 2005 Global Alignment aacttgagc- c345431-1-2-4-6 t1234420-1-4-5 g-2-1012310-3-4 a-2-2-2-10 120-2-3 g-5-4-3-2-1001-1-2 t-6-5-4-3-3-3-2-10-1 --9-8-7-6-5-4-3-2-10 aacttgagc--c-tgagtaacttgagc--c-tgagt
11
©CMBI 2005 Local Alignment Locate region(s) with high degree of similarity in two sequences Algorithm published by: Smith, T.F. and Waterman, M.S. (1981) “Identification of common molecular subsequences”. J. Mol. Biol. 147:195-197.
12
©CMBI 2005 Local Alignment aacttgagc-c3454310010t1234421000g2101231100a2210112000g0011010100t0001100000-0000000000aacttgagc-c3454310010t1234421000g2101231100a2210112000g0011010100t0001100000-0000000000 cttgagct-gagcttgagct-gag
13
©CMBI 2005 Gap Penalty Functions Linear Penalty rises monotonous with length of gap Affine Penalty has a gap-opening and a separate length component Probabilistic Penalties may depend upon the character of the residues involved Other functions Penalty first rises fast, but levels off at greater length values
14
©CMBI 2005 Significance of Alignment How significant is the alignment that we have found? Or put differently: how much different is the alignment score that we found from scores obtained by aligning random sequences to our sequence?
15
©CMBI 2005 Calculating Significance Repeat N times (N > 100): Randomise sequence A by shuffling the residues in a random fashion Align randomized sequence A with sequence B, and calculate alignment score S Calculate mean and standard deviation Calculate Z-score: Z = (S genuine – Ŝ random ) / s.d.
16
©CMBI 2005 Significance of Alignment Random matches Genuine match Alignment score
17
©CMBI 2005 Significance of Alignment Random matches Random match Alignment score
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.