Download presentation
Presentation is loading. Please wait.
Published byCharity McCoy Modified over 8 years ago
2
©CMBI 2001 Alignment Most alignment programs create an alignment that represents what happened during evolution at the DNA level. To carry over information from a well studied to a newly developed sequence, we need an alignment that represents the protein structures today.
3
©CMBI 2001 The amino acids Most information that enters the alignment procedure comes from the physico-chemical properties of the amino acids. Example: which is the better alignment (left or right)? CPISRTWASIFRCW CPISRT---LFRCW CPISRTL---FRCW
4
©CMBI 2001 A difficult alignment problem AYAYAYAYSY LGLPLPLPLP
5
©CMBI 2001 A difficult alignment problem solved AYAYAYAYSY AGAPAPAPSP LGLPLPLPLP
6
©CMBI 2001 Alignment order MIESAYTDSW QFEKSYVTDY -MIESAYTDSW QFEKSYVTDY-
7
©CMBI 2001 Alignment order MIESAYTDSW QFEKSYVTDY QWERTYASNF -MIESAYTDSW QFEKSYVTDY- QWERTYASNF-
8
©CMBI 2001 Alignment order Conclusion: Align first the sequences that look very much like each other. So you ‘build up information’ while making the alignments most likely to be correct.
9
©CMBI 2001 Alignment order In order to know which sequences look most like each other, you need to do all pairwise alignments first. This is what CLUSTAL does.
10
©CMBI 2001 Step 1 D E
11
©CMBI 2001 Step 2 D E A B
12
©CMBI 2001 Step 3 D E C A B
13
©CMBI 2001 Step 4 D E C A B
14
©CMBI 2001 Other algorithms Multi-sequence alignment can also be done with an iterative ‘profile’ alignment. A) Make alignment of few, well- aligned sequences B) Align all sequences using this profile
15
©CMBI 2001 1. What is a profile? Normally, we use a PAM-like matrix to determine the score for each possible match in an alignment. This assumes that each match I E is the same. But it isn’t.
16
©CMBI 2001 2. What is a profile? QWERTYIPASEF At 1, E and I are QWEKSFIPGSEY both OK. NWERTMVPVSEM QFEKTYLPSSEY At 2, I is OK, NFIKTLMPATEF but E surely not. QYIRSLIPAGEM NYIQSLIPSTEL At 3, E is OK, QFIRSLFPSSEI but I surely not. 1 2 3
17
©CMBI 2001 3. What is a profile? The knowledge about which residue types are good for a certain position can be expressed in a profile. A profile holds for each position 20 scores for the 20 residue types, and sometimes also two values for gap open and gap elongation.
18
©CMBI 2001 Back to other algorithms Multi-sequence alignment can also be done with an iterative ‘profile’ alignment. A) Make alignment of few, well- aligned sequences B) Align all sequences using this profile
19
©CMBI 2001 Conserved, variable, or in-between QWERTYASDFGRGH QWERTYASDTHRPM QWERTNMKDFGRKC QWERTNMKDTHRVW Gray = conserved Black = variable Green = correlated mutations
20
©CMBI 2001 Correlated mutations determine the tree shape 1 AGASDFDFGHKM 2 AGASDFDFRRRL 3 AGLPDFMNGHSI 4 AGLPDFMNRRRV
21
©CMBI 2001 Correlation = Information 1, 2 and 5 bind calcium; 3 and 4 don’t. Which residues bind calcium? 123456789012345 1 ASDFNTDEKLRTTYI 2 ASDFSTDEKLKTTYI 3 LSFFTTDTKLATIYI 4 LSHFLTDLKLATIYI 5 ASDFTTDEKLALTYI
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.