Download presentation
Presentation is loading. Please wait.
Published byMary Rodgers Modified over 9 years ago
1
Burkhard Morgenstern Institut für Mikrobiologie und Genetik Grundlagen der Bioinformatik Multiples Sequenzalignment Juni 2007
2
`Progressive´ Alignment Most popular approach to (global) multiple sequence alignment: Progressive Alignment Since mid-Eighties: Feng/Doolittle, Higgins/Sharp, Taylor, …
3
`Progressive´ Alignment WCEAQTKNGQGWVPSNYITPVN WWRLNDKEGYVPRNLLGLYP AVVIQDNSDIKVVPKAKIIRD YAVESEAHPGSFQPVAALERIN WLNYNETTGERGDFPGTYVEYIGRKKISP
4
`Progressive´ Alignment WCEAQTKNGQGWVPSNYITPVN WWRLNDKEGYVPRNLLGLYP AVVIQDNSDIKVVPKAKIIRD YAVESEAHPGSFQPVAALERIN WLNYNETTGERGDFPGTYVEYIGRKKISP Guide tree
5
`Progressive´ Alignment WCEAQTKNGQGWVPSNYITPVN WW--RLNDKEGYVPRNLLGLYP- AVVIQDNSDIKVVP--KAKIIRD YAVESEASFQPVAALERIN WLNYNEERGDFPGTYVEYIGRKKISP Profile alignment, “once a gap - always a gap”
6
`Progressive´ Alignment WCEAQTKNGQGWVPSNYITPVN WW--RLNDKEGYVPRNLLGLYP- AVVIQDNSDIKVVP--KAKIIRD YAVESEASVQ--PVAALERIN------ WLN-YNEERGDFPGTYVEYIGRKKISP Profile alignment, “once a gap - always a gap”
7
`Progressive´ Alignment WCEAQTKNGQGWVPSNYITPVN- WW--RLNDKEGYVPRNLLGLYP- AVVIQDNSDIKVVP--KAKIIRD YAVESEASVQ--PVAALERIN------ WLN-YNEERGDFPGTYVEYIGRKKISP Profile alignment, “once a gap - always a gap”
8
`Progressive´ Alignment WCEAQTKNGQGWVPSNYITPVN-------- WW--RLNDKEGYVPRNLLGLYP-------- AVVIQDNSDIKVVP--KAKIIRD------- YAVESEA---SVQ--PVAALERIN------ WLN-YNE---ERGDFPGTYVEYIGRKKISP Profile alignment, “once a gap - always a gap”
9
`Progressive´ Alignment WCEAQTKNGQGWVPSNYITPVN-------- WW--RLNDKEGYVPRNLLGLYP-------- AVVIQDNSDIKVVP--KAKIIRD------- YAVESEA---SVQ--PVAALERIN------ WLN-YNE---ERGDFPGTYVEYIGRKKISP Most important implementation: CLUSTAL W
10
`Progressive´ Alignment CLUSTAL W; Thompson et al., 1994 (~17.000 citations) Pairwise distances as 1 - percentage of identity Calculate un-rooted tree with Neighbor Joining Define root as central position in tree Define sequence weights based on tree Gap penalties calculated based on various parameters
11
Tools for multiple sequence alignment Problems with traditional approach: Results depend on gap penalty Heuristic guide tree determines alignment; alignment used for phylogeny reconstruction Algorithm produces global alignments.
12
Tools for multiple sequence alignment Problems with traditional approach: But: Many sequence families share only local similarity E.g. sequences share one conserved motif
13
Local sequence alignment Find common motif in sequences; ignore the rest EYENS ERYENS ERYAS
14
Local sequence alignment Find common motif in sequences; ignore the rest E-YENS ERYENS ERYA-S
15
Local sequence alignment Find common motif in sequences; ignore the rest – Local alignment E-YENS ERYENS ERYA-S
16
Local sequence alignment Traditional alignment approaches: Either global or local methods!
17
New question: sequence families with multiple local similarities Neither local nor global methods appliccable
18
New question: sequence families with multiple local similarities Alignment possible if order conserved
19
The DIALIGN approach
30
Consistency!
31
The DIALIGN approach
49
T-COFFEE C. Notredame, D. Higgins, J. Heringa (2000), T-Coffee: A novel algorithm for multiple sequence alignment, J. Mol. Biol. Problem: progressive alignment can go wrong if mistakes are made at an early stage. Example …
50
T-COFFEE SeqA GARFIELD THE LAST FAT CAT SeqB GARFIELD THE FAST CAT SeqC GARFIELD THE VERY FAST CAT SeqD THE FAT CAT
51
T-COFFEE SeqA GARFIELD THE LAST FAT CAT SeqB GARFIELD THE FAST CAT SeqC GARFIELD THE VERY FAST CAT SeqD THE FAT CAT
52
T-COFFEE
53
Idea: consider different pairwise alignments (local and global) check how these alignments support each other
54
T-COFFEE
56
T-COFFEE Less sensitive to spurious pairwise similarities Can handle local homologies better than CLUSTAL
57
Evaluation of multi-alignment methods Alignment evaluation by comparison to trusted benchmark alignments. `True’ alignment known by information about structure or evolution.
58
Evaluation of multi-alignment methods For protein alignment: M. McClure et al. (1994): 4 protein families, known functional sites J. Thompson et al. (1999): Benchmark data base, 130 known 3D structures (BAliBASE) T. Lassmann & E. Sonnhammer (2002): BAliBASE + simulated evolution (ROSE)
59
Evaluation of multi-alignment methods
60
Alignment evaluation by comparison to trusted benchmark alignments. `True’ alignment known by information about structure or evolution.
61
Evaluation of multi-alignment methods
62
1aboA 1.NLFVALYDfvasgdntlsitkGEKLRVLgynhn..............gE 1ycsB 1 kGVIYALWDyepqnddelpmkeGDCMTIIhrede............deiE 1pht 1 gYQYRALYDykkereedidlhlGDILTVNkgslvalgfsdgqearpeeiG 1ihvA 1.NFRVYYRDsrd......pvwkGPAKLLWkg.................eG 1vie 1.drvrkksga.........awqGQIVGWYctnlt.............peG 1aboA 36 WCEAQt..kngqGWVPSNYITPVN...... 1ycsB 39 WWWARl..ndkeGYVPRNLLGLYP...... 1pht 51 WLNGYnettgerGDFPGTYVEYIGrkkisp 1ihvA 27 AVVIQd..nsdiKVVPRRKAKIIRd..... 1vie 28 YAVESeahpgsvQIYPVAALERIN...... Key alpha helix RED beta strand GREEN core blocks UNDERSCORE BAliBASE Reference alignments Evaluation of multi-alignment methods
63
5 categories of benchmark sequences (globally related, internal gaps, end gaps) CLUSTAL W, RPPR perform well on globally related sequences, DIALIGN superior for local similarities Conclusion: no single best multi alignment program!
64
Evaluation of multi-alignment methods T. Lassmann & E. Sonnhammer (2002): BAliBASE + simulated evolution (ROSE)
66
Result: DIALIGN best for distantly related sequences, TCOFFEE best for closely related sequences
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.