Download presentation
Presentation is loading. Please wait.
Published byDorthy Lang Modified over 9 years ago
1
T-COFFEE, a novel method for combining biological information Cédric Notredame
2
chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKD wheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSE trybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGP mouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. :::.:... :.. *. *: * chite AATAKQNYIRALQEYERNGG- wheat ANKLKGEYNKAIAAYNKGESA trybr AEKDKERYKREM--------- mouse AKDDRIRYDNEMKSWEEQMAE * :.*. : Potential Uses of A Multiple Sequence Alignment? Extrapolation Motifs/Patterns Phylogeny Profiles Struc. Prediction Multiple Alignments Are CENTRAL to MOST Bioinformatics Techniques.
3
Why Is It Difficult To Compute A multiple Sequence Alignment? A CROSSROAD PROBLEM BIOLOGY: What is A Good Alignment COMPUTATION What is THE Good Alignment chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKD wheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSE trybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGP mouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. :::.:... :.. *. *: *
4
Why Is It Difficult To Compute A multiple Sequence Alignment ? BIOLOGY CIRCULAR PROBLEM.... Good Sequences Good Alignment COMPUTATION
5
Dynamic Programming Using A Substitution Matrix Progressive Alignment
6
The T-Coffee Algorithm
7
Progressive Alignment Principle and its Limitations…
8
The Extended Library Principle…
10
The Triplet Assumption SEQ A SEQ B
11
Weighting And Extension Extension=Using Information from Other Sequences Weighting=Using The surrounding Information (Coffee)
12
T-Coffee Progressive Alignment Notredame, Higgins, Heringa, 2000 Dynamic Programming Using The extended Library
13
Local Alignment Global Alignment Extension Multiple Sequence Alignment Mixing Local and Global Alignments
14
What is a library? Extension+T-Coffee Library Based Multiple Sequence Alignment 2 Seq1 MySeq Seq2 MyotherSeq #1 2 1 1 25 3 8 70 …. 3 Seq1 anotherseq Seq2 atsecondone Seq3 athirdone #1 2 1 1 25 #1 3 3 8 70 ….
15
Validating T-Coffee
16
What Is BaliBase BaliBase BaliBase is a collection of reference Multiple Alignments The Structure of the Sequences are known and were used to assemble the MALN. Evaluation is carried out by Comparing the Structure Based Reference Alignment With its Sequence Based Counterpart
17
BaliBase DALI, Sap … Method X Comparison
18
Validation Using BaliBase T-Coffee Results
19
Validation Using BaliBase
24
Choosing The Right Method (MAFFT evaluation)
26
Taking T-Coffee Further: Using Structures
27
Mixing Heterogenous Information With T-Coffee Local AlignmentGlobal Alignment Multiple Sequence Alignment Multiple Alignment StructuralSpecialist
28
Sequences are Cheap and Common. Structures are Expensive and Rare. STUCTURE FUNCTION We WANT to use Structural information in multiple alignments: To help the alignment To extrapolate from Structures to Sequences. Why Do We Want To Mix Sequences and Structures?
29
Better gap penalties (ClustalW). Helping an Alignment With Structures? Low gap penalties high gap penalties
30
Better gap penalties (ClustalW). Helping an Alignment With Structures? Revealing Very Distant Relationships 1hstA 1tc3c
31
Is It Possible to Use Structural Information ? Any_pair THE new T-coffee method Struct Vs Struct Seq Vs Struct FUGUE Evaluation on Homestrad SAP Seq Vs Seq Local Global
32
DataMethodResult SeqCW35.2 % SeqTC38.4 % 1 StrucTC+FU41.9 % 2 StrucTC+SA41.8 % 2 StrucTC+SA+FU51.7 % ALL StrucTC+SA66.7 % CW: Clustal W TC: T-Coffee default SA: T-Coffee Using SAP FU: T-Coffee Using SAP Is It Possible to Use Structural Information ? Validation of Any_pair on the Homestrad Database (Orla O’Sullivan, Des Higgins and C. Notredame) Result: % of columns correctly aligned as judged from the Homestrad reference Alignment
33
Of the Importance of being Trustworthy… Identifying Good Bits in an Alignment
34
cah2_human NGPEHWHK-DFPIAKGERQSPVDIDTHTAKYDP------------SLKPLSVS--YDQAT cahp_mouse --GVEWGL-VFPDANGEYQSPINLNSREARYDP------------SLLDVRLSPNYVVCR cah4_rat SGPEQWTG----DCKKNQQSPINIVTSKTKLNP------------SLTPFTFVG-YDQKK ptpg_mouse YGPEHWVT-SSVSCGGSHQSPIDILDHHARVGD------------EYQELQLDG-FDNES cah6_human LDEAHWPQ-HYPACGGQRQSPINLQRTKVRYNP------------SLKGLNMTGYETQAG cah_dunsa -VGFDWTGGVCVNTGTSKQSPINIETDSLAEESERLGTADDTSRLALKGLLSS--SYQLT cahh_varv --------------MSQQLSPINIETKKAISNA------------RLKPLNIH--YNESK cah2_chlre EGKDGAG-NPWVCKTGRKQSPINVPQYHVLDGK------------GSK--IATGLQTQWS **::: cah2_human ---------SLRILNNGHAFNVEFDD-SQDKAVLK--------------------GGPLD cahp_mouse ---------DCEVTNDGHTIQVILKS----KSVLS--------------------GGPLP cah4_rat ---------KWEVKNNQHSVEMSLGE----DIYIF--------------------GGDLP ptpg_mouse SN-------KTWMKNTGKTVAILLKD----DYFVS--------------------GAGLP cah6_human ---------EFPMVNNGHTVQIGLPS----TMRMT--------------------VAD-G cah_dunsa ---------SEVAINLEQDMQFSFNAPDEDLPQLT--------------------IGGVV cahh_varv ---------PTTIQNTGKLVRINFKG-----GYLS--------------------GGFLP cah2_chlre YPDLMSNGSSVQVINNGHTIQVQWTY----DYAGHATIAIPAMRNQSNRIVDVLEMRPND * :.. cah2_human G----TYRLIQFHFHWGSLD--GQGSEHTVDKKKYAAELHLVHWNTK-YGDFGKAVQQPD cahp_mouse Q--GQEFELYEVRFHWGREN--QRGSEHTVNFKAFPMELHLIHWNSTLFGSIDEAVGKPH cah4_rat T----QYKAIQLHLHWSEES--NKGSEHSIDGKHFAMEMHVVHKKMTTGDKVQDSDSKD- ptpg_mouse G----RFKAEKVEFHWGHSNG-SAGSEHSVNGRRFPVEMQIFFYNPDDFDSFQTAISENR cah6_human I----VYIAQQMHFHWGGASSEISGSEHTVDGIRHVIEIHIVHYNS-KYKTYDIAQDAPD cah_dunsa H----TFKPVQIHFH-------HFASEHAIDGQLYPLEAHMVMASQN-DGS--------D cahh_varv N----EYVLSSLHIYWGKED--DYGSNHLIDVYKYSGEINLVHWNKKKYSSYEEAKKHDD cah2_chlre ASDRVTAVPTQFHFH--------STSEHLLAGKIFPLELHIVHKVTD---KLEACKG--G...:: *:* :. * ::. How Good Is my Alignment?
35
Measuring The Local Reliability: CORE Measure of Reliability cah2_human NGPEHWHK-DFPIAKGERQSPVDIDTHTAKYDPSLKPLSVS cahp_mouse --GVEWGL-VFPDANGEYQSPINLNSREARYDPSLLDVRLS cah4_rat SGPEQWTG----DCKKNQQSPINIVTSKTKLNPSLTPFTFV ptpg_mouse YGPEHWVT-SSVSCGGSHQSPIDILDHHARVGDEYQELQLD cah6_human LDEAHWPQ-HYPACGGQRQSPINLQRTKVRYNPSLKGLNMT Escore (Q,x) N*Max Escore Core (Q)=
36
CORE index Specificity ( ) and Sensitivity ( ) 0.48
37
What is the Local Quality of my Alignment II I
38
T-COFFEE, Version_1.24(Wed Nov 15 18:31:29 PST 2000) Notredame, Higgins, Heringa, JMB(302)pp205-217,2000 CPU TIME:11 sec. SCORE=39 * BAD AVG GOOD * cah2_human : 42 cah4_rat : 41 cah6_human : 40 cahp_mouse : 43 cah_dunsa : 33 cah2_human 77664444-454555557666665554444444------------33322222- cah4_rat 54553332----233445655555554444444------------443323221 cah6_human 44333443-333344445555444444444444------------444433331 cahp_mouse --633453-333345565554444334444455------------555444331 cah_dunsa -34334320212223456555555543333333ERLGTADDTSRL22222111- cah2_chlre 7663333-0333334566666555444343322------------222--1110 ptpg_mouse 67763343-333334445444433333333333------------332222221 cahh_varv --------------5555555555554444433------------33322211- Cons 655433430333334455555554444444443------------333322221 cah2_human -11121---------22223334333322321-00011222------------- cah4_rat -22222---------23333344443344442----22222------------- cah6_human 001122---------22233344333333433----22222------------- cahp_mouse 022333---------34344455554444543----33334------------- cah_dunsa -11111---------11111111111111110P00000111------------- cah2_chlre 00000000DLMSNGS11223333333433332----22111ATIAIPAMRNQSN ptpg_mouse -1111100-------12234445444544433----33333------------- cahh_varv -11222---------22233333333333322-----1122------------- Cons 01112100-------22233334333333332-00022222------------- Using Consistency For Automatic Annotation?
39
Evaluating An Alignment Not Generated With T-Coffee: T_coffee –infile CLUSTALW_ALN –in Library –do_score
40
Running T-Coffee ONLINE
41
WHERE ? Cedric.notredame@europe.com igs-server.cnrs-mrs.fr/~cnotred igs-server.cnrs-mrs.fr/Tcoffee
42
The T-Coffee Server
44
ES45, 4Proc 1 Gb RAM
45
T-Coffee Server HP/Compaq-ES45/4-2G
46
The T-Coffee Server
47
Data Input
48
The Right Parameters
49
The T-Coffee Server
50
Evaluating An Alignment
52
The T-Coffee Server
59
Future…
60
Large Scale…
61
Tailor Made…
62
WHERE ? Cedric.notredame@europe.com igs-server.cnrs-mrs.fr/~cnotred igs-server.cnrs-mrs.fr/Tcoffee
63
WHO ? WHO USES T-Coffee ? Dali Domain Dictionnary Pfam SwissProt WHO Makes T-Coffee ? Cédric Notredame Des Higgins Chantal Abergel Olivier Poirot Orla O’Sullivan
64
igs-server.cnrs-mrs.fr/~cnotred igs-server.cnrs-mrs.fr/Tcoffee Cedric.notredame@europe.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.