Presentation is loading. Please wait.

Presentation is loading. Please wait.

Integrating Biological Information In Multiple Sequence Alignments Confronting Bits and Pieces of Information Cédric Notredame CNRS-Marseille, France www.tcoffee.org.

Similar presentations


Presentation on theme: "Integrating Biological Information In Multiple Sequence Alignments Confronting Bits and Pieces of Information Cédric Notredame CNRS-Marseille, France www.tcoffee.org."— Presentation transcript:

1 Integrating Biological Information In Multiple Sequence Alignments Confronting Bits and Pieces of Information Cédric Notredame CNRS-Marseille, France www.tcoffee.org

2 Building and Using Models 35.67 Angstrom

3 Computing the Correct Alignement is a Complicated Problem

4 T-Coffee and Concistency…

5

6

7

8

9

10

11 Too Many Methods for ONE Alignment M-Coffee

12 Combining Many MSAs into ONE MUSCLE MAFFT ClustalW ??????? T-Coffee

13 Comparing Methods MAFFT

14 Going Further

15 Place your Bets…

16 Where to Trust Your Alignments Most Methods Agree Most Methods Disagree

17 When Sequences Are not Enough 3D-Coffee and Expresso

18 3D-Coffee: Combining Sequences and Structures Within Multiple Sequence Alignments

19 Expresso: Finding the Right Structure Template based Alignment of the Source Sequences Template-Source Alignment Why Not Using Structure Based Alignments

20 Expresso: Finding the Right Structure Sources Templates Template based Alignment of the Source Sequences Template-Source Alignment Library BLAST SAP Template Alignment Source Template Alignment Remove Templates Templates

21 3D-Coffee: Combining Sequences and Structures Within Multiple Sequence Alignments

22 Improving The Evaluation

23 More Than Structure based Alignments Structural Correctness Is Only the Easy Side of the Coin. In practice MSA are intermediate models used to generate other models: DataModel TypeBenchmark HomologyProfileYes EvolutionTreesNo Structure3D-StructureCASP FunctionAnnotationNo

24 Conclusion Template based Multiple Sequence Alignments Projecting any relevant information onto the sequences Using this Information Need for new evaluation procedures Functional Analysis Phylogenetic Analysis Homology Search (Profiles) Homology Modelling Integrating data  Making sure your bits of data can fight with one another

25 Fabrice Armougom (CNRS, FR) Sebastien Moretti (CNRS, FR) Olivier Poirot (CNRS, FR) Frederic Reinier (CRS4, IT) Karsten Suhre (CNRS, FR) Vladimir Saudek (Sanofi-Aventis, FR) Des Higgins (UCD, IE)h Orla O’Sullivan (UCD, IE) Iain Wallace (UCD, IE) Victor Jongeneel (SIB/VitalIT, CH) Bruno Nyfler (VitalIT, CH) Roger Hersch (EPFL, CH) Pierre Dumas (EPFL, CH) Basile Schaeli (EPFL, CH) www.tcoffee.org cedric.notredame@europe.com

26

27 Turning Data into Models Data Columbus, considered that the landmass occupied 225°, leaving only 135° of water (Marinus of Tyre, 70 AD).Marinus of Tyre Columbus believed that 1° represented only 56 miles (Alfraganus, XIth century)Alfraganus He knew there was an island named Japan off the cost of China… Model Circumference of the Earth as 25,255 km at most, Canary Island to Japan : 3,700 km (Reality: 12,000 km.)

28 T-Coffee Results Validation Using BaliBase

29 Structures Vs Sequences

30 ClustalW: The Progressive Algorithm

31 The More Structures The Merrier Average Improvement over T-Coffee Struc/Seq Ratio

32

33 T-Coffee and Concistency… Each Library Line is a Soft Constraint (a wish) You can’t satisfy them all You must satisfy as many as possible (The easy ones)

34 Concistency Based Algorithms: T-Coffee Gotoh (1990) – Iterative strategy using concistency Martin Vingron (1991) – Dot Matrices Multiplications – Accurate but too stringeant Dialign (1996, Morgenstern) – Concistency – Agglomerative Assembly T-Coffee (2000, Notredame) – Concistency – Progressive algorithm ProbCons (2004, Do) – T-Coffee with a Bayesian Treatment

35 The Right Mixt of Methods

36 What’s in a Multiple Alignment? Structural Criteria – Residues are arranged so that those playing a similar role end up in the same column. Evolutive Criteria – Residues are arranged so that those having the same ancestor end up in the same column. Similarity Criteria – As many similar residues as possible in the same column

37 How Do We Perform In The Twilight Zone? Concistency Based Methods Have an Edge Hard to tell Methods Apart Sequence Alignment is NOT solved

38 T-Coffee and Concistency…

39 Three Types of Algorithms Progressive: ClustalW Iterative: Muscle Concistency Based: T-Coffee and Probcons

40 3D-Coffee: Combining Sequences and Structures Within Multiple Sequence Alignments

41

42 What’s in a Multiple Alignment? The MSA contains what you put inside: – Structural Similarity – Evolutive Similarity – Sequence Similarity You can view your MSA as: – A record of evolution – A summary of a protein family – A collection of experiments made for you by Nature…


Download ppt "Integrating Biological Information In Multiple Sequence Alignments Confronting Bits and Pieces of Information Cédric Notredame CNRS-Marseille, France www.tcoffee.org."

Similar presentations


Ads by Google