1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November 2009

Slides:



Advertisements
Similar presentations
Using the What Am I Template Copy the presentation to your hard drive. Open the slides using slide sorter and copy slides #3, 4 and 5 for each question.
Advertisements

Motivation “Nothing in biology makes sense except in the light of evolution” Christian Theodosius Dobzhansky.
Refworks Part I. How can I access Refworks Refworks can be accessed from: – The homepage of the Jotello F Soga Library (
Alignments and alignment reliability The first critical step in sequence analysis – the know how Eyal Privman and Osnat Penn Tel Aviv University COST Training.
Optimal Sum of Pairs Multiple Sequence Alignment David Kelley.
Clustal W and Clustal X version 2.0 김영호, 박준호, 최현희 The 9 th Protein Folding Winter School.
MICB 405 Bioinformatics Mini-Lab #4 – ClustalX Dr. Joanne Fox We gratefully acknowledge the funding for the development of these teaching.
© Wiley Publishing All Rights Reserved. Phylogeny.
CS320n – Elements of Visual Programming Introduction to Alice Mike Scott (Slides 1-1)
Input and output. What’s in PHYLIP Programs in PHYLIP allow to do parsimony, distance matrix, and likelihood methods, including bootstrapping and consensus.
Multiple sequence alignment Conserved blocks are recognized Different degrees of similarity are marked.
Phylogeny. Reconstructing a phylogeny  The phylogenetic tree (phylogeny) describes the evolutionary relationships between the studied data  The data.
Methods for Phylogenetics and Evolutionary analysis Jianpeng Xu University of Nebraska-Omah a.
MICB 405 Bioinformatics Mini-Lab #2 - BLAST Dr. Joanne Fox We gratefully acknowledge the funding for the development of these teaching.
Protein Sequence Classification Using Neighbor-Joining Method
Multiple sequence alignment Conserved blocks are recognized Different degrees of similarity are marked.
Bioinformatics tools for phylogeny and visualization
How to create a website for free Panagiotis Kafkarkou.
Multiple sequence alignment
Biology 4900 Biocomputing.
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Using the Multiple Choice Template Copy this presentation to your hard drive. Open up slide sorter, copy slides #3, 4, and 5 each time you are going to.
Multiple sequence alignment (MSA) Usean sekvenssin rinnastus Petri Törönen Help contributed by: Liisa Holm & Ari Löytynoja.
Alexis Dereeper Homology analysis and molecular phylogeny CIBA courses – Brasil 2011.
Christian M Zmasek, PhD 15 June 2010.
How to Raise the Dead: The Nuts & Bolts of Ancestral Sequence Reconstruction Jeffrey Boucher Theobald Laboratory.
Phylogenetic Analysis Dayong Guo. Introduction Phylogenetics is the study of evolutionary relatedness among various species, populations, or among a set.
User’s guide. Compare features:EndNote WebEndNote Save references++ Organize & edit references++ Storage capacity (number of references)10,000unlimited.
Tara and Pawel.  Download MEGA (Molecular Evolutionary Genetics Analysis) 
Phylogenetic trees School B&I TCD Bioinformatics May 2010.
From basic Concepts to Advanced applications Molecular Evolution & Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel.
Multiple sequence alignment and their reliability The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel January 2013 By.
Introduction to ArcGIS for Environmental Scientists Module 1 – Data Visualization Chapter 4 - Layouts.
BioMapper Bioinformatics Workflow Tool Cognitive Walkthrough 1 st November 2010.
Using the feature color file (.fc) and fasta file to create figures (ATOH7 example) Using the feature color file (.fc) and fasta file to create figures.
Applied Bioinformatics Week 8 Jens Allmer. Practice I.
BioPerf: A Benchmark Suite to Evaluate High- Performance Computer Architecture on Bioinformatics Applications David A. Bader, Yue Li Tao Li Vipin Sachdeva.
Copyright OpenHelix. No use or reproduction without express written consent1.
MUSCLE An Attractive MSA Application. Overview Some background on the MUSCLE software. The innovations and improvements of MUSCLE. The MUSCLE algorithm.
1 EndNote X2 Your Bibliographic Management Tool 30 September 2009 Aaron Tay Tel: /30
Parallel & Distributed Systems and Algorithms for Inference of Large Phylogenetic Trees with Maximum Likelihood Alexandros Stamatakis LRR TU München Contact:
Phylogeny and visualization: MEGA and iTOL Yanbin Yin Spring
1 EndNote X2 Your Bibliographic Management Tool 29 September 2009 Humanities and Social Sciences Resource Teams.
Saving PowerPoint Presentations as Web Pages Your Logo Here Open the PowerPoint Presentation. To convert to a format compatible with web browsers, launch.
COT 6930 HPC and Bioinformatics Multiple Sequence Alignment Xingquan Zhu Dept. of Computer Science and Engineering.
From basic Concepts to Advanced applications Molecular Evolution & Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel.
From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science.
Build an Automated Workflow Visual Workflow Creator Discovery Environment.
(PSI-)BLAST & MSA via Max-Planck. Where? (to find homologues) Structural templates- search against the PDB Sequence homologues- search against SwissProt.
Applied Bioinformatics Week 8 Jens Allmer. Theory I.
1 Import NOAA Atlas 14 Data June 2015 Obtaining NOAA Atlas 14 Rainfall Data and Importing it into WinTR-20 Presented by: WinTR-20 Development Team.
Automatic and manual sequence alignment Inferring phylogenetic trees Mining web-based databases Estimating rates of molecular evolution Testing evolutionary.
Using Divide-and-Conquer to Construct the Tree of Life Tandy Warnow University of Illinois at Urbana-Champaign.
Integration of BioInformatics tools at NUS. GenBank Growth Chart Year Bases.
Sequence alignment CS 394C: Fall 2009 Tandy Warnow September 24, 2009.
HANDS-ON ConSurf! Web-Server: The ConSurf webserver.
Especially created for ASL faculty. By Gloria Barron.
Winthrop June 28 – July 2, 2014 Terrell L. Hodge Western Michigan University
Phylip PHYLIP (the PHYLogeny Inference Package) is a package of programs for inferring phylogenies (evolutionary trees). PHYLIP is the most widely-distributed.
Advanced Taverna Aleksandra Pawlik University of Manchester materials by Katy Wolstencroft, Aleksandra Pawlik, Alan Williams
Multiple Sequence Alignment with PASTA Michael Nute Austin, TX June 17, 2016.
Scaling BAli-Phy to Large Datasets June 16, 2016 Michael Nute 1.
Introduction to Bioinformatics Resources for DNA Barcoding
Multiple Sequence Alignment Methods
CALL AOL Customer Support Number. How to Download and Install AOL Desktop Gold We are discussing a problem related to AOL where the users failed.
Tutorial for using Case It for bioinformatics analyses
Adva Yeheskel Bioinformatics Unit, Tel Aviv University 8/5/2018
New methods for simultaneous estimation of trees and alignments
Explore Evolution: Instrument for Analysis
Presentation transcript:

1 Phylogeny Workshop By Eyal PrivmanEyal Privman The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel November

2 Why should we care about phylogeny? "Nothing in biology makes sense except in the light of evolution" (Theodosius Dobzhansky, 1973)

33 Alignment and phylogeny are mutually dependant Inaccurate tree building MSA Sequence alignment Phylogeny reconstruction Unaligned sequences

44 Alignment and phylogeny are both challenging 25% of residues are aligned wrong Based on BAliBASE: a large representative set of proteins

55 Alignment and phylogeny are both challenging 5% of tree branches are wrong Based on simulations of 100 protein sequences

66 Multiple sequence alignment (MSA) progressive alignment ABCDEABCDE Guide tree A D C B E MSA Pairwise distance table Iterative

77 Multiple sequence alignment (MSA) Several advanced MSA programs are available. Today we will use two: MAFFT – fastest and one of the most accurate PRANK – distinct from all other MSA programs because of its correct treatment of insertions/deletions

88 MAFFT Web server & download: Efficiency-tuned variants  quick & dirty or slow but accurate Nucleic Acids Research, 2002, Vol. 30, No © 2002 Oxford University PressOxford University Press MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform Kazutaka Katoh, Kazuharu Misawa 1, Kei-ichi Kuma and Takashi Miyata *

99 Choosing a MAFFT strategy quick & dirty slow but accurate

10 Choosing a MAFFT strategy quick & dirty slow but accurate

11 Choosing a MAFFT strategy quick & dirty slow but accurate

12 Choosing a MAFFT strategy L-INS-i ooooooooooooooooooooooooooooooooXXXXXXXXXXX-XXXXXXXXXXXXXXX XX-XXXXXXXXXXXXXXX-XXXXXXXXooooooooooo ooooooooooooooXXXXX----XXXXXXXX---XXXXXXXooooooooooo ooooooooooooooooooooooooXXXXX-XXXXXXXXXX----XXXXXXXoooooooooooooooooo XXXXXXXXXXXXXXXX----XXXXXXX G-INS-i XXXXXXXXXXX-XXXXXXXXXXXXXXX XX-XXXXXXXXXXXXXXX-XXXXXXXX XXXXX----XXXXXXXX---XXXXXXX XXXXX-XXXXXXXXXX----XXXXXXX XXXXXXXXXXXXXXXX----XXXXXXX E-INS-i oooooooooXXX------XXXX XXXXXXXXXXX-XXXXXXXXXXXXXXXooooooooooooo XXXXXXXXXXXXXooo XXXXXXXXXXXXXXXXXX-XXXXXXXX ooooXXXXXX---XXXXooooooooooo XXXXX----XXXXXXXXXXXXXXXXXXooooooooooooo XXXXX----XXXXoooooooooooooooooooooooooooooooooXXXXX-XXXXXXXXXXXX--XXXXXXX XXXXX----XXXX XXXXX---XXXXXXXXXX--XXXXXXXooooo quick & dirty slow but accurate

13 MAFFT output Saving the output Choose a format: Clustal, Fasta, or click "Reformat" to convert to a selection of other formats Save page as a text file A colored view of the alignment

14 PRANK

15 Classical alignment errors for HIV env

16 PRANK Web server:

17 PRANK output If you need a different format – copy the results to the READSEQ sequence converter:

18 Downloadable PRANK –PRANK: A command-line program interface –PRANKSTER: A program with graphical user interface

19 1.Download and unzip the sequence files from my homepage (Google "Eyal Privman" and look for the workshop materials under "Teaching"). Open "fahA.fas" in Notepad – these are 65 protein sequences in FASTA format. 2.Run PRANKSTER, open the "fahA.fas" file, and run "Alignment"  "Make alignment" 3.While you wait: Copy the sequences into the MAFFT web server and run the "automatic" "moderate" strategy – which strategy did MAFFT choose for you? Click "Reformat", choose "phylip|phylip4", and save as "fahA.mafft.phylip" 4.When PRANKSTER finishes click File  Save, and save the MSA in Phylip format by the name "fahA.prank.phylip"

20 Phylogeny reconstruction Different approaches (algorithms / programs): Distance based methods (e.g. neighbor-joining, as in ClustalW)  Fast but inaccurate Maximum parsimony (e.g. MEGA)MEGA Maximum likelihood methods (e.g. phyML, RAxML)  Accurate but slowerphyMLRAxML Bayesian methods (e.g. MrBayes)  Most accurate but very slowMrBayes ABCDEABCDE Guide tree A D C B E MSA Pairwise distance table

21 PhyML The most widely used maximum likelihood (ML) program Web server & download: Accepts input MSA in PHYLIP format only: Interleaved: Sequencial:

22 Downloadable PhyML Less user-friendly, but allows using local computer power Run "phyml.bat" Drag the file from Windows Explorer to the blue window Enter "d" to switch from DNA to AA Enter "y" to run

23 1.Give "fahA.prank.phylip" or "fahA.mafft.phylip" as input to the phyML webserver (don't forget to choose "Amino-acids" and enter your ) 2.Run it with the local installation of "phyml.bat" You should end up with a file: "fahA.prank.phylip_phyml_tree.txt"

24 RAxML Web server: Similar maximum likelihood (ML) methodology as phyML, but much faster  Faster results  Better results in same run-time

25 Downloadable RAxML A command-line program: (On that page you will also find instructions for running on Windows, and the RAxML manual) easyRAx takes care of some of the RAxML options for you: m/easyRAx.html but installation is a somewhat more complex m/easyRAx.html

26 1.Give "fahA.prank.phylip" or "fahA.mafft.phylip" as input to the RAxML webserver (don't forget to tick "Protein sequences" and enter your ) Save the resulting tree file as: "fahA.prank.phylip.raxml"

27 FigTree: tree visualization and figure creation Manipulate a node Manipulate a clade Manipulate a taxon

28 1.Open "fahA.prank.phylip_phyml_tree.txt" in FigTree 2.Play around with the different options and make a pretty figure! 1.Find out how to color specific clades, as below 2.Try each of the three options under "Layout" 3.Export a figure in PDF format (File  Export Graphic … )

29 Thanks for your attention and happy phylogeny …