Download presentation
Presentation is loading. Please wait.
Published byAnnabel Gibbs Modified over 9 years ago
1
From basic Concepts to Advanced applications Molecular Evolution and Phylogeny By Ofir Cohen The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel 2011 http://ibis.tau.ac.il/twiki/bin/view/Bioinformatics/Phylogen y
2
2 of ~28 Darwin’s teachings– common descent and Tree-like evolution Introduction – The tree concept
3
3 of ~28 Common Descent – Modern evidence Introduction – The tree concept "The unity of life is no less remarkable than its diversity" "The unity of life is no less remarkable than its diversity" THEODOSIUS DOBZHANSK
4
4 of ~28 What is a Phylogenetic Tree? Phylogenetic tree: (hypothetical) historical pattern of evolutionary relationships among organisms Introduction – The tree concept (Greek: phylon = race and genetic = birth) sps Horizontal branch length – proportional to evolutionary distances (unit = substitution / site)
5
5 of ~28 Molecular evidence of HIV transmission in a criminal case Introduction - Anecdotes Metzker, Michael L. et al. (2002) Proc. Natl. Acad. Sci. USA 99, 14292-14297
6
6 of ~28 Criminal investigation August 1994 a nurse tests negative for HIV. breaks off a messy 10 year affair with a doctor. Three weeks later the doctor gives his ex-mistress a vitamin B-12 shot In January 1995, the nurse tests positive for both HIV and hepatitis C. The doctor’s office records from the day are missing (but eventually found). The doctor had withdrawn blood samples from a known HIV patient and a known hepatitis C patient the same day as the vitamin B-12 shot. The nurse had never had contact with either patient Introduction - Anecdotes Circumstantial evidence that the doctor injected blood from a patient of his into this ex-girlfriend…. How can this be proved using a phylogenetic approach?
7
7 of ~28 HIV – short background Extreme heterogeneity Within each patient there are many different viral strains ("quasi-species") Introduction - Anecdotes
8
8 of ~28 History of the virus: gp120 PATIENT VICTIM CONTROLS ©2002 National Academy of Sciences, U.S.A. Introduction - Anecdotes
9
9 of ~28 History of the virus: RT VICTIM PATIENT Introduction - Anecdotes Source sequences that are paraphyletic (other sequences are nested within them) with respect to the recipient sequences provide evidence for the direction of transmission.
10
10 of ~28 Phylogenetic analysis: Not only among organisms - Cancer phylogeny A phylogeny of acute myeloid leukemia (AML) subtypes Riester et al. 2010Liu et al. 2009
11
11 of ~28 Phylogenetic analysis: Not only in biology – Language evolution Russell and Atkinson. 2003 Researchers learn the evolution of languages by treating them like genomes. Instead of COGs (gene families), analyze COGNATES (words families)
12
12 of ~28 Comparative Genomics – " All life is one" Compare homologues sequences
13
13 of ~28 Newick format with branch lengths (A:0.3,((B1:0.1,B2:0.1):0.3,(C1:0.1,C2:0.1):0.5):0.3); http://tree.bio.ed.ac.uk/software/figtree/
14
14 of ~28 Alignment and phylogeny are mutually dependant Inaccurate tree building MSA Sequence alignment Phylogeny reconstruction Unaligned sequences
15
15 of ~28 Multiple sequence alignment (MSA) progressive alignment ABCDEABCDE Guide tree A D C B E MSA Pairwise distance table Iterative
16
16 of ~28 Multiple sequence alignment (MSA) Several advanced MSA programs are available. Today we will use two: MAFFT – fastest and one of the most accurate PRANK – distinct from all other MSA programs because of its correct treatment of insertions/deletions
17
17 of ~28 MAFFT Web server & download: http://align.bmr.kyushu-u.ac.jp/mafft/online/server/ http://align.bmr.kyushu-u.ac.jp/mafft/online/server/ Efficiency-tuned variants quick & dirty or slow but accurate Nucleic Acids Research, 2002, Vol. 30, No. 14 3059-3066 © 2002 Oxford University PressOxford University Press MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform Kazutaka Katoh, Kazuharu Misawa 1, Kei-ichi Kuma and Takashi Miyata *
18
18 of ~28 Choosing a MAFFT strategy quick & dirty slow but accurate
19
19 of ~28 Choosing a MAFFT strategy quick & dirty slow but accurate
20
20 of ~28 Choosing a MAFFT strategy quick & dirty slow but accurate
21
21 of ~28 Choosing a MAFFT strategy L-INS-i ooooooooooooooooooooooooooooooooXXXXXXXXXXX-XXXXXXXXXXXXXXX------------------ --------------------------------XX-XXXXXXXXXXXXXXX-XXXXXXXXooooooooooo------- ------------------ooooooooooooooXXXXX----XXXXXXXX---XXXXXXXooooooooooo------- --------ooooooooooooooooooooooooXXXXX-XXXXXXXXXX----XXXXXXXoooooooooooooooooo --------------------------------XXXXXXXXXXXXXXXX----XXXXXXX------------------ G-INS-i XXXXXXXXXXX-XXXXXXXXXXXXXXX XX-XXXXXXXXXXXXXXX-XXXXXXXX XXXXX----XXXXXXXX---XXXXXXX XXXXX-XXXXXXXXXX----XXXXXXX XXXXXXXXXXXXXXXX----XXXXXXX E-INS-i oooooooooXXX------XXXX---------------------------------XXXXXXXXXXX-XXXXXXXXXXXXXXXooooooooooooo ---------XXXXXXXXXXXXXooo------------------------------XXXXXXXXXXXXXXXXXX-XXXXXXXX------------- -----ooooXXXXXX---XXXXooooooooooo----------------------XXXXX----XXXXXXXXXXXXXXXXXXooooooooooooo ---------XXXXX----XXXXoooooooooooooooooooooooooooooooooXXXXX-XXXXXXXXXXXX--XXXXXXX------------- ---------XXXXX----XXXX---------------------------------XXXXX---XXXXXXXXXX--XXXXXXXooooo-------- quick & dirty slow but accurate
22
22 of ~28 MAFFT output Saving the output Choose a format: Clustal, Fasta, or click "Reformat" to convert to a selection of other formats Save page as a text file A colored view of the alignment
23
23 of ~28 PRANK
24
24 of ~28 Classical alignment errors for HIV env
25
25 of ~28 PRANK Web server: http://www.ebi.ac.uk/goldman-srv/webPRANK/http://www.ebi.ac.uk/goldman-srv/webPRANK/
26
26 of ~28 PRANK output If you need a different format – copy the results to the READSEQ sequence converter: http://www-bimas.cit.nih.gov/molbio/readseq/ http://www-bimas.cit.nih.gov/molbio/readseq/
27
27 of ~28 Downloadable PRANK http://www.ebi.ac.uk/goldman-srv/prank/prank/ PRANK: A command-line program interface PRANKSTER: A program with graphical user interface
28
28 1.Download and unzip the sequence files from my homepage (Google “Ofir Cohen" and look for the workshop materials under "Teaching"). Open "fahA.fas" in Notepad – these are 65 protein sequences in FASTA format. 2.Run PRANKSTER, open the "fahA.fas" file, and run "Alignment" "Make alignment" 3.While you wait: Copy the sequences into the MAFFT web server and run the "automatic" "moderate" strategy – which strategy did MAFFT choose for you? Click "Reformat", choose "phylip|phylip4", and save as "fahA.mafft.phylip" 4.When PRANKSTER finishes click File Save, and save the MSA in Phylip format by the name "fahA.prank.phylip"
29
29 of ~28 Phylogeny reconstruction Different approaches (algorithms / programs): Distance based methods (e.g. neighbor-joining, as in ClustalW) Fast but inaccurate Maximum parsimony (e.g. MEGA)MEGA Maximum likelihood methods (e.g. phyML, RAxML) Accurate but slowerphyMLRAxML Bayesian methods (e.g. MrBayes) Most accurate but very slowMrBayes ABCDEABCDE Guide tree A D C B E MSA Pairwise distance table
30
30 of ~28 PhyML The most widely used maximum likelihood (ML) program Web server & download: http://www.atgc-montpellier.fr/phyml/http://www.atgc-montpellier.fr/phyml/ Accepts input MSA in PHYLIP format only: Interleaved: Sequencial:
31
31 of ~28 Downloadable PhyML Less user-friendly, but allows using local computer power Run "phyml.bat" Drag the file from Windows Explorer to the blue window Enter "d" to switch from DNA to AA Enter "y" to run
32
32 1.Give "fahA.prank.phylip" or "fahA.mafft.phylip" as input to the phyML webserver (don't forget to choose "Amino-acids" and enter your email) 2.Run it with the local installation of "phyml.bat" You should end up with a file: "fahA.prank.phylip_phyml_tree.txt"
33
33 of ~28 RAxML Web server: http://phylobench.vital-it.ch/raxml-bb/http://phylobench.vital-it.ch/raxml-bb/ Similar maximum likelihood (ML) methodology as phyML, but much faster Faster results Better results in same run-time
34
34 of ~28 Downloadable RAxML A command-line program: http://icwww.epfl.ch/~stamatak/index-Dateien/Page443.htm (On that page you will also find instructions for running on Windows, and the RAxML manual) http://icwww.epfl.ch/~stamatak/index-Dateien/Page443.htminstructionsmanual easyRAx takes care of some of the RAxML options for you: http://projects.exeter.ac.uk/ceem /easyRAx.html but installation is a somewhat more complex http://projects.exeter.ac.uk/ceem /easyRAx.html
35
35 1.Give "fahA.prank.phylip" or "fahA.mafft.phylip" as input to the RAxML webserver (don't forget to tick "Protein sequences" and enter your email) Save the resulting tree file as: "fahA.prank.phylip.raxml"
36
36 of ~28 FigTree: tree visualization and figure creation Manipulate a node Manipulate a clade Manipulate a taxon
37
37 of ~28 1.Open "fahA.prank.phylip_phyml_tree.txt" in FigTree 2.Play around with the different options and make a pretty figure! 1.Find out how to color specific clades, as below 2.Try each of the three options under "Layout" 3.Export a figure in PDF format (File Export Graphic … )
38
38 of ~28 Final Questions… Thanks for your attention
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.