Phylogenetics.

Slides:



Advertisements
Similar presentations
Classification of Organisms
Advertisements

Computational Molecular Biology Biochem 218 – BioMedical Informatics Doug Brutlag Professor.
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
Wellcome Trust Workshop Working with Pathogen Genomes Module 6 Phylogeny.
Taxonomy & Phylogeny Classification of Organisms.
Multiple Sequence Alignment & Phylogenetic Trees.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Phylogenetic reconstruction
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
© Wiley Publishing All Rights Reserved. Phylogeny.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
Bioinformatics and Phylogenetic Analysis
Steps of the phylogenetic analysis
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Chapter 2 Opener How do we classify organisms?. Figure 2.1 Tracing the path of evolution to Homo sapiens from the universal ancestor of all life.
Phylogenetic Analysis. 2 Phylogenetic Analysis Overview Insight into evolutionary relationships Inferring or estimating these evolutionary relationships.
Topic : Phylogenetic Reconstruction I. Systematics = Science of biological diversity. Systematics uses taxonomy to reflect phylogeny (evolutionary history).
Phylogenetic trees Sushmita Roy BMI/CS 576
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Christian M Zmasek, PhD 15 June 2010.
P HYLOGENETIC T REE. OVERVIEW Phylogenetic Tree Phylogeny Applications Types of phylogenetic tree Terminology Data used to build a tree Building phylogenetic.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Phylogenetic trees School B&I TCD Bioinformatics May 2010.
BINF6201/8201 Molecular phylogenetic methods
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
 Read Chapter 4.  All living organisms are related to each other having descended from common ancestors.  Understanding the evolutionary relationships.
Warm-Up 1.Contrast adaptive radiation vs. convergent evolution? Give an example of each. 2.What is the correct sequence from the most comprehensive to.
Building and visualizing phylogeny Henrik Lantz Dept. of Medical Biochemistry and Microbiology, BMC, Uppsala University.
OUTLINE Phylogeny UPGMA Neighbor Joining Method Phylogeny Understanding life through time, over long periods of past time, the connections between all.
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Introduction to Phylogenetics
CHAPTER 26 Phylogeny and The Tree of Life. Learning Targets.
Why do trees?. Phylogeny 101 OTUsoperational taxonomic units: species, populations, individuals Nodes internal (often ancestors) Nodes external (terminal,
Phylogeny Ch. 7 & 8.
Phylogeny & the Tree of Life
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
Phylogeny and the Tree of Life
Phylogenetic basis of systematics
Phylogeny & the Tree of Life
Classification of Organisms
Phylogeny and the Tree of Life
In-Text Art, Ch. 16, p. 316 (1).
Goals of Phylogenetic Analysis
Biological Classification: The science of taxonomy
Systematics: Tree of Life
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Phylogenetic Trees.
Molecular Evolution.
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Chapter 25 Phylogeny and the Tree of Life
Systematics: Tree of Life
Phylogeny and the Tree of Life
Phylogeny and the Tree of Life
Phylogeny and the Tree of Life
Phylogeny and the Tree of Life
Phylogeny and the Tree of Life
Chapter 19 Molecular Phylogenetics
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Phylogeny and the Tree of Life
Phylogenetics Chapter 26.
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Molecular data assisted morphological analyses
Phylogeny and the Tree of Life
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Panthera pardus Genus: Panthera Family: Felidae Order: Carnivora
Presentation transcript:

Phylogenetics

Review Multiple sequence alignment ClustalW Steps 1.  pairwise alignments 2.  UPGMA or Neighbor-Joining tree based on pairwise scores (guide tree) 3.  Multiple alignment informed by guide tree

Multiple sequence alignment (Phylip format) 20 372 ThNM012b TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM012 TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM043 TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThlanugQH0 TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM069 TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM070 TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM037 TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM076 TTCCGCCGGG GGGGTNGTCC CNNGGCTCGG TGTGCCCCCG GGGCCCGTGC ThNM032 TTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TGTGCCCCCG GGGCCCGTGC ThNM075 CTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TTT--TGCCG GGGCGCGTGC ThNM007 CTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TTT--TGCCG GGGCGCGTGC Talthermo CTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TTT--TGCCG GGGCGCGTGC ThNM073 CTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TTT--TGCCG GGGCCCGTGC ThNM002 CTCCGCCGGG GGGGTCGTCC CGGGGCGCGG TTT--TGCCG GGGCCCGTGC AfumHQ6310 GGCCGCCGGG GAGGC-CTTG CGC------- -----CCCC- GGGCCCGCGC ThNM0026A -GCCGCCGGG GAGGC-CTTG CGC------- -----CCCC- GGGCCCGCGC ThNM025a -GCCGCCGGG GAGGC-CTTG CGC------- -----CCCC- GGGCCCGCGC Aspni5 -GCCGCCGGG GGGGCGCCTC TGC------- -----CCCCC GGGCCCGTGC ThNM001 --CCGCCGGG GGGCGTGTCC CGC------- -----CCCC- GGGCCCGCGC ThaurantT8 --CCGCCGGG GGGCGTGTCC CGC------- -----CCCC- GGGCCCGCGC

Newick tree format ( Physella_anatina/gb|AY651175.:0.00993, Physa_heterostropha/gb|AY6511:0.00165, Physa_acuta/gb|AY651188.1|:0.00598) :0.00687) :0.00137, Physella_virgata/gb|AY651170.1:0.00474, Lymnaea_stagnalis/gb|EF489390.:0.07009, Lymnaea_neotropica/emb|AM49400:0.07367) :0.00980, Biomphalaria_glabrata/gi|34538:0.08976) :0.09811);

Branch length may or may not reflect distance or time

A B C A B C A B C Three possible rooted trees with three taxa (unrooted tree has no meaning with 3 taxa)

A B A C A D A B C D C D B D C B Four possible unrooted trees with four taxa

Characters and character states Important terms Characters and character states Ancestral versus derived (not primitive versus advanced)

Invertebrate Fish Humans Dogs Birds animals Some examples of ancestral and derived characters Invertebrate Fish Humans Dogs Birds animals Upright posture loss of body hair feathers and feathered wings bony limbs bony skeleton nervous system Note: Ancestral and derived are relative terms. In this tree, a character is ancestral to nodes higher in the tree, but derived with respect to nodes Lower in the tree.

The problem of taxonomy not reflecting phylogeny

polyphyletic groups contain taxa that are not derived from a single common ancestor Old groups “algae” and “fungi” were polyphyletic Brown Algae Oomycete Fungi Green Plants Green Algae True Fungi animals

paraphyletic (subset of polyphyletic) groups have a taxonomic group contained within another group of equal status

Old “Reptilian” class is paraphyletic Crocodiles Birds Lizards

Estimating phylogeny (phylogenetics) Distance methods (not “phylogenetic”) - Examples: UPGMA, Fitch-Margolish, Neighbor Joining - Begin with a single measure of similarity or distance for every pair of taxa Phylogenetic methods (“phylogenetic”) - Examples: parsimony, maximum likelihood, Bayesian (Mr.Bayes) - Look at multiple discrete characters and use differences among character states to infer phylogeny

Distance (also sometimes called numerical) methods rely on a single numerical value that expresses the difference or similarity for any given pairwise comparison. With DNA data this value is usually obtained by dividing number of matching nucleotide positions by the average length of the two sequences compared. Species 1:  3510188 CTGATCCGAGGTCAACCTTGGGTT-GTGAAGGTCGTTTTACGGCTGGAAC 3510237                 |||||||||||||||||||||| | | ||||||||||||||||||||||| species 2:      562 CTGATCCGAGGTCAACCTTGGGGTCGCGAAGGTCGTTTTACGGCTGGAAC 513

Estimating phylogeny (phylogenetics) Distance methods (not “phylogenetic”) - Examples: UPGMA, Fitch-Margolish, Neighbor Joining - Begin with a single measure of similarity or distance for every pair of taxa Phylogenetic methods (“phylogenetic”) - Examples: parsimony, maximum likelihood, Bayesian (Mr.Bayes) - Look at multiple discrete characters and use differences among character states to infer phylogeny

Character #2 - + - + - - + + - + + - A B C D A C B D A D B C Tree #1 Tree #2 Tree #3

Some potential pitfalls with molecular data

Paralogs versus Orthologs Orthologs - homologous genes that reflect speciation Paralogs - homologous genes that reflect gene duplication = members of a gene family in a single organism (examples: alpha versus beta hemoglobin; red versus green visual pigment proteins Important to distinguish between these when doing comparative analyses (It’s sometimes hard to tell)

The Problem of Multiple Hits

Among other problems, this causes “long-branch attraction”

Scoring in phylogenetic methods is model dependent

General idea applies to protein amino-acid sequences as well

Can convert to scoring matrix based on log probablilities

7 1527 Physa_hete ---------- ---------- ---------- --------AA CATTATATTT Physa_acut ---------- ---------- ---------- --------AC CATTATATTT Physella_a ---------- ---------- ---------- --------AA CATTATATTT Physella_v ---------- ---------- ---------- --------AA CATTATATTT Lymnaea_st ---------- ---------- ---------- ---------- -----TTTAT Lymnaea_ne ---------- ---------- ---------- GATATTGGTA CTTTATATAT Biomphalar TTGCGTTGAC TCTTTTCAAC AAACCATAAA GATATTGGTA CTTTGTACAT AATTTTTGGG ATCTGGTGTG GATTGGTCGG TACAGGTTTA AGCTTGTTAA AATTTTTGGT GTTTGATGCG GTTTAGTGGG AACAGGTTTA TCCTTATTAA AATCTTTGGA ATCTGATGCG GGTTAGTAGG GACTGGATTG TCTTTATTAA AATTTTTGGA ATTTGGTGTG GTCTAGTTGG TACTGGATTA TCATTATTGA etc.

Bootstrap Analysis

7 1527 Physa_hete ---------- ---------- ---------- --------AA CATTATATTT Physa_acut ---------- ---------- ---------- --------AC CATTATATTT Physella_a ---------- ---------- ---------- --------AA CATTATATTT Physella_v ---------- ---------- ---------- --------AA CATTATATTT Lymnaea_st ---------- ---------- ---------- ---------- -----TTTAT Lymnaea_ne ---------- ---------- ---------- GATATTGGTA CTTTATATAT Biomphalar TTGCGTTGAC TCTTTTCAAC AAACCATAAA GATATTGGTA CTTTGTACAT AATTTTTGGG ATCTGGTGTG GATTGGTCGG TACAGGTTTA AGCTTGTTAA AATTTTTGGT GTTTGATGCG GTTTAGTGGG AACAGGTTTA TCCTTATTAA AATCTTTGGA ATCTGATGCG GGTTAGTAGG GACTGGATTG TCTTTATTAA AATTTTTGGA ATTTGGTGTG GTCTAGTTGG TACTGGATTA TCATTATTGA etc.