Presentation is loading. Please wait.

Presentation is loading. Please wait.

Darwin’s Tree of Life, July 1837 https://tree.opentreeoflife.org.https://tree.opentreeoflife.org 2.3 million species Phylogenetic inference from genomic.

Similar presentations


Presentation on theme: "Darwin’s Tree of Life, July 1837 https://tree.opentreeoflife.org.https://tree.opentreeoflife.org 2.3 million species Phylogenetic inference from genomic."— Presentation transcript:

1 Darwin’s Tree of Life, July 1837 https://tree.opentreeoflife.org.https://tree.opentreeoflife.org 2.3 million species Phylogenetic inference from genomic data

2 Historical context Physical traits Simple genetic markers AFLP, RFLP A few SNPs Single gene sequences (or part of a gene) Multiple genes (from a few to many) Many SNPs (10s of thousands to millions) Whole mitochondrial genomes Concatenated to form a single string composed of multiple genes or SNPs Earlier Later Complexity has increased as computing capability has increased and sequencing costs have dropped

3 How do you construct a phylogeny? Parsimony The tree with the fewest changes is the best Several trees can be the same length (equally parsimonious) The number of trees you have to investigate grows rapidly as you include more sequences Often have to do heuristic searches (many trees are never actually examined during analysis) This was the primary method used for homologous morphological traits

4 Using distance methods: Calculate pairwise distances between all sequences Use distance matrix to infer phylogeny Neighbor joining or minimum evolution These methods are fast and can be bootsrapped more easily than other methods (less computation required) How do you construct a phylogeny?

5 Likelihood methods: Based on probability Computationally intensive Utilize models of DNA evolution Describe the rate at which one nucleotide replaces another during evolution Models may have equal rates or different rates (transitions vs. transversions) Examples: JC69, GTR, HKY, etc. Maximum Likelihood: Picks the highest probability tree by using a specific evolutionary model and your data Bayesian methods Use posterior probabilities and find the tree that best fits your data How do you construct a phylogeny?

6 Methods of phylogenomic inference Supermatrix Build one tree from the concatenated genes Can be partitioned so that each gene utilizes a different model/rate Combined sequences will be different lengths because of genes that do not exist across taxa Supertree Build optimal tree for each traits each tree will not include all taxa Combine those trees into one supertree Figure from Delsuc, Brinkmann and Philippe 2005

7 Number of characters (i.e. genes) vs. number of species Many species, few genes If you want to sample the most species, you usually have to focus on a few genes Limited by budget and time constraints Few species, many genes Its easy to get more genes if you have the whole genome available As more genomes are sequenced, these will increase Sequence based methods The goal is many species, many genes but isn’t always realistic How will you deal with missing data? What if certain genes don’t exist in all taxa? Studies have shown that phylogenies can be very tolerant of missing data Is it better to include a species that only has part of the data you are looking at? Which strategy is most appropriate for your study? What consequences will choosing poorly have?

8 Potential pitfalls????? Incomplete taxa sampling Incomplete lineage sorting recombination across genomes horizontal gene transfer Example: Based on complete mitochondrial genome sequences “If one is interested in inferring the evolutionary history of life, a much broader sample of taxa (perhaps sequenced for far less than full genomes) will result in a much more accurate estimate of phylogeny than will complete genomes of only a small number of taxa.” quote from David Hillis et al.

9 Incomplete lineage sorting & gene flow

10 Multiple lines of evidence from genomic data Remember from Monday the example from Crocodilians 3 datasets generated from genomes: UCE’s Protein coding genes Transposable elements Morphological phylogeny results in a different tree Alligatoridae Crocodylidae Tomistoma schlegelli Gavialis gangeticus ~80 my, ~8 species ~20 my, ~13 species

11 Class Mammalia ProtheriaTheria Eutheria Metatheria The root of mammals????? Then comes the ‘bushy’ bit of the mammalian tree Mammalian classification in 1945 Three main divisions in mammals Based primarily on non-dental skeletal morphology MarsupialsPlacental Mammals

12 The root of mammals????? Jump to 1992- 1999

13 The root of mammals????? Jump to 2004

14 The root of mammals????? 2009- retrotransposons- suggests that these diverged nearly simultaneously

15 Darwin’s Finches Evolution of Darwin’s finches and their beaks revealed by genome sequencing (Nature 2015)

16 Darwin’s Finches Key findings: based on about 45 million variable sites morphology ≠ genetics extensive interspecific gene flow some species are a result of hybridization 240 kb ALX1 gene is a transcription factor strongly associated with beak shape

17 Darwin’s Finches

18

19

20 Tree of life??? Nature 2004 Genome fusions and horizontal gene transfers make building a prokaryote/eukaryote tree difficult Using 34 prokaryote and eukaryote genomes Found evidence that eukaryotes resulted from a fusion of a photosynthetic prokaryote and another prokaryote

21 The Tree of Life Published 11 Apr 2016 Constructed with 16 ribosomal protein sequences More resolution than using 1 gene 3,081 genomes used 1,011 were new genomes Includes 1 representative per genus for all genera with high quality genomes available New genomes- uncultivable bacteria

22 http://psb.stanford.edu/psb-online/proceedings/psb02/phylogeneticIntro.pdf Laura A. Hug, et al. (2016) A new view of the tree of life. Nature Microbiology. Delsuc F, H Brinkmann, and H Philippe (2005) Phylogenomics and the reconstruction of the tree of life. Nature reviews. SIMPSON, G. G. (1945) The principles of classification and a classification of mammals. Bull. Am. Museum Nat. Hist. 85: l- 350. Shoshani J (1986) Mammalian Phylogeny : comparison of morphological and molecular results. Mol. Biol. Evol. 3(3):222- 242. Edwards, S, et al. (2016) Implementing and testing the multispecies coalescent model: a valuable paradigm for phylogenomics. Molecular phylogenetics and evolution. http://blogs.scientificamerican.com/tetrapod-zoology/refined-fine-tuned-placental-mammal-family-tree/ https://www.mun.ca/biology/scarr/Adaptation_in_Darwins_Finches.html Papers and websites used for this presentation:

23


Download ppt "Darwin’s Tree of Life, July 1837 https://tree.opentreeoflife.org.https://tree.opentreeoflife.org 2.3 million species Phylogenetic inference from genomic."

Similar presentations


Ads by Google