A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen.

Slides:



Advertisements
Similar presentations
LG 4 Outline Evolutionary Relationships and Classification
Advertisements

Success of DNA barcoding in distinguishing sister species of diverse clades of birds Allan Baker, Erika Tavares, Rebecca Elbourne Department of Natural.
Metabarcoding 16S RNA targeted sequencing
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Phylogeny and Modern Taxonomy
 A way for identifying organisms to species using sequence information from a standard gene present in all animals  Mitochondia: c oxidase subunit 1.
The Barcode Gap Speciation or Phylogeography? BANBURY 3 ? Graham Stone, Richard Challis, James Nicholls, Jenna Mann, Sonja Preuss Mark Blaxter Institute.
Molecular Evolution Revised 29/12/06
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
CHAPTER 25 TRACING PHYLOGENY. I. PHYLOGENY AND SYSTEMATICS A.TAXONOMY EMPLOYS A HIERARCHICAL SYSTEM OF CLASSIFICATION  SYSTEMATICS, THE STUDY OF BIOLOGICAL.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Bell Work Dogs of a certain breed can have black fur or white fur. Black fur is dominant, but the breeder only wants puppies with white fur. Cross two.
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
Phylogeny & The Tree of Life. Phylogeny  The evolutionary history of a species or group of species.
Gene Trees and Species Trees: Lessons from morning glories Lauren A. Eserman & Richard E. Miller Department of Biological Sciences Southeastern Louisiana.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
DNA Barcoding Dolan DNA Learning Center
Molecular phylogenetics
Systematics the study of the diversity of organisms and their evolutionary relationships Taxonomy – the science of naming, describing, and classifying.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
LECT 4. What is Cloning? The terms recombinant DNA technology, DNA cloning, molecular cloning, or gene cloning all refer to the same process: the transfer.
Identify gene markers for different taxonomic groups in Archaea and Bacteria Genomes Dongying Wu 1,2, Jonathan A. Eisen 1,2 1. DOE Joint Genome Institute,
16 September 2007 Coalescent Consequences for Consensus Cladograms J. H. Degnan 1, M. Degiorgio 2, D. Bryant 3, and N. A. Rosenberg 1,2 1 Dept. of Human.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Quantifying uncertainty in species discovery with approximate Bayesian computation (ABC): single samples and recent radiations Mike HickersonUniversity.
Underlying Principles of Zoology Laws of physics and chemistry apply. Principles of genetics and evolution important. What is learned from one animal group.
Population assignment likelihoods in a phylogenetic and demographic model. Jody Hey Rutgers University.
DNA Barcoding Statistics Rasmus Nielsen University of Copenhagen.
Chapter 11 (Plant Taxonomy, pp ) Species Concepts.
Phylogenetic Analysis – Part 2. Outline   Why do we do phylogenetics (cladistics)?   How do we build a tree?   Do we believe the tree?   Applications.
PHYLOGENY and SYSTEMATICS CHAPTER 25. VOCABULARY Phylogeny – evolutionary history of a species or related species Systematics – study of biological diversity.
Final Project Bioinformatics for Biologists. Alternative A Alternative B.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Figure 5.1 Giant panda (Ailuropoda melanoleuca)
Phylogeny & the Tree of Life
PHYLOGENY AND THE TREE OF LIFE CH 26. I. Phylogenies show evolutionary relationships A. Binomial nomenclature: – Genus + species name Homo sapiens.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Chapter 25: Phylogeny and Systematics. “Taxonomy is the division of organisms into categories based on… similarities and differences.” p. 495, Campbell.
Phylogenetic analysis of flatfish species (Teleostei, Pleuronectiformes) based on cytochrome oxidase 1 (Co-1) and cytochrome b (Cyt-b) genes Sharina S.N.,
European Patients’ Academy on Therapeutic Innovation The Purpose and Fundamentals of Statistics in Clinical Trials.
The Big Issues in Phylogenetic Reconstruction Randy Linder Integrative Biology, University of Texas
Systematics and Phylogenetics Ch. 23.1, 23.2, 23.4, 23.5, and 23.7.
Bioinf.cs.auckland.ac.nz Juin 2008 Uncorrelated and Autocorrelated relaxed phylogenetics Michaël Defoin-Platel and Alexei Drummond.
Phylogenetic Analysis – Part 2. Outline   Why do we do phylogenetics (cladistics)?   How do we build a tree?   Do we believe the tree?   Applications.
Phylogeny and Systematics Phylogeny Evolutionary history of a species of a group of related species Information used to construct phylogenies.
Lecture 16: Paternity Analysis and Phylogenetics October 19, 2012.
Determining Sequence Relationships
Lesson Overview Lesson Overview Modern Evolutionary Classification 18.2.
Date of download: 6/23/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Pipeline for culture-independent studies of a microbiota. (A)
Names, Ranks, Clades, and Taxonomy Ontologies
Quantitative Phylogenetic Assessment of Microbial Communities in Diverse Environments Xinjun Zhang.
Metagenomic Species Diversity.
Introduction to Bioinformatics Resources for DNA Barcoding
Phylogeny & the Tree of Life
Pipelines for Computational Analysis (Bioinformatics)
5.4 Cladistics.
Biological Classification: The science of taxonomy
Biological Classification: The science of taxonomy
Patterns in Evolution I. Phylogenetic
Taxonomical classification is recognizing and registering the worlds organism diversity – continual changing knowledge about evolutionary and ecological.
Genes to Trees Daniel Ayres and Adam Bazinet
Explore Evolution: Instrument for Analysis
Phylogeny and Systematics (Part 6)
Classification of Vertebrates
Multiple sequence alignment & Phylogenetics Analysis
Molecular data assisted morphological analyses
Evidence for Evolution
Presentation transcript:

A Bayesian method for DNA barcoding Kasper Munch, Wouter Boomsma, Eske Willerslev, Rasmus Nielsen, University of Copenhagen

Varieties of barcoding Assignment to existing species. Identification of new species. Assignment to taxonomic levels in general

Motivation 1.Environmental aDNA samples. 2.Putative Neandertal DNA. Often short query sequences. –Little information. Permissive PCR conditions. –Not always from the intended locus.

Given a set of database reference sequences from different species – according to which criteria should we assign new query sequences to taxonomic levels? ?

True species assignment Requires proper population genetic analyses quantifying variablity within species. Often not possible... –small database sample size for each species. –short query PCR products.

Phylogenetic alternative -Purely phylogenetic criteria which ignore population genetic problems. -Taxonomic annotation of database sequences is used to map phylogenetic groups to taxonomic levels. -The simpler approach has its own advangates: Less data required / Fewer assumptions

Monophyletic taxonomic group Ingroup or outgroup? Query

Estimating trees Estimation of a single tree is not sufficient because of the uncertainty regarding the phylogeny. We suggest instead to use a Bayesian approach which quantifies this uncertainty

Bayesian approach Let Q be the query sequence, X the database data, G a gene tree, and F a desired taxonomic group, then where G i is the ith gene tree sampled from p(G | X).

Assignment pipeline Summary Statistics Query Sequence Homology set Taxonomy summary Sampled trees Alignment Database (GenBank) NCBI blast Retrieval of sequences and taxonomy annotation ClustalW MrBayes

Summary statistics For each tree: –Find the sister clades to the query. –Find the consensus taxonomy for each clade. –Pick sister clade with most specific consensus taxonomy. For each taxonomic rank: –Find the fraction of consensus taxonomies that include taxonomic names of that rank.

Summary statistics For each tree: –Find the sister group to the query. –Find the list of taxonomic levels shared by the sequences in the sister group (consensus taxonomy) Sister groupQuery

Summary statistics For each tree: –Find the sister group to the query. –Find the list of taxonomic levels shared by the sequences in the sister group (consensus taxonomy) For each name of each taxonomic level: –Find the fraction of samples trees where the consensus taxonomy include that name.

Example taxonomy summary

Environmental Samples 379 environmental samples (aDNA) RBCL and TRNL markers. Aim is the identification of environmental flora

Orders >90% AsteralesBrassicalesCaryophyllalesConiferales DipsacalesEricalesFabalesFagales LamialesLepidopteraMalpighialesPoales PottialesRanunculalesRosalesSapindales SaxifragalesSolanalesZingiberales

Families >90% AmaranthaceaeAsteraceaeBetulaceaeBrassicaceae CaprifoliaceaeCaryophyllaceaeEricaceaeFabaceae FagaceaeJuncaceaeMusaceaePapaveraceae PinaceaePlantaginaceaePoaceaeRosaceae RutaceaeSalicaceaeSaxifragaceaeSolanaceae TaxaceaeTheaceae

Genera >90% AchilleaAlnusAruncusCerastium FagusMusaPiceaPinus PlantagoPoaSaxifragaSymphoricarpos Taxus

Botanical evaluation Temperate climate similar to central Sweden.

Testing putative Neandertal DNA Needless to say we have had several negative examples... One positive example: –Posterior probability of 91%.

Testing putative Neandertal DNA Needless to say we have had several negative examples... One positive example: –Posterior probability of 91%. Croatian squence with Neandertal characteristics point mutations. –sapiens sapiens with post prob. 67%

Problems No population genetic modelling: –Outgroup problem. –Species issues are is not addressed. –Lineage sorting - not reciprocal monophyli. Incomplete database

Advantages Phylogenetic uncertainty and statistical uncertainty of assignment is addressed. Posterior probability of assignment. Alternative to single tree assignment. Can be used on any database.

Conclusions The phylogenetic barcoding does not model the coalescence process. It is the appropriate method for assignment with little data, or when assigning to higher taxonomic levels. Bayesian approach offers a measure of confidence in assignment.