Presentation is loading. Please wait.

Presentation is loading. Please wait.

DNA Barcoding Statistics Rasmus Nielsen University of Copenhagen.

Similar presentations


Presentation on theme: "DNA Barcoding Statistics Rasmus Nielsen University of Copenhagen."— Presentation transcript:

1 DNA Barcoding Statistics Rasmus Nielsen University of Copenhagen

2 Statistical Approaches Hypothesis testing problem. Hypothesis testing problem. Test membership of specific species. Test membership of specific species. Decision theoretic/Bayesian problem Decision theoretic/Bayesian problem Choose assignment by weighing how desirable/undesirable false positives and false negatives are. Choose assignment by weighing how desirable/undesirable false positives and false negatives are. Species assignment and higher taxonomic assignment without population genetics. Species assignment and higher taxonomic assignment without population genetics.

3 Approach 1: Hypothesis testing Test H 0 : Test H 0 : X  S i In divergence model In divergence model ~ T = 0 X  S i ~ T = 0 Likelihood ratio test Likelihood ratio test based on based on a T

4 Distribution of LR

5 Statistical Approaches Hypothesis testing problem. Hypothesis testing problem. Test membership of specific species. Test membership of specific species. Decision theoretic/Bayesian problem Decision theoretic/Bayesian problem Choose assignment by weighing how desirable/undesirable false positives and false negatives are. Choose assignment by weighing how desirable/undesirable false positives and false negatives are. Species assignment and higher taxonomic assignment without population genetics. Species assignment and higher taxonomic assignment without population genetics.

6 Approach 2: Classical (decision theoretic) assignment approach Base assignment on Base assignment on Pr(X  S i | D, X) X: query sequence S i : set of (mostly unobserved) sequences from species I D: all the avcailable DNA sequence data

7 Computation Use MCMC under coalescence model with divergence between species and other parameters. Use MCMC under coalescence model with divergence between species and other parameters. Calculate Calculate Pr(X  S i | D, X) from MCMC output. Currently only implemented for two species

8 Skipper butterfly Astraptes fulgerator

9

10 Why not use assignment based on marginal probabilities? What if we used i.e. we can calculate posterior probabilities by assuming independence, i.e. ignoring phylogeny.

11 Assignment error

12 Approach 3: Coaleescence- Shmoalescence Assign based on monophyly with other members of species (phylogenetic criterion). Assign based on monophyly with other members of species (phylogenetic criterion). Do not estimate phylogeny but only placement of query sequence Do not estimate phylogeny but only placement of query sequence of phylogeny. of phylogeny. Calculate posterior Calculate posterior probability of assignment. probability of assignment.

13 Algorithms BLAST to identify candidate set of species. BLAST to identify candidate set of species. Possible iteration to ensure a phylogenetic diverse sample. Possible iteration to ensure a phylogenetic diverse sample. Align and pipe to special version of MrBayes (by J. Huelsenbeck) which maintains phylogenetic constraints. Align and pipe to special version of MrBayes (by J. Huelsenbeck) which maintains phylogenetic constraints. Caluclate assignment probability based on MrBayes output. Caluclate assignment probability based on MrBayes output.

14 Example taxonomy summary

15 fig2

16

17

18 Greenland Ice Cores Example

19

20 Neanderthal Example

21 Acknowledgments Misha Matz (Coalescence based methods). Misha Matz (Coalescence based methods). Wouter Boomsma and Kasper Munch (Phylogenetic methods). Wouter Boomsma and Kasper Munch (Phylogenetic methods). John Huelsenbeck (MrBayes). John Huelsenbeck (MrBayes). Eske Willerslev (Ice and DNA examples). Eske Willerslev (Ice and DNA examples). Jody Hey (discussion and inspiration). Jody Hey (discussion and inspiration).


Download ppt "DNA Barcoding Statistics Rasmus Nielsen University of Copenhagen."

Similar presentations


Ads by Google