Pinpointing Uncertainty. Comparing competing phylogenetic hypotheses - tests of two (or more) trees Particularly useful techniques are those designed.

Slides:



Advertisements
Similar presentations
Bootstrapping (non-parametric)
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Mean, Proportion, CLT Bootstrap
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
1 Health Warning! All may not be what it seems! These examples demonstrate both the importance of graphing data before analysing it and the effect of outliers.
Hypothesis Testing Steps in Hypothesis Testing:
An Introduction to Phylogenetic Methods
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
Molecular Evolution Revised 29/12/06
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Maximum Likelihood. Historically the newest method. Popularized by Joseph Felsenstein, Seattle, Washington. Its slow uptake by the scientific community.
Distance Methods. Distance Estimates attempt to estimate the mean number of changes per site since 2 species (sequences) split from each other Simply.
Assessing Phylogenetic Hypotheses and Phylogenetic Data We use numerical phylogenetic methods because most data includes potentially misleading evidence.
Stat 301 – Day 14 Review. Previously Instead of sampling from a process  Each trick or treater makes a “random” choice of what item to select; Sarah.
Summary and Recommendations. Avoid the “Black Box” Researchers invest considerable resources in producing molecular sequence dataResearchers invest considerable.
Tree Evaluation Tree Evaluation. Tree Evaluation A question often asked of a data set is whether it contains ‘significant cladistic structure’, that is.
Bivariate Statistics GTECH 201 Lecture 17. Overview of Today’s Topic Two-Sample Difference of Means Test Matched Pairs (Dependent Sample) Tests Chi-Square.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Lecture 13 – Performance of Methods Folks often use the term “reliability” without a very clear definition of what it is. Methods of assessing performance.
Chapter 11: Inference for Distributions
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
Processing & Testing Phylogenetic Trees. Rooting.
Nonparametrics and goodness of fit Petter Mostad
Chapter 15 Nonparametric Statistics
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Chapter VIII: Elements of Inferential Statistics
Chapter 8 Introduction to Hypothesis Testing
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Comparing Means From Two Sets of Data
Terminology of phylogenetic trees
Lecture 15 - Hypothesis Testing A. Competing a priori hypotheses - Paired-Sites Tests Null Hypothesis : There is no difference in support for one tree.
Molecular phylogenetics
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Which Test Do I Use? Statistics for Two Group Experiments The Chi Square Test The t Test Analyzing Multiple Groups and Factorial Experiments Analysis of.
Non-parametric Tests. With histograms like these, there really isn’t a need to perform the Shapiro-Wilk tests!
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Tree Confidence Have we got the true tree? Use known phylogenies Unfortunately, very rare Hillis et al. (1992) created experimental phylogenies using phage.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Testing alternative hypotheses. Outline Topology tests: –Templeton test Parametric bootstrapping (briefly) Comparing data sets.
"Classical" Inference. Two simple inference scenarios Question 1: Are we in world A or world B?
Processing & Testing Phylogenetic Trees. Rooting.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
© Copyright McGraw-Hill 2004
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 26 Chapter 11 Section 1 Inference about Two Means: Dependent Samples.
1 Estimation of Population Mean Dr. T. T. Kachwala.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Evaluating the Fossil Record with Model Phylogenies Cladistic relationships can be determined without ideas about stratigraphic completeness; implied gaps.
Chapter 13 Understanding research results: statistical inference.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Application of Phylogenetic Networks in Evolutionary Studies Daniel H. Huson and David Bryant Presented by Peggy Wang.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Lecture 15 - Hypothesis Testing
Two-Sample Hypothesis Testing
Chapter 9: Inferences Involving One Population
CONCEPTS OF HYPOTHESIS TESTING
Some Nonparametric Methods
Discrete Event Simulation - 4
Summary and Recommendations
Assessing Phylogenetic Hypotheses and Phylogenetic Data
Assessing Phylogenetic Hypotheses and Phylogenetic Data
Summary and Recommendations
Presentation transcript:

Pinpointing Uncertainty

Comparing competing phylogenetic hypotheses - tests of two (or more) trees Particularly useful techniques are those designed to allow evaluation of alternative phylogenetic hypothesesParticularly useful techniques are those designed to allow evaluation of alternative phylogenetic hypotheses Several such tests allow us to determine if one tree is statistically significantly worse than another:Several such tests allow us to determine if one tree is statistically significantly worse than another: Winning sites, Templeton, Kishino-Hasegawa, parametric bootstrapping (SOWH) Winning sites, Templeton, Kishino-Hasegawa, parametric bootstrapping (SOWH) Shimodaira-Hasegawa, Approximately Unbiased Shimodaira-Hasegawa, Approximately Unbiased

Tests are of the null hypothesis that the differences between two trees (A and B) are no greater than expected from sampling errorTests are of the null hypothesis that the differences between two trees (A and B) are no greater than expected from sampling error The simplest ‘wining sites’ test sums the number of sites supporting tree A over tree B and vice versa (those having fewer steps on, and better fit to, one of the trees)The simplest ‘wining sites’ test sums the number of sites supporting tree A over tree B and vice versa (those having fewer steps on, and better fit to, one of the trees) Under the null hypothesis characters are equally likely to support tree A or tree B and a binomial distribution gives the probability of the observed difference in numbers of winning sitesUnder the null hypothesis characters are equally likely to support tree A or tree B and a binomial distribution gives the probability of the observed difference in numbers of winning sites Tests of two trees

The Templeton test Templeton’s test is a non-parametric Wilcoxon signed ranks test of the differences in fits of characters to two treesTempleton’s test is a non-parametric Wilcoxon signed ranks test of the differences in fits of characters to two trees It is like the ‘winning sites’ test but also takes into account the magnitudes of differences in the support of characters for the two treesIt is like the ‘winning sites’ test but also takes into account the magnitudes of differences in the support of characters for the two trees

Templeton’s test - an example Seymouriadae Diadectomorpha Synapsida ParareptiliaCaptorhinidae Paleothyris ClaudiosaurusYounginiformes Archosauromorpha Lepidosauriformes PlacodusEosauropterygiaAraeoscelidia 2 1 Recent studies of the relationships of turtles using morphological data have produced very different results with turtles grouping either within the parareptiles (H1) or within the diapsids (H2) the result depending on the morphologist This suggests there may be: - problems with the data - special problems with turtles - weak support for turtle relationships The Templeton test was used to evaluate the trees and showed that the slightly longer H1 tree found in the constrained analyses was not significantly worse than the unconstrained H2 tree The morphological data do not allow choice between H1 and H2 Parsimony analysis of the most recent data favoured H2 However, analyses constrained by H2 produced trees that required only 3 extra steps (<1% tree length)

Kishino-Hasegawa test The Kishino-Hasegawa test is similar in using differences in the support provided by individual sites for two trees to determine if the overall differences between the trees are significantly greater than expected from random sampling errorThe Kishino-Hasegawa test is similar in using differences in the support provided by individual sites for two trees to determine if the overall differences between the trees are significantly greater than expected from random sampling error It is a parametric test that depends on assumptions that the characters are independent and identically distributed (the same assumptions underlying the statistical interpretation of bootstrapping)It is a parametric test that depends on assumptions that the characters are independent and identically distributed (the same assumptions underlying the statistical interpretation of bootstrapping) It can be used with parsimony and maximum likelihood - implemented in PHYLIP and PAUP*It can be used with parsimony and maximum likelihood - implemented in PHYLIP and PAUP*

Kishino-Hasegawa test If the difference between trees (tree lengths or likelihoods) is attributable to sampling error, then characters will randomly support tree A or B and the total difference will be close to zero The observed difference is significantly greater than zero if it is greater than 1.95 standard deviations This allows us to reject the null hypothesis and declare the sub- optimal tree significantly worse than the optimal tree (p < 0.05) Under the null hypothesis the mean of the differences in parsimony steps or likelihoods for each site is expected to be zero, and the distribution normal From observed differences we calculate a standard deviation Distribution of Step/Likelihood differences at each site 0 Sites favouring tree A Sites favouring tree B Expected Mean

Kishino-Hasegawa test Ciliate SSUrDNA Maximum likelihood tree Ochromonas Symbiodinium Prorocentrum Sarcocystis Theileria Plagiopyla n Plagiopyla f Trimyema c Trimyema s Cyclidium p Cyclidium g Cyclidium l Glaucoma Colpodinium Tetrahymena Paramecium Discophrya Trithigmostoma Opisthonecta Colpoda Dasytrichia Entodinium Spathidium Loxophylum Homalozoon Metopus c Metopus p Stylonychia Onychodromous Oxytrichia Loxodes Tracheloraphis Spirostomum Gruberia Blepharisma anaerobic ciliates with hydrogenosomes Parsimonious character optimization of the presence and absence of hydrogenosomes suggests four separate origins of within the ciliates Questions - how reliable is this result? - in particular how well supported is the idea of multiple origins? - how many origins can we confidently infer?

Kishino-Hasegawa test Ochromonas Symbiodinium Prorocentrum Sarcocystis Theileria Plagiopyla n Plagiopyla f Trimyema c Trimyema s Cyclidium p Cyclidium g Cyclidium l Dasytrichia Entodinium Loxophylum Homalozoon Spathidium Metopus c Metopus p Loxodes Tracheloraphis Spirostomum Gruberia Blepharisma Discophrya Trithigmostoma Stylonychia Onychodromous Oxytrichia Colpoda Paramecium Glaucoma Colpodinium Tetrahymena Opisthonecta Ochromonas Symbiodinium Prorocentrum Sarcocystis Theileria Plagiopyla n Plagiopyla f Trimyema c Trimyema s Cyclidium p Cyclidium g Cyclidium l Homalozoon Spathidium Dasytrichia Entodinium Loxophylum Metopus c Metopus p Loxodes Tracheloraphis Spirostomum Gruberia Blepharisma Discophrya Trithigmostoma Stylonychia Onychodromous Oxytrichia Colpoda Paramecium Glaucoma Colpodinium Tetrahymena Opisthonecta Parsimony analyse with topological constraints found the shortest trees forcing hydrogenosomal ciliate lineages together, thereby reducing the number of separate origins of hydrogenosomes Two topological constraint trees Each of the constrained parsimony trees were compared to the ML tree and the Kishino-Hasegawa test used to determine which of these trees were significantly worse than the ML tree

Kishino-Hasegawa test No.ConstraintExtraDifference Significantly OriginstreeStepsand SD worse? 4ML MP- -13  18 No 3(cp,pt)  22 No 3(cp,rc)  40 Yes 3(cp,m)  36 Yes 3(pt,rc)  38 Yes 3(pt,m)  29 Yes 3(rc,m)  34 Yes 2(pt,cp,rc)  40 Yes 2(pt,rc,m)  43 Yes 2(pt,cp,m)  37 Yes 2(cp,rc,m)  49 Yes 2(pt,cp)(rc,m)  39 Yes 2(pt,m)(rc,cp)  48 Yes 2(pt,rc)(cp,m)  50 Yes 1(pt,cp,m,rc)  49 Yes Constrained analyses used to find most parsimonious trees with less than four separate origins of hydrogenosomes Tested against ML tree Trees with 2 or 1 origin are all significantly worse than the ML tree We can confidently conclude that there have been at least three separate origins of hydrogenosomes within the sampled ciliates Test summary and results (simplified)

Problems with tests of trees To be statistically valid, the Kishino-Hasegawa test should be of trees that are selected a prioriTo be statistically valid, the Kishino-Hasegawa test should be of trees that are selected a priori However, most applications have used trees selected a posteriori on the basis of the phylogenetic analysisHowever, most applications have used trees selected a posteriori on the basis of the phylogenetic analysis Where we test the ‘best’ tree against some other tree the KH test will be biased towards rejection of the null hypothesisWhere we test the ‘best’ tree against some other tree the KH test will be biased towards rejection of the null hypothesis Only if null hypothesis is not rejected will result be safe from some unknown level of biasOnly if null hypothesis is not rejected will result be safe from some unknown level of bias

Problems with tests of trees The Shimodaira-Hasegawa test is a more statistically correct technique for testing trees selected a posteriori and is implemented in PAUP*The Shimodaira-Hasegawa test is a more statistically correct technique for testing trees selected a posteriori and is implemented in PAUP* However it requires selection of a set of plausible topologies - hard to give practical adviceHowever it requires selection of a set of plausible topologies - hard to give practical advice Parametric bootstrapping (SOWH test) is an alternative - but it is harder to implement and may suffer from an opposite bias due to model mis- specificationParametric bootstrapping (SOWH test) is an alternative - but it is harder to implement and may suffer from an opposite bias due to model mis- specification The Approximately Unbiased test (implemented in CONSEL) may be the best option currentlyThe Approximately Unbiased test (implemented in CONSEL) may be the best option currently

Problems with tests of trees

Taxonomic Congruence Trees inferred from different data sets (different genes, morphology) should agree if they are accurateTrees inferred from different data sets (different genes, morphology) should agree if they are accurate Congruence between trees is best explained by their accuracyCongruence between trees is best explained by their accuracy Congruence can be investigated using consensus (and supertree) methodsCongruence can be investigated using consensus (and supertree) methods Incongruence requires further work to explain or resolve disagreementsIncongruence requires further work to explain or resolve disagreements

Reliability of Phylogenetic Methods Phylogenetic methods (e.g. parsimony, distance, ML) can also be evaluated in terms of their general performance, particularly their:Phylogenetic methods (e.g. parsimony, distance, ML) can also be evaluated in terms of their general performance, particularly their: consistency - approach the truth with more data efficiency - how quickly (how much data) robustness - sensitivity to violations of assumptions Studies of these properties can be analytical or by simulationStudies of these properties can be analytical or by simulation

Reliability of Phylogenetic Methods There have been many arguments that ML methods are best because they have desirable statistical properties, such as consistencyThere have been many arguments that ML methods are best because they have desirable statistical properties, such as consistency However, ML does not always have these propertiesHowever, ML does not always have these properties –if the model is wrong/inadequate (fortunately this is testable to some extent) –properties not yet demonstrated for complex inference problems such as phylogenetic trees

Reliability of Phylogenetic Methods “Simulations show that ML methods generally outperform distance and parsimony methods over a broad range of realistic conditions”“Simulations show that ML methods generally outperform distance and parsimony methods over a broad range of realistic conditions” Whelan et al Trends in Genetics 17: But…But… Most simulations cover a narrow range of very (unrealistically) simple conditionsMost simulations cover a narrow range of very (unrealistically) simple conditions –few taxa (typically just four!) –few parameters (standard models - JC, K2P etc)

Reliability of Phylogenetic Methods Simulations with four taxa have shown:Simulations with four taxa have shown: -Model based methods - distance and maximum likelihood perform well when the model is accurate (not surprising!) -Violations of assumptions can lead to inconsistency for all methods (a Felsenstein zone) when branch lengths or rates are highly unequal -Maximum likelihood methods are quite robust to violations of model assumptions -Weighting can improve the performance of parsimony (reduce the size of the Felsenstein zone)

Reliability of Phylogenetic Methods However:However: -Generalising from four taxon simulations may be dangerous as conclusions may not hold for more complex cases -A few large scale simulations (many taxa) have suggested that parsimony can be very accurate and efficient -Most methods are accurate in correctly recovering known phylogenies produced in laboratory studies More realistic simulations are needed if they are to help in choosing/understanding methodsMore realistic simulations are needed if they are to help in choosing/understanding methods You can try your own…You can try your own…