Lecture 15 - Hypothesis Testing A. Competing a priori hypotheses Null Hypothesis : There is no difference in support for one tree over the other. Test statistic: In Parsimony (or minimum evolution); d = LengthA – LengthB In Likelihood; d = lnLA – lnLB
Early methods looked at the difference in optimality between the two trees on a site-by-site basis. Thus, these are called paired-sites method. Normal (mean = 0) KH-Test: Parsimony – normal distribution Tree # 1 2 Length 354 357 [So the test statistic is 3 steps] Kishino-Hasegawa test: Tree Length diff s.d.(diff) t P* --------------------------------------------------------------------- 1 354 (best) 2 357 3 9.65019 0.3109 0.7560 * Probability of getting a more extreme T-value under the null hypothesis of no difference between the two trees (two-tailed test). Example So we would accept the Ho that there is no difference in support for either tree.
Same data, but using ML as criterion. KH-Test: Likelihood – normal distribution Tree 1 2 -ln L 2558.60898 2529.81009 [So the test statistic is 28.8 lnL units] Kishino-Hasegawa test: KH test using normal approximation, two-tailed test KH-test Tree -ln L Diff -ln L P --------------------------------------------- 1 2558.60898 28.79889 0.000* 2 2529.81009 (best) * P < 0.05 Here, the same data and test lead to rejection of Ho under a different optimality criterion KH-Test: Likelihood – RELL bootstrapping Tree 1 2 -ln L 2558.60898 2529.81009 Kishino-Hasegawa test: KH test using RELL bootstrap, two-tailed test Number of bootstrap replicates = 1000 KH-test Tree -ln L Diff -ln L P --------------------------------------------- 1 2558.60898 28.79889 0.396 2 2529.81009 (best)
trees that were examining. Same Data, Same Two Trees, Different Inferences Note that we’ve been using these tests as intended; we have two a priori trees that were examining.
Most applications of these tests have involved comparing a suboptimal tree to the best tree. This results in use of a mismatched null distribution and the tests should not be used in this way. SH Test - corrects for the a priori requirement. Uses a collection of trees (rather than just ML v. Hypothesis). If there are only two trees in the collection, the ML tree and the tree we’re interested in testing, the SH test is identical to the KH test. Uses RELL bootstrap to generate an average lnL for the collection. Uses this average to center the null distribution. The test is very conservative, though, and very sensitive to the collection of trees used in centering the null distribution. Shimodaira has applied multiscale bootstrapping to correct for this bias in the AU (almost unbiased) test
Parametric Bootstrap Tests Generate the null distribution via simulation under the hypothesis being tested. Assuming some hypothesis is true, what is the probability of an observed test statistic? Ancient Fragmentation Hypothesis PNW Biogeography Northern Dispersal Hypothesis N. Rockies N. Cascades S. Cascades Southern Dispersal Hypothesis N. Rockies S. Cascades N. Cascades
Parametric Bootstrap Tests This tree has a likelihood score of lnL = -1593.01499
Parametric Bootstrap Tests Can we reject an ancient fragmentation? Force the tree to fit the predictions of that hypothesis, we get a much worse tree: lnL = -1612.50229 d = lnL(uconstrained) – lnL(constrained) = - 1593.01499 – ( - 1612.50229) = 19.487 lnL units. If the true tee actually were the constrained tree (i.e., if the hypothesis being examined were actually true), what is the probability that we would see a tree that is 19.487 lnL units better, simply due to stochasticity? So we simulate many replicate data sets under the hypothesis. We first run an unconstrained search to find the ML tree for the replicate. We then run a search constrained to find the best tree consistent with the hypothesis for each replicate.
Parametric Bootstrap Tests Can we reject an ancient fragmentation? We can reject a southern dispersal hypothesis as well and are left with the northern dispersal hypothesis that the tree suggests.
Bayesian Hypothesis Testing If we run a typical MCMC, we have sample of topologies that represent the posterior distribution of trees. The proportion of the trees in the distribution that are consistent with the topological predictions of each hypothesis provide the posterior probability of that hypothesis. Remember the difference in Bayesian and frequentist perspectives, and the different interpretations of p-values this causes.
An Approximate LRT of Hypotheses Anismonv and Gascuel (2006. Syst. Biol., 55:539) produced a fast approximate approach. Most phylogenetic hypotheses rely the presence/ absence of a particular internal branch. Test statistic would be 2(lnl1 - lnl0). A more conservative test statistic: 2(lnl1 - lnl2). Power and accuracy analyses & robustness to model violations.