Comparing Classical and Bayesian Approaches to Hypothesis Testing James O. Berger Institute of Statistics and Decision Sciences Duke University
Outline The apparent overuse of hypothesis testing When is point null testing needed? The misleading nature of P-values Bayesian and conditional frequentist testing of plausible hypotheses Advantages of Bayesian testing Conclusions
I. The apparent overuse of hypothesis testing Tests are often performed when they are irrelevant. Rejection by an irrelevant test is sometimes viewed as license to forget statistics in further analysis
Prototypical example
Statistical mistakes in the example The hypothesis is not plausible; testing serves no purpose. The observed usage levels are given without confidence sets. The rankings are based only on observed means, and are given without uncertainties. (For instance, perhaps Pr (A>B)=0.6 only.)
Prototypical example
Statistical mistakes in the example The hypothesis is not plausible; testing serves no purpose. The observed usage levels are given without confidence sets. The rankings are based only on observed means, and are given without uncertainties. (For instance, perhaps Pr (A>B)=0.6 only.)
Prototypical example
II. When is testing of a point null hypothesis needed? Answer: When the hypothesis is plausible, to some degree.
Examples of hypotheses that are not realistically plausible H 0 : small mammals are as abundant on livestock grazing land as on non-grazing land H 0 : survival rates of brood mates are independent H 0 : bird abundance does not depend on the type of forest habitat they occupy H 0 : cottontail choice of habitat does not depend on the season
Examples of hypotheses that may be plausible, to at least some degree: H 0 : Males and females of a species are the same in terms of characteristic A. H 0 : Proximity to logging roads does not affect ground nest predation. H 0 : Pollutant A does not affect Species B.
III. For plausible hypotheses, P-values are misleading as measures of evidence
IV. Bayesian testing of point hypotheses
The prior distribution
Posterior probability that H 0 is true, given the data (from Bayes theorem):
Conditional frequentist interpretation of the posterior probability of H 0
V. Advantages of Bayesian testing Pr (H 0 | data x) reflects real expected error rates: P-values do not. A default formula exists for all situations:
Posterior probabilities allow for incorporation of personal opinion, if desired. Indeed, if the published default posterior probability of H 0 is P *, and your prior probability of H 0 is P 0, then your posterior probability of H 0 is
Posterior probabilities are not affected by the reason for stopping experimentation, and hence do not require rigid experimental designs (as do classical testing measures). Posterior probabilities can be used for multiple models or hypotheses.
An aside: integrating science and statistics via the Bayesian paradigm Any scientific question can be asked (e.g., What is the probability that switching to management plan A will increase species abundance by 20% more than will plan B?) Models can be built that simultaneously incorporate known science and statistics. If desired, expert opinion can be built into the analysis.
Conclusions Hypothesis testing is overutilized while (Bayesian) statistics is underutilized. Hypothesis testing is needed only when testing a plausible hypothesis (and this may be a rare occurrence in wildlife studies). The Bayesian approach to hypothesis testing has considerable advantages in terms of interpretability (actual error rates), general applicability, and flexible experimentation.