Comparing Classical and Bayesian Approaches to Hypothesis Testing James O. Berger Institute of Statistics and Decision Sciences Duke University www.stat.duke.edu.

Slides:



Advertisements
Similar presentations
Active Reading: “Scientific Processes”
Advertisements

Introductory Mathematics & Statistics for Business
Introduction to Hypothesis Testing
How do we know when we know. Outline  What is Research  Measurement  Method Types  Statistical Reasoning  Issues in Human Factors.
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Hypothesis Testing making decisions using sample data.
Introduction to Basic Statistical Methodology. CHAPTER 1 ~ Introduction ~
AP Statistics – Chapter 9 Test Review
Topic 6: Introduction to Hypothesis Testing
Statistical and Practical Significance Advanced Statistics Petr Soukup.
Chapter Seventeen HYPOTHESIS TESTING
Elementary hypothesis testing
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE.
Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Statistics for the Social Sciences Psychology 340 Fall 2006 Review For Exam 1.
Does Naïve Bayes always work?
Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.
Copyright c 2001 The McGraw-Hill Companies, Inc.1 Chapter 7 Sampling, Significance Levels, and Hypothesis Testing Three scientific traditions critical.
Chapter Nine: Evaluating Results from Samples Review of concepts of testing a null hypothesis. Test statistic and its null distribution Type I and Type.
Testing Hypotheses I Lesson 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics n Inferential Statistics.
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
Bayesian Inference Using JASP
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Research Design. Research is based on Scientific Method Propose a hypothesis that is testable Objective observations are collected Results are analyzed.
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
Statistical Inference Decision Making (Hypothesis Testing) Decision Making (Hypothesis Testing) A formal method for decision making in the presence of.
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Therapeutic Equivalence & Active Control Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.
Week 71 Hypothesis Testing Suppose that we want to assess the evidence in the observed data, concerning the hypothesis. There are two approaches to assessing.
Bayesian statistics Probabilities for everything.
Hypothesis Testing Judicial Analogy Hypothesis Testing Hypothesis testing  Null hypothesis Purpose  Test the viability Null hypothesis  Population.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Proportions. A proportion is the fraction of individuals having a particular attribute.
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 7 Sampling, Significance Levels, and Hypothesis Testing Three scientific traditions.
Inen 460 Lecture 2. Estimation (ch. 6,7) and Hypothesis Testing (ch.8) Two Important Aspects of Statistical Inference Point Estimation – Estimate an unknown.
Chi-Square X 2. Review: the “null” hypothesis Inferential statistics are used to test hypotheses Whenever we use inferential statistics the “null hypothesis”
Chi-Square X 2. Review: the “null” hypothesis Inferential statistics are used to test hypotheses Whenever we use inferential statistics the “null hypothesis”
Descriptive and Inferential Statistics Descriptive statistics The science of describing distributions of samples or populations Inferential statistics.
T tests comparing two means t tests comparing two means.
Hypothesis Testing.  Hypothesis is a claim or statement about a property of a population.  Hypothesis Testing is to test the claim or statement  Example.
Lec. 19 – Hypothesis Testing: The Null and Types of Error.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Part Four ANALYSIS AND PRESENTATION OF DATA
Does Naïve Bayes always work?
Chapter 4. Inference about Process Quality
Reasoning Under Uncertainty in Expert System
Hypothesis tests Single sample Z
Sampling and Sampling Distributions
P-value Approach for Test Conclusion
Reasoning in Psychology Using Statistics
Choosing a test: ... start from thinking whether our variables are continuous or discrete.
PSY 626: Bayesian Statistics for Psychological Science
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
Introduction to Basic Statistical Methodology
STA 291 Spring 2008 Lecture 18 Dustin Lueker.
Reasoning in Psychology Using Statistics
More on Testing 500 randomly selected U.S. adults were asked the question: “Would you be willing to pay much higher taxes in order to protect the environment?”
Statistical and Practical Significance
CS639: Data Management for Data Science
Testing Hypotheses I Lesson 9.
Presentation transcript:

Comparing Classical and Bayesian Approaches to Hypothesis Testing James O. Berger Institute of Statistics and Decision Sciences Duke University

Outline The apparent overuse of hypothesis testing When is point null testing needed? The misleading nature of P-values Bayesian and conditional frequentist testing of plausible hypotheses Advantages of Bayesian testing Conclusions

I. The apparent overuse of hypothesis testing Tests are often performed when they are irrelevant. Rejection by an irrelevant test is sometimes viewed as license to forget statistics in further analysis

Prototypical example

Statistical mistakes in the example The hypothesis is not plausible; testing serves no purpose. The observed usage levels are given without confidence sets. The rankings are based only on observed means, and are given without uncertainties. (For instance, perhaps Pr (A>B)=0.6 only.)

Prototypical example

Statistical mistakes in the example The hypothesis is not plausible; testing serves no purpose. The observed usage levels are given without confidence sets. The rankings are based only on observed means, and are given without uncertainties. (For instance, perhaps Pr (A>B)=0.6 only.)

Prototypical example

II. When is testing of a point null hypothesis needed? Answer: When the hypothesis is plausible, to some degree.

Examples of hypotheses that are not realistically plausible H 0 : small mammals are as abundant on livestock grazing land as on non-grazing land H 0 : survival rates of brood mates are independent H 0 : bird abundance does not depend on the type of forest habitat they occupy H 0 : cottontail choice of habitat does not depend on the season

Examples of hypotheses that may be plausible, to at least some degree: H 0 : Males and females of a species are the same in terms of characteristic A. H 0 : Proximity to logging roads does not affect ground nest predation. H 0 : Pollutant A does not affect Species B.

III. For plausible hypotheses, P-values are misleading as measures of evidence

IV. Bayesian testing of point hypotheses

The prior distribution

Posterior probability that H 0 is true, given the data (from Bayes theorem):

Conditional frequentist interpretation of the posterior probability of H 0

V. Advantages of Bayesian testing Pr (H 0 | data x) reflects real expected error rates: P-values do not. A default formula exists for all situations:

Posterior probabilities allow for incorporation of personal opinion, if desired. Indeed, if the published default posterior probability of H 0 is P *, and your prior probability of H 0 is P 0, then your posterior probability of H 0 is

Posterior probabilities are not affected by the reason for stopping experimentation, and hence do not require rigid experimental designs (as do classical testing measures). Posterior probabilities can be used for multiple models or hypotheses.

An aside: integrating science and statistics via the Bayesian paradigm Any scientific question can be asked (e.g., What is the probability that switching to management plan A will increase species abundance by 20% more than will plan B?) Models can be built that simultaneously incorporate known science and statistics. If desired, expert opinion can be built into the analysis.

Conclusions Hypothesis testing is overutilized while (Bayesian) statistics is underutilized. Hypothesis testing is needed only when testing a plausible hypothesis (and this may be a rare occurrence in wildlife studies). The Bayesian approach to hypothesis testing has considerable advantages in terms of interpretability (actual error rates), general applicability, and flexible experimentation.