Critical review of significance testing F.DAncona from a Alain Morens lecture 2006.

Slides:



Advertisements
Similar presentations
Introductory Mathematics & Statistics for Business
Advertisements

Statistical vs Clinical or Practical Significance
Statistical vs Clinical Significance
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
Confidence intervals Kristin Tolksdorf (based on previous EPIET material) 18th EPIET/EUPHEM Introductory course
Significance testing Ioannis Karagiannis (based on previous EPIET material) 18 th EPIET/EUPHEM Introductory course
Significance testing and confidence intervals Ágnes Hajdu EPIET Introductory course
Sample size calculation
Sample size calculation
Probability models- the Normal especially.
Quantitative Methods Lecture 3
Inference in the Simple Regression Model
Lecture 14 chi-square test, P-value Measurement error (review from lecture 13) Null hypothesis; alternative hypothesis Evidence against null hypothesis.
6. Statistical Inference: Example: Anorexia study Weight measured before and after period of treatment y i = weight at end – weight at beginning For n=17.
Designing an impact evaluation: Randomization, statistical power, and some more fun…
Review bootstrap and permutation
II. Potential Errors In Epidemiologic Studies Random Error Dr. Sherine Shawky.
Understanding p-values Annie Herbert Medical Statistician Research and Development Support Unit
Putting Statistics to Work
Inferential Statistics
Research Methodology Statistics Maha Omair Teaching Assistant Department of Statistics, College of science King Saud University.
Unit 4 – Inference from Data: Principles
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
1 G Lect 2a G Lecture 2a Thinking about variability Samples and variability Null hypothesis testing.
Issues About Statistical Inference Dr R.M. Pandey Additional Professor Department of Biostatistics All-India Institute of Medical Sciences New Delhi.
Random variable Distribution. 200 trials where I flipped the coin 50 times and counted heads no_of_heads in a trial.
Confidence Intervals © Scott Evans, Ph.D..
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
Business 205. Review Sampling Continuous Random Variables Central Limit Theorem Z-test.
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Statistical Significance What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant? How Do We Know Whether a Result.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
BCOR 1020 Business Statistics Lecture 18 – March 20, 2008.
Sample Size and Statistical Power Epidemiology 655 Winter 1999 Jennifer Beebe.
Introduction to Testing a Hypothesis Testing a treatment Descriptive statistics cannot determine if differences are due to chance. A sampling error occurs.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Inferential Statistics & Test of Significance
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
FRAMING RESEARCH QUESTIONS The PICO Strategy. PICO P: Population of interest I: Intervention C: Control O: Outcome.
STA Statistical Inference
Significance testing and confidence intervals Col Naila Azam.
No criminal on the run The concept of test of significance FETP India.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Multiple Testing Matthew Kowgier. Multiple Testing In statistics, the multiple comparisons/testing problem occurs when one considers a set of statistical.
Issues concerning the interpretation of statistical significance tests.
Testing the Differences between Means Statistics for Political Science Levin and Fox Chapter Seven 1.
Statistical Techniques
Introduction to Testing a Hypothesis Testing a treatment Descriptive statistics cannot determine if differences are due to chance. Sampling error means.
European Patients’ Academy on Therapeutic Innovation The Purpose and Fundamentals of Statistics in Clinical Trials.
T tests comparing two means t tests comparing two means.
More about tests and intervals CHAPTER 21. Do not state your claim as the null hypothesis, instead make what you’re trying to prove the alternative. The.
Statistical Significance or Hypothesis Testing. Significance testing Learning objectives of this lecture are to Understand Hypothesis: definition & types.
Critical Appraisal Course for Emergency Medicine Trainees Module 2 Statistics.
Chapter Nine Hypothesis Testing.
Lecture 9-I Data Analysis: Bivariate Analysis and Hypothesis Testing
Significance testing and confidence intervals
Lecture Nine - Twelve Tests of Significance.
Inference and Tests of Hypotheses
Significance testing Introduction to Intervention Epidemiology
Dr.MUSTAQUE AHMED MBBS,MD(COMMUNITY MEDICINE), FELLOWSHIP IN HIV/AIDS
Statistical inference: distribution, hypothesis testing
Null Hypothesis Testing
More on Testing 500 randomly selected U.S. adults were asked the question: “Would you be willing to pay much higher taxes in order to protect the environment?”
The objective of this lecture is to know the role of random error (chance) in factor-outcome relation and the types of systematic errors (Bias)
Interpreting Epidemiologic Results.
Statistical Power.
Rest of lecture 4 (Chapter 5: pg ) Statistical Inferences
Presentation transcript:

Critical review of significance testing F.DAncona from a Alain Morens lecture 2006

Botulism outbreak in Italy The relative risk of illness was higher among diners who ate home preserved green olives (RR=2.9) Is it statistically significant ?

Tests of statistical significance Many of them regarding differences between means or proportions These tests help to establish if the observed difference is real (= if it is not due to the chance alone)

The two hypothesis! There is a difference between people that ate olives and people that didnt eat them Hypothesis (H 1 ) (alternative hypothesis) When you perform a test of statistical significance you usually reject or not reject the Null Hypothesis (H 0 ) There is NO difference between the two groups Null Hypothesis (H 0 ) (example: RR = 1 OR=1)

Hypothesis, testing and null hypothesis If data provide evidence against the Null Hypothesis then this hypothesis can be rejected in favour of some alternative hypothesis H 1 (the objective of our study). If you dont reject the Null Hypothesis never you can say that the Null Hypothesis is true. You can only reject it or not reject it.

p = probability that a result (for example a difference between proportions or a RR) or more extreme values can be observed by chance alone Significance testing: H 0 rejected using reported p value Small p values = low degree of compatibility between H 0 and the observed data: you reject H 0 and the test is significant. Large p values = high degree of compatibility between H 0 and the observed data: you dont reject H 0, the test is not significant Never we can reduce to zero the probability that our result was not observed by chance alone

Levels of significance We need of a cut-off ! p value > 0.05 = H 0 non rejected (non significant) p value 0.05 = H 0 rejected (significant) Avoid to submit for publication if p > 0.05 Referees commonly relied on tests of significance

p = 0.05 and its errors Level of significance, usually p = 0.05 p value was used for decision making but still 2 possible errors H 0 should not be rejected, but it was rejected (Type I or alpha error or false positive) H 0 should be rejected but it was not rejected (Type II or beta error or false negative)

H 0 is true but rejected: Type I or error H 0 is false but not rejected: Type II or error Types of errors Test result Truth The p value level is the level of error that we could accept (usually 5%)

TreatmentSuccessful UnsuccessfulTotal B A Treatment B, success = 64 % Treatment A, success = 35% 2 = 3.44 p = NS Hypothetical data from a clinical trial of a new treatment p > 0.05 p = 0.06 Different ways to write the same concept but with more information

The epidemiologist needs measurements rather than probabilities 2 is a test of association. OR, RR are measure of association on a continuous scale (infinite number of possible values) The best estimate = point estimate Range of values allowing for random variability = confidence interval (precision of the point estimate)

the amount of variability in the data the dimension of the sample the arbitrary level of confidence (usually 90%, 95%, 99%) One way to use confidence interval is : If 1 is included in CI, then NON SIGNIFICANT If 1 is not included in CI, then SIGNIFICANT Width of confidence interval depends on …

Confidence interval provide more information than p value magnitude of the effect (strength of association) direction of the effect (RR > or < 1) precision around the point estimate of the effect (variability) p value can not provide them !

Level of confidence interval at 95% If the data collection and analysis could be replicated many times, the CI should include within it the TRUE value of the measure 95% of the time The only thing that should bring variability is the chance!

TreatmentSuccessful UnsuccessfulTotal B A Treatment B, success = 64 % Treatment A, success = 35% p = NS RR = % CI ( ) Hypothetical data from a clinical trial of a new treatment p > 0.05 p = 0.06 Different ways to write the same concept but with more information

More studies are better or worse? Decision based on results from a collection of studies are not facilitated when each study is classified as a YES or NO decision. You have to look the CI and the punctual estimation But also consider its clinical or biological significance 1 RR 20 studies with different results...

Study A, large sample, precise results, narrow CI - SIGNIFICANT Study B, small size, large CI - NON SIGNIFICANT Looking the CI Study A, effect close to NO EFFECT Study B, no information about absence of large effect RR = 1 A B Large RR

2 = A test of association. It depends on sample size p value = Probability that equal (or more extreme) results can be observed by chance alone OR, RR = Direction & strength of association if > 1risk factor if < 1protective factor (independently from sample size) CI = Magnitude and precision of effect What we have to evaluate the study Remember that these values not provide any information on the possibility that the observed association is due to a bias or confounding. This possibility should be investigated

Cases Non casesTotal 2 = 1.3 E p = 0.13 NE RR = 1.8 Total % CI [ ] Cases Non casesTotal 2 = 12 E p = NE RR = 1.8 Total % CI [ ] Cases Non casesTotal 2 = 12 E p = NE RR = 1.2 Total % CI [ ] 2 and Relative Risk

Exposurecases non casesAR% Yes % No % Total65220 Common source outbreak suspected Remember that these values do not provide any information on the possibility that the observed association is due to a bias or confounding. HOW YOU COULD EXPLAIN THAT ONLY 23% OF CASES WERE EXPOSED ? 2 = 9.1 p = RR= %CI=

Recommendations Hypothesis testing and CI evaluate only the role of chance as alternative explanation of the association. Interpret with caution every association that achieves statistical significance. Double caution if this statistical significance is not expected.

P < 0.05 Rothman It is not a good description of the information in the data