If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.

Slides:



Advertisements
Similar presentations
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
Advertisements

Introduction to Hypothesis Testing
Chapter 10 Introduction to Inference
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
Quantitative Data Analysis
A Spreadsheet for Analysis of Straightforward Controlled Trials
Client Assessment and Other New Uses of Reliability Will G Hopkins Physiology and Physical Education University of Otago, Dunedin NZ Reliability: the Essentials.
Statistical vs Clinical or Practical Significance
Planning, Performing, and Publishing Research with Confidence Limits
Statistical vs Clinical Significance
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
Chapter 7 Hypothesis Testing
Inferential Statistics and t - tests
Type I & Type II errors Brian Yuen 18 June 2013.
Understanding p-values Annie Herbert Medical Statistician Research and Development Support Unit
T-Tests For Dummies As in the books, not you personally!
Inferential Statistics
Statistical Analysis and Data Interpretation What is significant for the athlete, the statistician and team doctor? important Will Hopkins
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Issues About Statistical Inference Dr R.M. Pandey Additional Professor Department of Biostatistics All-India Institute of Medical Sciences New Delhi.
MAGNITUDE-BASED INFERENCES
Statistical Significance What is Statistical Significance? What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant?
Chapter 8 Hypothesis Testing I. Significant Differences  Hypothesis testing is designed to detect significant differences: differences that did not occur.
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
An Inference Procedure
Statistical Significance What is Statistical Significance? How Do We Know Whether a Result is Statistically Significant? How Do We Know Whether a Result.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 8-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Chapter Sampling Distributions and Hypothesis Testing.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Ch. 9 Fundamental of Hypothesis Testing
Chapter 8 Introduction to Hypothesis Testing
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
AM Recitation 2/10/11.
Overview Definition Hypothesis
Confidence Intervals and Hypothesis Testing - II
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Fundamentals of Hypothesis Testing: One-Sample Tests
Tests of significance & hypothesis testing Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
Health and Disease in Populations 2001 Sources of variation (2) Jane Hutton (Paul Burton)
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
Lecture 7 Introduction to Hypothesis Testing. Lecture Goals After completing this lecture, you should be able to: Formulate null and alternative hypotheses.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Instructor Resource Chapter 5 Copyright © Scott B. Patten, Permission granted for classroom use with Epidemiology for Canadian Students: Principles,
 If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Section 3.3: The Story of Statistical Inference Section 4.1: Testing Where a Proportion Is.
Issues concerning the interpretation of statistical significance tests.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Chapter 8 Hypothesis Testing I. Significant Differences  Hypothesis testing is designed to detect significant differences: differences that did not occur.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Hypothesis Testing and Statistical Significance
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Making Inferences About Effects Seminar presented at Leeds Beckett and Split universities, March 2016 This slideshow consists of part of the lecture on.
Statistical Analysis and Data Interpretation: What is Important for the Athlete and Statistician Will G Hopkins Institute of Sport and Recreation Research.
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
Statistical inference: distribution, hypothesis testing
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
Quantitative Data Analysis
Type I and Type II Errors
Presentation transcript:

If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it directly in PowerPoint. When you open the file, use the full-screen view to see the information on each slide build sequentially. For full-screen view, click on this icon at the lower left of your screen. To go forwards, left-click or hit the space bar, PdDn or key. To go backwards, hit the PgUp or key. To exit from full-screen view, hit the Esc (escape) key.

harmful trivial beneficial probability value of effect statistic Clinical, Practical or Mechanistic Significance vs Statistical Significance for POPULATION Effects Will G Hopkins Auckland University of Technology Auckland, NZ

Overview Background: Making Inferences Hypothesis Testing, P Values, Statistical significance Clinical Significance via Confidence Limits Clinical Significance via Clinical Chances Precision of estimation Smallest worthwhile effect Interpreting Probabilities How to Publish Clinical Chances Probabilities of benefit and harm How to use possible, likely, very likely, almost certain Examples

Background: Making Inferences The main aim of research is to make an inference about an effect in a population based on study of a sample. Alan will deal with inferences about the effect on an individual. Hypothesis testing via the P value and statistical significance is the traditional but flawed approach to making an inference. Precision of estimation via confidence limits is an improvement. But what's missing is some way to make inferences about the clinical, practical or mechanistic significance of an effect. I will explain how to do it via confidence limits using values for the smallest beneficial and harmful effect. I will also explain how to do it by calculating and interpreting chances that an effect is beneficial, trivial, and harmful.

Hypothesis Testing, P Values and Statistical Significance Based on the notion that we can disprove, but not prove, things. Therefore, we need a thing to disprove. Let's try the null hypothesis: the population or true effect is zero. If the value of the observed effect is unlikely under this assumption, we reject (disprove) the null hypothesis. Unlikely is related to (but not equal to) the P value. P < 0.05 is regarded as unlikely enough to reject the null hypothesis (that is, to conclude the effect is not zero or null). We say the effect is statistically significant at the 0.05 or 5% level. Some folks also say there is a real effect. P > 0.05 means there is not enough evidence to reject the null. We say the effect is statistically non-significant. Some folks also accept the null and say there is no effect.

Problems with this philosophy… We can disprove things only in pure mathematics, not in real life. Failure to reject the null doesn't mean we have to accept the null. In any case, true effects are always "real", never zero. So… THE NULL HYPOTHESIS IS ALWAYS FALSE ! Therefore, to assume that effects are zero until disproved is illogical and sometimes impractical or unethical is arbitrary. The P value is not a probability of anything in reality. Some useful effects aren't statistically significant. Some statistically significant effects aren't useful. Non-significant is usually misinterpreted as unpublishable. So good data are lost to meta-analysis and publication bias is rife. Two solutions: clinical significance via confidence limits or via clinical chances.

Confidence limits define a range within which we infer the true or population value is likely to fall. Likely is usually a probability of 0.95 (for 95% limits). Clinical Significance via Confidence Limits Representation of the limits as a confidence interval : Area = 0.95 upper likely limitlower likely limit observed value probability value of effect statistic 0 positivenegative probability distribution of true value, given the observed value value of effect statistic 0 positivenegative likely range of true value

Problem: 95% is arbitrary. And we need something other than 95% to stop folks seeing if the effect is significant at the 5% level. The effect is significant if the 95% confidence interval does not overlap the null. 99% would give an impression of too much imprecision. although even higher confidence could be justified sometimes. 90% is a good default, because… Chances that true value is < lower limit are very unlikely (5%), and… Chances that true value is > upper limit are very unlikely (5%).

Now, for clinical significance, we need to interpret confidence limits in relation to the smallest clinically beneficial and harmful effects. These are usually equal and opposite in sign. They define regions of beneficial, trivial, and harmful values. trivial harmful beneficial smallest clinically harmful effect smallest clinically beneficial effect value of effect statistic 0 positivenegative

Putting the confidence interval and these regions together, we can make a decision about clinical significance. Clinically decisive or clear is preferable to clinically significant. 0 value of effect statistic positivenegative trivial harmful beneficial Yes: use it.Yes Yes: use it.Yes Yes: use it.No Yes: don't use it.Yes Yes: don't use it.No Yes: don't use it.No Yes: don't use it.Yes Yes: don't use it.Yes No: need more research. No Clinically decisive? Statistically significant? Why statistical significance is impractical or unethical! Bars are 95% confidence intervals.

Problem: what's the smallest clinically important effect? If you can't answer this question, quit the field. Example: in many solo sports, ~0.5% change in power output changes substantially a top athlete's chances of winning. The default for most other populations and effects is Cohen's set of smallest values. These values apply to clinical, practical and/or mechanistic importance… Correlations: Relative frequencies, relative risks, or odds ratios: 1.1, depending on prevalence of the disease or other condition. Standardized changes or differences in the mean: 0.20 between-subject standard deviations. In a controlled trial, it's the SD of all subjects in the pre-test, not the SD of the change scores.

We calculate probabilities that the true effect could be clinically beneficial, trivial, or harmful (P beneficial, P trivial, P harmful ). These Ps are NOT the proportions of positive, non- and negative responders in the population. Alan will deal with these. Calculating the Ps is easy. Put the observed value, smallest beneficial/harmful value, and P value into a spreadsheet at newstats.org. More challenging: interpreting the probabilities, and publishing the work. Clinical Significance via Clinical Chances smallest harmful value P harmful = 0.05 P trivial = 0.15 probability distribution of true value smallest beneficial value P beneficial = probability value of effect statistic positivenegative observed value

You should describe outcomes in plain language in your paper. Therefore you need to describe the probabilities that the effect is beneficial, trivial, and/or harmful. Suggested scheme: Interpreting the Probabilities The effect… beneficial/trivial/harmful is almost certainly not… Probability <0.01 Chances <1% Odds <1:99 is very unlikely to be…0.01–0.051–5%1:99–1:19is unlikely to be…, is probably not…0.05–0.255–25%1:19–1:3is possibly (not)…, may (not) be…0.25–0.7525–75%1:3–3:1is likely to be…, is probably… is very likely to be… is almost certainly… 0.75– –0.99 > –95% 95–99% >99% 3:1–19:1 19:1–99:1 >99:1

How to Publish Clinical Chances Example of a table from a randomized controlled trial: Mean improvement (%) and 90% confidence limits 3.1; ± ; ±1.298; very likely 1.1; ±1.474; possible Compared groups Slow - control Explosive - control Slow - explosive Chances (% and qualitative) of substantial improvement a 99.6; almost certain a Increase in speed of >0.5%. TABLE 1–Differences in improvements in kayaking sprint speed between slow, explosive and control training groups. Chances of a substantial impairment were all <5% (very unlikely).

Example in body of the text: Chances (%) that the true effect was beneficial / trivial / harmful were 74 / 23 / 3 (possible / unlikely / very unlikely). In discussing an effect, use clear-cut or clinically significant or decisive when… Chances of benefit or harm are either at least very likely (>95%) or at most very unlikely (<5%), because… The true value of some effects is near the smallest clinically beneficial value, so for these effects… You would need a huge sample size to distinguish confidently between trivial and beneficial. And anyway… What matters clinically is that the effect is very unlikely to be harmful, for which you need only a modest sample size. And vice versa for effects near the threshold for harm. Otherwise, state more research is needed to clarify the effect.

P value 0.03 value of statistic 1.5 Conf. level (%) 90 deg. of freedom 18 positivenegative 1 threshold values for clinical chances Confidence limits lowerupper prob (%)odds 783:1 likely, probable clinically positive Chances (% or odds) that the true value of the statistic is783:1 likely, probable 191:4 unlikely, probably not 31:30 very unlikely Two examples of use of the spreadsheet for clinical chances: prob (%)odds 221:3 unlikely, probably not clinically trivial prob (%)odds 01:2071 almost certainly not clinically negative Both these effects are clinically decisive, clear, or significant.

Limitations of this approach to clinical decisions It deals with uncertainty about the magnitude of an effect in a population. Which is OK for effects like correlations or simple mean differences between groups, which don't apply to individuals. But effects like risk of injury or changes in physiology or performance can apply to individuals. Alas, this approach does NOT provide the uncertainty of the effect or chances of benefit and harm for an individual. Neither does statistical significance. More information and analyses are needed to make clinical decisions for individuals.

Summary Show the observed magnitude of the effect. Attend to precision of estimation by showing 90% confidence limits of the true value. Do NOT show p values, do NOT test a hypothesis and do NOT mention statistical significance. Attend to clinical, practical or mechanistic significance by… stating, with justification, the smallest worthwhile effect, then… interpreting the confidence limits in relation to this effect, or… estimating probabilities that the true effect is beneficial, trivial, and/or harmful (or substantially positive, trivial, and/or negative). Make a qualitative statement about the clinical or practical significance of the effect, using unlikely, very likely, and so on. Remember, it applies to populations, not individuals.

For related articles and resources: A New View of Statistics SUMMARIZING DATA GENERALIZING TO A POPULATION Simple & Effect Statistics Precision of Measurement Precision of Measurement Confidence Limits Statistical Models Statistical Models Dimension Reduction Dimension Reduction Sample-Size Estimation Sample-Size Estimation newstats.org