Statistical Power And Sample Size Calculations

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.
Statistical Issues in Research Planning and Evaluation
Statistical Power And Sample Size Calculations
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
Hypothesis Testing: Type II Error and Power.
Statistics for the Social Sciences Psychology 340 Fall 2006 Hypothesis testing.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview of Lecture Independent and Dependent Variables Between and Within Designs.
Statistics for the Social Sciences Psychology 340 Spring 2005 Hypothesis testing.
Today Concepts underlying inferential statistics
Sample Size and Statistical Power Epidemiology 655 Winter 1999 Jennifer Beebe.
Statistics for the Social Sciences
AM Recitation 2/10/11.
Overview of Statistical Hypothesis Testing: The z-Test
Chapter 10 Hypothesis Testing
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Chapter 8 Introduction to Hypothesis Testing
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Hypothesis Testing PowerPoint Prepared by Alfred.
STA Statistical Inference
Statistical Power The power of a test is the probability of detecting a difference or relationship if such a difference or relationship really exists.
Introduction to sample size and power calculations Afshin Ostovar Bushehr University of Medical Sciences.
Education 793 Class Notes Decisions, Error and Power Presentation 8.
Power and Sample Size Anquan Zhang presents For Measurement and Statistics Club.
© Copyright McGraw-Hill 2004
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Chapter 9 Introduction to the t Statistic
Introduction to Power and Effect Size  More to life than statistical significance  Reporting effect size  Assessing power.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Logic of Hypothesis Testing
Chapter 9: Testing a Claim
CHAPTER 9 Testing a Claim
Unit 5: Hypothesis Testing
Chapter 9: Testing a Claim
Hypothesis Testing.
Statistics for the Social Sciences
CHAPTER 9 Testing a Claim
Chapter 9: Testing a Claim
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Chapter 9: Testing a Claim
CHAPTER 9 Testing a Claim
Hypothesis Testing: Hypotheses
Statistical Process Control
Power, Sample Size, & Effect Size:
CONCEPTS OF HYPOTHESIS TESTING
Calculating Sample Size: Cohen’s Tables and G. Power
Chapter 9 Hypothesis Testing.
Statistics and Data Analysis
William P. Wattles, Ph.D. Psychology 302
INTRODUCTION TO HYPOTHESIS TESTING
Hypothesis Testing.
Chapter 9: Testing a Claim
Chapter 8 Making Sense of Statistical Significance: Effect Size, Decision Errors, and Statistical Power.
Chapter 9: Testing a Claim
Chapter 9: Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Chapter 9: Testing a Claim
Chapter 9: Testing a Claim
Chapter 9: Testing a Claim
Chapter 9: Testing a Claim
Chapter 9: Significance Testing
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Chapter 9: Testing a Claim
Chapter 9: Testing a Claim
Type I and Type II Errors
Presentation transcript:

Statistical Power And Sample Size Calculations Minitab calculations Mike Cox, Newcastle University, me fecit 11/03/2016 Manual calculations Sunday, 01 December 2019 2:56 PM

When Do You Need Statistical Power Calculations, And Why? A prospective power analysis is used before collecting data, to consider design sensitivity .

When Do You Need Statistical Power Calculations, And Why? A retrospective power analysis is used in order to know whether the studies you are interpreting were well enough designed.

When Do You Need Statistical Power Calculations, And Why? In Cohen’s (1962) seminal power analysis of the journal of Abnormal and Social Psychology he concluded that over half of the published studies were insufficiently powered to result in statistical significance for the main hypothesis. Cohen, J. 1962 “The statistical power of abnormal-social psychological research: A review” Journal of Abnormal and Social Psychology 65 145-153. 4

What Is Statistical Power? Essential concepts the null hypothesis Ho significance level, α Type I error Type II error Information point: Type I and Type II errors Crichton, N. Journal Of Clinical Nursing 9(2) 207-207 2000

What Is Statistical Power? Essential concepts Recall that a null hypothesis (Ho) states that the findings of the experiment are no different to those that would have been expected to occur by chance. Statistical hypothesis testing involves calculating the probability of achieving the observed results if the null hypothesis were true. If this probability is low (conventionally p < 0.05), the null hypothesis is rejected and the findings are said to be “statistically significant” (unlikely) at that accepted level. 6

Statistical Hypothesis Testing When you perform a statistical hypothesis test, there are four possible outcomes

Statistical Hypothesis Testing whether the null hypothesis (Ho) is true or false whether you decide either to reject, or else to retain, provisional belief in Ho

Statistical Hypothesis Testing Decision Ho is really true i.e., there is really no effect to find Ho is really false i.e., there really is an effect to be found Retain Ho correct decision: prob = 1 - α Type II error: prob = β Reject Ho Type I error: prob = α correct decision: prob = 1 - β

When Ho Is True And You Reject It, You Make A Type I Error When there really is no effect, but the statistical test comes out significant by chance, you make a Type I error. When Ho is true, the probability of making a Type I error is called alpha (α). This probability is the significance level associated with your statistical test.

When Ho is False And You Fail To Reject It, You Make A Type II Error When, in the population, there really is an effect, but your statistical test comes out non-significant, due to inadequate power and/or bad luck with sampling error, you make a Type II error. When Ho is false, (so that there really is an effect there waiting to be found) the probability of making a Type II error is called beta (β).

The Definition Of Statistical Power Statistical power is the probability of not missing an effect, due to sampling error, when there really is an effect there to be found. Power is the probability (prob = 1 - β) of correctly rejecting Ho when it really is false.

Calculating Statistical Power Depends On the sample size the level of statistical significance required the minimum size of effect that it is reasonable to expect.

How Do We Measure Effect Size? Cohen's d Defined as the difference between the means for the two groups, divided by an estimate of the standard deviation in the population. Often we use the average of the standard deviations of the samples as a rough guide for the latter.

Cohen's Rules Of Thumb For Effect Size Correlation coefficient Difference between means “Small effect” r = 0.1 d = 0.2 standard deviations “Medium effect” r = 0.3 d = 0.5 standard deviations “Large effect” r = 0.5 d = 0.8 standard deviations

Calculating Cohen’s d Cohen, J., (1977). Statistical power analysis for the behavioural sciences. San Diego, CA: Academic Press. Cohen, J., (1992). A Power Primer. Psychological Bulletin 112 155-159. 16

Calculating Cohen’s d 17

Calculating Cohen’s d from a t test Interpreting Cohen's d effect size: an interactive visualization 18

Conventions And Decisions About Statistical Power Acceptable risk of a Type II error is often set at 1 in 5, i.e., a probability of 0.2 (β). The conventionally uncontroversial value for “adequate” statistical power is therefore set at 1 - 0.2 = 0.8. People often regard the minimum acceptable statistical power for a proposed study as being an 80% chance of an effect that really exists showing up as a significant finding. Understanding Statistical Power and Significance Testing — an Interactive Visualization

6 Steps to determine to determine an appropriate sample size for my study? 1. Formulate the study. Here you detail your study design, choose the outcome summary, and you specify the analysis method. 2. Specify analysis parameters. The analysis parameters, for instance are the test significance level, specifying whether it is a 1 or 2-sided test, and also, what exactly it is you are looking for from your analysis. 20

6 Steps to determine to determine an appropriate sample size for my study? Specify effect size for test. This could be the expected effect size (often a best estimate), or one could use the effect size that is deemed to be clinically meaningful. Compute sample size or power. Once you have completed steps one through three you are now in a position to compute the sample size or the power for your study. 21

6 Steps to determine to determine an appropriate sample size for my study? 5. Sensitivity analysis. Here you compute your sample size or power using multiple scenarios to examine the relationship between the study parameters on either the power or the sample size. Essentially conducting a what-if analysis to assess how sensitive the power or required sample size is to other factors. 22

6 Steps to determine to determine an appropriate sample size for my study? 6. Choose an appropriate power or sample size, and document this in your study design protocol. However other authors suggest 5 steps (a, b, c or d)! Other options are also available! 23

A Couple Of Useful Links For an article casting doubts on scientific precision and power, see The Economist 19 Oct 2013. “I see a train wreck looming,” warned Daniel Kahneman. Also an interesting read The Economist 19 Oct 2013 on the reviewing process. A collection of online power calculator web pages for specific kinds of tests. Java applets for power and sample size, select the analysis.

Statistical Power Analysis In Minitab Next Week Statistical Power Analysis In Minitab Note that GPower3.1 is installed on University Machines. It is more complex to use than Minitab, but does provide a wider range of tests. For further information see the link, and look down for the desired software.

Statistical Power Analysis In Minitab Minitab is available via RAS Stat > Power and Sample Size >

Statistical Power Analysis In Minitab Recall that a comparison of two proportions equates to analysing a 2×2 contingency table. Note that you might find web tools for other models. The alternative normally involves solving some very complex equations.

Statistical Power Analysis In Minitab Note that you might find web tools for other models. The alternative normally involves solving some very complex equations. Simple statistical correlation analysis online See Test 28 in the Handbook of Parametric and Nonparametric Statistical Procedures, Third Edition by David J Sheskin

Factors That Influence Power Sample Size alpha   the standard deviation

Using Minitab To Calculate Power And Minimum Sample Size Suppose we have two samples, each with n = 13, and we propose to use the 0.05 significance level Difference between means is 0.8 standard deviations (i.e., Cohen's d = 0.8), so a t test All key strokes in printed notes

Using Minitab To Calculate Power And Minimum Sample Size Note that all parameters, bar one are required. Leave one field blank. This will be estimated.

Using Minitab To Calculate Power And Minimum Sample Size Power and Sample Size 2-Sample t Test Testing mean 1 = mean 2 (versus not =) Calculating power for mean 1 = mean 2 + difference Alpha = 0.05 Assumed standard deviation = 1 Sample Difference Size Power 0.8 13 0.499157 The sample size is for each group. Power will be 0.4992

Using Minitab To Calculate Power And Minimum Sample Size If, in the population, there really is a difference of 0.8 between the members of the two categories that would be sampled in the two groups, then using sample sizes of 13 each will have a 49.92% chance of getting a result that will be significant at the 0.05 level.

Using Minitab To Calculate Power And Minimum Sample Size Suppose the difference between the means is 0.8 standard deviations (i.e., Cohen's d = 0.8) Suppose that we require a power of 0.8 (the conventional value) Suppose we intend doing a one-tailed t test, with significance level 0.05. All key strokes in printed notes

Using Minitab To Calculate Power And Minimum Sample Size Select “Options” to set a one-tailed test

Using Minitab To Calculate Power And Minimum Sample Size

Using Minitab To Calculate Power And Minimum Sample Size Power and Sample Size 2-Sample t Test Testing mean 1 = mean 2 (versus >) Calculating power for mean 1 = mean 2 + difference Alpha = 0.05 Assumed standard deviation = 1 Sample Target Difference Size Power Actual Power 0.8 21 0.8 0.816788 The sample size is for each group. Target power of at least 0.8

Using Minitab To Calculate Power And Minimum Sample Size Power and Sample Size 2-Sample t Test Testing mean 1 = mean 2 (versus >) Calculating power for mean 1 = mean 2 + difference Alpha = 0.05 Assumed standard deviation = 1 Sample Target Difference Size Power Actual Power 0.8 21 0.8 0.816788 The sample size is for each group. At least 21 cases in each group

Using Minitab To Calculate Power And Minimum Sample Size Power and Sample Size 2-Sample t Test Testing mean 1 = mean 2 (versus >) Calculating power for mean 1 = mean 2 + difference Alpha = 0.05 Assumed standard deviation = 1 Sample Target Difference Size Power Actual Power 0.8 21 0.8 0.816788 The sample size is for each group. Actual power 0.8168

Using Minitab To Calculate Power And Minimum Sample Size Suppose you are about to undertake an investigation to determine whether or not 4 treatments affect the yield of a product using 5 observations per treatment. You know that the mean of the control group should be around 8, and you would like to find significant differences of +4. Thus, the maximum difference you are considering is 4 units. Previous research suggests the population σ is 1.64. So an ANOVA.

Using Minitab To Calculate Power And Minimum Sample Size

Using Minitab To Calculate Power And Minimum Sample Size Power and Sample Size One-way ANOVA Alpha = 0.05 Assumed standard deviation = 1.64 Number of Levels = 4 SS Sample Maximum Means Size Power Difference 8 5 0.826860 4 The sample size is for each level.

Using Minitab To Calculate Power And Minimum Sample Size To interpret the results, if you assign five observations to each treatment level, you have a power of 0.83 to detect a difference of 4 units or more between the treatment means. Minitab can also display the power curve of all possible combinations of maximum difference in mean detected and the power values for one-way ANOVA with the 5 samples per treatment.