Sample Size Determination In the Context of Hypothesis Testing
Recall, in context of Estimation, Sample Size is based upon: the width of the Confidence interval: The confidence level (1 – a) confidence coefficient, z1-a/2 The population standard error, s/n w ( ) x – z1-a/2(s/n) x x + z1-a/2(s/n)
Sample size in Context of Hypothesis Testing: Need to consider POWER as well as confidence level Example: Suppose we have a hypothesis on one mean: Ho: mo = 100 vs. Ha: mo 100 s = 10 a = .05 If the true mean is in fact ma = 105, what size sample is required so that the power of the test is (1-b) = .80 ?
For our hypothesis test, we will reject Ho for x greater than C1 or less than C2 a/2 = .025 a/2 = .025 mo=100 C2 = mo-Z1-a/2(s/n) C1 = mo+Z1-a/2(s/n)
Suppose, in fact, that ma = 105. We will reject Ho Let’s look at these decision points (C1 and C2) relative to a specific alternative. Suppose, in fact, that ma = 105. We will reject Ho if x is greater than C1 or x is less than C2 Distribution based on Ha C1 ma=105 C2
We want b = .20 for power=.80 We want b = .20 zb = -.842 ma C2 C1
Note for sample size determination: a, b are set by the investigator Both a specific null (mo) and a specific alternative (ma) must be specified we assume that the same variance s2 holds for both the null and alternative distributions a/2 a/2 b ma m0 C1 = mo+z1-a/2(s/n) C1 = ma-z1- (s/n)
We now have: C1 in terms of both the Ho and Ha distributions: Setting these equal: Then solve for n.
Sample Size is then: Note: Always use positive values for z1-a/2 and (we defined CI using positive z)
In our example: s = 10 a = .05 z1-a/2 = 1.96 b = .20 = .842 mo = 100 ma = 105 Or a sample size of n=32 is needed.
If we change the desired power to 1-b = .90: Or a sample size of n=42 is needed.
In the context of hypothesis testing, sample size is a function of: s2, the population variance a = .05 z1-a/2 , Type I error rate b = .20 , Type II error rate Distance between mo , hypothesized mean and ma , a specific alternative
Using Minitab to estimate Sample Size: Stat Power and Sample Size 1-Sample Z Difference between mo and ma Desired power (separate by spaces if entering several) 2-sided test s
Power and Sample Size 1-Sample Z Test Testing mean = null (versus not = null) Calculating power for mean = null + difference Alpha = 0.05 Sigma = 10 Sample Target Actual Difference Size Power Power 5 32 0.8000 0.8074 5 43 0.9000 0.9064
Sample size and power for comparing means of 2 independent groups. In the example comparing LOS for elective vs. emergency patients, we observed a difference between sample means of 3.3 days – but found that this was NOT statistically significantly different from zero. However 3.3 days is a large, expensive difference in length of stay. Our data had relatively large observed variance, and small n. What was the power of our study to detect a difference of 3.3 days? What sample size would be needed per group to find a difference of 3 days or more significantly different from zero?
Set a and 1- or 2- sided test, using options menu. In Minitab: Stat Power and Sample Size 2-sample t To evaluate power: Enter sample sizes Enter observed difference in means Enter standard deviation Set a and 1- or 2- sided test, using options menu. s
In Minitab: Stat Power and Sample Size 2-sample t 2-Sample t Test Testing mean 1 = mean 2 (versus not =) Calculating power for mean 1 = mean 2 + 3.3 Alpha = 0.05 Sigma = 10 Sample Size Power 14 0.1342 11 0.1142 Note: Minitab assumes equal n’s for the 2 groups, and only gives space for one value of s Clearly, our power to detect a difference as large as 3.3 days was only about 12% -- not very good!
s In Minitab: Stat Power and Sample Size 2-sample t To estimate sample size: Enter desired power Enter desired significant difference in means Enter standard deviation s
Power and Sample Size 2-Sample t Test Testing mean 1 = mean 2 (versus not =) Calculating power for mean 1 = mean 2 + 3 Alpha = 0.05 Sigma = 10 Sample Target Actual Size Power Power 176 0.8000 0.8014 235 0.9000 0.9007 Note: sample sizes are per group. If this seems excessive – is your estimate of the standard deviation reasonable? I used the larger of the 2 observed SD here. You might want to compute a pooled SD, and try that.