How Many Subjects Will I Need? Jane C. Johnson Office of Research Support A.T. Still University of Health Sciences Kirksville, MO USA
Objective To familiarize primary care researchers with sample size calculations and power analysis for confidence intervals and hypothesis tests.
Outline 1.Initial Case Study Discussion 2.Sample Size Calculations for Hypothesis Testing a.Review of Statistical Concepts b.Two-sample t-test c.Paired t-test d.Analysis of Variance (ANOVA) e.Chi-square test 3.Sample Size Calculations for Confidence Intervals a.Review of Statistical Concepts b.CI for a Mean c.CI for a Proportion 4.Follow-up Case Study Discussion
Case Study Overview Subclinical hypothyroidism is associated with elevated total cholesterol. (Hueston & Pearson, 2004) We want to study whether treatment of subclinical hypothyroidism affects total cholesterol.
Case Study #1 Research Question: In people with subclinical hypothyroidism, is total cholesterol lower in those treated for hypothyroidism compared to those who do not receive treatment?
Case Study #2 Research Question: Does total cholesterol decrease in people treated for subclinical hypothyroidism compared to pretreatment levels?
Case Study #3 Research Question: In people with subclinical hypothyroidism, do total cholesterol levels differ between those treated solely for hypothyroidism, those treated solely for hypercholesterolemia, and those who are not treated for either?
Case Study #4 Research Question: In people with subclinical hypothyroidism, is the prevalence of hypercholesterolemia lower in those treated for hypothyroidism compared to those who do not receive treatment?
Case Study #5 Research Question: What is the average total cholesterol in people with subclinical hypothyroidism following treatment for hypothyroidism?
Case Study #6 Research Question: What is the prevalence of hypercholesterolemia in people with subclinical hypothyroidism following treatment for hypothyroidism?
Hypothesis Testing: Review of Statistical Concepts Statistical Hypotheses Significance Level Power Effect Size
Hypothesis Tests Statistical tests or hypothesis tests are used to make an educated guess about which statistical hypothesis (null or alternative) is true.
Statistical Hypotheses Alternative Hypothesis (H A ) The research hypothesis stated in terms of statistical parameters. Examples: H A : 1 < 2 (one-sided) H A : 1 2 (two-sided) H A : p 1 p 2
Statistical Hypotheses Null Hypothesis (H o ) The “opposite” of the alternative hypothesis. Examples: H o : 1 ≥ 2 H o : 1 = 2 H o : p 1 = p 2
Errors Truth in the Universe Decision based on Hypothesis Test HAHA HoHo HAHA HoHo Correct Type I Error Correct Type II Error
Errors, Significance Level, and Power P(Type I Error) = = Significance Level Typical values for = 0.05, 0.01 P(Type II Error) = Power = 1 - Typical values for Power = 0.80, 0.90
Effect Size Triglycerides: Hypothyroid 1 =180 Normal 2 =155 =21 =7
Effect Size Example: Triglycerides Hypothyroid 1=180 Normal 2=155 =21 =7
Effects on Sample Size Standard deviation n Effect size Sample size (n) Power n n 2-sided H A n compared to 1-sided
Effects on Sample Size: Example (Two-sample t-test)
Hypothesis Testing: Steps for Selecting Sample Size 1. State H o and H A 2. Set values for and Power 3. Select statistical test 4. Determine expected/meaningful effect size 5. Estimate sample size
Sample Size: Two-sample t-test Hypotheses – H o : 1 = 2 vs. H A : 1 2 Effect Size – ES= See Handout Table 1
Sample Size: Paired t-test Hypotheses – H o : 1 = 2 vs. H A : 1 2 H o : = 0 vs. H A : 0 Effect Size – ES= See Handout Table 2
Sample Size: Analysis of Variance (ANOVA) Hypotheses – H o : 1 = 2 = … = T vs. H A : i j for at least 1 pair of groups Effect Size – (one possibility) ES= See Handout Table 4
Sample Size: Chi-square Test Hypotheses – H o : p 1 = p 2 vs. H A : p 1 p 2 “Effect Size” – Depends on the values of p 1 and p 2, not just p 1 – p 2
Sample Size: Chi-square Test See Handout Table 5
Confidence Intervals: Review of Statistical Concepts Parameters and Estimates Standard Error Confidence Level Margin of Error
Confidence Intervals Confidence intervals are used to estimate parameters with a certain amount of confidence.
Parameters Quantitative (Continuous or Discrete) Interested in the mean ( ) Qualitative (Categorical) Interested in the proportion (p)
Point Estimates of Parameters The estimate of is. The estimate of p is.
Interval Estimates of Parameters Typical form of a confidence interval is: estimate ± constant standard error
Standard Error Standard error is a measure of how much variability there is between estimates of the parameter from one sample to another. In CI for , se=. In CI for p, se=.
Confidence Level The confidence level indicates how certain you are about your interval estimate of your parameter. Typical values for confidence level [(1- )] 100% = 95%, 99%
Constant in Confidence Interval The constant in the confidence interval formula depends on: 1. Confidence Level 2. Distribution of the parameter point estimate
Constant in Confidence Interval Confidence Level Constant For CI for and for CI for p, constant comes from the Normal distribution.
Margin of Error Margin of Error (MOE) = constant standard error
Effects on Sample Size Standard Deviation n -or- p close to 0.5 n Confidence Level n Margin of Error Sample size (n)
Effects on Sample Size: Example (Confidence Interval for )
Effects on Sample Size: Example (Confidence Interval for p)
Confidence Intervals: Steps for Selecting Sample Size 1. Set value for confidence level 2. Select appropriate confidence interval 3. Determine desired margin of error 4. Estimate sample size
Sample Size: Confidence Interval for See Handout Table 3
Sample Size: Confidence Interval for p See Handout Table 6