**************** GCRC Research-Skills Workshop October 26, 2007 William D. Dupont Department of Biostatistics **************** How to do Power & Sample Size Calculations Part 2
Survival Analysis Follow-up time Fate at exit Statistic: Log-rank test Power: Schoenfeld & Richter, Biometrics 1982 Time Sample Size 0 Accrual time A Additional follow-up F
Hemorage-free survival in patients with previous lobular introcerebral hemmorrhage subdivided by apolipoprotein E genotype (O’Donnell et al. 2000).
Example: Hemorrhage-free survival and genotype Control group = patients with an e2 or e4 allele Pr[Type I error] = 0.05 Power = 0.8 Relative Risk (control/experiment) R = 2 Medial survival time of controls m 1 = ?
Hemorage-free survival in patients with previous lobular intracerebral hemmorrhage subdivided by apolipoprotein E genotype (O’Donnell et al. 2000).
Example: Hemorrhage-free survival and genotype Control group = patients with an e2 or e4 allele Pr[Type I error] = 0.05 Power = 0.8 Relative Risk (control/experiment) R = 2 Medial survival time of controls m 1 = 38 Accrual A = 12 months Additional Follow-up F = 24 months
What if we wanted to study a treatment of Homozygous e3/e3 patients?
= some time = probability of survival at time = If survival has an exponential distribution
= some time = probability of survival at time = If survival has an exponential distribution =
Median Survival Experimental treatment Control treatment Relative Risk (Hazard Ratio) for controls relative to experimental subjects Median survival for experimental subjects Median survival for controls = = =
For t tests power calculations for increased or decreased response relative to control response are symmetric. i.e. The power to detect is the same as the power to detect This is not true for Survival Analysis. The power to detect a two-fold increase in hazard does not equal the power to detect a 50% decrease in hazard.
Power to detect treatment 1 vs. control greater than treatment 1 vs.control Treatment 1 Treatment 2 Control Relative Risk Control vs. Treatment 1 2 Control vs. Treatment It is important to read the parameter definitions carefully.
Power Calculations for Linear Regression
s x = 4.75 Estriol (mg/24 hr) Birthweight (g/100) Rosner Table 11.1 Am J Obs Gyn 1963;85:1-9 = standard deviation of y variable = standard deviation of x variable
Estimated by s = root mean squared error (MSE) = standard deviation of the regression errors
r = 0 r = -1 0 < r < 1 r = 1 c) is estimated by {1.2} Correlation coefficient
Measures of association between continuous variables Correlation vs. linear regression
Independent Variable x Response Variable y Treatment A B Both treatments have identical values of x = 6, y = 20 x = 2, y = 5
x j x j y j j th Regression Error Treatment Dose Level Regression Line x 1 Unit of Dosage 0 Patient Response
c) Slope parameter estimate a.k.a. ) is estimated by b = r s y /s x
r = 0 r = -1 0 < r < 1 r = 1 is estimated by b = r s y /s x
Studies can be either experimental or observational
Normally distributed unless investigator chooses level Treatment level Chosen by investigator
Estimating s directly is often difficult s = = If we can estimate s Warning: If the anticipated value of in your experiment is different from that found in the literature then your value of r will also be different. or
Independent Variable x Response Variable y
Relationship between BMI and exercise time n = 100 women willing to follow a diet exercise program for six months Interquartile range (IQR) of exercise time = 15 minutes (pilot data) = (IQR / 2) / z 0.25 = 7.5 / = = 4.0 kg/m 2 = standard deviation of BMI for women obtained by Kuskowska-Wolk et al. Int J Obes 1992;16:1-9 Would like to detect a true drop in BMI of = kg/m 2 per minute of exercise (1/2 hour of exercise per day induces a drop of 2 kg/m 2 over 6 months)
When Interquartile Range = (IQR / 2) / z 0.25 In general
Impaired Antibody Response to Pneumococcal Vaccine after treatment for Hodgkin’s Disease Siber et al. N Engl J Med 1978;299: n = 17 patients treated with subtotal radiation. vaccinated 8 to 51 months later A linear regression of log antibody response against time from radiation to vaccination gave Suppose we wanted to determine the sample size for a new study with patients randomized to vaccination at 10, 30, or 50 weeks
Composing Slopes of Two Linear Regressin Lines Armitage and Berry (1994) gave age and pulmonary vital capacity for 28 cadmium industry workers with > 10 years of exposure 44 workers with no exposure
(unexposed) (exposed) pooled error variance How many workers do we need to detect with ratio of unexposed workers m = 44/28 = 1.57? Need 167 exposed workers and 167 x 1.57 = 262 unexposed workers
(unexposed) (exposed) pooled error variance How many workers do we need to detect with ratio of unexposed workers m = 44/28 = 1.57? Need 167 exposed workers and 167 x 1.57 = 262 unexposed workers