Exercise 1 You have a clinical study in which 10 patients will either get the standard treatment or a new treatment Randomize which 5 of the 10 get the new treatment so that all possible combinations can result. Use Excel or R or another formal randomization method. Instead, randomize so that in each pair of patients entered by date, one has the standard and one the new treatment (blocked randomization). What are the advantages of each method? Why is randomization important? April 2, 2013SPH 247 Statistical Analysis of Laboratory Data1
Exercise 1 Solutions For the first situation, using Excel, you can (for example) put the numbers 1–10 in column A, put five “Treatment” and five “Standard” in column B, put =rand() in column C. Then fix the numbers in column C and sort columns B and C by the random numbers. For the second situation, you can put random numbers in column B and fix them, then in cell C2 (assuming a header row) put =IF(B2<B3,"A","B") and in cell C3 put =IF(B3<B2,"A","B"). Then copy the pair of cells to C4, C6, C8, and C10. Or use “Treatment” and “Standard” instead of “A” and “B”. There are many other ways to do this. April 2, 2013SPH 247 Statistical Analysis of Laboratory Data2
April 2, 2013SPH 247 Statistical Analysis of Laboratory Data3 PatientRandomTreatment A A A A A B B B B B PatientRandomTreatment A B B A A B A B A B Before SortingAfter Sorting
April 2, 2013SPH 247 Statistical Analysis of Laboratory Data4 PatientRandomTreatment A B B A A B A B B A IF(C2<C3,"A","B") IF(C3<C2,"A","B") IF(C4<C5,"A","B") IF(C5<C4,"A","B") IF(C6<C7,"A","B") IF(C7<C6,"A","B") IF(C8<C9,"A","B") IF(C9<C8,"A","B") IF(C10<C11,"A","B") IF(C11<C10,"A","B")
Comments Randomization is important to avoid bias, approximately balance covariates, and provide a basis for analysis. Blocked randomization can better control for time effects, but this particular version risks unblinding the next patient. April 2, 2013SPH 247 Statistical Analysis of Laboratory Data5
Exercise 2 Analyze the testosterone levels from Rosner’s endocrin data set in the same way as we did for the estradiol levels, using anova(lm()) Reanalyze estradiol and testosterone using lme() and verify that the results are the same. Here is the specification lme(Testosterone ~ 1, random = ~1 | Subject,data=endocrin) Repeat the analysis of the coop data for specimens 2 and 5 separately. Do the analysis both with the traditional ANOVA tables using lm() and with lme() and compare the results April 2, 2013SPH 247 Statistical Analysis of Laboratory Data6
April 2, 2013SPH 247 Statistical Analysis of Laboratory Data7 > anova(lm(Testosterone ~ Subject,data=endocrin)) Analysis of Variance Table Response: Testosterone Df Sum Sq Mean Sq F value Pr(>F) Subject ** Residuals Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Replication error variance is 6.484, so the standard deviation of replicates is 2.55 pg/mL This compared to average levels across subjects from to Estimated variance across subjects is ( − 6.484)/2 = Standard deviation across subjects is 8.26 pg/mL If we average the replicates, we get five values, the standard deviation of which is This contains both the subject and replication variance.
April 2, 2013SPH 247 Statistical Analysis of Laboratory Data8 > lme(Estradiol ~ 1, random = ~1 | Subject,data=endocrin) Linear mixed-effects model fit by REML Data: endocrin Log-restricted-likelihood: Fixed: Estradiol ~ 1 (Intercept) Random effects: Formula: ~1 | Subject (Intercept) Residual StdDev: Number of Observations: 10 Number of Groups: 5 > lme(Testosterone ~ 1, random = ~1 | Subject,data=endocrin) Linear mixed-effects model fit by REML Data: endocrin Log-restricted-likelihood: Fixed: Testosterone ~ 1 (Intercept) 25.3 Random effects: Formula: ~1 | Subject (Intercept) Residual StdDev: Number of Observations: 10 Number of Groups: 5
Specimen 2 > anova(lm(Conc ~ Lab + Lab:Bat,data=coop,sub=Spc=="S2")) Analysis of Variance Table Response: Conc Df Sum Sq Mean Sq F value Pr(>F) Lab e-14 *** Lab:Bat e-08 *** Residuals April 2, 2013SPH 247 Statistical Analysis of Laboratory Data9 The test for batch-in-lab is correct, but the test for lab is not—the denominator should be The Lab:Bat MS, so F(5,12) = / = so p = , still significant The residual variance is (sd=0.0787). The estimated batch variance within labs is ( − )/2 = (sd = ). The estimated lab variance is ( − )/(3×2) = (sd = )
> lme(Conc ~1, random = ~1 | Lab/Bat,sub=Spc=="S2",data=coop) Linear mixed-effects model fit by REML Data: coop Subset: Spc == "S2" Log-restricted-likelihood: Fixed: Conc ~ 1 (Intercept) Random effects: Formula: ~1 | Lab (Intercept) StdDev: Formula: ~1 | Bat %in% Lab (Intercept) Residual StdDev: Number of Observations: 36 Number of Groups: Lab Bat %in% Lab 6 18 April 2, 2013SPH 247 Statistical Analysis of Laboratory Data10 The standard deviations for Lab, Batch, and Residual exactly match those from the lm() analysis
Specimen 5 > anova(lm(Conc ~ Lab + Lab:Bat,data=coop,sub=Spc=="S5")) Analysis of Variance Table Response: Conc Df Sum Sq Mean Sq F value Pr(>F) Lab e-08 *** Lab:Bat *** Residuals Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 April 2, 2013SPH 247 Statistical Analysis of Laboratory Data11 The test for batch-in-lab is correct, but the test for lab is not—the denominator should be The Lab:Bat MS, so F(5,12) = / = so p = , still significant The residual variance is (sd=0.3110). The estimated batch variance within labs is ( −0.0967)/2 = (sd = ). The estimated lab variance is ( − )/(3×2) = (sd = )
> lme(Conc ~1, random = ~1 | Lab/Bat,sub=Spc=="S5",data=coop) Linear mixed-effects model fit by REML Data: coop Subset: Spc == "S5" Log-restricted-likelihood: Fixed: Conc ~ 1 (Intercept) Random effects: Formula: ~1 | Lab (Intercept) StdDev: Formula: ~1 | Bat %in% Lab (Intercept) Residual StdDev: Number of Observations: 36 Number of Groups: Lab Bat %in% Lab 6 18 April 2, 2013SPH 247 Statistical Analysis of Laboratory Data12 The standard deviations for Lab, Batch, and Residual exactly match those from the lm() analysis