Fitting Univariate Models to Continuous and Categorical Data September 9, 2014 Elizabeth Prom-Wormley & Hermine Maes ecpromwormle@vcu.edu 804-828-8154
Patterns of Twin Correlation ??? rMZ = 2rDZ Additive DZ twins on average share 50% of additive effects rMZ = rDZ Shared Environment rMZ > 2rDZ Additive & Dominance DZ twins on average share 25% of dominance effects rDZ > ½ rMZ Additive & “Shared Environment” A = 2(rMZ-rDZ) C = 2rDZ – rMZ E = 1- rMZ
BMI Twin Correlations A = 2(rMZ-rDZ) C = 2rDZ - rMZ E = 1- rMZ 0.30 0.78 A = 2(rMZ-rDZ) C = 2rDZ - rMZ E = 1- rMZ A = 0.96 C = -0.20 E = 0.22 ADE or ACE?
Univariate Analysis A Roadmap 1- Use the data to test basic assumptions inherent to standard ACE (ADE) models Saturated Model 2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype ADE or ACE Models 3- Test ADE (ACE) submodels to identify and report significant genetic and environmental contributions AE or E Only Models
ADE Model- A Few Considerations Dominance (D) refers to non-additive genetic effects resulting from interactions between alleles at the same locus ADE models also include effects of interactions between different loci (epistasis) In addition to sharing on average 50% of their DNA, DZ twins also share about 25% of effects due to dominance
ADE Model Deconstructed Path Coefficients covA <- mxAlgebra( expression=a %*% t(a), name="A" ) covD <- mxAlgebra( expression=d %*% t(d), name=“D" ) covE <- mxAlgebra( expression=e %*% t(e), name="E" ) a 1 x 1 matrix d 1 x 1 matrix e 1 x 1 matrix
ADE Model Deconstructed Variance Components covA <- mxAlgebra( expression=a %*% t(a), name="A" ) covD <- mxAlgebra( expression=d %*% t(d), name=“D" ) covE <- mxAlgebra( expression=e %*% t(e), name="E" ) a aT * 1 x 1 matrix d dT * 1 x 1 matrix e eT * 1 x 1 matrix
ADE Model Deconstructed Means & Covariances meanG <- mxMatrix( type="Full", nrow=1, ncol=ntv, free=TRUE, values= 20, label="mean", name="expMean" ) covMZ <- mxAlgebra( expression= rbind( cbind (A+D+E , A+D), cbind(A+D , A+D+E)), name="expCovMZ" ) covDZ <- mxAlgebra( expression= rbind( cbind(A+D+E, 0.5%x%A+0.25%x%D), cbind(0.5%x%A+0.25%x%D , A+D+E)), name="expCovDZ" ) expMean expMean= 1 x 2 matrix T1 T2 T1 A+D+E T2 covMZ & covDZ= 2 x 2 matrix
ADE Model Deconstructed Means & Covariances covMZ <- mxAlgebra( expression= rbind( cbind(A+D+E , A+D), cbind(A+D , A+D+E)), name="expCovMZ" ) covDZ <- mxAlgebra( expression= rbind( cbind(A+D+E, 0.5%x%A+0.25%x%D), cbind(0.5%x%A+0.25%x%D, A+0.25%x%D+E)), name="expCovDZ" ) T2 T1 T2 T1 A+D+E 0.5A T2
ADE Model Deconstructed Means & Covariances
ADE Model Deconstructed 4 Parameters Estimated ExpMean Variance due to A Variance due to D Variance due to E
Univariate Analysis A Roadmap 1- Use the data to test basic assumptions inherent to standard ACE (ADE) models Saturated Model 2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype ADE or ACE Models 3- Test ADE (ACE) submodels to identify and report significant genetic and environmental contributions AE or E Only Models
Testing Specific Questions Please Open twinADEcon.R ‘Full’ ADE Model Comparison against Saturated Nested Models AE Model Test significance of D E Model vs AE Model Test significance of A E Model vs ADE Model Test combined significance of A & D
AE Model Deconstructed New Object Containing AE Model Object containing Full Model AeModel <- mxModel( AdeFit, name="AE" ) The d path will not be estimated AeModel <- omxSetParameters( AeModel, labels=“d11", free=FALSE, values=0) AeFit <- mxRun(AeModel) Fixing the value of the c path to zero
Goodness of Fit Stats ep -2ll df AIC diff -2ll diff p Saturated ADE AE no! E
Goodness-of-Fit Statistics ep -2ll df AIC diff -2ll diff p Saturated 10 4055.93 1767 521.93 - ADE 4 4063.45 1773 517.45 7.52 6 0.28 AE 3 4067.66 1774 519.66 4.21 1 0.06 E 2 4591.79 1775 1041.79 528.3 0.00
What about the magnitudes of genetic and environmental contributions to BMI?
Univariate Analysis A Roadmap 1- Use the data to test basic assumptions inherent to standard ACE (ADE) models Saturated Model 2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype ADE or ACE Models 3- Test ADE (ACE) submodels to identify and report significant genetic and environmental contributions AE or E Only Models
Estimated Values a d e c a2 d2 e2 c2 ADE - AE E
Estimated Values a d e c a2 d2 e2 ADE 0.57 0.54 0.41 - 0.37 0.22 AE 0.78 0.42 E 0.88 1.00
Potential Complications Assortative Mating Gene-Environment Correlation Gene-Environment Interaction Sex Limitation Gene-Age Interaction
Trying Again with Ordinal Data Open twinAceOrdS.R
Okay, so we have a sense of which model best explains the data Okay, so we have a sense of which model best explains the data...or do we?
BMI Twin Correlations A = 2(rMZ-rDZ) C = 2rDZ - rMZ E = 1- rMZ 0.30 0.78 A = 2(rMZ-rDZ) C = 2rDZ - rMZ E = 1- rMZ ADE or ACE?
ACE/ADE Conclusions- Problem Set 1 Due Tuesday, 9/16 at 5pm Modify twinAdeCon.R and modify to reflect an ACE model BMI (zygosity groups 1 and 3) Are additive genetic factors significant? Are dominance factors significant? Are specific environmental factors significant? Are shared environmental factors significant? Provide results (as tables), provide brief explanations and guide me through your tables Can also send script used to produce result
Goodness-of-Fit Statistics Model ep -2ll df AIC diff -2ll diff p Saturated ADE ACE
Estimated Values a d e c a2 d2 e2 c2 ADE AE ACE E
Thank You!