Elizabeth Prom-Wormley & Hermine Maes

Slides:



Advertisements
Similar presentations
15 The Genetic Basis of Complex Inheritance
Advertisements

Sex-limitation models Karri Silventoinen. Testing sex-specific genetic effect It is possible to test whether the size of genetic and environmental variations.
Bivariate analysis HGEN619 class 2007.
Fitting Bivariate Models October 21, 2014 Elizabeth Prom-Wormley & Hermine Maes
The use of Cholesky decomposition in multivariate models of sex-limited genetic and environmental effects Michael C. Neale Virginia Institute for Psychiatric.
ACE estimates under different sample sizes Benjamin Neale Michael Neale Boulder 2006.
Univariate Model Fitting
Sex-limitation Models Brad Verhulst, Elizabeth Prom-Wormley (Sarah, Hermine, and most of the rest of the faculty that has contributed bits and pieces to.
The Inheritance of Complex Traits
Estimating “Heritability” using Genetic Data David Evans University of Queensland.
Multivariate Genetic Analysis: Introduction(II) Frühling Rijsdijk & Shaun Purcell Wednesday March 6, 2002.
Extended sibships Danielle Posthuma Kate Morley Files: \\danielle\ExtSibs.
Heterogeneity Boulder, Heterogeneity Questions Are the contributions of genetic and environmental factors equal for different groups, such as sex,
Chapter 5 Human Heredity by Michael Cummings ©2006 Brooks/Cole-Thomson Learning Chapter 5 Complex Patterns of Inheritance.
Summarizing Data Nick Martin, Hermine Maes TC21 March 2008.
Heterogeneity Hermine Maes TC19 March Files to Copy to your Computer Faculty/hmaes/tc19/maes/heterogeneity  ozbmi.rec  ozbmi.dat  ozbmiysat(4)(5).mx.
Heterogeneity II Danielle Dick, Tim York, Marleen de Moor, Dorret Boomsma Boulder Twin Workshop March 2010.
Univariate Analysis Hermine Maes TC19 March 2006.
Gene x Environment Interactions Brad Verhulst (With lots of help from slides written by Hermine and Liz) September 30, 2014.
Mx Practical TC18, 2005 Dorret Boomsma, Nick Martin, Hermine H. Maes.
Raw data analysis S. Purcell & M. C. Neale Twin Workshop, IBG Colorado, March 2002.
Quantitative Genetics
Karri Silventoinen University of Helsinki Osaka University.
A B C D (30) (44) (28) (10) (9) What is a gene? A functional group of DNA molecules (nucleotides) that is responsible for the production of a protein.
“Beyond Twins” Lindon Eaves, VIPBG, Richmond Boulder CO, March 2012.
Multifactorial Traits
Karri Silventoinen University of Helsinki Osaka University.
Continuously moderated effects of A,C, and E in the twin design Conor V Dolan & Sanja Franić (BioPsy VU) Boulder Twin Workshop March 4, 2014 Based on PPTs.
Introduction to Multivariate Genetic Analysis (2) Marleen de Moor, Kees-Jan Kan & Nick Martin March 7, 20121M. de Moor, Twin Workshop Boulder.
1 Phenotypic Variation Variation of a trait can be separated into genetic and environmental components Genotypic variance  g 2 = variation in phenotype.
Spousal Associations for Alcohol Dependence and Educational Attainment Andrew Williams University of North Carolina Support from NIH Grants: AA07728, AA11998,
Chapter 4 Modern Genetics Thursday, December 10, 2009 Pages
Heterogeneity Hermine Maes TC21 March Files to Copy to your Computer Faculty/hmaes/tc19/maes/heterogeneity  ozbmi.rec  ozbmi.dat  ozbmiysat(4)(5).mx.
Univariate modeling Sarah Medland. Starting at the beginning… Data preparation – The algebra style used in Mx expects 1 line per case/family – (Almost)
Variation in Human Mate Choice: Simultaneously Investigating Heritability, Parental Influence, Sexual Imprinting, and Assortative Mating By: Phillip Skaliy.
Genetic Analysis of Human Diseases Chapter Overview Due to thousands of human diseases having an underlying genetic basis, human genetic analysis.
Presented by Alicia Naegle Twin Studies. Important Vocabulary Monozygotic Twins (MZ)- who are identical twins Dizygotic Twins (DZ)- who are twins that.
Gene-Environment Interaction & Correlation Danielle Dick & Danielle Posthuma Leuven 2008.
Univariate Analysis Hermine Maes TC21 March 2008.
Genetic Epidemiology of Cancer and its Risk Factors Hermine Maes Cancer Control March 2006.
Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.
Mx Practical TC20, 2007 Hermine H. Maes Nick Martin, Dorret Boomsma.
Model building & assumptions Matt Keller, Sarah Medland, Hermine Maes TC21 March 2008.
Welcome  Log on using the username and password you received at registration  Copy the folder: F:/sarah/mon-morning To your H drive.
Biological Approach Methods. Other METHODS of studying biological traits??? How else can you examine biological links to behaviour? Brain storm.
Methodology of the Biological approach
March 7, 2012M. de Moor, Twin Workshop Boulder1 Copy files Go to Faculty\marleen\Boulder2012\Multivariate Copy all files to your own directory Go to Faculty\kees\Boulder2012\Multivariate.
Categorical Data HGEN
Extended Pedigrees HGEN619 class 2007.
Genetic Explanation Continued…..
Bivariate analysis HGEN619 class 2006.
Univariate Twin Analysis
Introduction to Multivariate Genetic Analysis
Fitting Univariate Models to Continuous and Categorical Data
Gene-Environment Interaction & Correlation
Heterogeneity HGEN619 class 2007.
Univariate Analysis HGEN619 class 2006.
Describe one research method used to investigate schizophrenia (6)
METHODS of studying biological traits???
METHODS of studying biological traits???
Structural Equation Modeling
Genetics of qualitative and quantitative phenotypes
Sarah Medland & Nathan Gillespie
Lucía Colodro Conde Sarah Medland & Matt Keller
Bivariate Genetic Analysis Practical
ACE estimates under different sample sizes
63.1 – Discuss the evidence for a genetic influence on intelligence and explain what is meant by heritability. Nature vs. Nurture and Intelligence Early.
Heritability of Subjective Well-Being in a Representative Sample
Presentation transcript:

Elizabeth Prom-Wormley & Hermine Maes Univariate Twin Analysis- Saturated Models for Continuous and Categorical Data September 2, 2014 Elizabeth Prom-Wormley & Hermine Maes ecpromwormle@vcu.edu 804-828-8154

Overall Questions to be Answered Does the data satisfy the assumptions of the classical twin study? Does a trait of interest cluster among related individuals?

Family & Twin Study Designs Family Studies Classical Twin Studies Adoption Studies Extended Twin Studies

The Data Please open twinSatConECPW Fall2014.R 569 total MZ pairs 351 total DZ pairs Please open twinSatConECPW Fall2014.R Australian Twin Register 18-30 years old, males and females Work from this session will focus on Body Mass Index (weight/height2) in females only Sample size MZF = 534 complete pairs (zyg = 1) DZF = 328 complete pairs (zyg = 3)

A Quick Look at the Data

Classical Twin Studies Basic Background The Classical Twin Study (CTS) uses MZ and DZ twins reared together MZ twins share 100% of their genes DZ twins share on average 50% of their genes Expectation- Genetic factors are assumed to contribute to a phenotype when MZ twins are more similar than DZ twins

Classical Twin Study Assumptions MZ twins are genetically identical Equal Environments of MZ and DZ pairs

Basic Data Assumptions MZ and DZ twins are sampled from the same population, therefore we expect :- Equal means/variances in Twin 1 and Twin 2 Equal means/variances in MZ and DZ twins Further assumptions would need to be tested if we introduce male twins and opposite sex twin pairs

“Old Fashioned” Data Checking MZ DZ T1 T2 mean 21.35 21.34 21.45 21.46 variance 0.73 0.79 0.77 0.82 covariance(T1-T2) 0.59 0.25 Nice, but how can we actually be sure that these means and variances are truly the same?

Univariate Analysis A Roadmap 1- Use the data to test basic assumptions (equal means & variances for twin 1/twin 2 and MZ/DZ pairs) Saturated Model 2- Estimate contributions of genetic and environmental effects on the total variance of a phenotype ACE or ADE Models 3- Test ACE (ADE) submodels to identify and report significant genetic and environmental contributions AE or CE or E Only Models 10

Saturated Twin Model

Saturated Code Deconstructed meanMZ <- mxMatrix( type="Full", nrow=1, ncol=ntv, free=TRUE, values=meVals, labels=c("mMZ1","mMZ2"), name=”meanMZ" ) meanDZ <- mxMatrix( type="Full", nrow=1, ncol=ntv, free=TRUE, values=meVals, labels=c("mDZ1","mDZ2"), name=”meanDZ" ) mMZ1 mMZ2 mDZ1 mDZ2 mean MZ = 1 x 2 matrix mean DZ = 1 x 2 matrix

Saturated Code Deconstructed covMZ <- mxMatrix( type="Symm", nrow=ntv, ncol=ntv, free=TRUE, values=cvVals, lbound=lbVals, labels=c("vMZ1","cMZ21","vMZ2"), name=”covMZ" ) covDZ <- mxMatrix( type="Symm", nrow=ntv, ncol=ntv, free=TRUE, values=cvVals, lbound=lbVals, labels=c("vDZ1","cDZ21","vDZ2"), name=”covDZ" ) vMZ1 cMZ21 vMZ2 covMZ = 2 x 2 matrix T1 T2 vDZ1 cDZ21 vDZ2 covDZ = 2 x 2 matrix

Time to Play... Continue with the File twinSatConECPW Fall2014.R

Estimated Values T1 T2 Saturated Model mean MZ DZ cov 10 Total Parameters Estimated mMZ1, mMZ2, vMZ1,vMZ2,cMZ21 mDZ1, mDZ2, vDZ1,vDZ2,cDZ21 Standardize covariance matrices for twin pair correlations (covMZ & covDZ)

Estimated Values T1 T2 Saturated Model mean MZ 21.34 21.35 DZ 21.45 21.46 cov 0.73 0.77 0.59 0.79 0.24 0.82 10 Total Parameters Estimated mMZ1, mMZ2, vMZ1,vMZ2,cMZ21 mDZ1, mDZ2, vDZ1,vDZ2,cDZ21 Standardize covariance matrices for twin pair correlations (covMZ & covDZ)

Fitting Nested Models Saturated Model likelihood of data without any constraints fitting as many means and (co)variances as possible Equality of means & variances by twin order test if mean of twin 1 = mean of twin 2 test if variance of twin 1 = variance of twin 2 Equality of means & variances by zygosity test if mean of MZ = mean of DZ test if variance of MZ = variance of DZ

Estimated Values T1 T2 Equate Means & Variances across Twin Order mean MZ DZ cov Equate Means Variances across Twin Order & Zygosity

Estimates T1 T2 Equate Means & Variances across Twin Order mean MZ 21.35 DZ 21.45 cov 0.76 0.79 0.59 0.24 Equate Means Variances across Twin Order & Zygosity 21.39 0.78 0.61 0.23

Stats Model ep -2ll df AIC diff -2ll diff p Saturated 10 4055.93 1767 521.93 mT1=mT2 8 4056 1769 518 0.07 2 0.97 mT1=mT2 & varT1=VarT2 6 4058.94 1771 516.94 3.01 4 0.56 Zyg MZ=DZ 4063.45 1773 517.45 7.52 0.28 No significant differences between saturated model and models where means/variances/covariances are equal by zygosity and between twins

Working with Binary and Ordinal Data Elizabeth Prom-Wormley and Hermine Maes Special Thanks to Sarah Medland

Transitioning from Continuous Logic to Categorical Logic Ordinal data has 1 less degree of freedom compared to continuous data MZcov, DZcov, Prevalence No information on the variance Thinking about our ACE/ADE model 4 parameters being estimated A/ C/ E/ mean ACE/ADE model is unidentified without adding a constraint

Two Approaches to the Liability Threshold Model Traditional Maps data to a standard normal distribution Total variance constrained to be 1 Alternate Fixes an alternate parameter (usually E) Estimates the remaining parameters

Please open BinaryWarmUp.R Time to Look at the Data! Please open BinaryWarmUp.R

Observed Binary BMI is Imperfect Measure of Underlying Continuous Distribution Mean (bmiB2) = 0.39 SD (bmiB2) = 0.49 Prevalence “low” BMI = 60.6% We are interested in the liability of risk for being in the “high” BMI category

It’s Helpful to Rescale Raw Data (Unstandardized) mean=0.49, SD=0.39 -Data not mapped to a standard normal -No easy conversion to % -Difficult to compare between groups Since the scaling is now arbitrary Standard Normal (Standardized) mean=0, SD=1 Area under the curve between two z-values is interpreted as a probability or percentage

Binary Review Threshold calculated using the cumulative normal distribution (CND) -We used frequencies and inverse CND to do our own estimation of the threshold qnorm(0.816) = 0.90 - Threshold is the Z Value that corresponds with the proportion of the population having “low BMI”

Moving to Ordinal Data!

Getting a Feel for the Data Open twinSatOrd.R Calculate the frequencies of the 5 BMI categories for the second twins of the MZ pairs CrossTable(mzDataOrdF$bmi2)

Estimating MZ Twin 2 Thresholds by Hand T1 = qnorm(0.124) T1 = -1.155 T2 = qnorm(0.124 + 0.236) T2 = -0.358 T3 = qnorm(0.124 + 0.236 + 0.291) T3 =0.388 T4 = qnorm(0.124 + 0.236 + 0.291 + 0.175) T4 = 0.939 Estimate Twin Pair Correlations for the Liabilities Too!

Translating Back to the SEM Approach in OpenMx

Handling Ordinal Data in OpenMx 1- Determine the 1st threshold 2- Determine displacements between 1st threshold and subsequent thresholds 3- Add the 1st threshold and the displacement to obtain the subsequent thresholds 32

Ordinal Saturated Code Deconstructed Defining Threshold Matrices covMZ covDZ μ MZT1 μ MZT2 μ DZT1 μ DZT2 1 1 LT1 LT2 LT1 LT2 Variance Constraint 1 1 1 1 Threshold Model t1MZ1 t1MZ2 t2MZ1 t2MZ2 t3MZ1 t3MZ2 t4MZ1 t4MZ2 t1DZ1 t1DZ2 t2DZ1 t2DZ2 t3DZ1 t3DZ2 t4DZ1 t4DZ2 threM <- mxMatrix( type="Full", nrow=nth, ncol=ntv, free=TRUE, values=thVal, lbound=thLB, labels=thLabMZ, name="ThreMZ" ) threD <- mxMatrix( type="Full", nrow=nth, ncol=ntv, free=TRUE, values=thVal, lbound=thLB, labels=thLabDZ, name="ThreDZ" )

Ordinal Saturated Code Deconstructed Defining Threshold Matrices- ThreMZ Tw1 Tw2 -1.89 -1.16 0.81 0.79 0.73 0.76 0.55 1- Determine the 1st threshold 2- Determine displacements between1st thresholds and subsequent thresholds

Double Check- Moving from Frequencies to Displacements Frequency BMI T2 Cumulative Frequency Z Value Displacement 0.124 -1.16 0.236 0.360 -0.37 0.79 0.291 0.651 0.39 0.76 0.175 0.826 0.94 0.55 1 -

Ordinal Saturated Code Deconstructed Estimating Expected Threshold Matrices threMZ <- mxAlgebra( expression= Inc %*% ThreMZ, name="expThreMZ" ) Inc <- mxMatrix( type="Lower", nrow=nth, ncol=nth, free=FALSE, values=1, name="Inc" ) -1.19 -1.16 0.81 0.79 0.73 0.76 0.55 -1.19 -1.16 -0.38 -0.37 0.34 0.39 0.89 0.93 1 % * % = 3- Add the 1st threshold and the displacement to obtain the subsequent thresholds

Ordinal Saturated Code Deconstructed Estimating Correlations & Fixing Variance corMZ <- mxMatrix( type="Stand", nrow=ntv, ncol=ntv, free=TRUE, values=corVals, lbound=lbrVal, ubound=ubrVal, labels="rMZ", name="expCorMZ" ) corDZ <- mxMatrix( type="Stand", nrow=ntv, ncol=ntv, free=TRUE, values=corVals, lbound=lbrVal, ubound=ubrVal, labels="rDZ", name="expCorDZ" )

How Many Parameters in this Ordinal Model? MZ correlation- rMZ DZ correlation- rDZ Thresholds t1MZ1,t2MZ1,t3MZ1,t4MZ1 t1MZ2,t2MZ2,t3MZ2,t4MZ2 t1DZ1,t2DZ1,t3DZ1,t4DZ1 t1DZ2,t2DZ2,t3DZ2,t4DZ2

Problem Set 1 Open twinSatOrdA.R and twinSatOrd.R What do these scripts do? Looking at the scripts only: How are they similar? How are they different? What do these differences in the scripts reflect regarding conceptual differences in the two models? Run either script and double check against your previously hand-calculated values of thresholds. Report your results. If you can’t get it to match up, don’t panic…do e-mail. Run twinSatOrd.R Is testing an ACE model with the usual model assumptions justified? Why or why not?

Univariate Analysis with Ordinal Data A Roadmap 1- Use the data to test basic assumptions inherent to standard ACE (ADE) models Saturated Model 2- Estimate contributions of genetic and environmental effects on the liability of a trait ADE or ACE Models 3- Test ADE (ACE) submodels to identify and report significant genetic and environmental contributions AE or E Only Models

Questions?