Presentation is loading. Please wait.

Presentation is loading. Please wait.

Raw data analysis S. Purcell & M. C. Neale Twin Workshop, IBG Colorado, March 2002.

Similar presentations


Presentation on theme: "Raw data analysis S. Purcell & M. C. Neale Twin Workshop, IBG Colorado, March 2002."— Presentation transcript:

1 Raw data analysis S. Purcell & M. C. Neale Twin Workshop, IBG Colorado, March 2002

2 Raw data vs. summary statistics ZygT1 T2 11.20.8 1-1.3-2.2 20.71.9 20.2-0.8........ MZ 1.03 0.870.98 DZ 0.95 0.571.08 ZygT1 T2 11.20.8 1-1.3-2.2 20.71.9 20.2-0.8........

3 Modelling raw data in Mx Pros Missing data Measures of individual fit Finite mixture distributions Continuous moderator variables Cons Computationally more intensive Sensitivity to starting values

4 Likelihood analysis of raw data What is the probability of observing a given twin pair, assuming a certain trait model? 1. e.g. genetic influences very important  dissimilar MZ pairs less likely 2. e.g. no familial influences  dissimilar pairs as likely as similar pairs How do we relate, statistically : Sample-based observed statistics Model-based expectations : parameters ?

5 The Probability Model Data Mean Variance P(X)P(X) X

6 Observed data P(X)P(X) X

7 Probability of the data given the model P(X)P(X) X

8 Maximum Likelihood P(X)P(X) X -estimate the 2 parameters mean variance

9 Twin model Means vector M 1 M 2 Variance-covariance matrix V 1 C 21 C 12 V 2

10 Bivariate normal

11 Bivariate normal : MZ pairs

12 High positive correlation

13 Bivariate normal : DZ pairs

14 Low correlation

15 Likelihood MZ pair DZ pair

16 Likelihood MZ pair DZ pair ACE/AE model

17 Likelihood MZ pair DZ pair CE model

18 Likelihood MZ pair DZ pair E model

19 Summary statistics Originally, model-fitting only on summary statistics variances, covariances, means Maximum likelihood covariance matrix fit function  expected covariance matrix S observed covariance matrix p dimension of S and 

20 Raw data Individual likelihood probability of the observation conditional on some model. x vector of scores (e.g. a twin pair)  expected covariance matrix  expected mean vector Sample log-likelihood =  individual log-likelihoods sum of log-likelihoods  product of likelihoods assumes independence of observations

21 Option MX%P= Option MX%P= Output individual fit statistics to a file identify outliers, possible heterogeneity For each observation 8 values, including -2 log likelihood Mahalanobis distance estimated z-score good for detection of outliers with missing data half-normal plot

22 Missing data ZygA1B1C1A2B2C2 MZ1292313729 MZ65 227919 MZ101126101030 MZ982911924 DZ5102112928 DZ107247829 DZ962351225 DZ1282510721

23 Missing data ZygA1B1C1A2B2C2 MZ1292313729 MZ6. 227919 MZ101126.1030 MZ98.11924 DZ5102112928 DZ10724.829 DZ962351225 DZ12825.721

24 Missing data ZygA1B1C1A2B2C2 MZ1292313729 MZ6-9 227919 MZ101126-91030 MZ98-911924 DZ5102112928 DZ10724-9829 DZ962351225 DZ12825-9721

25 Mx implementation Rectangular datatype RE file=data.raw Means model as well as a Covariance model missing keyword Missing=-999 treated as a string -999 does not equal -999.00

26 Example dataset 110.361769-0.35641 210.8889861.46342 310.5351610.636073 411.461870.663174 511.017160.346681 … … … … … … … … … … …

27 Example dataset MZ covariance matrix 0.55 0.280.51 DZ covariance matrix 0.56 0.150.54 Correlations MZ0.53 (= 0.28 /  ( 0.55 * 0.51 ) ) DZ0.27 (= 0.15 /  ( 0.56 * 0.54 ) )

28 Example dataset ACE -2LL2547.71 df1197 a 2 = 0.29 c 2 = 0.00 e 2 = 0.25 CE -2LL2566.33 df1198 c 2 = 0.21 e 2 = 0.32 Model comparison A test that the A component is significantly nonzero is the deterioration of fit from the ACE to the CE model -2LL 2566.33 - 2547.71 = 18.62 df1198 - 1197 = 1 p-value< 0.0001

29 Testing differences in means Do MZ and DZ twins have similar mean values? Equating MZ and DZ means Joint zygosity mean -0.0014 Model -2LL2547.707 df1196 Separate MZ and DZ means MZ mean 0.0161 DZ mean-0.0159 Model -2LL2547.304 df1195

30 Saturated model Expected covariance matrix = observed exactly “Perfect fit” No constraints at all on the model e.g. variance separately estimated for each twin -2LL 2545.425 df1190 (10 parameters : 4 variances, 2 covariances, 4 means) ACE model-2LL2547.71 df1197


Download ppt "Raw data analysis S. Purcell & M. C. Neale Twin Workshop, IBG Colorado, March 2002."

Similar presentations


Ads by Google