Download presentation
Presentation is loading. Please wait.
1
Continuous heterogeneity Shaun Purcell Boulder Twin Workshop March 2004
2
Raw data VS summary statistics ZygT1 T2 11.20.8 1-1.3-2.2 20.71.9 20.2-0.8........ MZ 1.03 0.870.98 DZ 0.95 0.571.08
3
Raw data VS summary statistics ZygT1 T2 11.20.8 1-1.3-2.2 20.71.9 20.2-0.8........
4
Raw data VS summary statistics ZygT1 T2age 1 1.20.812.3 1-1.3-2.210.3 20.71.98.7 20.2-0.814.5...........
5
Variance Bivariate normal distribution DataMean
6
Introducing Definition variables Zygosity as a definition variable “Rectangular” file data.raw 1 1 0.361769 -0.35641 2 1 0.888986 1.46342 3 1 0.535161 0.636073... 1 2 0.234099 0.0848318 2 2 -0.547252 -0.22976 3 2 -0.307926 -0.253692...
7
!Using definition variables Group1: Defines Matrices Calc NGroups=2 Begin Matrices; X Lower 1 1 free Y Lower 1 1 free Z Lower 1 1 free M full 1 1 free H Full 1 1 End Matrices; Begin Algebra; A= X*X'; C = Y*Y'; E = Z*Z'; End Algebra; Ma X 0 Ma Y 0 Ma Z 1 Ma M 0 Options MX%P=rawfit.txt End Group2: MZ & DZ twin pairs Data NInput_vars=4 NObservations=0 RE file=data.raw Labels id zyg t1 t2 Select t1 t2 zyg / Definition zyg / Matrices = Group 1 Means M | M / Covariances A + C + E | (H~)@A + C _ (H~)@A + C | A + C + E / Specify H -1 End H will be specified as a definition variable M, necessary for the means model Optional: request individual fit statistics for each pair A single group for both MZ & DZ twins Points to a “REctangular” data file No need to specify number of pairs Zygosity is a “Definition” variable Multiply A component by 1/H 1 x 1 matrix H represents each pair’s zygosity A model for the means [ twin1 | twin 2]
8
Output from zyg.mx RE FILE=DATA.RAW Rectangular continuous data read initiated NOTE: Rectangular file contained 500 records with data that contained a total of 2000 observations LABELS ID ZYG T1 T2 SELECT T1 T2 ZYG / DEFINITION ZYG / NOTE: Selection yields 500 data vectors for analysis NOTE: Vectors contain a total of 1500 observations NOTE: Definition yields 500 data vectors for analysis NOTE: Vectors contain a total of 1000 observations
9
Output from zyg.mx Summary of VL file data for group 2 ZYG T1 T2 Code -1.0000 1.0000 2.0000 Number 500.0000 500.0000 500.0000 Mean 1.5000 -0.0140 0.0240 Variance 0.2500 0.5601 0.5211 Minimum 1.0000 -2.1941 -1.9823 Maximum 2.0000 2.1218 2.7670
10
Output from zyg.mx MATRIX H This is a FULL matrix of order 1 by 1 1 1 -1 MATRIX M This is a FULL matrix of order 1 by 1 1 1 4 MATRIX X This is a LOWER TRIANGULAR matrix of order 1 by 1 1 1 1 MATRIX Y This is a LOWER TRIANGULAR matrix of order 1 by 1 1 1 2 MATRIX Z This is a LOWER TRIANGULAR matrix of order 1 by 1 1 1 3 Specify H -1
11
Output from zyg.mx Your model has 4 estimated parameters and 1000 Observed statistics -2 times log-likelihood of data >>> 2134.998 Degrees of freedom >>>>>>>>>>>>>>>> 996 Fixing X to zero Your model has 3 estimated parameters and 1000 Observed statistics -2 times log-likelihood of data >>> 2154.626 Degrees of freedom >>>>>>>>>>>>>>>> 997
12
Continuous moderators Traits often best defined continuously Many environmental moderators also likely to be continuous in nature –Age –Gestational age –Socio-economic status –Educational level –Consumption of food / alcohol / drugs How to test for G x E interaction in this case?
13
Continuous moderators Problems? –Stratification of sample reduced sample size –Modelling proportions of variance implicitly assumes equality of variance w.r.t moderator –Logical to assume a linear G E interaction linearity at the level of effect, not variance –No obvious statistical test for heterogeneity Heritability 468 10 Age (yrs) 0% 100%
14
Biometrical G E model At a hypothetical single locus –additive genetic valuea –allele frequency p –QTL variance2p(1-p)a 2 Assuming a linear interaction –additive genetic valuea + M –allele frequency p –QTL variance2p(1-p)(a + M) 2
15
Biometrical G E model M No interaction AaAAaa a 0 -a M Interaction 1 -- 1 M Equivalently… 22 1 1
16
Model-fitting approach to GxE Twin 1 ACE Twin 2 ACE a c e ce a
17
Model-fitting approach to GxE Twin 1 ACE Twin 2 ACE a+ X M c e ce a+XMa+XM Continuous moderator variable M Can be coded 0 / 1 in the dichotomous case
18
Individual specific moderators Twin 1 ACE Twin 2 ACE a+ X M 1 c e ce a+XM2a+XM2
19
E x E interactions Twin 1 ACE Twin 2 ACE a+ X M 1 c+ Y M 1 e+ Z M 1 a+ X M 2 c+ Y M 2 e+ Z M 2
20
ACE - XYZ - M Twin 1 ACE Twin 2 ACE a+XM1a+XM1 c+YM1c+YM1 e+ZM1e+ZM1 a+XM2a+XM2 c+YM2c+YM2 e+ZM2e+ZM2 M m+MM1m+MM1 M m+MM2m+MM2 Main effects and moderating effects statistically and conceptually distinct
21
Model-fitting approach to GxE C A E Component of variance Moderator variable
22
Turkheimer et al (2003) 320 twin pairs recruited at birth from urban hospitals G : additive genetic variance E : SES –parental education, occupation, income X : IQ –Wechsler; Verbal, Performance, Full
23
A CE Full scale IQ Verbal IQ Non-Verbal IQ
24
Standard model Means vector Covariance matrix
25
Allowing for a main effect of X Means vector Covariance matrix
26
! Basic model + main effect of a definition variable G1: Define Matrices Data Calc NGroups=3 Begin Matrices; A full 1 1 free C full 1 1 free E full 1 1 free M full 1 1 free! grand mean B full 1 1 free ! moderator-linked means model H full 1 1 R full 1 1! twin 1 moderator (definition variable) S full 1 1! twin 2 moderator (definition variable) End Matrices; Ma M 0 Ma B 0 Ma A 1 Ma C 1 Ma E 1 Matrix H.5 Options NO_Output End
27
G2: MZ Data NInput_vars=6 NObservations=0 Missing =-999 RE File=f1.dat Labels id zyg p1 p2 m1 m2 Select if zyg = 1 / Select p1 p2 m1 m2 / Definition m1 m2 / Matrices = Group 1 Means M + B*R | M + B*S / Covariance A*A' + C*C' + E*E' | A*A' + C*C' _ A*A' + C*C' | A*A' + C*C' + E*E' / !twin 1 moderator variable Specify R -1 !twin 2 moderator variable Specify S -2 Options NO_Output End
28
G3: DZ Data NInput_vars=6 NObservations=0 Missing =-999 RE File=f1.dat Labels id zyg p1 p2 m1 m2 Select if zyg = 2 / Select p1 p2 m1 m2 / Definition m1 m2 / Matrices = Group 1 Means M + B*R | M + B*S / Covariance A*A' + C*C' + E*E' | H@A*A' + C*C' _ H@A*A' + C*C' | A*A' + C*C' + E*E' / !twin 1 moderator variable Specify R -1 !twin 2 moderator variable Specify S -2 End
29
MATRIX A This is a FULL matrix of order 1 by 1 1 1 1.3228 MATRIX B This is a FULL matrix of order 1 by 1 1 1 0.3381 MATRIX C This is a FULL matrix of order 1 by 1 1 1 1.1051 MATRIX E This is a FULL matrix of order 1 by 1 1 1 0.9728 MATRIX M This is a FULL matrix of order 1 by 1 1 1 0.1035 Your model has 5 estimated parameters and 800 Observed statistics -2 times log-likelihood of data >>> 3123.925 Degrees of freedom >>>>>>>>>>>>>>>> 795
30
MATRIX A This is a FULL matrix of order 1 by 1 1 1 1.3078 MATRIX B This is a FULL matrix of order 1 by 1 1 1 0.0000 MATRIX C This is a FULL matrix of order 1 by 1 1 1 1.1733 MATRIX E This is a FULL matrix of order 1 by 1 1 1 0.9749 MATRIX M This is a FULL matrix of order 1 by 1 1 1 0.1069 Your model has 4 estimated parameters and 800 Observed statistics -2 times log-likelihood of data >>> 3138.157 Degrees of freedom >>>>>>>>>>>>>>>> 796
31
Continuous heterogeneity model Means vector Covariance matrix
32
! GxE - Basic model G1: Define Matrices Data Calc NGroups=3 Begin Matrices; A full 1 1 free C full 1 1 free E full 1 1 free T full 1 1 free! moderator-linked A component U full 1 1 free! moderator-linked C component V full 1 1 free! moderator-linked E component M full 1 1 free! grand mean B full 1 1 free ! moderator-linked means model H full 1 1 R full 1 1! twin 1 moderator (definition variable) S full 1 1! twin 2 moderator (definition variable) End Matrices; Ma T 0 Ma U 0 Ma V 0 Ma M 0 Ma B 0 Ma A 1 Ma C 1 Ma E 1 Matrix H.5 Options NO_Output End
33
G2: MZ Data NInput_vars=6 NObservations=0 Missing =-999 RE File=f1.dat Labels id zyg p1 p2 m1 m2 Select if zyg = 1 / Select p1 p2 m1 m2 / Definition m1 m2 / Matrices = Group 1 Means M + B*R | M + B*S / Covariance (A+T*R)*(A+T*R) + (C+U*R)*(C+U*R) + (E+V*R)*(E+V*R) | (A+T*R)*(A+T*S) + (C+U*R)*(C+U*S) _ (A+T*S)*(A+T*R) + (C+U*S)*(C+U*R) | (A+T*S)*(A+T*S) + (C+U*S)*(C+U*S) + (E+V*S)*(E+V*S) / !twin 1 moderator variable Specify R -1 !twin 2 moderator variable Specify S -2 Options NO_Output End
34
G3: DZ Data NInput_vars=6 NObservations=0 Missing =-999 RE File=f1.dat Labels id zyg p1 p2 m1 m2 Select if zyg = 2 / Select p1 p2 m1 m2 / Definition m1 m2 / Matrices = Group 1 Means M + B*R | M + B*S / Covariance (A+T*R)*(A+T*R) + (C+U*R)*(C+U*R) + (E+V*R)*(E+V*R) | H@(A+T*R)*(A+T*S) + (C+U*R)*(C+U*S) _ H@(A+T*S)*(A+T*R) + (C+U*S)*(C+U*R) | (A+T*S)*(A+T*S) + (C+U*S)*(C+U*S) + (E+V*S)*(E+V*S) / !twin 1 moderator variable Specify R -1 !twin 2 moderator variable Specify S -2 End
35
Practical 1 The script: mod.mx The data: f1.dat IDzygositytrait_twin_1trait_twin_2mod_twin_1mod_twin_2 1.Any evidence for G × E for this trait ? i.e. does the A latent variable show heterogeneity with respect to the moderator variable 2.If so, in what way? i.e. how would you interpret/describe the effect?
36
Practical 1 : f1.dat Moderator distributionMZ pairs (trait) DZ pairs (trait) All twin 1’s v.s. moderator
37
nomod.mx a1.3078a 2 ~ 1.7 c1.1733c 2 ~ 1.4 e 0.9749 e 2 ~ 0.95 a 2 +c 2 +e 2 = 4.05 i.e. % variance is 42%, 35% and 23%
38
Parameter estimates: mod.mx ACE-XYZ-MACE-YZ-M A1.22881.4455 C0.98740.6837 E0.92360.9484 T-0.6007 U0.1763-0.6817 V0.38250.4663 M0.07370.0724 B0.3670.3625
39
Plotting VCs For the additive genetic VC, for example –Given a, and a range of values for the moderator variable For example, a = 0.5, = -0.2 and M ranges from -2 to +2 M (a+M)2(a+M)2 (a+M)2(a+M)2 -2(0.5+(-0.2×-2)) 2 0.81 -1.5(0.5+(-0.2×-1.5)) 2 0.73 … +2(0.5+(-0.2×2)) 2 0.01
41
Specific test of G×E -2LLDf Full model ACE-XYZ-M 3024.689792 Sub model ACE-YZ-M 3034.898793 Difference10.2091 p-value = 0.00139
42
Other tests TestSubmodel-2LLΔdfp-value Y ACE-XZ-M3025.782 10.29 Z ACE-XY-M3110.429 1< 1×10 -19 M ACE-XYZ3039.370 10.00013 C & Y AE-XZ-M3026.228 20.46 All made against the full model ACE-XYZ-M, -2LL = 3024.689
43
Confidence intervals Easy to get CIs for individual parameters Additionally, CIs on the moderated VCs are useful for interpretation e.g. a 95% CI for (a+ M) 2, for a specific M
44
Define two extra vectors in Group 1 P full 1 13 O Unit 1 13 Matrix P -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 Add a 4 th group to calculate the CIs CIs Calc Matrices = Group 1 Begin Algebra; F= ( A@O + T@P ). ( A@O + T@P ) / G= ( C@O + U@P ). ( C@O + U@P ) / I= ( E@O + V@P ). ( E@O + V@P ) / End Algebra; Interval @ 95 F 1 1 to F 1 13 Interval @ 95 G 1 1 to G 1 13 Interval @ 95 I 1 1 to I 1 13 End;
45
Calculation of CIs F= ( A@O + T@P ). ( A@O + T@P ) / E.g. if P were then ( A@O + T@P ) equals or Finally, the dot-product squares all elements to give or
46
Confidence intervals on VCs A C E
47
Other considerations Simple approach to test for heterogeneity –easily adapted, e.g. for ordinal data models Extensions / things to watch for… –scalar v.s. qualitative heterogeneity v. low power –the environment may show shared genetic influence with the trait –nonlinear effects in both mediation and moderation
48
X E G Main effect Moderating G E r GE
50
Turkheimer et al, 2003 SES IQ SES V(IQ)
51
Simulated twin data Moderator Standard Quadratic E(Trait) A 3 df test of any moderating effect Standard analysis : linear means model (in H A and H 0 ) Quadratic analysis : linear and quadratic means model (in H A and H 0 ) 18/50 replicates significant i.e. type I error 36% for nominal 5% level
52
More complex G E interaction E-risk Trait P(disease)
53
Include E-risk in means model E-risk Residual Trait P(disease | E-risk)
54
Biometrical model E-risk Additive genetic effect Quadratic form AaAAaa
55
ACE - XYZ - X 2 Y 2 Z 2 - M Twin 1 ACE Twin 2 ACE a + X M 1 + X M 2 1 c e ce a+XM2+XM22a+XM2+XM22
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.