Multivariate Genetic Analysis: Introduction

Slides:



Advertisements
Similar presentations
Bivariate analysis HGEN619 class 2007.
Advertisements

Fitting Bivariate Models October 21, 2014 Elizabeth Prom-Wormley & Hermine Maes
Lecture 3: A brief background to multivariate statistics
Confirmatory Factor Analysis
Univariate Model Fitting
Multivariate Mx Exercise D Posthuma Files: \\danielle\Multivariate.
Factor analysis Caroline van Baal March 3 rd 2004, Boulder.
Path Analysis Danielle Dick Boulder Path Analysis Allows us to represent linear models for the relationships between variables in diagrammatic form.
Multivariate Analysis Nick Martin, Hermine Maes TC21 March 2008 HGEN619 10/20/03.
Introduction to Multivariate Analysis Frühling Rijsdijk & Shaun Purcell Twin Workshop 2004.
Multivariate Genetic Analysis: Introduction(II) Frühling Rijsdijk & Shaun Purcell Wednesday March 6, 2002.
Univariate Analysis in Mx Boulder, Group Structure Title Type: Data/ Calculation/ Constraint Reading Data Matrices Declaration Assigning Specifications/
Multivariate Analysis Hermine Maes TC19 March 2006 HGEN619 10/20/03.
David M. Evans Sarah E. Medland Developmental Models in Genetic Research Wellcome Trust Centre for Human Genetics Oxford United Kingdom Twin Workshop Boulder.
Univariate Analysis Hermine Maes TC19 March 2006.
Path Analysis Frühling Rijsdijk SGDP Centre Institute of Psychiatry King’s College London, UK.
Mx Practical TC18, 2005 Dorret Boomsma, Nick Martin, Hermine H. Maes.
Introduction to Multivariate Genetic Analysis Kate Morley and Frühling Rijsdijk 21st Twin and Family Methodology Workshop, March 2008.
Path Analysis Frühling Rijsdijk. Biometrical Genetic Theory Aims of session:  Derivation of Predicted Var/Cov matrices Using: (1)Path Tracing Rules (2)Covariance.
Phenotypic multivariate analysis. Last 2 days……. 1 P A A C C E E 1/.5 a cecae P.
Path Analysis HGEN619 class Method of Path Analysis allows us to represent linear models for the relationship between variables in diagrammatic.
Lecture 7: Simulations.
TH EDITION LIAL HORNSBY SCHNEIDER COLLEGE ALGEBRA.
Summarizing Variation Matrix Algebra Benjamin Neale Analytic and Translational Genetics Unit, Massachusetts General Hospital Program in Medical and Population.
MathematicalMarketing Slide 2.1 Descriptive Statistics Chapter 2: Descriptive Statistics We will be comparing the univariate and matrix formulae for common.
Institute of Psychiatry King’s College London, UK
Introduction to Multivariate Genetic Analysis (2) Marleen de Moor, Kees-Jan Kan & Nick Martin March 7, 20121M. de Moor, Twin Workshop Boulder.
Basic Statistics Correlation Var Relationships Associations.
Cholesky decomposition May 27th 2015 Helsinki, Finland E. Vuoksimaa.
Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam.
Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.
4.4 Identify and Inverse Matrices Algebra 2. Learning Target I can find and use inverse matrix.
Multivariate Statistics Matrix Algebra I Solutions to the exercises W. M. van der Veld University of Amsterdam.
March 7, 2006Lecture 8aSlide #1 Matrix Algebra, or: Is this torture really necessary?! What for? –Permits compact, intuitive depiction of regression analysis.
Univariate Analysis Hermine Maes TC21 March 2008.
Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.
Matrices and Determinants
Mx Practical TC20, 2007 Hermine H. Maes Nick Martin, Dorret Boomsma.
Categorical Data Frühling Rijsdijk 1 & Caroline van Baal 2 1 IoP, London 2 Vrije Universiteit, A’dam Twin Workshop, Boulder Tuesday March 2, 2004.
Introduction to Genetic Theory
Chapter 1 Section 1.6 Algebraic Properties of Matrix Operations.
Introduction to Multivariate Genetic Analysis Danielle Posthuma & Meike Bartels.
Simple and multiple regression analysis in matrix form Least square Beta estimation Beta Simple linear regression Multiple regression with two predictors.
1 Objective To provide background material in support of topics in Digital Image Processing that are based on matrices and/or vectors. Review Matrices.
March 7, 2012M. de Moor, Twin Workshop Boulder1 Copy files Go to Faculty\marleen\Boulder2012\Multivariate Copy all files to your own directory Go to Faculty\kees\Boulder2012\Multivariate.
Multivariate Genetic Analysis (Introduction) Frühling Rijsdijk Wednesday March 8, 2006.
Categorical Data HGEN
HGEN Thanks to Fruhling Rijsdijk
7.1 Matrices, Vectors: Addition and Scalar Multiplication
Multivariate Analysis
Introduction to Matrices
Bivariate analysis HGEN619 class 2006.
Introduction to Multivariate Genetic Analysis
Heterogeneity HGEN619 class 2007.
MRC SGDP Centre, Institute of Psychiatry, Psychology & Neuroscience
Matrices and Vectors Review Objective
Path Analysis Danielle Dick Boulder 2008
Pak Sham & Shaun Purcell Twin Workshop, March 2002
(Re)introduction to Mx Sarah Medland
Sarah Medland faculty/sarah/2018/Tuesday
\nathan\RaterBias.
Objective To provide background material in support of topics in Digital Image Processing that are based on matrices and/or vectors.
Matrix Algebra.
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 – 14, Tuesday 8th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR)
Bivariate Genetic Analysis Practical
BOULDER WORKSHOP STATISTICS REVIEWED: LIKELIHOOD MODELS
Multivariate Genetic Analysis
Rater Bias & Sibling Interaction Meike Bartels Boulder 2004
Matrices and Determinants
Presentation transcript:

Multivariate Genetic Analysis: Introduction Frühling Rijsdijk & Shaun Purcell Twin Workshop, Boulder Wednesday March 3, 2004 In this second part we will talk about the Matrix specification of Bivariate genetic models. Bivariate models are the most simple case of multivariate models: when we analyse two traits at a time.

Multivariate Twin Analyses Goal: to understand what factors make sets of variables correlate or co-vary Two or more traits can be correlated because they share common genes or common environmental influences With twin data on multiple traits it’s possible to partition the covariation into it’s genetic and environmental components

Univariate ACE Model for a Twin Pair 1 1/.5 A P x E C A A C E X LOW 1  1 z y x x y z C P y P1 P2 Y LOW 1  1 E P z RECAP Before we’ll move on to a two-variable (bivariate) AE twin model, we’ll quickly review the univariate ACE model: The effects of the latent genetic and environmental factors on the observed phenotypes are represented by the path coefficients small a, c, and e These path coefficients are 1x1 matrices, so they are just single values. The expected MZ and DZ matrices (the variances and covariances) that are formulated from these path coefficients are, therefore, 2x2 matrices. 2  2 Z LOW 1  1 2  2

x y z P2 1 P A2 A1 P2 1 P C2 C1 P2 1 P E2 E1 X LOWER 2  2 1/.5 A1 A2 y21 P11 P21 x22 x21 E1 E2 z11 z21 z22 C1 C2 y11 x11 P12 P22 1 When we have measured 2 variables for each twin, the A and E matrices are now 2x2 matrices which have a triangular or Cholesky decomposition. In the Cholesky there are as many factors as variables, i.e. A1 and A2, where A2 only influences the second variable and A1 influences both variables allowing for a correlation between the variables due to shared genetic effects. The same counts for E. Show second half of diagram. Show second half of slide: The complete variance covariance matrix will be a 4x4 matrix which can be divided into 4 quadrants: The first Q represents the variances and covariances within an individual: on the diagonals we have the variances of trait1 and trait 2 which can be obtained by path tracing rules : i.e for P1 this is x112 + x112 and for P2 this is x222 + x212 + z222 + z212. On the off-diagonal we have the covariance between trait1-2, which is the sum of all chains or paths (via A and E) linking the two: x11 * x21 + z11 * z21. Since the within twin variances and covariances are expected to be the same for Twin 1 and Twin 2 we can fill in the same expected 2x2 Var-Cov matrix for the 4th quadrant  The second Q represents the covariances between Twin1 and 2 and can be either within trait (diagonals) or cross trait. Show path tracing. The cross-twin within-trait covariances are in fact what we have already come across in the univariate analyses: the mz/dz ratios indicate the heritability of a trait. What is new here is the correlation between trait1-2. Whether this correlation is determined by shared genetic or shared environmental effects, will be indicated by the cross-twin cross-trait covariance. In AE model the expected cross-twin cross-trait covariance ratio for MZ and DZ pairs will be 2:1, because MZ pairs share twice as many genetic effects as DZ. 22 21 11 P2 1 P A2 A1 x 22 21 11 P2 1 P C2 C1 y 22 21 11 P2 1 P E2 E1 z X LOWER 2  2 Y LOWER 2  2 Z LOWER 2  2

4  4 Twin1 p1 p2 Within-Twin Covariances Cross-Twin Covariances Var P1 Cov P1-P2 Var P2 Twin2 p1 p2 Within Trait 1 Cross Traits Within Trait 2 Twin1 p1 Twin2 To summarize the points made with the previous slides: Within-individual cross-traits covariances implies common etiological influences Cross-twin cross-traits covariances implies that these common etiological influences are familial Whether these common familial etiological influences are genetic or environmental, is reflected in the MZ/DZ ratio of the cross-twin cross-traits covariances 4  4

1/.5 A1 A2 y21 P11 P21 x22 x21 E1 E2 z11 z21 z22 C1 C2 y11 y22 x11 P12 P22 1 y22 4  4 Twin1 Twin2 p1 p2 p1 p2 Within-Twin Covariances Cross-Twin Covariances Twin 1 p2 p1 When we have measured 2 variables for each twin, the A and E matrices are now 2x2 matrices which have a triangular or Cholesky decomposition. In the Cholesky there are as many factors as variables, i.e. A1 and A2, where A2 only influences the second variable and A1 influences both variables allowing for a correlation between the variables due to shared genetic effects. The same counts for E. Show second half of diagram. Show second half of slide: The complete variance covariance matrix will be a 4x4 matrix which can be divided into 4 quadrants: The first Q represents the variances and covariances within an individual: on the diagonals we have the variances of trait1 and trait 2 which can be obtained by path tracing rules : i.e for P1 this is x112 + x112 and for P2 this is x222 + x212 + z222 + z212. On the off-diagonal we have the covariance between trait1-2, which is the sum of all chains or paths (via A and E) linking the two: x11 * x21 + z11 * z21. Since the within twin variances and covariances are expected to be the same for Twin 1 and Twin 2 we can fill in the same expected 2x2 Var-Cov matrix for the 4th quadrant  The second Q represents the covariances between Twin1 and 2 and can be either within trait (diagonals) or cross trait. Show path tracing. The cross-twin within-trait covariances are in fact what we have already come across in the univariate analyses: the mz/dz ratios indicate the heritability of a trait. What is new here is the correlation between trait1-2. Whether this correlation is determined by shared genetic or shared environmental effects, will be indicated by the cross-twin cross-trait covariance. In AE model the expected cross-twin cross-trait covariance ratio for MZ and DZ pairs will be 2:1, because MZ pairs share twice as many genetic effects as DZ. x112 + y112 + z112 1/.5*x112 + 1/1 * y112 x21*x11+ y21*y11 + z21*z11 x222 + x212+ y222 + y212 + z222 +z212 1/.5*x21* x11 + 1/1 * y212 * y11 1/.5*x222+1/.5*x212 + 1/1*y222+1/1*y212 Rmz:Rdz will indicate whether A, C or E determine Rp1-p2 p1 Twin 2 p2

Twin1 MZ p1 p2 Within-Twin Covariances p1 1 Twin 1 1 p2 .30 Cross-Twin Covariances Within-Twin Covariances p1 . 79 .49 1 Twin 1 .50 .59 . 29 1 p2 Twin1 DZ p1 p2 Within-Twin Covariances When we have measured 2 variables for each twin, the A and E matrices are now 2x2 matrices which have a triangular or Cholesky decomposition. In the Cholesky there are as many factors as variables, i.e. A1 and A2, where A2 only influences the second variable and A1 influences both variables allowing for a correlation between the variables due to shared genetic effects. The same counts for E. Show second half of diagram. Show second half of slide: The complete variance covariance matrix will be a 4x4 matrix which can be divided into 4 quadrants: The first Q represents the variances and covariances within an individual: on the diagonals we have the variances of trait1 and trait 2 which can be obtained by path tracing rules : i.e for P1 this is x112 + x112 and for P2 this is x222 + x212 + z222 + z212. On the off-diagonal we have the covariance between trait1-2, which is the sum of all chains or paths (via A and E) linking the two: x11 * x21 + z11 * z21. Since the within twin variances and covariances are expected to be the same for Twin 1 and Twin 2 we can fill in the same expected 2x2 Var-Cov matrix for the 4th quadrant  The second Q represents the covariances between Twin1 and 2 and can be either within trait (diagonals) or cross trait. Show path tracing. The cross-twin within-trait covariances are in fact what we have already come across in the univariate analyses: the mz/dz ratios indicate the heritability of a trait. What is new here is the correlation between trait1-2. Whether this correlation is determined by shared genetic or shared environmental effects, will be indicated by the cross-twin cross-trait covariance. In AE model the expected cross-twin cross-trait covariance ratio for MZ and DZ pairs will be 2:1, because MZ pairs share twice as many genetic effects as DZ. p1 1 Twin 1 p2 .30 1 Cross-Twin Covariances Within-Twin Covariances p1 .39 .25 1 Twin 1 .24 .43 . 31 1 p2

Summary : Cross-traits covariances Within-individual cross-traits covariances implies common etiological influences Cross-twin cross-traits covariances implies that these common etiological influences are familial Whether these common familial etiological influences are genetic or environmental, is reflected in the MZ/DZ ratio of the cross-twin cross-traits covariances

Specification in Mx We have seen how the within-twin variance-covariance matrix for the Additive genetic effects of the Cholesky decomposition can be obtained by path tracing. The Cholesky decomposition of the A matrix is specified in Mx as a Lower matrix (which are always square) and in the bivariate case the dimension is 2x2. Show box. The within-twin var-cov of A is given by the product of A and it’s tranpsose. In the transposed matrix the rows become the columns and vice-versa We use the * or ordinary matrix multiplication : rows of A are multiplied by columns of A’ to form the elements of sigma A. If we work out the multiplication we see that we get exactly the same result as by path tracing.

Within-Twin Covariances : A P1 P2 x22 x11 A1 A2 x21 Path Tracing: ‘Star’ Matrix Multiplication (*) We have seen how the within-twin variance-covariance matrix for the Additive genetic effects of the Cholesky decomposition can be obtained by path tracing. The Cholesky decomposition of the A matrix is specified in Mx as a Lower matrix (which are always square) and in the bivariate case the dimension is 2x2. Show box. The within-twin var-cov of A is given by the product of A and it’s tranpsose. In the transposed matrix the rows become the columns and vice-versa We use the * or ordinary matrix multiplication : rows of A are multiplied by columns of A’ to form the elements of sigma A. If we work out the multiplication we see that we get exactly the same result as by path tracing.

Specification of C and E follow the same principals Begin Matrices; X LOW 2 2 FREE ! Additive Genetic PATHS Y LOW 2 2 FREE ! Common Env PATHS Z LOW 2 2 FREE ! Unique Env PATHS End Matrices; Begin Algebra; A=X*X’; ! Additive Genetic Cov matrix C=Y*Y’; ! Common Env Cov matrix E=Z*Z’; ! Unique Env Cov matrix P=A+C+E; End Algebra; For the simplest, two-variable multivariate AE model the total phenotypic variance is P = A + E , where A, and E are 2  2 matrices rather than single values. P can be written as: the sum of the additive genetic and E covariance matrices. An extension to an ACE model is straightforward.

P = By rule of matrix addition:  P =  A +  C +  E For the simplest, two-variable multivariate AE model the total phenotypic variance is P = A + E , where A, and E are 2  2 matrices rather than single values. P can be written as: the sum of the additive genetic and E covariance matrices. An extention to an ACE model is straightforward. x211 + y211 +z211 x11x21 + y11y21+ z11z21 P = x21x11 + y21y11+ z21z11 x221+x222 + y221+y222 + z221+z222

Cross-Twins Covariances (DZ): A Path Tracing: .5 A1 A2 x11 P11 P21 x22 x21 P12 P22 Twin 1 Twin 2 Within-Traits (diagonals): P11-P12= x11  .5  x11 P21-P22= (x22  .5  x22)+(x21 .5  x21) Cross-Traits: P11-P22= x11  .5  x21 P21-P12= x21  .5  x11 Kronecker Product 

Kronecker product  H FULL 1 1  = @ MATRIX H .5  = @ (m  n)  (p  q) = (mp  nq) (1  1)  (2  2) = (2  2) Kronecker Product : Specification in MX The right Kronecker product of two matrices is formed by multiplying each element of the first matrix by the second matrix. So, the result matrix in which all elements of the matrix A*A’ are multiplied by .5 is the Kronecker product of a 1x1 matrix i.e. H which we assign the value .5 There are no conformability criteria for this type of product, the matrices can be of different order. In Mx, the Kronecker product is denoted woth the @ symbol.

Cross-Twins Covariances (DZ): C Path Tracing: 1 1 Within-Traits (diagonals): P11-P12= y11  y11 P21-P22= (y22  y22)+(y21  y21) Cross-Traits: P11-P22= y11  y21 P21-P12= y21  y11 C1 C2 C1 C2 y11 y21 y22 y11 y21 y22 P11 P21 P12 P22 Twin 1 Twin 2

Cross-Twin Covariances (DZ): (1/2 A + C) .5x211 + y211 .5x222+x221 + y222+y221 .5x11x21 + y11y21 .5x21x11 + y21y11

Cross-Twin Covariances (MZ): (A + C) x211 + y211 x222+x221 + y222+y221 x11x21 + y11y21 .5x21x11 + y21y11

A+C+E | A+C _ A+C | A+C+E 4  4 Within-Twin Cov = A+C+E 2  2 Covariance matrix (MZ): specification in Mx Within-Twin Cov = A+C+E 2  2 Cross-Twin Cov = A+C 2  2 A+C+E | A+C _ A+C | A+C+E COV We now have all relevant bits to specify the predicted Var-Cov matrix of the MZ twin pairs. The usefulness of the vertical and horizontal adhesion operators can now be shown. Since the predicted Var-Cov matrix is a 2x2 matrix with 4 elements, but some of the elements are the same according to the model i.e the Var of twin 1 = Var of twin 2 (diagonal elements) and the cov between Twin 1 and Twin 2 = the cov between Twin 2 and Twin1, we use | and _ to combine the two information units to get the predicted 2x2 var-cov matrix. 4  4

A+C+E | H@A+C _ H@A+C | A+C+E 4  4 Within-Twin Cov = A+C+E 2  2 Covariance matrix (DZ): specification in Mx Within-Twin Cov = A+C+E 2  2 Cross-Twin Cov = 1/2 A+C 2  2 A+C+E | H@A+C _ H@A+C | A+C+E COV We now have all relevant bits to specify the predicted Var-Cov matrix of the MZ twin pairs. The usefulness of the vertical and horizontal adhesion operators can now be shown. Since the predicted Var-Cov matrix is a 2x2 matrix with 4 elements, but some of the elements are the same according to the model i.e the Var of twin 1 = Var of twin 2 (diagonal elements) and the cov between Twin 1 and Twin 2 = the cov between Twin 2 and Twin1, we use | and _ to combine the two information units to get the predicted 2x2 var-cov matrix. 4  4

The Within-Twin Covariance matrix describes how much of the phenotypic covariance between P1 and P2 is due to common A, common C and common E effects P = x211 + y211 + z211 x221+x222 + y221+y222 + z221+z222 x21x11 + y21y11+ z21z11 x11x21 + y11y21+ z11z21 In order to get the Predicted Phenotypic correlation, we convert this Covariance matrix to a correlation matrix. In order to get the Genetic, Shared-environmental and Unique- environmental correlations (rg, rc, re), we convert the A, C and E Covariance matrices to correlation matrices. For the simplest, two-variable multivariate AE model the total phenotypic variance is P = A + E , where A, and E are 2  2 matrices rather than single values. P can be written as: the sum of the additive genetic and E covariance matrices. An extention to an ACE model is straightforward.

Correlations A correlation coefficient is a standardized covariance that lies between -1 and 1 so that it is easier to interpret It is calculated by dividing the covariance by the square root of the product of the variances of the two variables

Covariances to Correlations In matrix form:

Correlations to covariances In matrix form:

How do we derive the Genetic Correlation? The Standardize operation converts a covariance matrix into a correlation matrix by dividing the covariance between two variables 1 and 2 by the square root of the product of the variances of variable 1 and variable 2. Matrix Function in Mx: \stnd(A)

R=\sqrt(I.A)~*A*\sqrt(I.A)~ ; Or......... R=\sqrt(I.A)~*A*\sqrt(I.A)~ ; The Standardize operation converts a covariance matrix into a correlation matrix by dividing the covariance between two variables 1 and 2 by the square root of the product of the variances of variable 1 and variable 2. Where I is an Identity Matrix of 2  2 and I.A = . 1 0 0 1 A11 A12 A21 A22 = A11 0 0 A22

Cholesky ACE for 3 variables x22 x32 x33 x11 x21 x31 P1 P2 P3 Begin Matrices; X LOW 3 3 FREE ! Additive Genetic PATHS Y LOW 3 3 FREE ! Common Env PATHS Z LOW 3 3 FREE ! Unique Env PATHS End Matrices;

rph due to A rph due to C rph due to E X Y h2x h2y c2x c2y rg A1 rc c2x c2y rph due to C rph due to E