MRC SGDP Centre, Institute of Psychiatry, Psychology & Neuroscience

Slides:



Advertisements
Similar presentations
SEM PURPOSE Model phenomena from observed or theoretical stances
Advertisements

Structural Equation Modeling Using Mplus Chongming Yang Research Support Center FHSS College.
General Structural Equation (LISREL) Models
Structural Equation Modeling
Practical H:\ferreira\biometric\sgene.exe. Practical Aim Visualize graphically how allele frequencies, genetic effects, dominance, etc, influence trait.
Linear Regression and Binary Variables The independent variable does not necessarily need to be continuous. If the independent variable is binary (e.g.,
Regression Regression: Mathematical method for determining the best equation that reproduces a data set Linear Regression: Regression method applied with.
Structural Equation Modeling
Causal Modelling and Path Analysis. Some Notes on Causal Modelling and Path Analysis. “Path analysis is... superior to ordinary regression analysis since.
Biometrical genetics Manuel Ferreira Shaun Purcell Pak Sham Boulder Introductory Course 2006.
Biometrical genetics Manuel Ferreira Shaun Purcell Pak Sham Boulder Introductory Course 2006.
Variance and covariance M contains the mean Sums of squares General additive models.
Estimating “Heritability” using Genetic Data David Evans University of Queensland.
Genetic Theory Manuel AR Ferreira Egmond, 2007 Massachusetts General Hospital Harvard Medical School Boston.
Lecture 3: Resemblance Between Relatives. Heritability Central concept in quantitative genetics Proportion of variation due to additive genetic values.
Path Analysis Danielle Dick Boulder Path Analysis Allows us to represent linear models for the relationships between variables in diagrammatic form.
Multivariate Genetic Analysis: Introduction(II) Frühling Rijsdijk & Shaun Purcell Wednesday March 6, 2002.
Summarizing Variation Michael C Neale PhD Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
ACDE model and estimability Why can’t we estimate (co)variances due to A, C, D and E simultaneously in a standard twin design?
“Ghost Chasing”: Demystifying Latent Variables and SEM
Structural Equation Modeling
Biometrical Genetics Pak Sham & Shaun Purcell Twin Workshop, March 2002.
David M. Evans Sarah E. Medland Developmental Models in Genetic Research Wellcome Trust Centre for Human Genetics Oxford United Kingdom Twin Workshop Boulder.
Path Analysis Frühling Rijsdijk SGDP Centre Institute of Psychiatry King’s College London, UK.
Introduction to Multivariate Genetic Analysis Kate Morley and Frühling Rijsdijk 21st Twin and Family Methodology Workshop, March 2008.
Path Analysis Frühling Rijsdijk. Biometrical Genetic Theory Aims of session:  Derivation of Predicted Var/Cov matrices Using: (1)Path Tracing Rules (2)Covariance.
LECTURE 16 STRUCTURAL EQUATION MODELING.
Structural Equation Modeling Intro to SEM Psy 524 Ainsworth.
Structural Equation Modeling Continued: Lecture 2 Psy 524 Ainsworth.
Structural Equation Models – Path Analysis
Correlation and Regression
Path Analysis HGEN619 class Method of Path Analysis allows us to represent linear models for the relationship between variables in diagrammatic.
Institute of Psychiatry King’s College London, UK
Univariate modeling Sarah Medland. Starting at the beginning… Data preparation – The algebra style used in Mx expects 1 line per case/family – (Almost)
Path Analysis. Path Diagram Single headed arrowruns from cause to effect Double headed bent arrow: correlation The model above assumes that all 5 variables.
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
Genetic Theory Pak Sham SGDP, IoP, London, UK. Theory Model Data Inference Experiment Formulation Interpretation.
Module III Multivariate Analysis Techniques- Framework, Factor Analysis, Cluster Analysis and Conjoint Analysis Research Report.
SIMPLE LINEAR REGRESSION AND CORRELLATION
Lecture 1: Basic Statistical Tools. A random variable (RV) = outcome (realization) not a set value, but rather drawn from some probability distribution.
Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.
Introduction to Genetic Theory
Introduction to Multivariate Genetic Analysis Danielle Posthuma & Meike Bartels.
March 7, 2012M. de Moor, Twin Workshop Boulder1 Copy files Go to Faculty\marleen\Boulder2012\Multivariate Copy all files to your own directory Go to Faculty\kees\Boulder2012\Multivariate.
Chapter 16 PATH ANALYSIS. Chapter 16 PATH ANALYSIS.
Multivariate Genetic Analysis (Introduction) Frühling Rijsdijk Wednesday March 8, 2006.
Extended Pedigrees HGEN619 class 2007.
Regression Models for Linkage: Merlin Regress
NORMAL DISTRIBUTIONS OF PHENOTYPES
Introduction to Multivariate Genetic Analysis
Re-introduction to openMx
NORMAL DISTRIBUTIONS OF PHENOTYPES
Path Analysis Danielle Dick Boulder 2008
Structural Equation Modeling
Univariate modeling Sarah Medland.
ACE Problems Greg Carey BGA, 2009 Minneapolis, MN.
CHAPTER- 17 CORRELATION AND REGRESSION
Correlation and Regression
Cause (Part II) - Causal Systems
Pak Sham & Shaun Purcell Twin Workshop, March 2002
Structural Equation Modeling
Sarah Medland faculty/sarah/2018/Tuesday
Product moment correlation
Lecture 3: Resemblance Between Relatives
Power Calculation for QTL Association
BOULDER WORKSHOP STATISTICS REVIEWED: LIKELIHOOD MODELS
Multivariate Genetic Analysis: Introduction
Structural Equation Modeling
Rater Bias & Sibling Interaction Meike Bartels Boulder 2004
Presentation transcript:

MRC SGDP Centre, Institute of Psychiatry, Psychology & Neuroscience Path Analysis Frühling Rijsdijk MRC SGDP Centre, Institute of Psychiatry, Psychology & Neuroscience King’s College London

Derivation of Predicted Var/Cov matrices Using: Twin Model Biometrical Genetic Theory Aim of session: Derivation of Predicted Var/Cov matrices Using: Path Tracing Rules Covariance Algebra Model Equations Path Diagrams In the classical Twin model, we make certain predictions about the extend to which individual differences in a trait are determined by genetic and environmental latent factors. The effects of these latent factors is based on biometrical genetic theory. The model can be expressed as a system of linear equations or represented by a path diagram. The ultimate goal of model-fitting is to see how well the predicted variance-covariance matrix of the model fits the observed variance-covariance matrix of the data. We will deal with that tomorrow. To derive the predicted var-cov matrix of the model, we can use : (i) either Covariance Algebra on the model equations or (ii) Path tracing on the path diagrams. The aim of this session is to show how the classical twin model is represented by path diagrams and how, by using both path tracing rules and covariance Algebra, we can derive predicted variances and covariances implied by the model. Covariance Algebra Path Tracing Rules Predicted Var/Cov from Model Observed Var/Cov from Data

Path Analysis Causal relationships <-> observed correlations Present linear relationships between variables in diagrams The relationships can also be represented as structural equations and covariance matrices All three forms are mathematically complete, it is possible to translate from one to the other Structural equation modelling (SEM) represents a unified platform for path analytic and variance components models Path analysis was developed around 1918 by geneticist Sewall Wright. Wright, S. (1921). "Correlation and causation". J. Agricultural Research 20: 557–585 This method is now widely applied to problems in genetics and the behavioural sciences Combines knowledge we have with regard to causal relations with degree of observed correlations This technique allows us to present linear relationships between variables in diagrams and to derive predictions for the variances and covariances of the variables under the specified model

In SEM expected relationships between observed variables are expressed by: A system of linear model equations or Path diagrams which allow the model to be represented in schematic form Both allow derivation of predicted variances and covariances of the variables under the specified model by using: (1) Path Tracing & (2) Covariance Algebra

Path Diagram Conventions Observed Variables Latent Variables In path diagrams we use square boxes to represent observed or measured variables and circles to represent unobserved or latent variables. The relationship among variables is represented by two kinds of arrows: a straight, single-headed arrow represents a causal relationship, and a curved two-headed arrow represents a covariance or correlational path. Causal Paths Covariance Paths

Path Diagrams for the Classical Twin Model

Model for an MZ PAIR e c a a c e 1 1 E C A A C E 1 1 1 1 1 1 e c a a c e Twin 1 Twin 2 This path diagram represents an ACE twin model for MZ pairs. The model represents specific hypotheses about (causal) relationships of latent genetic and environmental factors (single-headed arrows) on the observed variables and also about the correlations (curved arrows) between these latent factors between individuals in a pair. The meaning of ‘causal’ is the assumption that change in the variable at the tail of the arrow will result in change in the variable at the head of the arrow, with all other variables in the diagram held constant. The causal relationships represented by straight arrows are assumed to be linear. Variables that do not receive causal input from any one variable in the diagram are referred to as independent, source or predictor variables. Variables that do, are referred to as dependent variables. In general, only independent variables are connected by double-headed arrows. Because the latent variables have no scale, the double-headed arrow from the factor to itself is fixed to 1 in order to give the explained variance of A, C and E the same scale as the observed variable (e.g. height). The path coefficients a, c and e are the regression coefficient which tells us to what extent a change in the variable at the tail of the arrow is transmitted to the variable at the head of the arrow. Model for an MZ PAIR Note: a, c and e are the same cross twins

Model for a DZ PAIR e c a a c e 1 .5 E C A A C E 1 1 1 1 1 1 e c a a c e Twin 1 Twin 2 Model for a DZ PAIR Note: a, c and e are also the same cross groups

(1) Path Tracing The expected covariance between any two variables is the sum of all legitimate chains connecting the variables Since the variance of a variable is the covariance of the variable with itself, the expected variance will be the sum of all legitimate chains from the variable to itself The numerical value of a chain is the product of all traced path coefficients within the chain A legitimate chain is a path along arrows that follow 3 rules: Wright’s Rules The numerical value of each chain is the product of the path coefficients of all constituent arrows: both causal effects and correlations.  

(I) Trace backward, then forward, or simply forward from variable to variable, but NEVER forward then backward! Include double-headed arrows from the independent variables to itself (the variance) These variances are 1 for latent variables X A 1 a X Va A a

(II) You can pass through the same variable only once in a given chain of paths

(III) There is a maximum of one bi-directional path per chain. The double-headed arrow from the independent variable to itself is included, unless the chain includes another correlation path. X a Y b r Va Vb A B

Variance of Twin 1 AND Twin 2 (for MZ and DZ pairs) E C A 1 1 1 e c a Twin 1

Variance of Twin 1 AND Twin 2 (for MZ and DZ pairs) E C A 1 1 1 e c a Twin 1

Variance of Twin 1 AND Twin 2 (for MZ and DZ pairs) E C A 1 1 1 e c a Twin 1

Variance of Twin 1 AND Twin 2 (for MZ and DZ pairs) a*1*a = a2 E C A 1 1 1 + e c a Twin 1

Variance of Twin 1 AND Twin 2 (for MZ and DZ pairs) a*1*a = a2 E C A 1 1 1 + c*1*c = c2 e c a + e*1*e = e2 Twin 1 Total Variance = a2 + c2 + e2

Covariance Twin 1-2: MZ pairs

Covariance Twin 1-2: MZ pairs Total Covariance = a2 +

Covariance Twin 1-2: MZ pairs Total Covariance = a2 + c2

Predicted Var-Cov Matrices Tw1 Tw2 a2+c2+e2 a2+c2 Cov MZ Tw1 Tw2 a2+c2+e2 ½a2+c2 Cov DZ

ADE Model e d a a d e E D A A D E Twin 1 Twin 2 1(MZ) / .25 (DZ) 1/.5 Why ¼? This will hopefully have been explained in previous session. Given the 4 possible offspring genotypes Amam, Afaf, Afam, afAm (m=mother/f=father) two siblings (or DZ twin pairs) will have 16 possible genotype combinations. If we tabulate these 16, you can see that we can identify 3 distinct combinations: When sibs share 2 alleles IBD (from same parent), left diagonal, 4 cells; P2= 4/16 = 1/4 When sibs share 0 alleles IBD, right diagonal, 4 cells; P0 = 4/16 = ¼ When sibs share 1 alleles IBD, rest of the cells, 8 cells; P1 = 8/16 = 1/2 But why is P2 used for the Dominance Genetic Variance contribution to correlations across individuals? Since D is the interaction effect between alleles at the same locus, these interaction effects can only be correlated across relatives if the genotypes are exactly the same, i.e. IBD 2. This is ¼ in sibs and DZ twins and 1 in MZ twins. Since the prob of IBD=2 is 0 in all other relations, Dominance does not contribute to the genetic covariance (Covg = Ra * Va + P2*Vd). Twin 1 Twin 2

Predicted Var-Cov Matrices Tw1 Tw2 a2+d2+e2 a2+d2 Cov MZ Tw1 Tw2 a2+d2+e2 ½a2+¼d2 Cov DZ ½a2+¼d2

Three Fundamental Covariance Algebra Rules Var (X) = Cov(X,X) Cov (aX,bY) = ab Cov(X,Y) Cov (X,Y+Z) = Cov (X,Y) + Cov (X,Z)

Example 1 A Y Var(Y) = Var(aA) = Cov(aA,aA) = a2 Cov(A,A) = a2 Var(A) Y = aA The variance of a dependent variable (Y) caused by independent variable A, is the squared regression coefficient multiplied by the variance of the independent variable

Example 2 Y A Z Cov(Y,Z) = Cov(aA,aA) = a2 Cov(A,A) = a2 *.5 = .5a2 a Z = aA Y a Y = aA A Z 1 .5 Cov(Y,Z) = Cov(aA,aA) = a2 Cov(A,A) = a2 *.5 = .5a2