F:\sarah\fri_MV. Multivariate Linkage and Association Sarah Medland and Manuel Ferreira -with heavy borrowing from Kate Morley and Frühling Rijsdijk.

Slides:



Advertisements
Similar presentations
Bivariate analysis HGEN619 class 2007.
Advertisements

Fitting Bivariate Models October 21, 2014 Elizabeth Prom-Wormley & Hermine Maes
Multivariate Mx Exercise D Posthuma Files: \\danielle\Multivariate.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
Path Analysis Danielle Dick Boulder Path Analysis Allows us to represent linear models for the relationships between variables in diagrammatic form.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
Multivariate Analysis Nick Martin, Hermine Maes TC21 March 2008 HGEN619 10/20/03.
Multivariate Genetic Analysis: Introduction(II) Frühling Rijsdijk & Shaun Purcell Wednesday March 6, 2002.
Association analysis Shaun Purcell Boulder Twin Workshop 2004.
Introduction to Linkage
Multivariate Analysis Hermine Maes TC19 March 2006 HGEN619 10/20/03.
Univariate Analysis Hermine Maes TC19 March 2006.
Gene x Environment Interactions Brad Verhulst (With lots of help from slides written by Hermine and Liz) September 30, 2014.
Mx Practical TC18, 2005 Dorret Boomsma, Nick Martin, Hermine H. Maes.
Introduction to Multivariate Genetic Analysis Kate Morley and Frühling Rijsdijk 21st Twin and Family Methodology Workshop, March 2008.
Raw data analysis S. Purcell & M. C. Neale Twin Workshop, IBG Colorado, March 2002.
Linkage Analysis in Merlin
Statistical Power Calculations Boulder, 2007 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Copy the folder… Faculty/Sarah/Tues_merlin to the C Drive C:/Tues_merlin.
Introduction to Multivariate Genetic Analysis (2) Marleen de Moor, Kees-Jan Kan & Nick Martin March 7, 20121M. de Moor, Twin Workshop Boulder.
Univariate modeling Sarah Medland. Starting at the beginning… Data preparation – The algebra style used in Mx expects 1 line per case/family – (Almost)
Gene Mapping Quantitative Traits using IBD sharing References: Introduction to Quantitative Genetics, by D.S. Falconer and T. F.C. Mackay (1996) Longman.
Linkage in selected samples Manuel Ferreira QIMR Boulder Advanced Course 2005.
Longitudinal Modeling Nathan Gillespie & Dorret Boomsma \\nathan\2008\Longitudinal neuro_f_chol.mx neuro_f_simplex.mx jepq6.dat.
Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.
Power of linkage analysis Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
The importance of the “Means Model” in Mx for modeling regression and association Dorret Boomsma, Nick Martin Boulder 2008.
Regression-Based Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Univariate Analysis Hermine Maes TC21 March 2008.
Linkage and association Sarah Medland. Genotypic similarity between relatives IBS Alleles shared Identical By State “look the same”, may have the same.
Errors in Genetic Data Gonçalo Abecasis. Errors in Genetic Data Pedigree Errors Genotyping Errors Phenotyping Errors.
Epistasis / Multi-locus Modelling Shaun Purcell, Pak Sham SGDP, IoP, London, UK.
Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Means, Thresholds and Moderation Sarah Medland – Boulder 2008 Corrected Version Thanks to Hongyan Du for pointing out the error on the regression examples.
Powerful Regression-based Quantitative Trait Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Mx Practical TC20, 2007 Hermine H. Maes Nick Martin, Dorret Boomsma.
David M. Evans Multivariate QTL Linkage Analysis Queensland Institute of Medical Research Brisbane Australia Twin Workshop Boulder 2003.
Frühling Rijsdijk & Kate Morley
Categorical Data Frühling Rijsdijk 1 & Caroline van Baal 2 1 IoP, London 2 Vrije Universiteit, A’dam Twin Workshop, Boulder Tuesday March 2, 2004.
A simple method to localise pleiotropic QTL using univariate linkage analyses of correlated traits Manuel Ferreira Peter Visscher Nick Martin David Duffy.
Welcome  Log on using the username and password you received at registration  Copy the folder: F:/sarah/mon-morning To your H drive.
Linkage in Mx & Merlin Meike Bartels Kate Morley Hermine Maes Based on Posthuma et al., Boulder & Egmond.
Copy folder (and subfolders) F:\sarah\linkage2. Linkage in Mx Sarah Medland.
Efficient calculation of empirical p- values for genome wide linkage through weighted mixtures Sarah E Medland, Eric J Schmitt, Bradley T Webb, Po-Hsiu.
More on thresholds Sarah Medland. A plug for OpenMx? Very few packages can handle ordinal data adequately… OpenMx can also be used for more than just.
Introduction to Multivariate Genetic Analysis Danielle Posthuma & Meike Bartels.
Developmental Models/ Longitudinal Data Analysis Danielle Dick & Nathan Gillespie Boulder, March 2006.
QTL Mapping Using Mx Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
March 7, 2012M. de Moor, Twin Workshop Boulder1 Copy files Go to Faculty\marleen\Boulder2012\Multivariate Copy all files to your own directory Go to Faculty\kees\Boulder2012\Multivariate.
Multivariate Genetic Analysis (Introduction) Frühling Rijsdijk Wednesday March 8, 2006.
Regression Models for Linkage: Merlin Regress
Ordinal Data Sarah Medland.
Bivariate analysis HGEN619 class 2006.
Introduction to Multivariate Genetic Analysis
Re-introduction to openMx
Genome Wide Association Studies using SNP
Ordinal data, matrix algebra & factor analysis
More on thresholds Sarah Medland.
Univariate modeling Sarah Medland.
Linkage in Selected Samples
GxE and GxG interactions
Why general modeling framework?
(Re)introduction to Mx Sarah Medland
Sarah Medland faculty/sarah/2018/Tuesday
Univariate Linkage in Mx
Power Calculation for QTL Association
Multivariate Genetic Analysis: Introduction
Presentation transcript:

F:\sarah\fri_MV

Multivariate Linkage and Association Sarah Medland and Manuel Ferreira -with heavy borrowing from Kate Morley and Frühling Rijsdijk

MV analyses can address Questions of common aetiology Same gene (snp) Co-incidental covariation due to LD between two different genes Co-variation due to shared social/environmental risk factors Aust. Albatross

MV analyses can address Questions of common aetiology Same gene (snp) Co-incidental covariation due to LD between two different genes Co-variation due to shared social/environmental risk factors Pleiotropy occurs when a single gene influences multiple phenotypic traits. Bilby

Studying multiple phenotypes… Run multiple univariate analyses on correlated traits Bandicoot

Studying multiple phenotypes… Run multiple univariate analyses Correct for multiple testing… Bonferroni  Correction for equivalent number of independent variables Doesn’t really address the idea of common aetiology Blue ringed octopus

Studying multiple phenotypes… Run multiple univariate analyses Try and determine if the coincident linkage/association is statistically unlikely Box jellyfish

Studying multiple phenotypes… Run multiple univariate analyses Try and determine if the coincident linkage/association is statistically unlikely Simulate/Permute data and assess how often this group of traits reaches this pattern of sig. by chance Brown Snake

Studying multiple phenotypes… Study a number of proxies  Cockatoo

Studying multiple phenotypes… Make a composite phenotype  Cuscus

Studying multiple phenotypes… Make a factor score combine both factor level and trait- specific effects latent factor effects are inherently pleiotropic residual effects are not assumes factor loadings are constant across genome Cuscus (again)

Alternative… Explicitly model the covariation between traits Traits may be correlated due to shared genetic factors (A) or shared environmental factors (C or E) ADHD E1E1 E2E2 A1A1 A2A2 a 11 2 IQ rGrG Q1Q1 Q2Q rErE q 11 2 a 22 2 q 22 2 e 11 2 e 22 2 Dingo

Explicitly model the covariation between traits… Directly assess pleiotropy Increased power Esp. when the pattern of QTL effects is different from the background covariation ie positive r but gene effects are in opposite directions Reduced multiple testing Drop Bear

Multivariate analysis… In the context of linkage analysis when we have family data two traits measured in twin pairs Interested in: Cross-trait covariance within individuals Cross-trait covariance between twins MZ:DZ ratio of cross-trait covariance between twins Echidna

Observed Covariance Matrix Phenotype 1Phenotype 2Phenotype 1Phenotype 2 Phenotype 1 Variance P1 Phenotype 2 Covariance P1-P2 Variance P2 Phenotype 1 Within-trait P1 Cross-traitVariance P1 Phenotype 2 Cross-traitWithin-trait P2 Covariance P1-P2 Variance P2 Twin 1 Twin 2

Observed Covariance Matrix Phenotype 1Phenotype 2Phenotype 1Phenotype 2 Phenotype 1 Variance P1 Phenotype 2 Covariance P1-P2 Variance P2 Phenotype 1 Within-trait P1 Cross-traitVariance P1 Phenotype 2 Cross-traitWithin-trait P2 Covariance P1-P2 Variance P2 Twin 1 Twin 2 Within-twin covariance

Within-Twin Covariances (A) A1A1 A2A2 a 11 a 21 a Phenotype 1Phenotype 2 Phenotype 1 a c e 11 2 Phenotype 2 a 11 a 21 +c 11 c 21 +e 11 e 21 a a c c e e 21 2 Twin 1 Phenotype 1 Twin 1 Phenotype 2

Within-Twin Covariances (A) A1A1 A2A2 a 11 a 21 a Phenotype 1Phenotype 2 Phenotype 1 a c e 11 2 Phenotype 2 a 11 a 21 +c 11 c 21 +e 11 e 21 a a c c e e 21 2 Twin 1 Phenotype 1 Twin 1 Phenotype 2

Within-Twin Covariances (A) A1A1 A2A2 a 11 a 21 a Phenotype 1Phenotype 2 Phenotype 1 a c e 11 2 Phenotype 2 a 11 a 21 +c 11 c 21 +e 11 e 21 a a c c e e 21 2 Twin 1 Phenotype 1 Twin 1 Phenotype 2

Within-Twin Covariances (A) A1A1 A2A2 a 11 a 21 a Phenotype 1Phenotype 2 Phenotype 1 a c e 11 2 Phenotype 2 a 11 a 21 +c 11 c 21 +e 11 e 21 a a c c e e 21 2 Twin 1 Phenotype 1 Twin 1 Phenotype 2

Within-Twin Covariances (Q) Q1Q1 q 11 q 21 1 Phenotype 1Phenotype 2 Phenotype 1 a q e 11 2 Phenotype 2 a 11 a 21 +q 11 q 21 +e 11 e 21 a a q e e 21 2 Twin 1 Phenotype 1 Twin 1 Phenotype 2

Within-Twin Covariances (E) Phenotype 1Phenotype 2 Phenotype 1 a q e 11 2 Phenotype 2 a 11 a 21 +q 11 q 21 +e 11 e 21 a a q e e 21 2 Twin 1 Phenotype 1 E1E1 E2E2 e 11 e 21 e Twin 1 Phenotype 2

Observed Covariance Matrix Phenotype 1Phenotype 2Phenotype 1Phenotype 2 Phenotype 1 Variance P1 Phenotype 2 Covariance P1-P2 Variance P2 Phenotype 1 Within-trait P1 Cross-traitVariance P1 Phenotype 2 Cross-traitWithin-trait P2 Covariance P1-P2 Variance P2 Twin 1 Twin 2 Within-twin covariance Cross-twin covariance

Cross-Twin Covariances (A) Twin 1 Phenotype 1 A1A1 A2A2 a 11 a 21 a Twin 1 Phenotype 2 Twin 2 Phenotype 1 A1A1 A2A2 a 11 a 21 a Twin 2 Phenotype 2 1/0.5 Phenotype 1Phenotype 2 Phenotype 1 Phenotype 2 +c 11 c 21 +c c 21 2 Twin 1 Twin 2

Cross-Twin Covariances (A) Twin 1 Phenotype 1 A1A1 A2A2 a 11 a 21 a Twin 1 Phenotype 2 Twin 2 Phenotype 1 A1A1 A2A2 a 11 a 21 a Twin 2 Phenotype 2 1/0.5 Phenotype 1Phenotype 2 Phenotype 1 1/0.5a Phenotype 2 +c 11 c 21 +c c 21 2 Twin 1 Twin 2

Cross-Twin Covariances (A) Twin 1 Phenotype 1 A1A1 A2A2 a 11 a 21 a Twin 1 Phenotype 2 Twin 2 Phenotype 1 A1A1 A2A2 a 11 a 21 a Twin 2 Phenotype 2 1/0.5 Phenotype 1Phenotype 2 Phenotype 1 1/0.5a c 11 2 Phenotype 2 1/0.5a 11 a 21 +c 11 c 21 +c c 21 2 Twin 1 Twin 2

Cross-Twin Covariances (A) Twin 1 Phenotype 1 A1A1 A2A2 a 11 a 21 a Twin 1 Phenotype 2 Twin 2 Phenotype 1 A1A1 A2A2 a 11 a 21 a Twin 2 Phenotype 2 1/0.5 Phenotype 1Phenotype 2 Phenotype 1 1/0.5a c 11 2 Phenotype 2 1/0.5a 11 a 21 +c 11 c 21 1/0.5(a a 21 2 )+c c 21 2 Twin 1 Twin 2

Cross-Twin Covariances (Q) Twin 1 Phenotype 1 Twin 1 Phenotype 2 Q1Q1 q 11 q 21 1 Twin 2 Phenotype 1 Twin 2 Phenotype 2 Q1Q1 q 11 q 21 1 π Phenotype 1Phenotype 2 Phenotype 1 1/0.5a πq 11 2 Phenotype 2 1/0.5a 11 a 21 + π q 11 q 21 1/0.5(a a 21 2 )+ π q 21 2 Twin 1 Twin 2

Predicted Model Phenotype 1Phenotype 2Phenotype 1Phenotype 2 Phenotype 1 a q e 11 2 Phenotype 2 a 11 a 21 +q 11 q 21 +e 11 e 21 a a q e e 21 2 Phenotype 1 1/.5a π q 11 2 a q e 11 2 Phenotype 2 1/.5a 11 a 21 + πq 11 q 21 1/.5(a a 21 2 )+ π q 21 2 a 11 a 21 +q 11 q 21 +e 11 e 21 a a q e e 21 2 Twin 1 Twin 2 Within-twin covariance Cross-twin covariance

Running MV linkage analysis… Numerous programs Mx Solar Loki Merlin Repeated measures Computationally intensive Multiple boundary issues p-values difficult to obtain Emu (not e-moo)

MV association… Maximum likelihood – factor based approach Mx Canonical Correlation approach Plink Principal components approach F-bat Funnel Web Spider

Maximum likelihood approach Galah Unrelated individuals Shared variance due to a common factor Residual non-shared variance Family based data ACE type models

Common factor model Great white Phenotype 1 F1F1 E1E1 E2E2 f 11 f 21 e 11 e Phenotype 2 Phenotype 3Phenotype 4 1 f 31 f 41 E3E3 e 33 1 E4E4 e 44 1

Factor level association Estimate a factor level beta Use the factor loadings as weights Add the uncorrected or grand mean 1 df Green Tree Frog

Factor level association Green Tree Frog Dot product see page 61 of the Mx manual

Factor level association Green Tree Frog Kronecker product see page 61 of the Mx manual

Variable specific association Estimate a separate beta for each trait Add the uncorrected or grand mean n df Kookaburra

Simulated data set 10 traits, Moderately correlated ~.4 1 snp, MAF individuals Kingfisher

factorlevel.mx… dataset.ped Example Factor level association script - Boulder 2009 Sarah Medland Data NI=15 NGroups=1 Rec file=dataset.ped Labels fid iid a1 a2 genotype V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 Select genotype V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 ; Definition genotype ; Begin Matrices; F full 10 1 free ! factor R diag free! residuals M full 10 1 free! grand means B full 1 1 free! association beta G full 1 1 ! genotype End Matrices; Numbat

factorlevel.mx… dataset.ped st.6 F 1 1 to F 10 1 st.4 R 1 1 to R st 0 M 1 1 to M 10 1 sp G genotype Covariance (F*F')+(R*R') ; Means M + ; Options multiple issat jiggle end drop B end Perentie Monitor

variablespecific.mx… dataset.ped Example Factor level association script - Boulder 2009 Sarah Medland Data NI=15 NGroups=1 Rec file=dataset.ped Labels fid iid a1 a2 genotype V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 Select genotype V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 ; Definition genotype ; Begin Matrices; F full 10 1 free ! factor R diag free! residuals M full 10 1 free! grand means B full 10 1 free! association beta G full 1 1 ! genotype End Matrices; Quoll

variablespecific.mx… dataset.ped st.6 F 1 1 to F 10 1 st.4 R 1 1 to R st 0 M 1 1 to M 10 1 sp G genotype Covariance (F*F')+(R*R') ; Means M + ; Options multiple issat jiggle end drop B to B end Red back spider

Your task Run both FL and VS tests in Mx for the first data set Edit the data file name Calculate the p-value for the VS test using excel… Which variables are associated? What is the mean for variable 1 by genotype?

Results – factor level Difference Chi-squared >> Difference d.f. >>>>>>>>> 1 Probability >>>>>>>>>>>> Pademelon FBM

Association with a factor score… Data set 1 Pigmy Possum

Results – variable specific Difference Chi-squared Difference d.f. >>>> 10 Probability >>>>>>>> 1.2*10 -6 Rainbow Lorikeet FBM

Mean under different genotypes Salt water croc AA 0 AB 1 BB

Advantages to the ML approach Completely flexible Can be applied to any model Longitudinal models Simplex/Autoregressive processed Easy to add dominance etc Covariates Extends to family data Stone fish

Disadvantages Correction for multiple testing? FL & VS tests provide complementary information Inflation of type 1 error use a Bonferroni correction if you use both Taipan

Tassie Devil Tassie Tiger Tiger Snake Thorny Devil Tree Kangaroo Wombat Zebra finch