QTL Mapping Using Mx Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.

Slides:



Advertisements
Similar presentations
Bivariate analysis HGEN619 class 2007.
Advertisements

Summarizing Variation Matrix Algebra & Mx Michael C Neale PhD Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
Power in QTL linkage: single and multilocus analysis Shaun Purcell 1,2 & Pak Sham 1 1 SGDP, IoP, London, UK 2 Whitehead Institute, MIT, Cambridge, MA,
(Re)introduction to Mx Sarah Medland. KiwiChinese Gooseberry.
Multivariate Analysis Nick Martin, Hermine Maes TC21 March 2008 HGEN619 10/20/03.
Extended sibships Danielle Posthuma Kate Morley Files: \\danielle\ExtSibs.
Summarizing Variation Michael C Neale PhD Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
(Re)introduction to Mx. Starting at the beginning Data preparation Mx expects 1 line per case/family Almost limitless number of families and variables.
Introduction to Linkage
Univariate Analysis in Mx Boulder, Group Structure Title Type: Data/ Calculation/ Constraint Reading Data Matrices Declaration Assigning Specifications/
Continuous heterogeneity Shaun Purcell Boulder Twin Workshop March 2004.
Multivariate Analysis Hermine Maes TC19 March 2006 HGEN619 10/20/03.
Univariate Analysis Hermine Maes TC19 March 2006.
Missing Data Michael C. Neale International Workshop on Methodology for Genetic Studies of Twins and Families Boulder CO 2006 Virginia Institute for Psychiatric.
Mx Practical TC18, 2005 Dorret Boomsma, Nick Martin, Hermine H. Maes.
Introduction to Multivariate Genetic Analysis Kate Morley and Frühling Rijsdijk 21st Twin and Family Methodology Workshop, March 2008.
Raw data analysis S. Purcell & M. C. Neale Twin Workshop, IBG Colorado, March 2002.
Linkage Analysis in Merlin
Structural Equation Modeling Continued: Lecture 2 Psy 524 Ainsworth.
Statistical Power Calculations Boulder, 2007 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Copy the folder… Faculty/Sarah/Tues_merlin to the C Drive C:/Tues_merlin.
Univariate modeling Sarah Medland. Starting at the beginning… Data preparation – The algebra style used in Mx expects 1 line per case/family – (Almost)
Introduction to Linkage Analysis Pak Sham Twin Workshop 2003.
Gene Mapping Quantitative Traits using IBD sharing References: Introduction to Quantitative Genetics, by D.S. Falconer and T. F.C. Mackay (1996) Longman.
Whole genome approaches to quantitative genetics Leuven 2008.
Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.
Power of linkage analysis Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
The importance of the “Means Model” in Mx for modeling regression and association Dorret Boomsma, Nick Martin Boulder 2008.
Regression-Based Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Combined Linkage and Association in Mx Hermine Maes Kate Morley Dorret Boomsma Nick Martin Meike Bartels Boulder 2009.
Tutorial #10 by Ma’ayan Fishelson. Classical Method of Linkage Analysis The classical method was parametric linkage analysis  the Lod-score method. This.
Univariate Analysis Hermine Maes TC21 March 2008.
Linkage and association Sarah Medland. Genotypic similarity between relatives IBS Alleles shared Identical By State “look the same”, may have the same.
Epistasis / Multi-locus Modelling Shaun Purcell, Pak Sham SGDP, IoP, London, UK.
Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.
Means, Thresholds and Moderation Sarah Medland – Boulder 2008 Corrected Version Thanks to Hongyan Du for pointing out the error on the regression examples.
Powerful Regression-based Quantitative Trait Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Mx Practical TC20, 2007 Hermine H. Maes Nick Martin, Dorret Boomsma.
David M. Evans Multivariate QTL Linkage Analysis Queensland Institute of Medical Research Brisbane Australia Twin Workshop Boulder 2003.
Categorical Data Frühling Rijsdijk 1 & Caroline van Baal 2 1 IoP, London 2 Vrije Universiteit, A’dam Twin Workshop, Boulder Tuesday March 2, 2004.
The Measurement and Analysis of Complex Traits Everything you didn’t want to know about measuring behavioral and psychological constructs Leuven Workshop.
Welcome  Log on using the username and password you received at registration  Copy the folder: F:/sarah/mon-morning To your H drive.
Linkage in Mx & Merlin Meike Bartels Kate Morley Hermine Maes Based on Posthuma et al., Boulder & Egmond.
Copy folder (and subfolders) F:\sarah\linkage2. Linkage in Mx Sarah Medland.
Multivariate Genetic Analysis (Introduction) Frühling Rijsdijk Wednesday March 8, 2006.
Power in QTL linkage analysis
Categorical Data HGEN
Extended Pedigrees HGEN619 class 2007.
Regression Models for Linkage: Merlin Regress
Boulder Colorado Workshop March
Intro to Mx HGEN619 class 2005.
Linkage and Association in Mx
Re-introduction to openMx
Heterogeneity HGEN619 class 2007.
Univariate Analysis HGEN619 class 2006.
Can resemblance (e.g. correlations) between sib pairs, or DZ twins, be modeled as a function of DNA marker sharing at a particular chromosomal location?
Introduction to Linkage and Association for Quantitative Traits
I Have the Power in QTL linkage: single and multilocus analysis
Univariate modeling Sarah Medland.
Linkage in Selected Samples
GxE and GxG interactions
Why general modeling framework?
David M. Evans Sarah E. Medland
(Re)introduction to Mx Sarah Medland
Sarah Medland faculty/sarah/2018/Tuesday
Lecture 9: QTL Mapping II: Outbred Populations
Univariate Linkage in Mx
Power Calculation for QTL Association
BOULDER WORKSHOP STATISTICS REVIEWED: LIKELIHOOD MODELS
Presentation transcript:

QTL Mapping Using Mx Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University

Overview  Alternative approach  Linkage as Mixture  Univariate/Multivariate  One/more loci  Practical considerations  Power - Pihat vs covs - Larger Sibships

Schematic of Genome Marker 1Marker 2Marker 3Marker 4 QTL d1 d2 d3 d4

Genetic Heterogeneity Sib pairs IBD at a locus, parents AB and CD ACADBCBD AC2110 AD1201 BC1021 BD0112

Pi hat approach  1 Pick a putative QTL location  2 Compute p(IBD0) p(IBD1) p(IBD2) given  marker data [Mapmaker/sibs]  3 Compute = p(IBD2) +.5p(IBD1)  4 Fit model  Repeat 1-4 as necessary for different locations Elston & Stewart B ^

Major QTL effects DZ twins A1C1D1E1 P1 Q1Q2E2D2C2 P2 A2 B ^.51.25

Normal Theory Likelihood Function For raw data in Mx j=1 ln L i = f i ln [ 3 w j g(x i,: ij, G ij )] m x i - vector of observed scores on n subjects : ij - vector of predicted means G ij - matrix of predicted covariances - functions of parameters

General Likelihood Function ) Model for Means can differ ) Model for Covariances can differ ) Weights can differ ) Frequencies can differ Things that may differ over subjects i = 1....n subjects (families) j=1 ln L i = f i ln [ 3 w ij g(x i,: ij, G ij )] m

Normal distribution N(: ij, G ij ) Likelihood is height of the curve : G xixi N likelihood

Weighted mixture of models Finite mixture distribution j=1 m j = 1....m models w ij Weight for subject i model j e.g., Segregation analysis ln L i = f i ln [ 3 w ij g(x i,: ij, G ij )]

Mixture of Normal Distributions Two normals, propotions w1 & w2, different means But Likelihood Ratio not Chi-Squared - what is it? :1:1 xixi g :2:2 w 1 x l 1 w 2 x l 2

Weighted Likelihood Method  1 Pick a putative QTL location  2 Compute p(IBD0) p(IBD1) p(IBD2) given marker data  these are "WEIGHTS"  3 Compute likelihood of phenotype data under each of 3 IBD conditions  4 Maximize weighted likelihood of 3  Repeat 1-4 as necessary for different locations

Mixture method Add them up A1C1D1E1 P1 Q1Q2E2D2C2 P2 A A1C1D1E1 P1 Q1Q2E2D2C2 P2 A A1C1D1E1 P1 Q1Q2E2D2C2 P2 A p(IBD1) x p(IBD2) xp(IBD0) x

Dataset structure Rectangular format Id sex age P1 P2 IBD0 IBD1 IBD2 IBD0 IBD1 IBD2 Locus 1 Locus Missing data: Phenotypes ML Markers Listwise

Mx Script Mixture method !QTL analysis via Mixture Distribution method !Using marker1 !Using DZ twins only !Analysis of LDL !Dutch Adults #define nvar 1 !different for multivariate #define nsib 2 !number of siblings #NGroups=2

Mx Script Mixture part 2 G1: Parameter Estimates Calculation Begin Matrices; X Lower nvar nvar Free !familial background Z Lower nvar nvar Free !unique environment L Full 1 1 Free !QTL effect M Full 1 nvar Free !means H Full 1 1 End Matrices; Matrix H.5 Begin Algebra; F= X*X'; !familial variance E= Z*Z'; !unique environmental variance Q= L*L'; !variance due to QTL V= F+Q+E; !total variance T= F|Q|E; !parameters in one matrix for standardizing S= !standardized variance component estimates End Algebra; Labels Row S standest Labels Col S f^2 q^2 e^2 Labels Row T unstandest Labels Col T f^2 q^2 e^2 End

Mx Script G2: Dizygotic twins #include lipiddzmix.dat Select ibd0m1 ibd1m1 ibd2m1 ldl1 ldl2; Definition ibd0m1 ibd1m1 ibd2m1; Begin Matrices = Group 1; K Full 3 1 !IBD probabilities (from Merlin) U Unit 3 2 End Matrices; Specify K ibd0m1 ibd1m1 ibd2m1 Means Covariance F+Q+E | F _ F | F+Q+E _ ! IBD 0 Covariance matrix F+Q+E | F+ | F+Q+E _ ! IBD 1 Covariance matrix F+Q+E | F+Q _ F+Q | F+Q+E; ! IBD 2 Covariance matrix Weights K; ! IBD probabilities Start 1 All Start 2.8 M Option NDecimals=3 Option Multiple Issat End

Mx Script Mixture part 4 ! Test significance of QTL effect Drop L End

Output Pihat Method Summary of VL file data for group 1 Code Number Mean Variance MATRIX F This is a LOWER TRIANGULAR matrix of order 1 by MATRIX Q This is a FULL matrix of order 1 by

Output Your model has 4 estimated parameters and 950 Observed statistics -2 times log-likelihood of data >>> Degrees of freedom >>>>>>>>>>>>>>>> 946 Your model has 3 estimated parameters and 950 Observed statistics -2 times log-likelihood of data >>> Degrees of freedom >>>>>>>>>>>>>>>> 947 QTL Effect Present QTL Effect Absent Difference chi-squared = (1 df)

Output Pihat Method Your model has 4 estimated parameters and 950 Observed statistics -2 times log-likelihood of data >>> Degrees of freedom >>>>>>>>>>>>>>>> 946 Your model has 3 estimated parameters and 950 Observed statistics -2 times log-likelihood of data >>> Degrees of freedom >>>>>>>>>>>>>>>> 947 QTL Effect Present QTL Effect Absent Difference chi-squared = (1 df)

Summary  SEM - QTL direct relationship  Mx graphical/script approaches  Mixture vs Pihat  Multivariate treatment  Multilocus  Missing Data  Ascertainment

How much more power?  Large sibships much more powerful  Dolan et al 1999  Pihat simple with large sibships - Solar, Genehunter etc · Pihat shows substantial bias with missing data

Expected IBD Frequencies TypeConfigurationFrequency 124/16 218/16 304/16 Sibships of size 2

Expected IBD Frequencies TypeConfigurationFrequency 12224/ / / / / / / / / /64 Sibships of size 3

More power in large sibships Dolan, Neale & Boomsma (2000) +Size 2 o Size 3 * Size 4

Number of IBD Combinations As a function of number of sibs in family Sibship SizeNumber of combinations

Mixture Approach for Pedigrees  Iterate configurations within families  Only use non-zero IBD probabilities  Set threshold?  Improves with genotype data  Allows moderated genotypes Some ideas

Strategy 2  Families within combinations  Limited # of IBD configurations  Depends on max sibship size  Usually Faster - Can do missing data - Cannot do moderator variables

Multivariate QTL Vectors of variables, Matrices of paths Three component mixture B ^ Q1Q2A2C2D2E2E1D1C1A1 P1P2

Two locus model R1C1A1E1 P1 Q1Q2E2A2C2 P2 R B1B1 ^ B2B2 ^

Two locus model mixture p(ibd0 R) p(ibd1 R) p(ibd2 R) R1C1A1E1 P1 Q1Q2E2A2C2 P2 R R1C1A1E1 P1 Q1Q2E2A2C2 P2 R R1C1A1E1 P1 Q1Q2E2A2C2 P2 R R1C1A1E1 P1 Q1Q2E2A2C2 P2 R R1C1A1E1 P1 Q1Q2E2A2C2 P2 R R1C1A1E1 P1 Q1Q2E2A2C2 P2 R R1C1A1E1 P1 Q1Q2E2A2C2 P2 R R1C1A1E1 P1 Q1Q2E2A2C2 P2 R R1C1A1E1 P1 Q1Q2E2A2C2 P2 R p(ibd0 Q) p(ibd1 Q) p(ibd2 Q)

Multivariate multilocus multipoint )Eaves Neale & Maes 1996 )10 minutes for 5 phenotypes )Restart at previous solution )Only fit null model (q=0) once

Not dead yet )Latent variable qtls )Multiple rater )Comorbidity )Repeated measures