Linkage and Association in Mx

Slides:

Advertisements

Similar presentations

Bivariate analysis HGEN619 class 2007.

Advertisements

Basics of Linkage Analysis

. Parametric and Non-Parametric analysis of complex diseases Lecture #6 Based on: Chapter 25 & 26 in Terwilliger and Ott’s Handbook of Human Genetic Linkage.

Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.

Estimating “Heritability” using Genetic Data David Evans University of Queensland.

(Re)introduction to Mx Sarah Medland. KiwiChinese Gooseberry.

Extended sibships Danielle Posthuma Kate Morley Files: \\danielle\ExtSibs.

Quantitative Genetics

Association analysis Shaun Purcell Boulder Twin Workshop 2004.

(Re)introduction to Mx. Starting at the beginning Data preparation Mx expects 1 line per case/family Almost limitless number of families and variables.

Introduction to Linkage

Univariate Analysis in Mx Boulder, Group Structure Title Type: Data/ Calculation/ Constraint Reading Data Matrices Declaration Assigning Specifications/

Univariate Analysis Hermine Maes TC19 March 2006.

Mx Practical TC18, 2005 Dorret Boomsma, Nick Martin, Hermine H. Maes.

Linkage Analysis in Merlin

Statistical Power Calculations Boulder, 2007 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.

Copy the folder… Faculty/Sarah/Tues_merlin to the C Drive C:/Tues_merlin.

Linkage and LOD score Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.

Univariate modeling Sarah Medland. Starting at the beginning… Data preparation – The algebra style used in Mx expects 1 line per case/family – (Almost)

Whole genome approaches to quantitative genetics Leuven 2008.

Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.

Power of linkage analysis Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.

The importance of the “Means Model” in Mx for modeling regression and association Dorret Boomsma, Nick Martin Boulder 2008.

Regression-Based Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.

Gene-Environment Interaction & Correlation Danielle Dick & Danielle Posthuma Leuven 2008.

Combined Linkage and Association in Mx Hermine Maes Kate Morley Dorret Boomsma Nick Martin Meike Bartels Boulder 2009.

Univariate Analysis Hermine Maes TC21 March 2008.

Linkage and association Sarah Medland. Genotypic similarity between relatives IBS Alleles shared Identical By State “look the same”, may have the same.

Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.

Errors in Genetic Data Gonçalo Abecasis. Errors in Genetic Data Pedigree Errors Genotyping Errors Phenotyping Errors.

Epistasis / Multi-locus Modelling Shaun Purcell, Pak Sham SGDP, IoP, London, UK.

Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.

Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.

Family Based Association Danielle Posthuma Stacey Cherny TC18-Boulder 2005.

Means, Thresholds and Moderation Sarah Medland – Boulder 2008 Corrected Version Thanks to Hongyan Du for pointing out the error on the regression examples.

Powerful Regression-based Quantitative Trait Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.

Mx Practical TC20, 2007 Hermine H. Maes Nick Martin, Dorret Boomsma.

David M. Evans Multivariate QTL Linkage Analysis Queensland Institute of Medical Research Brisbane Australia Twin Workshop Boulder 2003.

Introduction to Genetic Theory

Welcome  Log on using the username and password you received at registration  Copy the folder: F:/sarah/mon-morning To your H drive.

Linkage in Mx & Merlin Meike Bartels Kate Morley Hermine Maes Based on Posthuma et al., Boulder & Egmond.

Copy folder (and subfolders) F:\sarah\linkage2. Linkage in Mx Sarah Medland.

QTL Mapping Using Mx Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.

Association Mapping in Families Gonçalo Abecasis University of Oxford.

Lecture 17: Model-Free Linkage Analysis Date: 10/17/02  IBD and IBS  IBD and linkage  Fully Informative Sib Pair Analysis  Sib Pair Analysis with Missing.

Power in QTL linkage analysis

Categorical Data HGEN

Regression Models for Linkage: Merlin Regress

Univariate Twin Analysis

Re-introduction to openMx

Genome Wide Association Studies using SNP

Path Analysis Danielle Dick Boulder 2008

Can resemblance (e.g. correlations) between sib pairs, or DZ twins, be modeled as a function of DNA marker sharing at a particular chromosomal location?

I Have the Power in QTL linkage: single and multilocus analysis

Univariate modeling Sarah Medland.

Regression-based linkage analysis

Linkage in Selected Samples

Correlation for a pair of relatives

Why general modeling framework?

Heterogeneity Danielle Dick, Hermine Maes,

(Re)introduction to Mx Sarah Medland

Sarah Medland faculty/sarah/2018/Tuesday

Pi = Gi + Ei Pi = pi - p Gi = gi - g Ei = ei - e _ _ _ Phenotype

Multivariate Linkage Continued

Lecture 9: QTL Mapping II: Outbred Populations

Univariate Linkage in Mx

Power Calculation for QTL Association

BOULDER WORKSHOP STATISTICS REVIEWED: LIKELIHOOD MODELS

Multivariate Genetic Analysis: Introduction

Presentation transcript:

Linkage and Association in Mx Sarah Medland

Linkage

In biometrical modeling A is correlated at 1 for MZ twins and In biometrical modeling A is correlated at 1 for MZ twins and .5 for DZ twins .5 is the average genome-wide sharing of genes between full siblings (DZ twin relationship)

In linkage analysis we will be estimating an additional variance component Q For each locus under analysis the coefficient of sharing for this parameter will vary for each pair of siblings The coefficient will be the probability that the pair of siblings have both inherited the same alleles from a common ancestor

Genotypic similarity between relatives IBD Alleles shared Identical By Descent are a copy of the same ancestor allele Pairs of siblings may share 0, 1 or 2 alleles IBD The probability of a pair of relatives being IBD is known as pi-hat M1 M2 M3 M3 Q1 Q2 Q3 Q4 M1 M2 M3 M3 Q1 Q2 Q3 Q4 IBS IBD M1 M3 M1 M3 2 1 Q1 Q3 Q1 Q4

PTwin1 PTwin2 MZ=1.0 DZ=0.5 MZ & DZ = 1.0 Q A C E E C A Q q a c e e c

Distribution of pi-hat Adult Dutch DZ pairs: distribution of pi-hat at 65 cM on chromosome 19 Model resemblance (e.g. correlations, covariances) between sib pairs, or DZ twins, as a function of DNA marker sharing at a particular chromosomal location

DZ by IBD status Variance = Q + F + E Covariance = πQ + F + E

Covariance Statements G2: DZ IBD2 twins Matrix K 1 Covariance F+Q+E | F+K@Q _ F+K@Q | F+Q+E; G3: DZ IBD1 twins Matrix K .5 G4: DZ IBD0 twins F+Q+E | F_ F | F+Q+E;

DZ by IBD status + MZ

Covariance Statements +MZ G2: DZ IBD2 twins Matrix K 1 Covariance A+C+Q+E | H@A+C+K@Q _ H@A+C+K@Q | A+C+Q+E; G3: DZ IBD1 twins Matrix K .5 G4: DZ IBD0 twins A+C+Q+E | H@A+C_ H@A+C | A+C+Q+E; G5: MZ twins A+C+Q+E | A+C+Q _ A+C+Q | A+C+Q+E;

Using the full distribution More power if we use all the available information So instead of dividing the sample we will use as a continuous coefficient that will vary between sib-pair across loci

DZ twins using full distribution of pihat Covariance F+E+Q | F+P@Q_ F+P@Q | F+E+Q ; DZ=1 1 1 1 1 1 1 Q F E E F Q q f e e f q PTwin1 PTwin2

The example data Lipid data: apoB

Pihat.mx !script for univariate linkage - pihat approach !DZ/SIB #loop $i 1 4 1 #define nvar 1 #NGroups 1 DZ / sib TWINS genotyped Data NInput=324 Missing =-1.0000 Rectangular File=lipidall.dat Labels sample fam ldl1 apob1 ldl2 apob2 … Select apob1 apob2 ibd0m$i ibd1m$i ibd2m$i ; Definition_variables This use of the loop command allows you to run the same script over and over moving along the chromosome The format of the command is: #loop variable start end increment So…#loop $i 1 4 1 Starts at marker 1 goes to marker 4 and runs each locus in turn Each occurrence of $i within the script will be replaced by the current number ie on the second run $i will become 2 With the loop command the last end statement becomes an exit statement and the script ends with #end loop

Pihat.mx !script for univariate linkage - pihat approach !DZ/SIB #loop $i 1 4 1 #define nvar 1 #NGroups 1 DZ / sib TWINS genotyped Data NInput=324 Missing =-1.0000 Rectangular File=lipidall.dat Labels sample fam ldl1 apob1 ldl2 apob2 … Select apob1 apob2 ibd0m$i ibd1m$i ibd2m$i ; Definition_variables This use of the ‘definition variables’ command allows you to specify which of the selected variables will be used as covariates The value of the covariate displayed in the mxo will be the values for the last case read

Pihat.mx Begin Matrices; X Lower nvar nvar free ! residual familial F Z Lower nvar nvar free ! unshared environment E L Full nvar 1 free ! qtl effect Q G Full 1 nvar free ! grand means H Full 1 1 ! scalar, .5 K Full 3 1 ! IBD probabilities (from Merlin) J Full 1 3 ! coefficients 0.5,1 for pihat End Matrices; Specify K ibd0m$i ibd1m$i ibd2m$i Matrix H .5 Matrix J 0 .5 1 Start .1 X 1 1 1 Start .1 L 1 1 1 Start .1 Z 1 1 1 Start .5 G 1 1 1 !script for univariate linkage - pihat approach !DZ/SIB #loop $i 1 2 1 #define nvar 1 #NGroups 1 DZ / sib TWINS genotyped Data NInput=324 Missing =-1.0000 Rectangular File=lipidall.dat Labels sample fam ldl1 apob1 ldl2 apob2 … Select apob1 apob2 ibd0m$i ibd1m$i ibd2m$i ; Definition_variables

You need to fix this before you run the script Pihat.mx Means G| G ; Covariance F+E+Q | F+P@Q_ F+P@Q | F+E+Q ; Option NDecimals=4 Option RSiduals Option Multiple Issat !End !test significance of QTL effect ! Drop L 1 1 1 Exit #end loop Begin Algebra; F= X*X'; ! residual familial variance E= Z*Z'; ! unique environmental variance Q= L*L'; ! variance due to QTL V= F+Q+E; ! total variance T= F|Q|E; ! parameters in one matrix S= F%V| Q%V| E%V; ! standardized variance component estimates P= ???? ; ! estimate of pihat End Algebra; Labels Row S standest Labels Col S f^2 q^2 e^2 Labels Row T unstandest Labels Col T f^2 q^2 e^2 You need to fix this before you run the script

Run the script across your region and add the -2LL to the xls file

Converting chi-squares to LOD scores For univariate linkage analysis (where you have 1 QTL estimate) Χ2/4.6 = LOD

Converting chi-squares to p values Complicated Distribution of genotypes and phenotypes Boundary problems For univariate linkage analysis (where you have 1 QTL estimate) p(linkage)=

What does it mean?

Alternative approach is a summary statistic Use all the information Convenient Loss of information .5 can mean p.ibd0=0 p.ibd1=1 p.ibd2=0 or p.ibd0=.2 p.ibd1=.6 p.ibd2=.2 Use all the information

Alternative approach Model each of the possible outcomes IBD0 IBD1 IBD2 Weight each of the models by the probability that it is the correct model The pairwise likelihood is equal to the sum of likelihood for each model multiplied by the probability it is the correct model The combined likelihood is equal to the sum of all the pairwise likelihoods

DZ pairs * pIBD2 + * pIBD1 * pIBD0

Tells Mx we will be using 3 different means and variance models How to do this in mx? Script mixture.mx Tells Mx we will be using 3 different means and variance models DZ / sib TWINS genotyped Data NInput=324 NModel=3 Missing =-1.0000 Rectangular File=lipidall.dat Labels …. Select apob1 apob2 ibd0m$i ibd1m$i ibd2m$i ; Definition_variables

How to do this in mx? This script runs an FQE model L is the QTL VC path coefficent Begin Matrices; X Lower nvar nvar free ! residual familial F Z Lower nvar nvar free ! unshared environment E L Full nvar 1 free ! qtl effect Q G Full 1 nvar free ! grand means H Full 1 1 ! scalar, .5 K Full 3 1 ! IBD probabilities (from Merlin) U Unit 3 1 … Matrix H 0.5 Specify K ibd0m$i ibd1m$i ibd2m$i We are placing the prob. Of being IBD 0 1 & 2 in the K matrix

How to do this in mx? Begin Algebra; F= X*X'; ! residual familial variance E= Z*Z'; ! unique environmental variance Q= L*L'; ! variance due to QTL V= F+Q+E; ! total variance

How to do this in mx? Means U@G| U@G; Covariance F+Q+E | F _ F | F+Q+E _ ! IBD 0 Covariance matrix F+Q+E | F+H@Q _ F+H@Q | F+Q+E _ ! IBD 1 Covariance matrix F+Q+E | F+Q _ F+Q | F+Q+E; ! IBD 2 Covariance matrix Weights B ; The means matrix contains corrections for age and sex – it is repeated 3 times and vertically stacked Tells Mx to weight each of the means and var/cov matrices by the IBD prob. which we placed in the B matrix

Your job Pick a number location that you ran with the pihat script Change the script to run linkage at the location you picked Test for linkage

Summary Weighted likelihood approach more powerful than pi-hat Quickly becomes unfeasible 3 models for sibship size 2 27 models for sibship size 4 Q: How many models for sibship size 3? For larger sib-ships/arbitrary pedigrees pi-hat approach is method of choice

Break – and then association

Association Introduction

Association Simplest design possible Correlate phenotype with genotype

The equivalent for a quantitative trait - run a regression Yi = a + bXi + ei where Yi = trait value for individual i Xi = 1 if allele individual i has allele ‘A’ 0 otherwise i.e., test of mean differences between ‘A’ and ‘not-A’ individuals Play with Association.xls 1

Biometrical model Genotype Genetic Value BB Bb bb a d -a midpoint Bb BB Genotype Genetic Value BB Bb bb a d -a Va (QTL) = 2pqa2 (no dominance)

How can we do this in Mx?

Where this goes wrong… Chose a phenotype – chopstick use Collect 2 groups - cases and controls Unrelated individuals – university students Matched for relevant covariates – age sex 1

Where this goes wrong… Pick your candidate gene and type it – SUSHI (Successful use of selected hand instruments gene) Count the number of cases and controls with each genotype – highly significant and replicatable 1

Obviously this is a spurious example The cases were of Asian decent, the controls were of Caucasian decent SUSHI – was a histocompatability antigen gene with different allele frequencies in different racial groups This confound is known as population stratification (Yes this is a fairy tale but it is based on a real asthma study)

Practical – Find a gene for sensation seeking: Two populations (A & B) of 100 individuals in which sensation seeking was measured In population A, gene X (alleles 1 & 2) does not influence sensation seeking In population B, gene X (alleles 1 & 2) does not influence sensation seeking Mean sensation seeking score of population A is 90 Mean sensation seeking score of population B is 110 Frequencies of allele 1 & 2 in population A are .1 & .9 Frequencies of allele 1 & 2 in population B are .5 & .5

Population B scores higher than population A .01 .18 .81 .25 .50 .25 Genotypic freq. Sensation seeking score is the same across genotypes, within each population. Population B scores higher than population A Differences in genotypic frequencies

Suppose we are unaware of these two populations and have measured 200 individuals and typed gene X The mean sensation seeking score of this mixed population is 100 What are our observed genotypic frequencies and means?

Calculating genotypic frequencies in the mixed population Genotype 11: 1 individual from population A, 25 individuals from population B on a total of 200 individuals: (1+25)/200=.13 Genotype 12: (18+50)/200=.34 Genotype 22: (81+25)/200=.53

Calculating genotypic means in the mixed population Genotype 11: 1 individual from population A with a mean of 90, 25 individuals from population B with a mean of 110 = ((1*90) + (25*110))/26 =109.2 Genotype 12: ((18*90) + (50*110))/68 = 104.7 Genotype 22: ((81*90) + (25*110))/106 = 94.7

Gene X is the gene for sensation seeking! .13 .34 .53 Genotypic freq. Now, allele 1 is associated with higher sensation seeking scores, while in both populations A and B, the gene was not associated with sensation seeking scores… FALSE ASSOCIATION

False positives and false negatives Reversal effects Overestimation Underestimation -3 -2 -1 1 2 3 4 5 0.49 0.42 0.35 0.28 0.21 0.14 0.07 0.00 -0.07 -0.14 -0.21 -0.28 -0.35 -0.42 -0.49 Estimated value of allelic effect =-10 =-5 =5 =10 Genuine allelic effect=+2 Difference in gene frequency in subpopulations Dm = Difference in subpopulation mean Posthuma et al., Behav Genet, 2004

How to avoid spurious association? True association is detected in people coming from the same genetic stratum Can check that individuals come from the same population using a large set of highly polymorphic genes – genomic control Can use family members as controls – family based association

Biometrical model Genotype Genetic Value BB Bb bb a d -a 2a d bb Bb BB midpoint Bb BB Genotype Genetic Value BB Bb bb a d -a

Fulker (1999) between/within association model Fulker et al, AJHG, 1999

BTW – this is on top of a linkage model

So… 4 tests for the price of one Test for pop-stratification aw=ab Robust test for association aw=0 Test so see if linked loci is the functional variant QTL≠ 0 in the presence of aw it is in LD with the variant but is not the casual variant Test for dominance effects if dominance is also modeled

Combined Linkage & association Implemented in QTDT (Abecasis et al Combined Linkage & association Implemented in QTDT (Abecasis et al., 2000) and Mx (Posthuma et al., 2004) Association and Linkage modeled simultaneously: Association is modeled in the means Linkage is modeled in the (co)variances QTDT: simple, quick, straightforward, but not so flexible in terms of models Mx: less simple, but highly flexible

Implementation in Mx link_assoc.mx #define n 3 ! number of alleles is 3, coded 1, 2, 3 … G1: calculation group between and within effects Data Calc Begin matrices; A Full 1 n free ! additive allelic effects within C Full 1 n free ! additive allelic effects between D Sdiag n n free ! dominance deviations within F Sdiag n n free ! dominance deviations between I Unit 1 n ! one's End matrices; Specify A 100 101 102 Specify C 200 201 202 Specify D 800 801 802 Specify F 900 901 902 The locus has 3 alleles These 1*3 vector matrices contain the b/n & w/n additive effects of each of the 3 alleles These 3*3 off-diagonal matrices contain the b/n & w/n dominance effects of each of the 3 alleles

Sticking it together… Begin algebra; K = (A'@I) + (A@I') ; ! Within effects, additive L = D + D' ; ! Within effects, dominance W = K+L ; ! Within effects - additive and dominance in one matrix M = (C'@I) + (C@I') ; ! Between effects, additive N = F + F' ; ! Between effects, dominance B = M+N ; ! Between effects - additive and dominance in one matrix End algebra ;

K = (A'@I) + (A@I') ; A matrix contains additive w/n effects Allele 1 Allele 2 Allele 3 a1 100 a2 101 a3 102 Parameter numbers I matrix is unit 1*3 so … 1|1|1

K = (A'@I) + (A@I') ; A’@I = a1 a2 a3

K = (A'@I) + (A@I') ; (A’@I)+(A@I’) = 2*a1 a1+a2 a1+a3 2*a2 a2+a3 2*a3

L = D + D' ; This is the deviation of the 1/2 genotype mean from the simple additive effects of allele 1 + allele 2 D+D’ = d12 d13 d23

W = K+L ; K+L = 2*a1 a1+a2+ d12 a1+a3+ d13 2*a2 a2+a3+ d23 2*a3 This is the 1/2 genotype mean composed of the simple additive effects of allele 1 + allele 2 and any deviation from these simple additive effects (dominance effects) K+L = 2*a1 a1+a2+ d12 a1+a3+ d13 2*a2 a2+a3+ d23 2*a3

W = K+L ; K+L (parameter numbers) = 100+100 101+100+800 102+100+801 100+101+800 101+101 102+101+802 100+102+801 101+102+802 102+102 Between effects stick together in the same way

M = (C'@I) + (C@I') ; ! Between effects, additive K = (A'@I) + (A@I') ; ! Within effects, additive L = D + D' ; ! Within effects, dominance W = K+L ; ! Within effects total I = [ 1 1 1], A = [a1 a2 a3] D = 0 0 0 d21 0 0 d31 d32 0 K = (A'@I) + (A@I') = a1 1 a2 @ [1 1 1] + [a1 a2 a3] @ 1 = a3 1 a1 a1 a1 a1 a2 a3 a1a1 a1a2 a1a3 a2 a2 a2 + a1 a2 a3 = a2a1 a2a2 a2a3 a3 a3 a3 a1 a2 a3 a3a1 a3a2 a3a3 L = D + D' = 0 0 0 0 d21 d31 0 d21 d31 d21 0 0 + 0 0 d32 = d21 0 d32 d31 d32 0 0 0 0 d31 d32 0 W = K+L = a1a1 a1a2 a1a3 0 d21 d31 a1a1 a1a2d21 a1a3d31 a2a1 a2a2 a2a3 + d21 0 d32 = a2a1d21 a2a2 a2a3d32 a3a1 a3a2 a3a3 d31 d32 0 a3a1d31 a3a2d32 a3a3 M = (C'@I) + (C@I') ; ! Between effects, additive N = F + F' ; ! Between effects, dominance B = M+N ; ! Between effects - total

We have a sibpair with genotypes 1,1 and 1,2. B = c1c1 c1c2f21 c1c3f31 c2c1f21 c2c2 c2c3f32 c3c1f31 c3c2f32 c3c3 W = a1a1 a1a2d21 a1a3d31 a2a1d21 a2a2 a2a3d32 a3a1d31 a3a2d32 a3a3 We have a sibpair with genotypes 1,1 and 1,2. μ1 = Grand mean + pair mean + ½ pair difference μ2 = Grand mean + pair mean - ½ pair difference To calculate the between-pairs effect, or the mean genotypic effect of this pair, we need matrix B: ((c1c1) + (c1c2f21)) / 2 To calculate the within-pair effect we need matrix W and the between pairs effect: For sib1: (a1a1) + ((c1c1) + (c1c2f21)) / 2 For sib2: (a1a2d21) - ((c1c1) + (c1c2f21)) / 2

G3: datagroup: sibship size two, DZ … Definition_variables tw1a1 tw1a2 Begin Matrices; G Full 1 nvar = G2 ! grand mean B Computed n n = B1 ! spurious and genuine genotypic effects (between) W Computed n n = W1 ! genuine genotypic effects (within) K Full 1 4 Fix ! Will contain first and second allele of twin1 L Full 1 4 Fix ! Will contain first and second allele of twin2 S Full 1 1 Fix ! Will contain 2 (for two individuals per family) End Matrices; Matrix K 1 1 1 1 Matrix L 1 1 1 1 Specify K tw1a1 tw1a2 tw1a1 tw1a2 Specify L tw2a1 tw2a2 tw2a1 tw2a2 Alleles 1 and 2 for twins 1 and 2 respectively We are going to put the allele numbers here These matrices must be initialized –given default values Alleles 1 and 2 for twins 1 and 2 respectively

Sticking it together using part For sibpairs 1,1 and 1,2 To calculate the between-pairs effect, or the mean genotypic effect of this pair, we need matrix B: ((c1c1) + (c1c2f21)) / 2 We can use the part function to draw out a specified element of a matrix B = c1c1 c1c2f21 c1c3f31 c2c1f21 c2c2 c2c3f32 c3c1f31 c3c2f32 c3c3

Sticking it together using part We can use the part function to draw out a specified element of a matrix So to draw out c3c1f31 we could say: \part(B, K) where K is a matrix containing 1 3 1 3 B = c1c1 c1c2f21 c1c3f31 c2c1f21 c2c2 c2c3f32 c3c1f31 c3c2f32 c3c3

Sibpair with genotypes: 1,1 and 1,2 W = a1a1 a1a2d21 a1a3d31 a2a1d21 a2a2 a2a3d32 a3a1d31 a3a2d32 a3a3 B = c1c1 c1c2f21 c1c3f31 c2c1f21 c2c2 c2c3f32 c3c1f31 c3c2f32 c3c3 Sibpair with genotypes: 1,1 and 1,2 Specify K tw1a1 tw1a2 tw1a1 tw1a2 = 1 1 1 1 Specify L tw2a1 tw2a2 tw2a1 tw2a2 = 1 2 1 2 V = (\part(B,K) + \part(B,L) ) %S ; (c1c1 + c1c2f21)/2 C = (\part(W,K) + \part(W,L) ) %S ; (a1a1 + a1a2d21)/2 Means G + F*R '+ V + (\part(W,K)-C) | G + I*R' + V +(\part(W,L)-C); = G + F*R’ + (c1c1 + c1c1f21)/2 + (a1a1 - (a1a1 + a1a2d21)/2) | G + I*R' + (c1c1 + c1c1f21)/2 + (a2a1 - (a1a1 + a1a2d21)/2)

Constrain sum additive allelic within effects = 0 Constraint ni=1 Begin Matrices; A full 1 n = A1 O zero 1 1 End Matrices; Begin algebra; B = \sum(A) ; End Algebra; Constraint O = B ; end Constrain sum additive allelic between effects = 0 C full 1 n = C1 ! B = \sum(C) ;

Variance/covariance matrix AEQ model using pi-hat formulation So 4 tests for the price of one Test for pop-stratification aw=ab Robust test for association aw=0 Test so see if linked loci is the functional variant QTL≠ 0 in the presence of aw it is in LD with the variant but is not the casual variant Test for dominance effects if dominance is also modeled

!1.test for linkage in presence of full association Drop D 2 1 1 end !2.Test for population stratification: !between effects = within effects. Specify 1 A 100 101 102 Specify 1 C 100 101 202 Specify 1 D 800 801 802 Specify 1 F 800 801 802 !3.Test for presence of dominance Drop @0 800 801 802 !4.Test for presence of full association Drop @0 800 801 802 100 101 !5.Test for linkage in absence of association Free D 2 1 1

Model Test -2ll df Vs model Chi^2 Df-diff P-value - 1 Linkage in presence of association 2 B=W 3 Dominance 4 Full association 5 Linkage in absence of association