Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linkage Disequilibrium Mapping of Complex Binary Diseases Two types of complex traits Quantitative traits–continuous variation Dichotomous traits–discontinuous.

Similar presentations


Presentation on theme: "Linkage Disequilibrium Mapping of Complex Binary Diseases Two types of complex traits Quantitative traits–continuous variation Dichotomous traits–discontinuous."— Presentation transcript:

1 Linkage Disequilibrium Mapping of Complex Binary Diseases Two types of complex traits Quantitative traits–continuous variation Dichotomous traits–discontinuous variation oBinary, e.g., presence (1) or absence (0) of a disease oMultiple outcomes, e.g., none, moderate or severe disease Special topic for Rebecca and Amy’s project

2 Consider a nature population One marker with two alleles M and m, Prob(M)=p, Prob(m)=1-p One QTL (affecting a binary trait) with two alleles A and a, Prob(A)=q, Prob(a)=1-q Four haplotypes: Prob(MQ)=p 11 =pq+D p=p 11 +p 10 Prob(Mq)=p 10 =p(1-q)-Dq=p 11 +p 01 Prob(mQ)=p 01 =(1-p)q-DD=p 11 p 00 -p 10 p 01 Prob(mq)=p 00 =(1-p)(1-q)+D D is the linkage disequilibrium between the marker and underlying QTL

3 Data structure SampleBinary (y i )Marker (j) 11MM (2) 21Mm (1) 31Mm (1) 41mm (0) 50MM (2) 60Mm (1) 70Mm (1) 80mm (0)

4 Arrange the data in a 2 x 3 contingency table Marker genotype 210 Affected (1)n 12 n 11 n 10 n 1. Normal (0)n 02 n 01 n 00 n 0. n. 2 n. 1 n. 0 n Affected (1)g 12 g 11 g 10 g 1. Normal (0)g 02 g 01 g 00 g 0. g. 2 g. 1 g. 0 1

5 Independence test  2 df=2 =  l=0 1  j=0 2 (n lj - m lj ) 2 /m lj = n  l=0 1  j=0 2 (g li - g l.g. j ) 2 /(g l.g. j ) where m lj is the expected value of n lj, m lj =ng l.g.j. H0: g li = g l.g. j H1: g li  g l.g. j Under H0,  2 df=2 is central chi 2 -distributed for a large sample size n, with df = (2-1)x(3-1) =2 If H0 is rejected, there is a significant D

6 Regression analysis Marker ModelQTL model SampleBinary (y ij )Marker(j) #M(T ij )There is 2 A’s 11MM (2)2  2|2 =p 11 2 21Mm (1)1  2|1 =2p 11 p 01 31Mm (1)1  2|1 =2p 11 p 01 41mm (0)0  2|0 =p 01 2 50MM (2)2  2|2 =p 11 2 60Mm (1)1  2|1 =2p 11 p 01 70Mm (1)1  2|1 =2p 11 p 01 80mm (0)0  2|0 =p 01 2 p 11 =pq+D, p 01 =(1-p)q-D

7 AA (2)Aa (1)aa (0)Obs MMp 11 2 2p 11 p 10 p 10 2 n 2 Mm2p 11 p 01 2(p 11 p 00 +p 10 p 01 )2p 10 p 00 n 1 mmp 01 2 2p 01 p 00 p 00 2 n 0 MMp 11 2 2p 11 p 10 p 10 2 n 2 p 2 p 2 p 2 Mm2p 11 p 01 2(p 11 p 00 +p 10 p 01 )2p 10 p 00 n 1 2p(1-p)2p(1-p)2p(1-p) mmp 01 2 2p 01 p 00 p 00 2 n 0 (1-p) 2 (1-p) 2 (1-p) 2 Joint and conditional (  k|ij ) genotype prob. between marker and QTL

8 Statistical models Marker Model y ij = a + bT ij +  ij The least squares approach can be used to estimate a and b. The size of b reflects the marker effect, confounded by the QTL effect and marker-QTL LD

9 The phenotype of sample i can be within marker genotype group j is modeled by y ij = 1 If z ij   0If z ij <  where  is the threshold for the underlying liability of the trait z, which is formulated as z ij =  ik  k + e ij  k = the genotypic value of QTL k  ik = the (1/0) indicator variable for sample i e ij = normally distributed residual variable with mean 0 and variance 1

10 The conditional probability of y ij = 1 given sample i’s QTL genotype (say G ij =k) is obtained by f k = Pr(y ij =1|G ij =k,  ) = Pr(z ij   |G ij =k,  ) = 1 – Pr(z ij <  |G ij =k,  ) = 1 – 1/(2  )  -   exp[-(z-  k ) 2 /2]dz f k is called the penetrance of QTL genotype k

11 F-values as a function of q and D Landscape F q D

12 Maximum likelihood analysis: Mixture model L(  |y)=  j=0 2  i=0 nj log [  2|ij Pr{y ij =1|G ij =2,  } yij Pr{y ij =0|G ij =2,  } (1-yij) +  1|ij Pr{y ij =1|G ij =1,  } yij Pr{y ij =0|G ij =1,  } (1-yij) +  0|ij Pr{y ij =1|G ij =0,  } yij Pr{y ij =0|G ij =0,  } (1-yij) ] =  j=0 2  i=0 nj log[  2|ij f 2 yij (1-f 2 ) (1-yij) +  1|ij f 1 yij (1-f 1 ) (1-yij) +  0|ij f 0 yij (1-f 0 ) (1-yij) ]  = (p 11, p 10, p 01, p 00, f 2, f 1, f 0 ) (6 parameters)

13 EM algorithm Define  2|ij =  2|ij f 2 yij (1-f 2 ) (1-yij) [  2|ij f 2 yij (1-f 2 ) (1-yij) +  1|ij f 1 yij (1-f 1 ) (1-yij) +  0|ij f 0 yij (1-f 0 ) (1-yij) ] (1)  1|ij =  1|ij f 1 yij (1-f 1 ) (1-yij) [  2|ij f 2 yij (1-f 2 ) (1-yij) +  1|ij f 1 yij (1-f 1 ) (1-yij) +  0|ij f 0 yij (1-f 0 ) (1-yij) ] (2)  0|ij =  0|ij f 0 yij (1-f 0 ) (1-yij) [  2|ij f 2 yij (1-f 2 ) (1-yij) +  1|ij f 1 yij (1-f 1 ) (1-yij) +  0|ij f 0 yij (1-f 0 ) (1-yij) ] (3) as the posterior probabilities of QTL genotypes given marker genotypes for sample i

14 Population genetic parameters Posterior prob AAAaaaObs MM  2|2i  1|2i  0|2i n. 2 Mm  2|1i  1|1i  0|1i n. 1 mm  2|0i  1|0i  0|0i n. 0 p 11 =1/2n{  i=1 n.2 [2  2|2i +  1|2i ]+  i=1 n.1 [  2|1i +  1|1i ](4) p 10 =1/2n{  i=1 n.2 [2  0|2i +  1|2i ]+  i=1 n.1 [  0|1i +(1-  )  1|1i ](5) p 01 =1/2n{  i=1 n.0 [2  2|0i +  1|0i ]+  i=1 n.1 [  2|1i +(1-  )  1|1i ](6) p 00 =1/2n{  i=1 n.2 [2  0|0i +  1|0i ]+  i=1 n.1 [  0|1i +  1|1i ] (7)

15 Quantitative genetic parameters  j=0 2  i=0 nj (  2|ij y ij ) f 2 = (8)  j=0 2  i=0 nj  2|ij  j=0 2  i=0 nj (  1|ij y ij ) f 1 = (9)  j=0 2  i=0 nj  1|ij  j=0 2  i=0 nj (  0|ij y ij ) f 0 = (10)  j=0 2  i=0 nj  0|ij

16 EM algorithm (1) Give initiate values  (0) =(p 11,p 10,p 01,p 00,f 2,f 1,f 0 ) (0) (2) Calculate  2|ij (1),  1|ij (1) and  0|ij (1) using Eqs. 1- 3, (3) Calculate  (1) using  2|ij (1),  1|ij (1) and  0|ij (1) based on Eqs. 4-10, (4) Repeat (2) and (3) until convergence.

17 Three genotypic values  2 =  + a for AA  1 =  + dfor Aa  0 =  - afor aa With the MLEs of  k, we can estimate , a and d.

18 How to estimate  k ? f 2 = 1 – 1/(2  )  -   exp[-(z-  2 ) 2 /2]dz f 1 = 1 – 1/(2  )  -   exp[-(z-  1 ) 2 /2]dz f 0 = 1 – 1/(2  )  -   exp[-(z-  0 ) 2 /2]dz We can use numerical approaches to estimate  2,  1 and  0

19 Hypothesis test H0: f 2 = f 1 = f 0 H1: at least one equality does not hold LR = -2[logL(  0 |y,M,D) - logL(  1 |y,M,D)] for interval [max{-p(1-q),-(1-p)q}, min{pq, (1-p)(1-q)}] of D.  0 = MLE under H0  1 = MLE under H1

20 LR as a function of D Profile D min{p(1-q),(1-p)q}max{pq.(1-p)(1-q)}

21 Dr Ma will write the program.


Download ppt "Linkage Disequilibrium Mapping of Complex Binary Diseases Two types of complex traits Quantitative traits–continuous variation Dichotomous traits–discontinuous."

Similar presentations


Ads by Google