Regression-based linkage analysis Shaun Purcell, Pak Sham.
(HE-SD) (X – Y)2 = 2(1 – r) – 2Q( – 0.5) + Penrose (1938) quantitative trait locus linkage for sib pair data Simple regression-based method squared pair trait difference proportion of alleles shared identical by descent (HE-SD) (X – Y)2 = 2(1 – r) – 2Q( – 0.5) + ^
Haseman-Elston regression = -2Q (X - Y)2 IBD 1 2
Expected sibpair allele sharing + - Sib 2 Sib 1
Squared differences (SD) - - + + - + + + + - - - Sib 2 Sib 1
Sums versus differences Wright (1997), Drigalenko (1998) phenotypic difference discards sib-pair QTL linkage information squared pair trait sum provides extra information for linkage independent of information from HE-SD (HE-SS) (X + Y)2 = 2(1 + r) + 2Q( – 0.5) + ^
Squared sums (SS) + - Sib 2 Sib 1
SD and SS - - + + - + + + + - - - Sib 2 Sib 1
New dependent variable to increase power mean corrected cross-product (HE-CP) other extensions > 2 sibs in a sibship multiple trait loci and epistasis multivariate multiple markers binary traits other relative classes
SD + SS ( = CP) - - + + - + + + + - - - Sib 2 Sib 1
Xu et al With residual sibling correlation HE-CP HE-CP in power, HE-SD in power HE-CP Propose a weighting scheme
Variance of SD
Variance of SS
Low sibling correlation
Increased sibling correlation
Clarify the relative efficiencies of existing HE methods Demonstrate equivalence between a new HE method and variance components methods Show application to the selection and analysis of extreme, selected samples
Haseman-Elston regressions HE-SD (X – Y)2 = 2(1 – r) – 2Q( – 0.5) + HE-SS (X + Y)2 = 2(1 + r) + 2Q( – 0.5) + HE-CP XY = r + Q( – 0.5) +
NCPs for H-E regressions Variance of Dependent NCP per sibpair (X – Y)2 (X + Y)2 XY Dependent
Weighted H-E Squared-sums and squared-differences Optimal weighting orthogonal components in the population Optimal weighting inverse of their variances
Weighted H-E A function of Equivalent to variance components square of QTL variance marker informativeness complete information = 0.0125 sibling correlation Equivalent to variance components to second-order approximation Rijsdijk et al (2000)
Combining into one regression New dependent variable : a linear combination of squared-sum and squared-difference weighted by the population sibling correlation:
HE-COM - - + + - + + - - - Sib 2 Sib 1
Simulation Single QTL simulated Residual variance 10,000 sibling pairs accounts for 10% of trait variance 2 equifrequent alleles; additive gene action assume complete IBD information at QTL Residual variance shared and nonshared components residual sibling correlation : 0 to 0.5 10,000 sibling pairs 100 replicates 1000 under the null
Unselected samples
Sample selection A sib-pairs’ squared mean-corrected DV is proportional to its expected NCP Equivalent to variance-components based selection scheme Purcell et al, (2000)
Sample selection Sibship NCP 1.6 1.4 1.2 0.8 0.6 0.4 0.2 -4 -3 -2 -1 1 2 3 4 Sib 1 trait Sib 2 trait 0.2 0.4 0.6 0.8 1.2 1.4 1.6 Sibship NCP
Analysis of selected samples 500 (5%) most informative pairs selected r = 0.05 r = 0.60
Selected samples : H0
Selected samples : HA
Variance-based weighting scheme SD and SS weighted in proportion to the inverse of their variances Implemented as an iterative estimation procedure loses simple regression-based framework
Product of pair values corrected for the family mean for sibs 1 and 2 from the j th family, Adjustment for high shared residual variance For pairs, reduces to HE-SD
Conclusions Advantages Future directions Efficient Robust Easy to implement Future directions Weight by marker informativeness Extension to general pedigrees
The End