Dirk-Jan de Koning*, Örjan Carlborg*, Robert Williams†, Lu Lu†, Epistatic QTL for gene expression in mice; potential for BXD expression data Dirk-Jan de Koning*, Örjan Carlborg*, Robert Williams†, Lu Lu†, Chris Haley* *Roslin Institute, UK †University of Tennessee Health Science Center, USA
Introduction Genetical genomics: exciting new tool Analysis tools for experimental crosses widely available More complex models have been proposed Scale-up from 10 to 10K traits NOT trivial
Data 29 BXD RI lines 587 markers spanning all chromosome Array data for 12,242 genes 77 arrays Normalized: µ=8, σ2=2 1 - 4 replicates/line
Research questions Proportion of variation in gene expression due to epistasis? Epistasis more prevalent for certain types of genes? For epistatic pairs of genes: both trans or 1 cis? Magnitude of epistasis in relation to differences between founder lines and deviation of F1
Data and analysis issues What is the repeatability? What to do with outliers? Means or single observations? If means: weighted or un-weighted? If weighted: what weights? Single marker mapping or interval mapping?
Repeatability Upper limit of heritability Mixed linear model in Genstat No consistent effect of sex and age
Outliers Outliers identified as individual expression measures + or – 3 s.d. from mean 3 treatments of outliers: Ignore Remove Shrink to 3 s.d.
(Weighted) analysis of means Weighted analyses should reflect difference in number of replicates 3 types of weighting: No weighting Inverse of variance Very crude estimate Strong effect of small SE! Use expected reduction in variance: n/[1+r(n-1)]
QTL analysis* Single QTL genome scan using least squares 2-dimensional scan fitting all pair-wise combinations of interacting QTL: exhaustive search Only additive x additive interaction Permutation test: analyses ‘approximated’ using GA * Carlborg and Andersson, Genetical Research, 2002
“Training” data 96 trait pseudo-randomly selected: proportional representation of r Individual phenotypes 3 treatments of outliers mean phenotypes 3 type of weighting IM vs. single marker Many scenarios to be evaluated
Computational considerations Means (29) vs. ind. measurements (77) Single marker vs. IM: 587 vs. 2100 tests for 1D scan 343,982 vs. 4,410,000 tests for 2D scan 1,000 genome-wide randomisations for 12,442 traits… 100.000 CPU hours on 512 processor Origin 3800 at CSAR, Manchester (£50K)
A flavour of the results
A flavour of the results
A flavour of the results
Acknowledgements