Optimal designs for one and two-colour microarrays using mixed models A comparative evaluation of their efficiencies Lima Passos, Winkens, Tan and Berger DEMA 2008 Maastricht University Department of Methodology and Statistics
Current situation One versus two colour comparisons Background Woo et al, 2004: We observed good concordance in both estimated expression levels and statistical significance of common genes. Smyth, 2005: All four platforms reasonably precise (cDNA, oligo, Agilent, Affymetrix); Broadly agree; Disagreement due to sequence differences, not to noise. John Hopkins Press release, 2005: Different microarray systems more alike than previously thought; Patterson et al., 2006: The quality of the data stemming from one and two-colour arrays are equivalent in terms of reproducibility, sensitivity, specificity and accuracy; highly concordant results regarding detection of differentially expressed genes;
Current opinions One or Two? Background Hardiman, 2004: The choice of platform … should be guided by the content on that platform and the amount of RNA available for experimentation. Agilent technologies: Both one and two colour have their places in scientific research: One provide much quicker analysis, more efficient method for analysing a large number of samples or those that span long time frames; Two provide the most accurate results, helping identify small incremental changes in sample to further specific investigations; Patterson et al. 2006; The decision to used one or two will be determined by cost, experimental design considerations and personal preference; Platform type should not be considered a primary factor ‘in decisions regarding experimental microarray design’;
Optimal designs One versus two? Objective The majority of papers addressing microarray design questions - fixed effects models; They are all specifically directed to two-colour microarrays; Design papers with mixed models (also two-colour) are less abundant (Cui and Churchill, 2003; Landgrebe et al., 2004; Tempelman, 2005; Bueno Filho et al., 2006 and Tsai et al., 2006); Is the choice of platform an important design issue? Main question: What is exactly the impact the choice of a platform can have on the precision of model parameters? If any, which are the financial implications?
Design issues at stake ??? Two colour: which pair-samples (the design points) to distribute across the slides together with their label assignment? One colour: design points consists of the groups themselves, and not their pair-wise combinations; ???
Mixed models Premises One colour: Two colour:
Covariance structure Premises Block diagonal, compound symmetric structure of V: Dye swap is made at the level of technical replication with identical sample pairs. If not, i.e. lj with lk’, with k ≠ k’, the block diagonal of the final covariance matrix V will be lost.
Further premises Premises Contrasts - Θ* = CΘ (first order interactions or main effects) Optimality criteria: Sequential search yields an approximate Exact designs: rounding up/down to the closest integer: Relative efficiency one versus two:
The cost function cost = njc1 + nkSc2 Premises Given the prohibitive costs, it is recommendable to have an estimation of the costs of different microarray designs for comparative purposes: cost = njc1 + nkSc2
Ceteris paribus Assumptions/limitations Premises To warrant comparability and fair assessment between the two platforms: model parameters and contrasts (common research questions) for the one and two-colour arrays are given on the same scale; number of technical replicates was held constant (2), and the search of optimal designs focused on the distribution of biological replicates; homogeneity of biological variances of experimental groups as well as independence and homogeneity of residual error variances were assumed to hold; Variance components were restricted to a random intercept model with compound symmetric, block-diagonal covariance matrix (dye-swap with identical sample pairs!); subjects’ price was constant over all biological groups and the one- and two-colour arrays cost the same;
3 x 3 factorial experiment Results Results 3 x 3 factorial experiment
ξ* and ξI* - Two colour Results
The design measure ξ* Pmf Directed graph wd xd wd xd D-optimal design – main effects only Results Pmf Directed graph 11 22 13 12 21 23 31 32 33 wd xd wd xd
One versus two?? Subjects to groups allocation 11 8 5 12 Results How many subjects? 11 8 5 12
One versus two?? Subjects to groups allocation Results ~
Efficiency comparison Results Efficiency comparison =N ≠ I ≠ N = I
Cost comparison Cost 1 – Cost 2 =N ≠ I ≠ N Cost 1 – Cost 2!!! = I Results Cost comparison Cost 1 – Cost 2 =N ≠ I ≠ N = I Cost 1 – Cost 2!!!
Results Cost comparison “adjusted for efficiency”
Final remarks Conclusion Optimal allocation of subjects to experimental groups is much concordant between the two platforms - Hence the choice of platform will not affect the subjects to groups’ optimal allocation; By varying number of subjects and arrays, while holding statistical precision of parameter estimates comparable, the choice of the one over the two-colour platform or vice-versa will be determined the subject to arrays cost ratio; On the grounds of statistical efficiency and under the condition that the acquisition of arrays outstrips that of subjects financially, two-colour arrays should be considered an efficient alternative over the one-colour, specifically for studies involving class comparisons.