Composite interval mapping Significance thresholds Confidence intervals Experimental design
Association between genotype and phenotype Individual Marker 1 Marker 2 Marker 3 Marker 4 Marker 5 Marker 6 Phenotype A 1 0.07 B 0.35 C 2 0.46 D 0.67 E 0.41 F 0.30
Interval mapping vs. Composite interval mapping Uses flanking marker genotypes to infer probability of genotype at intervals between the markers Associates probability of genotype with phenotype Composite interval mapping Uses markers in addition to flanking markers to control for QTL located elsewhere
Composite interval mapping Uses markers in addition to flanking markers to control for QTL located elsewhere including linked markers accounts for linked QTL- improved localisation of QTL including unlinked markers reduces variation (noise) due to other QTL, and so increases power.
Composite interval mapping Zeng 1994; Genetics 136:1457-1468 There is a trade-off between estimation of QTL location (esp. if linked QTL) and power to detect QTL with small effects. QTL cartographer
Significance thresholds How do you determine whether a QTL is statistically significant? Problem with multiple tests Arbitrary threshold OR Obtain an empirical distribution for the test statistic under the null hypothesis Permutation tests
Permutation test Permute genotypes/phenotypes (removes any real association) Individual Marker 1 Marker 2 Marker 3 Marker 4 Marker 5 Marker 6 Phenotype A 1 0.07 B 0.35 C 2 0.46 D 0.67 E 0.41 F 0.30
Permutation test Permute genotypes/phenotypes (removes any real association) Individual Marker 1 Marker 2 Marker 3 Marker 4 Marker 5 Marker 6 Phenotype A 1 0.67 B 0.35 C 2 0.30 D 0.07 E 0.46 F 0.41
Permutation test Permute genotypes/phenotypes (removes any real association) Individual Marker 1 Marker 2 Marker 3 Marker 4 Marker 5 Marker 6 Phenotype A 1 0.41 B 0.67 C 2 0.46 D 0.35 E 0.07 F 0.30
Permutation test Permute genotypes/phenotypes (removes any real association) Rerun genome-wide scan analysis, and calculate the highest test statistic across the genome Repeat many times
Example
Permuted data
Distribution of test statistic by permutation Permutation results Traditional statistical analysis of real data
Confidence intervals How do you assess uncertainty in the location of a QTL? 1 LOD support interval LOD-based intervals are often too narrow Bootstrappig
Bootstrapping want to know what would happen if you repeated the experiment many times use existing data set, and use it to create new, bootstrap, datasets by random sampling with replacement Marker 1 Marker 2 Pheno AA Aa 4 aa 5 8 6 9 Marker 1 Marker 2 Pheno AA Aa 4 aa 9 6 Marker 1 Marker 2 Pheno Aa 8 AA 4 aa 9
Bootstrapping want to know what would happen if you repeated the experiment many times use existing data set, and use it to create new, bootstrap, datasets by random sampling with replacement a given observation may appear more than once bootstrap datasets have the same sample size as the real data set Repeat QTL analysis with each bootstrapped data set Bootstrapping is more robust/ conservative
Experimental design Phenotyping – what phenotype to measure? Endophenotypes Schmidt et al. 2003 JOURNAL OF BONE AND MINERAL RESEARCH 18: 1486-1496
Experimental design Phenotyping – what phenotype to measure? Type of cross Pedigree vs. cross Inbred vs. outbred F2 vs. backcross
Experimental design Phenotyping – what phenotype to measure? Type of cross Sample size and power Beavis effect Marker density