High resolution QTL mapping in genotypically selected samples from experimental crosses Selective mapping (Fig. 1) is an experimental design strategy for.

Slides:



Advertisements
Similar presentations
Planning breeding programs for impact
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Association Mapping as a Breeding Strategy
Qualitative and Quantitative traits
Selective mapping and simulation study. high-density genome maps Are used for: Comparative mapping Map-based cloning Genome sequencing But genotyping.
Forecasting Using the Simple Linear Regression Model and Correlation
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Objectives (BPS chapter 24)
QTL Mapping R. M. Sundaram.
1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.
More Powerful Genome-wide Association Methods for Case-control Data Robert C. Elston, PhD Case Western Reserve University Cleveland Ohio.
9. SIMPLE LINEAR REGESSION AND CORRELATION
Quantitative Genetics
Evaluating Hypotheses
Mapping Basics MUPGRET Workshop June 18, Randomly Intermated P1 x P2  F1  SELF F …… One seed from each used for next generation.
Inferences About Process Quality
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Quantitative Genetics
Simple Linear Regression Analysis
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Chuanyu Sun Paul VanRaden National Association of Animal breeders, USA Animal Improvement Programs Laboratory, USA Increasing long term response by selecting.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
QTL mapping in animals. It works QTL mapping in animals It works It’s cheap.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Course on Biostatistics Instructors – Dr. Małgorzata Bogdan Dr. David Ramsey Institute of Mathematics and Computer Science Wrocław University of Technology.
Fine mapping QTLs using Recombinant-Inbred HS and In-Vitro HS William Valdar Jonathan Flint, Richard Mott Wellcome Trust Centre for Human Genetics.
Introduction to Linear Regression
Class 3 1. Construction of genetic maps 2. Single marker QTL analysis 3. QTL cartographer.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Complex Traits Most neurobehavioral traits are complex Multifactorial
Quantitative Genetics
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Association between genotype and phenotype
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
Population structure at QTL d A B C D E Q F G H a b c d e q f g h The population content at a quantitative trait locus (backcross, RIL, DH). Can be deduced.
Radiation Detection and Measurement, JU, 1st Semester, (Saed Dababneh). 1 Radioactive decay is a random process. Fluctuations. Characterization.
Correlation & Regression Analysis
PCB 3043L - General Ecology Data Analysis.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
The genomes of recombinant inbred lines
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Why you should know about experimental crosses. To save you from embarrassment.
Types of genome maps Physical – based on bp Genetic/ linkage – based on recombination from Thomas Hunt Morgan's 1916 ''A Critique of the Theory of Evolution'',
Designs for Experiments with More Than One Factor When the experimenter is interested in the effect of multiple factors on a response a factorial design.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are.
Bayesian Variable Selection in Semiparametric Regression Modeling with Applications to Genetic Mappping Fei Zou Department of Biostatistics University.
Measurement, Quantification and Analysis
PCB 3043L - General Ecology Data Analysis.
Relationship between quantitative trait inheritance and
Simple Linear Regression - Introduction
Mapping Quantitative Trait Loci
Parametric Methods Berlin Chen, 2005 References:
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Flowering-time QTL in crosses of Lz-0 with Ler and Col.
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Presentation transcript:

High resolution QTL mapping in genotypically selected samples from experimental crosses Selective mapping (Fig. 1) is an experimental design strategy for genome-wide, high-density linkage mapping of molecular markers in experimental crosses that optimizes the map resolution obtained for a given amount of genotyping effort [1]. It is especially suited to permanent mapping populations (e.g. recombinant inbred lines). Introduction Simulation A base population of diploid F2 recombinant inbred lines (RIL) was simulated. From this population, n individuals were selected either at random or using selective sampling to minimize the sum of squares of the bin lengths (as implemented in the software MapPop [1]). We varied  the sample fraction (the proportion of the base population in the selected sample)  the spacing of markers on the map  genome length QTLs were added at either fixed or random positions. For simulations where additive effects were considered to be random variables, they were sampled from a gamma (1,2) distribution. QTL analysis was performed using either marker regression or simple interval mapping with QTL Cartographer [3]. References [1] Vision TJ, Brown DG, Shmoys DB (2000) Genetics 15, [2] Beavis WD (1998) pp in Paterson AH (ed) Molecular Analysis of Complex Traits. CRC Press. [3] Basten CJ, Weir BS, Zeng ZB (2002) QTL Cartographer v 1.16 Conclusions Selective sampling increases the recombination frequency between a marker and a QTL due to both  crossover enrichment and  pseudointerference. As a result  QTL detection power is reduced when markers are sparse  QTL map resolution is substantially increased with little loss of detection power if markers are sufficiently dense. Selective sampling can help maximize map resolution when  there are logistical constraints on mapping population size  marker density is at least 1 per 5-10 cM. 1 Zongli Xu, 2 Fei Zou, 1 Todd J. Vision Departments of 1 Biology and 2 Biostatistics, University of North Carolina at Chapel Hill correspondence: This work has been supported by NSF grant DBI to TJV. QTL Detection Power Crossover Enrichment QTL Mapping Resolution Pseudointerference Pseudointerference refers to the non-independence of crossover sites within a selected sample. We hypothesized that pseudointerference could be contributing to the reduced power and increased resolution of selected samples relative to CE-adjusted random samples. We found that selected samples had proportionally fewer short within-individual intercrossover intervals. Interestingly, the mode of the distribution of interval lengths was at the same cM distance as the marker spacing used for selection (Figure 6A). The altered distribution above lead to fewer close double- crossovers, and thus more observable recombinations in selected samples relative to CE-adjusted random samples (Figure 6B). The total number of crossovers in the selected sample relative to that expected in a random sample of the same size is referred to as the crossover enrichment (CE). We found that CE was inversely related to the sample fraction, marker spacing and map length (Figure 2). We found that CE could be very closely predicted by the following empirical formula: where L is map length in cM, f is the sample fraction, and A is a constant that is determined by the type of base population. For an F2 RIL population, A=500. For backcross RIL and doubled haploid populations, A=750 and 1200, respectively. The fit of this equation to the observed values of CE was quite good. Within the realistic parameter range that we explored, we obtained R 2 values of , and for F2 RIL, backcross RIL, and DH populations, respectively. In order to determine whether CE alone is responsible for the differences in QTL detection power between selected and random samples, we compared selected samples to CE-adjusted random samples in which crossover sites were independently assigned but at the same frequency as in a given selected sample. Figure 6. (A) The distribution of intercrossover interval lengths within individuals for selected samples at three marker intervals versus a random sample. Map length was 100 cM, base population size was 500, and sample fraction was 0.1. Each point represents the average of 10,000 individuals. The spike at 100 cM in the random sample consists of chromosomes without crossovers. (B) The number of recombinations per individual in selected (dashed line) and CE-adjusted random (solid line) samples along a 100 cM chromosome. Marker intervals were 5 cM (  ), 10 cM (  ) or 20 cM (  ). Each value was obtained from 1,000 replicates with a sample size of 100. Figure 3. Detection power for a single QTL (additive effect 0.5, heritability 0.2) located equidistant from the two centermost markers. Samples of size 100. Series correspond to marker intervals of 1 ( ◊ ), 5 (  ), 10 (  ) and 20 cM (  ). QTL analysis was done using marker regression. (A) Populations differing by CE alone. Map length was 1000 cM. (B) Selected (dotted) and CE-adjusted random (solid) samples. Map length was 100 cM. Figure 5. 1-LOD confidence intervals for QTLs in random and selected samples. (A) One QTL at a fixed position with h 2 =0.2. Key to legend: sampling strategy/map length/marker interval. Sampling strategy (color) was either random, selected (Sel) or CE-adjusted (CE). Symbols denote map length. (B) Five QTL with random positions (at least 100 cM apart) and effects; combined h 2 =0.5. Selection increased QTL mapping resolution in both single and multiple QTL simulations, as determined by the width of the 1-LOD confidence intervals (Figure 5). The effect was greatest with dense markers and a short map. The confidence intervals of CE-adjusted random samples were slightly larger than those of selected samples. Detection power was calculated by simulation. For the single- QTL simulations, a QTL was counted as detected if any marker on the map exceeded the significance threshold. Power was inversely related to CE but the relationship was nearly flat when the marker-QTL distance was less than 2.5 cM (Figure 3A). Interestingly, the power in the selected samples was still less than that in CE-adjusted random samples (Figure 3B). Sensitivity (Sn) and Specificity (Sp), were calculated as where true positives (TP), false positives (FP) and false negatives (FN) were determined based upon the overlap between the position of true QTLs and the likelihood ratio peaks surpassing a predetermined significance threshold. Specificity was greater, and sensitivity lower, in selected samples in both single and multiple-QTL simulations. The difference was most pronounced when markers were sparse (Figure 4) and heritability was low (not shown). Figure 4. Sensitivity and specificity in random and selected samples. (A) One QTL at a random position with heritability of 0.2. (B) Five QTL with random positions and effect sizes having an overall heritability of 0.5. Map length was 1000 cM, base population size was 100, and sample fraction was 0.2. QTL positions were always over 100 cM from each other in the five QTL simulations. Analysis was performed using marker regression. Sensitivity and Specificity In principle, a similar selection strategy could be used in quantitative trait locus (QTL) mapping population with the aim of maximizing the resolution obtained when only a limited number of permanent lines can be propagated or phenotyped. Though large mapping populations are invariably desirable for QTL mapping [2], practical constraints on population size are commonplace. We refer to the choice of individuals for phenotyping on the basis of their genotypes, namely the inferred positions of crossovers, as selective sampling. Here we describe our results on the statistical consequences of using such a selected sample for QTL mapping, particularly with regard to detection power and resolution. Figure 1. Samples are selected from larger mapping populations to optimize the distribution of bin lengths, where a bin is the shortest interval between two crossovers in a sample. Of the four possible selected samples of size 3 from the pool of gametes shown on the left, the one shown on the right would be chosen because it minimizes the sum of the squares of the bin lengths. Figure 2. CE after selection of RILs with varied framework marker intervals (A) and genome lengths (B). Base population size is 500. Genome length is 1000 cM in A. Marker interval is 10 cM in B. Each point is obtained from 10,000 individuals. A B