QTL Analysis: Concept  Parents F1 F2 F2:3 × A B Generation Procedure

Slides:



Advertisements
Similar presentations
Linkage and Genetic Mapping
Advertisements

Genetic Linkage and Recombination
The next generation Chapters 9, 10, 17 in the course textbook, especially pages , ,
Medical Genetics 1 Prof Duncan Shaw
Genetic Variation Chapter 10 and 11 in the course textbook especially pages , ,
Genetic Inheritance & Variation
The genetic dissection of complex traits
Zhiwu Zhang. Complex traits Controlled by multiple genes Influenced by environment Also known as quantitative traits Most traits are continuous, e.g.
Planning breeding programs for impact
Association Studies, Haplotype Blocks and Tagging SNPs Prof. Sorin Istrail.
Frary et al. Advanced Backcross QTL analysis of a Lycopersicon esculentum x L. pennellii cross and identification of possible orthologs in the Solanaceae.
Identification of markers linked to Selenium tolerance genes
Experimental crosses. Inbred Strain Cross Backcross.
Qualitative and Quantitative traits
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.
Near-isogenic line (NIL) pairs Characterization of qEt8.06 using near-isogenic line (NIL) pairs To be able to analyze qEt8.06 in detail, NIL pairs contrasting.
ASSOCIATION MAPPING WITH TASSEL Presenter: VG SHOBHANA PhD Student CPMB.
Believing in MAGIC: Validation of a novel experimental breeding design Emma Huang, Ph.D. Biometrics on the Lake December 2, 2009.
Power and limitations of GWAS Aaron Lorenz Department of Agronomy and Horticulture.
S.P. From linkage analysis to linkage disequilibrium mapping: the case of HRPT2 ( a gene mutated in Hyperparathyroidism-jaw tumor syndrome) by Silvano.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
Eric Jorgenson Epidemiology 217 2/21/12
Ferdinand van ’t Hooft Cardiovascular Genetics and Genomics Group Karolinska Institutet, Stockholm, Sweden Genome-Wide Association Study GWAS
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
November 16 Remaining deadlines: Research paper Friday Dec 4
1.Generate mutants by mutagenesis of seeds Use a genetic background with lots of known polymorphisms compared to other genotypes. Availability of polymorphic.
Finding “the gene” for cystic fibrosis. Why is this in quotes? A.CF is not caused by a gene, it’s caused by multiple genes. B.CF is not caused by genetic.
Genetic Traits Quantitative (height, weight) Dichotomous (affected/unaffected) Factorial (blood group) Mendelian - controlled by single gene (cystic fibrosis)
Mapping Basics MUPGRET Workshop June 18, Randomly Intermated P1 x P2  F1  SELF F …… One seed from each used for next generation.
Pedigrees.
Genome Evolution. Amos Tanay 2009 Genome evolution Lecture 9: Quantitative traits.
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Linkage and LOD score Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
QTL mapping in animals. It works QTL mapping in animals It works It’s cheap.
Natural Variation in Arabidopsis ecotypes. Using natural variation to understand diversity Correlation of phenotype with environment (selective pressure?)
Non-Mendelian Genetics
Genetic Linkage. Two pops may have the same allele frequencies but different chromosome frequencies.
Gene Hunting: Linkage and Association
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Complex Traits Most neurobehavioral traits are complex Multifactorial
Quantitative Genetics
INTRODUCTION TO ASSOCIATION MAPPING
The same gene can have many versions.
Mapping and cloning Human Genes. Finding a gene based on phenotype ’s of DNA markers mapped onto each chromosome – high density linkage map. 2.
Population structure at QTL d A B C D E Q F G H a b c d e q f g h The population content at a quantitative trait locus (backcross, RIL, DH). Can be deduced.
SNPs, Haplotypes, Disease Associations Algorithmic Foundations of Computational Biology II Course 1 Prof. Sorin Istrail.
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
1 Balanced Translocation detected by FISH. 2 Red- Chrom. 5 probe Green- Chrom. 8 probe.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Chapter 22 - Quantitative genetics: Traits with a continuous distribution of phenotypes are called continuous traits (e.g., height, weight, growth rate,
Alberts • Bray • Hopkin • Johnson • Lewis • Raff • Roberts • Walter
Why you should know about experimental crosses. To save you from embarrassment.
Types of genome maps Physical – based on bp Genetic/ linkage – based on recombination from Thomas Hunt Morgan's 1916 ''A Critique of the Theory of Evolution'',
Association Mapping in Families Gonçalo Abecasis University of Oxford.
Moukoumbi, Y. D1. , R. Yunus2, N. Yao3, M. Gedil1, L. Omoigui1 and O
Genetic Linkage.
Of Sea Urchins, Birds and Men
Genetic Linkage.
Mapping Quantitative Trait Loci
Basic concepts on population genetics
Genome-wide Association Studies
Genetic Drift, followed by selection can cause linkage disequilibrium
Genetic Linkage.
Washington State University
Heat map of additive effects for PCs QTL
Presentation transcript:

QTL Analysis: Concept  Parents F1 F2 F2:3 × A B Generation Procedure Alternatives: BC1, RIL, DHL Field PHT[cm] 210 190 203 159 206 . 171 Marker # 1 2 3 4 5 .. M 1 B B H H A .. A 2 H A H A A .. H 3 B B H H H .. A 4 H H B B B .. H 5 H B H H A .. B . . . . . . . . . . . . . . . . N A H H H A .. A Laboratory Chromosome 1 LOD score PHT Office QTL mapping is a multi step procedure that involves field and lab work as well as an elaborate statistical analysis. In general, two homozygous lines that differ significantly for the trait under study are crossed. The F1 hybrid is selfed to produce a segregating F2 populations. F2 individuals will be genotyped using molecular markers. F2 will be selfed to produce F2:3 lines.  enough seed for repeated field trials. QTL mapping idea (Lynch and Walsh) By crossing two lines, linkage disequilibrium is created between loci that differ between the parental lines. This is creating associations between marker loci and linked segregating QTL. Experimental designs F2:3  in contrast to all other populations here three marker classes can be observed, therefore, dominance can be evaluated. AIL  advanced intercross lines, Random mated populations, higher resolution, but decreased power of QTL detection. RIL  homozygous genetic background, field trials can be repeated in multiple locations and years. DHL BC Marker information and phenotypic data is combined and statistical tools are used to map and characterize QTL.

QTL Analysis: Single Marker Analysis 240 Total 196 umc157 AA Aa aa 195 197 F = 0.48 ns umc130 AA Aa aa 201 196 191 F = 6.47** 220 Plant height (cm) 200 180 160 A basic method to identify markers linked to QTL is the single marker analysis. XMC (cm)

QTL Analysis: Single Marker Model (F2) QQ Qq qq MM (1-r)2 2r(1-r) r2 μ(MM) Mm r(1-r) (1-r)2+r2 r(1-r) μ(Mm) mm r2 2r(1-r) (1-r)2 μ(mm) μ1 μ2 μ3 r M m Q q Additive effect: Dominance effect: Expected QTL genotypic frequencies conditional on marker genotype. The QTL mean for each marker genotype is equl to the frequency of each QTL type time the value of each QTL type, given the marker genotype. Example: uMM = u1 [(1-r)2] + u2 [2r(1-r)] +u3 [r2] We have three equations but four parameters (u1 – u4, r). QTL effects and position of the QTL are confounded. We can only solve for the QTL effects if r is fixed. F tests on the contrasts of marker classes test the following hypothesis: a > 0 d > 0 r < 0.5 Schön, 2002

QTL Analysis: Single Marker Model (F2) Example: Plant height, umc130 X(MM) = 201cm X(Mm) = 196cm X(mm) = 191cm Case 1 Case 2 r = 0 M m Q q r = 0.2 M m Q q In single marker analysis, the only information we have are the means of each marker class. And based on this information it is possible to determine whether a marker is linked with a QTL. However, it is not possible to determine the effect of a QTL, because effect and QTL position are confounded. PHT (cm) r = 0 r = 0.2 r = 0.4 Add. Effect 5.0 8.3 25.0 X(QQ) 201.0 204.3 221.0 X(Qq) 196.0 196.0 196.0 X(qq) 191.0 187.7 171.0

4. Association Analysis Concepts

Dissecting A Quantitative Trait: Time Versus Resolution 5 NILs RI QTL Mapping Positional Cloning Research Time in Years F2 QTL Mapping Associations 1 1 1x104 1x107 Resolution in bp

Resolution Versus Allelic Range >40 Associations In Diverse Germplasm Pedigree Alleles Evaluated Associations In Narrow Germplasm F2 or RIL Mapping Positional Cloning NIL 1 1 1x104 1x107 Resolution in bp

Evaluate whether nucleotide polymorphisms associate with phenotype Natural populations Exploit extensive recombination Association Tests T A G C 1.3m 1.5m 1.4m 1.8m 2.0m Spend a lot of time on this slide

Association mapping Mainstay of human genetics Cystic fibrosis One of a few possible approaches Reproducibility was an issue Cystic fibrosis Kerem, et al. (1989). Science 245, 1073-1080. Alzheimer's disease Corder et al. (1994). Nature Genet. 7, 180-184.

Associations may result from at least three causes 1. The locus is the cause of the phenotype 2. The locus is in linkage disequilibrium with the cause of the phenotype Linked and highly correlated

Complete Linkage Disequilibrium 1 2 D’=1 r2=1 6 Locus 1 Locus 2 Same mutational history and no recombination. No resolution Adapted from Rafalski (2002) Curr Opin Plant Biol 5:94-100.

Linkage Disequilibrium 1 2 D’=1 r2=0.33 3 6 Locus 1 Locus 2 Different mutational history and no recombination. Some resolution

Linkage Equilibrium D’=0 r2=0 1 2 3 Locus 1 Locus 2 Same mutational history with recombination. Resolution

3. Population structure can produce associations G T Andes U.S. P<<0.001 T G P=0.04 G T These non-functional associations can be accounted for by estimating the population structure using random markers.

5. QTL mapping analysis

QTL Analysis: Interval Mapping Simple Interval Mapping Composite Interval Mapping PLOT Peak at 96 LOD = 4.7 + === ===== I === === I == === I == I = 2.4 + == I ==== I I ==== ===========********** ****** *************** 0.0 M----+----+---M+----MC--M+----M----+----+----+-C--+----+---M+----+----+--M cM (0.47) 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 This problem is solved by using interval mapping approaches. PlabQTL

QTL Analysis: Power of QTL detection 10 20 30 40 50 60 70 80 90 100 Heritability 0.4 0.5 0.6 0.7 0.8 0.9 1.0 N = 600 N = 300 N = 100 Power: Probability of finding a QTL Heritability: Simulation Model Additive genetic model F2 or F3 lines Maize genetic model, marker interval 20cM 16500 F2 F3 individuals that were partitioned into the different population Utz, H.F., and A.E. Melchinger. 1994. Comparison of different approaches to interval mapping of quantitative trait loci. pp. 195-204. ed. J.W. van Ooijen, and J. Jansen. Biometrics in Plant Breeding Applications of Molecular Markers. Proceedings of the Ninth Meeting of the EUCARPIA section Biometrics in Plant Breeding. Wageningen. CPRO-DLO, Wageningen, the Netherlands.ns. Utz and Melchinger, 1994

QTL Analysis: Conclusions There are a number of QTL, in analysis the largest ones easiest to detect BUT Makes detection of others difficult Models can adjust for this – detect others

QTL Analysis: Conclusions QTL mapping combines qualitative linkage analysis with quantitative genetic analysis. – Association between marker genotypes and phenotypic trait values. Single marker analysis is easy to perform but QTL effect and position are confounded. This results in low power of QTL detection. Interval mapping approaches increase power of QTL detection and allow the estimation of QTL effects and position.

QTL Analysis: Conclusions Estimates of QTL effects and the proportion of the genotypic variance explained by QTL are biased due to genotypic and environmental sampling. Estimates of QTL position show low precision. With large populations a large number of QTL is found for complex traits. When conducting a QTL study you may wish to use a large population size.

6. Candidate Genes

Functional Genomics Using Diversity Forward Genetics Reverse Genetics Trait Trait QTL Candidate Polymorpism Positionally clone gene Candidate gene

Association Analysis Identification of More Favorable Alleles Enhanced Marker Assisted Breeding

7. Linkage Disequilibrium Analysis

Properties of LD The basic measure of LD is: DAB = PAB - pA pB A a ( DAB = -DAb = -DaB = Dab ) A a PAB = pApB + DAB PAb = pApb - DAB PaB = papB - DAB Pab = papb + DAB B b pA pa pB pb 1 25

Linkage Disequilibrium versus Generations Since its Creation 100 200 300 400 500 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 c = 0.1 c = 0.02 c = 0.01 c = 0.005 c = 0.001 Disequilibrium,rAB Generation, g rAB  (1-c)g Recomb. Rate (c)

Other Measures of LD E(r2)= 1 / (1 – 4Nc) Can divide DAB by the maximum value it can obtain: D’AB = DAB / [max(-pApB, -papb)] if DAB < 0 DAB / [min (pApb, papB)] if DAB > 0 The sampling properties of D’AB are not well understood. r2AB = D2AB pA pB pa pb E(r2)= 1 / (1 – 4Nc)

LD generally decays rapidly with distance Remington, D. L., et al. 2001.. PNAS-USA 98:11479-11484. & unpublished r2

Population Effect on Linkage Disequilibrium in Maize Investigator Population Studied Extent of LD Gaut Landraces <1000 bp Buckler Diverse Inbreds 2000 bp Rafalski Elite Lines 100 kb? (6 kb euchromatin?) Reviewed in Flint-Garcia, S. A. et al. 2003. Annual Review of Plant Biology 54:357-374.

8. Association Analysis

Allele Case-Control Test marker allele 1 allele 2 Affected n1|aff n2|aff 2 naff Unaffected n1|unaff n2|unaff 2 nunaff n1 n2 2 N individuals if naff = nunaff (ni|aff - ni|unaff)2 ~ c2(k-1) X2 = Si ni|aff + ni|unaff (k alleles)

Population Stratification: American Indian and Diabetes Knowler 1988 Am J Hum Genet 43, 520-526.

Use SSR Markers to Estimate Population Structure Method: Pritchard, J. K., M. Stephens, and P. Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics 155:945-59. Example: Remington, D. L., et al. 2001.. Proc Natl Acad Sci U S A 98:11479-11484. 8 Stiff Stalk 38 Non-Stiff Stalk 30 Sub-Tropical

Logistic Regression Ratio Test For Association Adapted from Pritchard case-control approach Where: C = candidate polymorphism distribution T = trait value Q = matrix of population membership Evaluated by logistic regression Significance evaluated by permutation based on haplotype distribution in populations Pritchard, J. K., M. Stephens, N. A. Rosenberg, and P. Donnelly. 2000. Am J Hum Genet 67:170-181.

Population Structure Estimates Greatly Reduce Estimated Type I Error Rates Fields Flowering Time Height

Su1 Sugary1 is an isoamylase, a starch debranching enzyme Sequenced fully from 32 diverse lines Sampled 2 small parts of gene from 102 lines Whitt, S. R., et al. 2002. PNAS-USA 99:12959-12962. 11100bp

su1 Promoter & 1st Exon Two distinct alleles Sweet phenotype not associated

su1 Coding Region Two distinct alleles Sweet phenotype associated with W578R

Su1 5 7 8 : W ® R Based on survey of 12kbp from 32-102 lines.

Dwarf8 functional variation 2 Amino Acid Deletion MITE Indel SH2 Domain Days to Silking relative to B73 When controlling for population structure, associates with flowering time & plant height across 12 environments. Thornsberry et al. 2001 Nat. Genet.

9. Type I and Type II Error

Statistics - Hypothesis Test               Null Hypoth True Null Hypoth False Reject Null Hypothesis Type I Error α Correct Fail to Reject Null Hypothesis Type II Error β P-value = α Power = 1- β

Experimentwise P value Each statistical test has a Type I error rate Test 20 independent SNPs, one will be significant at P<0.05 Bonferroni correction essentially divides the P by number of tests Often too conservative (no power), as markers are correlated Churchill and Doerge permutation help estimate experimentwise P, Permutes the entire genotype relative to the phenotypes

Power of approaches Sample size Heritability of trait 100 to 1000 are typical Heritability of trait H2 = 10% - 90% Depends on ability to measure trait Interactions with environment Depends on statistical properties of test

Association Approaches Complement QTL Linkage Mapping Linkage (RILs) Association 10,000,000 bp 2000 bp Resolution High Power Little Power Genome Scan Low (1 or 2) High (10s) Allelic Range High Low Statistical Power per Allele