Association Mapping as a Breeding Strategy

Slides:



Advertisements
Similar presentations
15 The Genetic Basis of Complex Inheritance
Advertisements

Planning breeding programs for impact
Potato Mapping / QTLs Amir Moarefi VCR
Frary et al. Advanced Backcross QTL analysis of a Lycopersicon esculentum x L. pennellii cross and identification of possible orthologs in the Solanaceae.
Quantitative traits.
Qualitative and Quantitative traits
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
Genomic Tools for Oat Improvement
PBG 650 Advanced Plant Breeding
Believing in MAGIC: Validation of a novel experimental breeding design Emma Huang, Ph.D. Biometrics on the Lake December 2, 2009.
Breeding cross-pollinated crops
Breeding and Genetics Tools Dr. Brent Hulke Research Geneticist.
Backcross Breeding.
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
Chapter 9: Genetic linkage and maps in breeding applications
Quantitative Genetics Theoretical justification Estimation of heritability –Family studies –Response to selection –Inbred strain comparisons Quantitative.
Chapter 7: Molecular markers in breeding
Computer Simulation in Plant Breeding Introduction Outline Application I: Breeding Method Application II: Gene Mapping Application III: Genetic Modeling.
Lecture 5 Artificial Selection R = h 2 S. Applications of Artificial Selection Applications in agriculture and forestry Creation of model systems of human.
Plant breeding aims to produce gene combinations that improve crop yield In plants as in animals sexual reproduction involves a fusion of gametes (sex.
Mark E. Sorrells and Flavio Breseghello
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Quantitative Genetics
TOPIC FOUR: INHERITANCE OF A SINGLE GENE Why can’t we all just get along and, say, call an inbred line in the F 6­ generation simply ‘an F 6 line’? Well.
Mark E. Sorrells & Elliot Heffner Department of Plant Breeding & Genetics Association Breeding Strategies for Crop Improvement.
Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating designs PBG 650 Advanced Plant Breeding.
Methods of Genome Mapping linkage maps, physical maps, QTL analysis The focus of the course should be on analytical (bioinformatic) tools for genome mapping,
ConceptS and Connections
Natural Variation in Arabidopsis ecotypes. Using natural variation to understand diversity Correlation of phenotype with environment (selective pressure?)
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Genetic Linkage. Two pops may have the same allele frequencies but different chromosome frequencies.
BREEDING AND BIOTECHNOLOGY. Breeding? Application of genetics principles for improvement Application of genetics principles for improvement “Accelerated”
Dr. Scott Sebastian, Research Fellow, Pioneer Hi-Bred International Plant Breeding Seminar at University of California Davis Accelerated Yield.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Complex Traits Most neurobehavioral traits are complex Multifactorial
Quantitative Genetics
INTRODUCTION TO ASSOCIATION MAPPING
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
QTL Associated with Maize Kernel Traits among Illinois High Oil × B73 Backcross-Derived Lines By J.J. Wassom, J.C. Wong, and T.R. Rocheford University.
Marker Assisted Selection in Tomato Pathway approach for candidate gene identification and introduction to metabolic pathway databases. Identification.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Gene Bank Biodiversity for Wheat Prebreeding
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
Software and Databases for managing and selecting molecular markers General introduction Pathway approach for candidate gene identification and introduction.
Use of breeding populations to detect and use QTL Jean-Luc Jannink Iowa State University 2006 American Oat Workers Conference Fargo, ND24 July 2006.
Types of genome maps Physical – based on bp Genetic/ linkage – based on recombination from Thomas Hunt Morgan's 1916 ''A Critique of the Theory of Evolution'',
STT2073 Plant Breeding and Improvement. Quality vs Quantity Quality: Appearance of fruit/plant/seed – size, colour – flavour, taste, texture – shelflife.
Association Mapping in European Winter Wheat
Moukoumbi, Y. D1. , R. Yunus2, N. Yao3, M. Gedil1, L. Omoigui1 and O
Genetic Linkage.
Comparative mapping of the Oregon Wolfe Barley using doubled haploid lines derived from female and male gametes L. Cistue, A. Cuesta-Marcos, S. Chao, B.
MULTIPLE GENES AND QUANTITATIVE TRAITS
From: Will genomic selection be a practical method for plant breeding?
Quantitative traits Lecture 13 By Ms. Shumaila Azam
BREEDING AND BIOTECHNOLOGY
W. Wen, T. Guo, V.H. Chavez T., J. Yan, S. Taba CIMMYT
Genetic Linkage.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Backcross Breeding.
Mapping Quantitative Trait Loci
MULTIPLE GENES AND QUANTITATIVE TRAITS
Genome-wide Association Studies
Genetic Drift, followed by selection can cause linkage disequilibrium
Genetic Linkage.
Linkage analysis and genetic mapping
BREEDING AND BIOTECHNOLOGY
University of Wisconsin, Madison
Evan G. Williams, Johan Auwerx  Cell 
Cancer as a Complex Genetic Trait
Presentation transcript:

Association Mapping as a Breeding Strategy Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University

Presentation Overview A Genetic Model for Association Mapping in Plant Breeding Populations Comparison of Different Plant Breeding Materials for Association Mapping Association Mapping of Kernel Size and Milling Quality in Soft Winter Wheat Cultivars

A Genetic Model for AM in Plant Breeding Populations: Association as Conditional Probabilities Marker Population genetics theory (Hedrick 2005) Recombination (c) Breeding Pool Gene={a} Marker={m,M} Selection on A or M (w) Recombination (c) New Parent (A,M) t generations Pr(A,M)=φ Pr(a,M)=θ Pr(a,m)=1-φ-θ Pr(A,m)=0 Pr(A|M,c,t,φ,θ,w) “Probability of a plant with marker allele M to have gene allele A, t generations after the introduction of A”

Recombination x initial frequency of M in the breeding pool Freq. new parent: φ=0.05 Relative fitness: w=1 Freq. M in original population = θ Freq. Recombination c ~8 ~18 θ=0 θ=0.05 θ=0.25 θ=0 Pr(A|M) A novel marker allele at 10 cM distance can be more predictive of the QTL allele than an allele 1 cM away if it was present in the original pop at a freq of 0.05 t Generations

Recombination x selection for M Freq. new parent: φ=0.05 Relative fitness: w = 4 (red), 2 (green), 1.25 (blue) Freq. M in original pop: 0 Freq. Recombination: c = 0.01, 0.05, 0.10 The generation at which the marker is depleted [Pr(A|M)=Pr(A)], depends on the selection intensity applied; The final frequency of A depends on selection and tightness of linkage between marker and gene. Pr(A|M) Pr(A) Generations

Summary In plant breeding populations, the locus most associated with the trait is not necessarily the closest locus; Loosely linked markers can still be useful for MAS if high intensity of selection is applied.

MAS for Complex Traits: Issues Accurate detection and estimation of QTL effects Pre-existing marker alleles in a breeding population can be linked to non-target QTL alleles Multiple QTL alleles can have different relative values Gene x gene and gene by environment interactions

Association Analysis as a Breeding Strategy Most association studies have focused on estimating linkage disequilibrium and fine mapping. Breeding programs are dynamic, complex genetic entities that require frequent evaluation of marker / phenotype relationships. Breseghello, F., and M.E. Sorrells. 2006. Association mapping of kernel size and milling quality in wheat (Triticum aestivum L.) cultivars. Genetics 172:1165-1177. Breseghello, F., and M.E. Sorrells. 2006. Association analysis as a strategy for improvement of quantitative traits in plants. Crop Sci. In press.

Association Mapping versus QTL Mapping Association Mapping can be conducted directly on the breeding material, therefore: Direct inference from data analysis to breeding is possible Phenotypic variation is observed for most traits of interest Marker polymorphism is higher than in biparental populations Routine variety trial evaluations provide phenotypic data Association Mapping provides other useful information about: Organization of genetic variation in relevant breeding populations Novel alleles can be identified and their relative value can be assessed as often as necessary

Association Mapping versus QTL Mapping Type I error (false positives) can be higher because of: Unaccounted population structure Simultaneous selection of combinations of alleles at different loci High sampling variance of rare alleles Type II error can be higher (low power) because of: Lower LD than in biparental mapping populations Unbalanced design due to differences in allele frequencies A larger multiple-testing problem because of lower LD

Integration of Association Analysis in a Breeding Program Germplasm Parental Selection Hybridization Elite germplasm feeds back into hybridization nursery New Populations Marker Assisted Selection Selection (Intermating) Novel & Validated QTL/Marker Associations New Synthetics, Lines, Varieties Evaluation Trials Elite Synthetics, Lines, Varieties Genotypic & Phenotypic data

Types of Populations Germplasm Bank Collection Synthetic Populations A collection of genetic resources including landraces, exotic material and wild relatives. Synthetic Populations Outcrossing populations (either male-sterile or manually crossed) synthesized from inbred lines. May be used for recurrent selection. Elite Lines Inbred lines (and checks) manipulated with the objective of releasing new varieties in the short term.

Characteristics Related to Association Mapping: Practical aspects Aspects of AM Germplasm bank Synthetic Populations Elite Germplasm Sample Core-collection Segregating progenies Elite lines and checks Sample turnover Static Ephemeral Gradually substituted Source of phenotypic data Screenings Progeny tests Yield trials Type of traits High heritability traits; Domestication traits Depends on the evaluation scheme Low heritability traits: yield, resistance to abiotic stresses

Characteristics Related to Association Mapping: Genetic Expectations Aspects of AM Germplasm bank Synthetic Populations Elite Germplasm Linkage Disequilibrium Low Intermediate and fast-decaying High Population structure Medium Allele diversity among samples Intermediate Allele diversity within samples Variable 1 or 2 alleles (diploid species) 1 allele (inbred lines)

Characteristics Related to Association Mapping: Potential Applications Aspects Germplasm bank Synthetic Populations Elite Germplasm Power Low Intermediate and decreasing High; could allow genome scan Resolution High; could allow fine mapping Intermediate and increasing Use of significant markers Transfer of new alleles by marker-assisted backcross Incorporation in selection index Forward Breeding -MAS in progenies (requires validation)

Previous QTL information Width 2D Doubled-Haploid Population AC Reed x Grandin QTL for kernel size (width) near Xwmc18-2D Recombinant Inbred Population Synthetic W7984 x Opata (ITMI population) QTL for kernel size (length) on 5A and 5B Length 5B

Association Analysis Materials 95/149 soft winter wheat cultivars from the Northeastern US: Mostly recent releases, representing 35 seed companies / institutions 93 SSR loci: 33 on 2D, 20 on 5A, 9 on 5B, 31 on 16 other chromosomes Rare alleles (freq<5%):considered as missing for LD and population structure analysis; considered as allele for AM analysis Methods Population Structure: 36 “unlinked” SSR markers- Structure without admixture, SPAGeDi (Hardy & Vekemans) program for Kinship ; Visualization: Factorial (Multiple) Correspondence Analysis (Benzecri, 1973 L' Analyse des correspondances. Dunod) Linkage Disequilibrium: Tassel (maizegenetics.net) used to compute r2 , with p-values from 1000 permutations Association Analysis: R stats package lme used to analyze Linear mixed-effects model with marker as fixed effects (selected from previously identified QTL regions) and subpopulations or Kinship as random effects (no obvious differentiating characteristics); Two-marker models: tested by likelihood ratio test Jianming Yu, Gael Pressoir, et al. (2006) A Unified Mixed-Model Method for Association Mapping Accounting for Multiple Levels of Relatedness Nature Genetics 38:203-208

Estimating Relatedness The K Matrix j . Fij = (Qij-Qm)/(1-Qm) (Ritland, Loiselle) If Fij is negative, then it is set to zero. F11 . …………. Θij≅ Fij i Fnj …… Fnn Relatedness (K) In cattle studies the analogous matrix is estimated from pedigrees, and it controls for the polygene effect Jianming Yu, Gael Pressoir, et al. (2006) A Unified Mixed-Model Method for Association Mapping Accounting for Multiple Levels of Relatedness Nature Genetics 38:203-208

Population Structure: Sample Subdivisions Subpopulation No. of Varieties Fst 19 0.337 32 0.111 13 0.295 31 0.064 Total 95 0.188 Moderate Population Subdivision

Population Structure: Factorial Correspondence Analysis Orthogonal views of 4 soft winter wheat subpopulations S2 S3 S4 S1

Linkage Disequilibrium: Germplasm Sample Selection R2 probability for unlinked SSR markers 149 lines genotyped with 18 unlinked SSR markers Most similar lines were excluded "Normalizing" the sample drastically reduced LD among unlinked markers 149 lines 95 lines

Definition of a baseline-LD specific for our sample Defined as the 95th percentile of the distribution of r2 among unlinked loci r2 estimates above this value are probably due to genetic linkage Baseline LD for this sample: r2 = 0.0654 Normal curve Normal Distr. 95th percentile LD baseline LD baseline

Linkage Disequilibrium: Chromosome 2D Consistent LD was below 1 cM, localized LD 1-5 cM

Linkage Disequilibrium: Chromosome 5A Significant LD extended for 5 cM in pericentromeric region ~5 cM

Loci Associated with Kernel Size (p-values) Chromosome 2D Agreed with QTL in Reed x Grandin Kernel Size Locus Weight Area Length Width cM Name NY OH 7 Xcfd56 0.069 0.160 0.012 0.119 0.076 0.031 0.000* 0.252 11 Xwmc111 0.005 0.020 0.108 0.003’ 0.107 0.000** 23 Xgwm261 0.145 0.016 0.019 0.009 0.027 0.058 0.001* 28 Xwmc112 0.057 0.047 0.120 0.480 0.367 0.024 64 Xgwm30 0.081 0.862 0.053 0.848 0.312 0.820 0.212 91 Xgwm539 0.042 0.038 0.030 0.039 0.290 0.334 Likelihood Ratio Test ** Milling Quality None of the loci on 2D were significant after multiple testing correction

Loci Associated with Kernel Size (p-values) Chromosome 5A Agreed with QTL in M6 x Opata n.s. ** Likelihood Ratio Test Kernel Size Locus Weight Area Length Width cM Name NY OH 55 Xcfa2250 0.021 0.007 0.044 0.014 0.002* 0.637 0.649 Xwmc150b 0.003 0.005 0.009 0.093 0.429 56 Xbarc117 0.118 0.022 0.039 60 Xbarc141 0.631 0.037 0.232 0.024 0.038 0.852 0.863 Milling Quality cM Locus Milling Score Flour Yield ESI Friability Break-Flour Yield 55 Xcfa2250 0.010 0.029 0.047 0.002* 0.081

B.L.U.E. of allele effects Kernel Length N. of Cultivars: 9 5 18 37 9 9 41 45 43 49

B.L.U.E. of allele effects Kernel Width N. of Cultivars: 41 14 8 15 18 24 5 10 19

B.L.U.E of allele effects Kernel Weight N. of Cultivars: 41 45 43 49

Conclusions Linkage Disequilibrium Variation in LD across the genome can be characterized in relevant germplasm Markers closely linked to QTL of interest can be identified and allelic effects quantified Association Mapping as a Breeding Strategy For recurrent selection, markers could be used to carry information from a “good year” to a “bad year” In pedigree breeding, markers could carry information about traits of interest from replicated field trials to single row or single plant selection Allelic values of previously identified alleles can be updated annually based on advanced trial data combined with genotypic data New alleles can be identified and characterized to determine their relative value A selection index can be used to incorporate both phenotypic and molecular data

Acknowledgements USDA Soft Wheat Quality Lab, Wooster, OH Embrapa Technical Support: David Benscher James Tanaka Gretchen Salm

Kangaroo Island Wayne Powell