Quantitative Genetics and Genetic Diversity Bruce Walsh Depts of Ecology & Evol. Biology, Animal Science, Biostatistics, Plant Science Footprints of Diversity.

Slides:



Advertisements
Similar presentations
Qualitative and Quantitative traits
Advertisements

ASSOCIATION MAPPING WITH TASSEL Presenter: VG SHOBHANA PhD Student CPMB.
Population Genetics and Natural Selection
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
Basics of Linkage Analysis
The Inheritance of Complex Traits
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Atelier INSERM – La Londe Les Maures – Mai 2004
Signatures of Selection
Detection of domestication genes and other loci under selection.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
A coalescent computational platform for tagging marker selection for clinical studies Gabor T. Marth Department of Biology, Boston College
Lecture 5 Artificial Selection R = h 2 S. Applications of Artificial Selection Applications in agriculture and forestry Creation of model systems of human.
Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA
Bruce Walsh, University of Arizona
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
Salit Kark Department of Evolution, Systematics and Ecology The Silberman Institute of Life Sciences The Hebrew University of Jerusalem Conservation Biology.
Constraints on multivariate evolution Bruce Walsh Departments of Ecology & Evolutionary Biology, Animal Science, Biostatistics, Plant Science.
Quantitative Genetics
Lecture 2: Basic Population and Quantitative Genetics.
Modes of selection on quantitative traits. Directional selection The population responds to selection when the mean value changes in one direction Here,
Biodiversity IV: genetics and conservation
Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating designs PBG 650 Advanced Plant Breeding.
Methods of Genome Mapping linkage maps, physical maps, QTL analysis The focus of the course should be on analytical (bioinformatic) tools for genome mapping,
Hidenki Innan and Yuseob Kim Pattern of Polymorphism After Strong Artificial Selection in a Domestication Event Hidenki Innan and Yuseob Kim A Summary.
Haplotype Blocks An Overview A. Polanski Department of Statistics Rice University.
Section 4 Evolution in Large Populations: Mutation, Migration & Selection Genetic diversity lost by chance and selection regenerates through mutation.
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
The Evolution of Populations.  Emphasizes the extensive genetic variation within populations and recognizes the importance of quantitative characteristics.
Chapter 5 Characterizing Genetic Diversity: Quantitative Variation Quantitative (metric or polygenic) characters of Most concern to conservation biology.
Genetic Linkage. Two pops may have the same allele frequencies but different chromosome frequencies.
CS177 Lecture 10 SNPs and Human Genetic Variation
Genetics and Speciation
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Quantitative Genetics
Models of Molecular Evolution III Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 7.5 – 7.8.
INTRODUCTION TO ASSOCIATION MAPPING
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Selectionist view: allele substitution and polymorphism
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
The Interactions of Selection With Genetic Drift Can Be Complicated Because the Changes in p Induced By Drift are Random and Ever-Changing Three Important.
NEW TOPIC: MOLECULAR EVOLUTION.
By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458.
Lecture 22: Quantitative Traits II
Genomics of Adaptation
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Chapter 22 - Quantitative genetics: Traits with a continuous distribution of phenotypes are called continuous traits (e.g., height, weight, growth rate,
In populations of finite size, sampling of gametes from the gene pool can cause evolution. Incorporating Genetic Drift.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Association Mapping in Families Gonçalo Abecasis University of Oxford.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Genetic Linkage.
MULTIPLE GENES AND QUANTITATIVE TRAITS
Of Sea Urchins, Birds and Men
Population Genetics As we all have an interest in genomic epidemiology we are likely all either in the process of sampling and ananlysising genetic data.
Signatures of Selection
Neutrality Test First suggested by Kimura (1968) and King and Jukes (1969) Shift to using neutrality as a null hypothesis in positive selection and selection.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Detection of the footprint of natural selection in the genome
MULTIPLE GENES AND QUANTITATIVE TRAITS
The ‘V’ in the Tajima D equation is:
BI820 – Seminar in Quantitative and Computational Problems in Genomics
Lecture 5 Artificial Selection
Genetic Drift, followed by selection can cause linkage disequilibrium
Medical genomics BI420 Department of Biology, Boston College
Lecture 9: QTL Mapping II: Outbred Populations
Medical genomics BI420 Department of Biology, Boston College
Evan G. Williams, Johan Auwerx  Cell 
Presentation transcript:

Quantitative Genetics and Genetic Diversity Bruce Walsh Depts of Ecology & Evol. Biology, Animal Science, Biostatistics, Plant Science Footprints of Diversity in the Agricultural Landscape: Understanding and Creating Spatial Patterns of Diversity

Overview Introductory comments –Processes generating spatial genetic variation –Molecular vs. genetic variation –Importance of different types of variability Finding genomic locations under selection –Patterns selection leaves in the genome –Finding genes involved in local adaptation G x E tools for localizing interesting populations –Use of factorial regressions to dissect contributions to G x E

Divergence of populations over time The patterning of genetic variation within- and between-populations is a dynamic process Loss/fixation of variations via drift and creation of new genetic variation via mutation (and perhaps migration) is a constant background process Populations can also evolve via natural selection to be locally-adaptive

Populations show both within-population variation as well as between-population variation (divergence) In this example, lots of within-population variation, little between-population variation (little divergence)

Over time, loss/fixation (via drift) of variation increases the between-population variation unless overpowered by sufficient levels of migration is a private allele Shared alleles Here, plenty of within-population variation, but also significant between-population variation as well.

Variation can also be lost (and hence between-population variation increased) in founder populations Note reduction of within-population variation relative to founding (source) population

Quantifying levels of variation In an ANOVA-like framework, we can ask how much of the total variation over a series of population is in common (within- population variation) and how much is distinct (between-population variation), such as differences in allele frequencies F ST = fraction of all genetic variation due to between-population divergence (R ST when using SSR/STR). –Range is 0 to 1 –The larger the value of F ST, the more molecular divergence

Molecular diversity SNPs, SSRs (STRs), and other molecular markers widely used to examine genetic variation within populations and divergence between them (such as estimating levels of polymorphism and F ST ). Much of this pattern of variation is largely shaped by the genetic drift of effectively neutral alleles (the marker alleles) Hence, molecular variation is a snap-shot of the neutral variation –All loci equally influenced by demography

Genetic divergence Drift and mutation cause allele frequencies to change between populations However, the breeder is usually interested in those genetic changes from selection: –Adaptation to the local environment Interested in both traits that provide adaptation and in the genes that underlie these adaptive traits

Types of divergence Three sources of usable genetic variation for breeding from population divergence –Accumulation of new QTLs alleles for subsequent selection response –Divergence in allele frequencies at loci involved in heterosis –Fixation of locally-adaptive mutations. How good a predictor is divergence at neutral sites (e.g., SNP, STR data) likely to be for these three classes?

Accumulation of new variation For random quantitative traits, new variance accumulates at roughly 2t Var(M) –The trait mutational variance/gen, Var(M), is typically on the order of 1/1000 of the environmental variance (slow, but steady) Accumulation of variation in a neutral trait tracks the accumulation of divergence at random molecular markers As a rough approximation, molecular divergence can provide a guide of potentially usable quantitative trait variation

Predicts that usable variance can be generated in the cross between two molecularly-divergent lines. –Stronger prediction: Larger F ST, larger F 2 variation in cross Transgressive segregation is a potential example of this, the finding in many QTL mapping studies that favorable alleles for a trait are often found in populations with lower trait values (and vise-versa)

Accumulation of heterotic variation  p 2 = variance under drift = 2p(1-p)[1-exp(-t/Ne)] Hence,  p 2 is expected to increase with divergence time (t), which can be predicted by levels of molecular divergence (larger F ST, greater divergence time). Recall that the expected heterosis in a cross between two populations is a function of their difference (  p) in allele frequencies at loci showing dominance (d) H F1 =  i (  p i ) 2 d i

Predicting heterosis While expected allele frequency differences increase with time of divergence, this does not guarantee that heterosis will increase with divergence time between populations Key is that strong directional dominance (d > 0 consistently) is required, and drift also increases the frequency differences in alleles with d < 0. Hence, level of marker divergence is a poor predictor of cross heterosis. –F ST a poor predictor of H F1

Finding genes under selection Overall amount of genomic molecular divergence no predictor of potential adaptation –Can have large amounts of neutral divergence (large F ST ), but little to no adaptation. –Likewise, populations with small F ST can have undergone considerable adaptation, esp. when strong selection has occurred However, can use molecular markers to look for recent signatures of selection in genomic regions This, in turn, allows us to localize potential adaptation genes

Search for Genes that experienced artificial (and natural) selection Akin in sprit to testing candidate genes for association or using genome scans to find QTLs. In linkage studies: Use molecular markers to look for marker-trait associations (phenotypes) In tests for selection, use molecular markers to look for patterns of selection (patterns of within- and between-species variation) Tests for selection make NO assumptions as to the traits under selection

Logic behind polymorphism-based tests Key: Time to MRCA relative to drift If a locus is under positive selection, more recent MRCA (shorter coalescent) If a locus is under balancing selection, older MRCA relative to drift (deeper coalescent) Shorter coalescent = lower levels of variation, longer blocks of disequilibrium Deeper coalescent = higher levels of variation, shorter blocks of disequilibrium

Balancing selection Selective Sweep Neutral Time Present Past Longer time back to MRCA Shorter time back to MRCA Selection changes to coalescent times Time to MRCA for the individuals sampled

A scan of levels of polymorphism can thus suggest sites under selection Directional selection (selective sweep) Balancing selection Local region with reduced mutation rate Local region with elevated mutation rate Map location Variation

Wang et al (1999) Nature 398: 236. Example: maize domestication gene tb1 Wang et al. (1999) observed a significant decrease in genetic variation in the 5’ NTR region of tb1, suggesting a selective sweep influenced this region.

Polymorphism-based tests Given a sample of n sequences at a candidate gene, there are several different ways to measure diversity, which are related under the strict neutral model –number of segregating sites. E(S) = a n  –number of singletons. E(  ) =   n/(n-1) –average num. of pairwise differences, E(k) =  A number of polymorphism-based tests (e.g., Tajima’s D) are based on detecting departures from these expectations, e.g., E(S) differing from a n E(k)

Major Complication With Polymorphism-based tests Demographic factors can also cause these departures from neutral expectations! Too many young alleles -> recent population expansion Too many old alleles -> population substructure Thus, there is a composite alternative hypothesis, so that rejection of the null does not imply selection. Rather, selection is just one option.

Can we overcome this problem? It is an important one, as only polymorphism- based tests can indicate on-going selection Solution: demographic events should leave a constant signature across the genome Essentially, all loci experience common demographic factors Genome scan approach: look at a large number of markers. These generate null distribution (most not under selection), outliers = potentially selected loci (genome wide polymorphism tests)

Linkage Disequilibrium Decay One feature of a selective sweep are derived alleles at high frequency. Under neutrality, only older alleles are at higher frequencies. Such a feature is NOT influenced by past demography. Older = more time to reduce the size of LD (haplotype) blocks Sabeti et al (2002) note that under a sweep such high frequency young alleles should (because of their recent age) have much longer regions of LD than expected. Wang et al (2006) proposed a Linkage Disequilibrium Decay, or LDD, test looks for excessive LD for high frequency alleles

Starting haplotype Under pure drift, high-freq alleles should have short haplotypes time freq Under directional selection, very fast change in allele frequency, and hence short time. Results in high-frequency alleles with long haplotypes

Optimal conditions for detecting selection High levels of polymorphism at the start of selection High effective levels of recombination gives a shorter window around the selective site Low selfing as high levels of selfing reduces the effective recombination rate Recent selection, as signatures of sweeps persist for roughly N e generations

Summary: Detecting Adaptive Genes Linkage mapping (QTL mapping, association analysis) vs. detection of selected loci Linkage: Know the target phenotype(s) Selection: Don’t know the target phenotype Both can suffer from low power and confounding from demographic effects Both can significantly benefit from high-density genomic scans, but these are also not without problems.

G x E The flip side of molecular divergence is the direct assessment of trait values in a set of populations/lines over a series of environments Lines that show strong positive G x E (genotype-environment interactions) in a particular environment (or set of environments) are sources of improvement genes for a target environment(s)

Basic G x E model Basic model is the mean value of line i in environment j is u + G i + E j + GE ij G i is the line average over all environments E j is the environmental effect over all lines GE ij is the G x E interaction

Looking for structure in G x E Often there is considerable structure in G x E, so that the ij-th term can be estimated as a simple product –GE ij = a i b j –General bilinear models can be used (more terms) –Key: a i can be thought of an a genotypic environmental specificity factor Modification is to use factorial regression –Here one uses measured environmental factors (temp, rainfall, etc) to try to predict GE –One can also incorporate measured genes (candidate genotypes entered as cofactors) as well

Suppose that y 1.. y p are p environmental factors that are measured by the breeder (e.g., degree days, rainfall,etc.), with y kj the value of factor k in environment j The idea is to predict GE by looking at how different lines react to each environmental factor  1i is the measure of the sensitivity to line i to environmental factor 1,  2i to factor 2, etc. GE ij =  1i y 1 j + …  pi y pj +  ij

Factorial regressions allow the breeder to examine how each line reacts to a variety of environmental factors, potentially offering differential targets of selection Maize example (Epinat-Le Signor et al. 2001) –A major contributor to GxE was the interaction between a line's date of flowering and water supply, with early varieties becoming more favorable as the water supply decreases

Summary Level of genome-wide divergence using molecular markers –a weak signal for usable QTL variation –a very poor (at best!) signal for heterosis –No signal for presence of locally-adaptive genes Signals of adaptive genes –Changes in polymorphism levels around target Use of factorial regressions to tease out components of GXE –Environmental factors within E –Traits, genes within lines