PBG 650 Advanced Plant Breeding Module 1: Introduction Population Genetics – Hardy Weinberg Equilibrium – Linkage Disequilibrium.

Slides:



Advertisements
Similar presentations
Single Nucleotide Polymorphism And Association Studies Stat 115 Dec 12, 2006.
Advertisements

Association Mapping as a Breeding Strategy
Alleles = A, a Genotypes = AA, Aa, aa
Qualitative and Quantitative traits
CSS 650 Advanced Plant Breeding Module 2: Inbreeding Small Populations –Random drift –Changes in variance, genotypes Mating Systems –Inbreeding coefficient.
Chapter 17 Population Genetics and Evolution, part 2 Jones and Bartlett Publishers © 2005.
Hardy-Weinberg Equilibrium
 Read Chapter 6 of text  Brachydachtyly displays the classic 3:1 pattern of inheritance (for a cross between heterozygotes) that mendel described.
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
BMI 731- Winter 2005 Chapter1: SNP Analysis Catalin Barbacioru Department of Biomedical Informatics Ohio State University.
Chapter 2: Hardy-Weinberg Gene frequency Genotype frequency Gene counting method Square root method Hardy-Weinberg low Sex-linked inheritance Linkage and.
Quantitative Genetics Theoretical justification Estimation of heritability –Family studies –Response to selection –Inbred strain comparisons Quantitative.
Essentials of Biology Sylvia S. Mader
Population Genetics. Mendelain populations and the gene pool Inheritance and maintenance of alleles and genes within a population of randomly breeding.
14 Molecular Evolution and Population Genetics
Population Genetics What is population genetics?
Brachydactyly and evolutionary change
Evolutionary Change in Populations: Population Genetics, Selection & Drift.
Hardy Weinberg: Population Genetics
 Read Chapter 6 of text  We saw in chapter 5 that a cross between two individuals heterozygous for a dominant allele produces a 3:1 ratio of individuals.
Chapter 23 Population Genetics © John Wiley & Sons, Inc.
Lamarck vs Darwin worksheet Bell Ringer
PBG 650 Advanced Plant Breeding
Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating designs PBG 650 Advanced Plant Breeding.
Population Genetics Learning Objectives
Broad-Sense Heritability Index
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Population Genetics is the study of the genetic
BIOLOGY 30 POPULATION GENETICS. CHAPTER OUTCOMES Define a gene pool. Describe the gene pool of a population at genetic equilibrium. Summarize the five.
14 Population Genetics and Evolution. Population Genetics Population genetics involves the application of genetic principles to entire populations of.
Chapter 5 Characterizing Genetic Diversity: Quantitative Variation Quantitative (metric or polygenic) characters of Most concern to conservation biology.
Population Genetics: Chapter 3 Epidemiology 217 January 16, 2011.
Population Genetics I. Basic Principles. Population Genetics I. Basic Principles A. Definitions: - Population: a group of interbreeding organisms that.
Genetic Linkage. Two pops may have the same allele frequencies but different chromosome frequencies.
Chapter 3 – Basic Principles of Heredity. Johann Gregor Mendel (1822 – 1884) Pisum sativum Rapid growth; lots of offspring Self fertilize with a single.
PBG 650 Advanced Plant Breeding Module 2: Inbreeding Genetic Diversity –A few definitions Small Populations –Random drift –Changes in variance, genotypes.
PBG 650 Advanced Plant Breeding
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
AP Biology Lab 7: Genetics (Fly Lab). AP Biology Lab 7: Genetics (Fly Lab)  Description  given fly of unknown genotype use crosses to determine mode.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
INTRODUCTION TO ASSOCIATION MAPPING
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
1 Population Genetics Definitions of Important Terms Population: group of individuals of one species, living in a prescribed geographical area Subpopulation:
1 B-b B-B B-b b-b Lecture 2 - Segregation Analysis 1/15/04 Biomath 207B / Biostat 237 / HG 207B.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Allele Frequencies: Staying Constant Chapter 14. What is Allele Frequency? How frequent any allele is in a given population: –Within one race –Within.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Evolution of Populations. The Smallest Unit of Evolution Natural selection acts on individuals, but only populations evolve – Genetic variations contribute.
Characteristics of a population Genotype frequency--the relative proportion of different genotypes in a population with respect to a given locus.
Lecture 22: Quantitative Traits II
Mader Evolution of Poplulations Chapter 23.
Modern Evolutionary Biology I. Population Genetics A. Overview Sources of VariationAgents of Change MutationN.S. Recombinationmutation - crossing over.
STT2073 Plant Breeding and Improvement. Quality vs Quantity Quality: Appearance of fruit/plant/seed – size, colour – flavour, taste, texture – shelflife.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Please feel free to chat amongst yourselves until we begin at the top of the hour.
Population Genetics Measuring Evolutionary Change Over Time.
Genetic Linkage.
13/11/
Measuring Evolutionary Change Over Time
It is the study of the properties of genes in populations
MULTIPLE GENES AND QUANTITATIVE TRAITS
Genetic Linkage.
Quantitative Traits in Populations
The Evolution of Populations
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
MULTIPLE GENES AND QUANTITATIVE TRAITS
The ‘V’ in the Tajima D equation is:
Basic concepts on population genetics
Genome-wide Association Studies
Lecture 4: Testing for Departures from Hardy-Weinberg Equilibrium
Presentation transcript:

PBG 650 Advanced Plant Breeding Module 1: Introduction Population Genetics – Hardy Weinberg Equilibrium – Linkage Disequilibrium

“The science, art, and business of improving plants for human benefit” Considerations: –Crop(s) –Production practices –End-use(s) –Target environments –Type of cultivar(s) –Traits to improve –Breeding methods –Source germplasm –Time frame –Varietal release and intellectual property rights Plant Breeding Bernardo, Chapter 1

Plant Breeding A common mistake that breeders make is to improve productivity without sufficient regard for other characteristics that are important to producers, processors and consumers.  Well-defined Objectives  Good Parents  Genetic Variation  Good Breeding Methods  Functional Seed System  Adoption of Cultivars by Farmers

Quantitative Traits Continuum of phenotypes (metric traits) Often many genes with small effects Environmental influence is greater than for qualitative traits Specific genes and their mode of inheritance may be unknown Analysis of quantitative traits –population parameters means variances –molecular markers linked to QTL

Populations In the genetic sense, a population is a breeding group –individuals with different genetic constitutions –sharing time and space In animals, mating occurs between individuals –‘Mendelian population’ –genes are transmitted from one generation to the next In plants, there are additional ways for a population to survive –self-fertilization –vegetative propagation Definition of ‘population’ may be slightly broader for plants –e.g., lines from a germplasm collection Falconer, Chapt. 1; Lynch and Walsh, Chapt. 4

Study genes in populations –Frequency and interaction of alleles –Mating patterns, genotype frequencies –Gene flow –Selection and adaptation vs random genetic drift –Genetic diversity and relationship –Population structure Related Fields –Evolutionary Biology – e.g., crop domestication –Landscape Genetics What do population geneticists do?

Gene and genotype frequencies AllelesGenotypes A1A1 A2A2 A1A1A1A1 A1A2A1A2 A2A2A2A2 FrequenciespqP 11 P 12 P 22 # Individuals Proportions For a population of diploid organisms: p + q = 1 P 11 + P 12 + P 22 = 1 Bernardo, Chapter 2

Gene frequencies (another way) AllelesGenotypes A1A1 A2A2 A1A1A1A1 A1A2A1A2 A2A2A2A2 FrequenciespqP 11 P 12 P 22 # Individuals Proportions Number of individuals = N = N 11 + N 12 + N 22 = 100 Number of alleles = 2N = N 1 + N 2 = 200

Allele frequencies in crosses Inbred x inbred Alleles are unknown, but allele frequencies at segregating loci are known F 1 and F 2 : p = q = 0.5 pq BC BC BC BC Value of q is reduced by ½ in each backcross generation

Factors that may change gene frequencies Population size –changes may occur due to sampling  assume ‘large’ population Differences in fertility and viability –parents may differ in fertility –gametes may differ in viability –progeny may differ in survival rate  assume no selection Migration and mutation  assume no migration and no mutation

Factors that may change genotype frequencies Changes in genotype frequency (not gene frequency) Mating system –assortative or disassortative mating –selfing –geographic isolation  assume that mating occurs at random (panmixia)

Hardy-Weinberg Equilibrium Assumptions –large, random-mating population –no selection, mutation, migration –normal segregation –equal gene frequencies in males and females –no overlap of generations (no age structure) Note that assumptions only need to be true for the locus in question  Gene and genotype frequencies remain constant from one generation to the next  Genotype frequencies in progeny can be predicted from gene frequencies of the parents  Equilibrium attained after one generation of random mating

Hardy-Weinberg Equilibrium Genes in parentsGenotypes in progeny A1A1 A2A2 A1A1A1A1 A1A2A1A2 A2A2A2A2 FrequenciespqP 11 = p 2 P 12 = 2pqP 22 = q 2 Example Expected genotype frequencies are obtained by expanding the binomial (p + q) 2 = p 2 + 2pq + q 2 = 1 A1A1 A2A2 A1A1 A2A2 p 2 =.16pq=.24p = 0.4 q = 0.6 q 2 =.36 pq=.24

Equilibrium with multiple alleles For multiple alleles, expected genotype frequencies can be found by expanding the multinomial (p 1 + p 2 + ….+ p n ) 2 For example, for three alleles: Lynch and Walsh (pg 57) describe equilibrium for autopolyploids Corresponding genotypes: A 1 A 1 A 1 A 2 A 1 A 3 A 2 A 2 A 2 A 3 A 3 A 3

Relationship between gene and genotype frequencies f(A 1 A 2 ) has a maximum of 0.5, which occurs when p=q=0.5 Most rare alleles occur in heterozygotes Implications for –F 1 ? –F 2 ? –Any BC? A2A2A2A2 A1A1A1A1 A1A2A1A2

Applications of the Hardy-Weinberg Law Predict genotype frequencies in random-mating populations Use frequency of recessive genotypes to estimate the frequency of a recessive allele in a population –Example: assume that the incidence of individuals homozygous for a recessive allele is about 1/11,000. q 2 = 1/11,000 q  Estimate frequency of individuals that are carriers for a recessive allele p = = pq =  2%

Testing for Hardy-Weinberg Equilibrium All genotypes must be distinguishable GenotypesGene frequencies A1A1A1A1 A1A2A1A2 A2A2A2A2 A1A1 A2A2 Observed Expected N = N 11 + N 12 + N 22 = = 747

Chi-square test for Hardy-Weinberg Equilibrium Accept H 0 : no reason to think that assumptions for Hardy- Weinberg equilibrium have been violated –does not tell you anything about the fertility of the parents When you reject H 0, there is an indication that one or more of the assumptions is not valid –does not tell you which assumption is not valid Example in Excel only 1 df because gene frequencies are estimated from the progeny data

Exact Test for Hardy-Weinberg Equilibrium Chi-square is only appropriate for large sample sizes If sample sizes are small or some alleles are rare, Fisher’s Exact test is a better alternative –Calculate the probability of all possible arrays of genotypes for the observed numbers of alleles –Rank outcomes in order of increasing probability –Reject those that constitute a cumulative probability of <5% Example in Excel Weir (1996) Chapt. 3

Likelihood Ratio Test Maximum of the likelihood function given the data (z) when some parameters are assigned hypothesized values Maximum of the likelihood function given the data (z) when there are no restrictions When the hypothesis is true:   2 df=#parameters assigned values Likelihood ratio tests for multinomial proportions are often called G-tests (for goodness of fit) Lynch and Walsh Appendix 4

Likelihood Ratio Test for HWE where is the expected number and is the observed number of the ij th genotype Calculations in Excel

Gametic phase equilibrium Lynch and Walsh, pg ; Falconer, pg A a Bb P AB P Ab P aB P ab pApA papa pBpB pbpb Random association of alleles at different loci (independence) P AB =p A p B Disequilibrium D AB = P AB – p A p B D AB = P AB P ab – P Ab P aB D AB = 0.40 – 0.5*0.5 = 0.15 D AB = 0.4*0.4 – 0.1*0.1 = 0.15 Bb A a

Linkage Disequilibrium Nonrandom association of alleles at different loci –the covariance in frequencies of alleles between the loci Refers to frequencies of alleles in gametes (haplotypes) May be due to various causes in addition to linkage –‘gametic phase disequilibrium’ is a more accurate term –‘linkage disequilibrium’ (LD) is widely used to describe associations of alleles in the same or in different linkage groups

Linkage Disequilibrium Gametic typesABAbaBab Observed P AB P Ab P aB P ab Expected pA pBpA pB pA pbpA pb pa pBpa pB pa pbpa pb Disequilibrium+D-D +D Excess of coupling phase gametes  +D Excess of repulsion phase gametes  -D

Sources of linkage disequilibrium Linkage Multilocus selection (particularly with epistasis) Assortative mating Random drift in small populations Bottlenecks in population size Migration or admixtures of different populations Founder effects Mutation

Two locus equilibrium For two loci, it may take many generations to reach equilibrium even when there is independent assortment and all other conditions for equilibrium are met –New gamete types can only be produced when the parent is a double heterozygote A A B b 0.5 AB 0.5 Ab A a B b 0.25 AB0.25 aB 0.25 Ab0.25 ab

Decay of linkage disequilibrium In the absence of linkage, LD decays by one-half with each generation of random mating c = recombination frequency

Factors that delay approach to equilibrium Linkage Selfing – because it decreases the frequency of double heterozygotes Small population size – because it reduces the likelihood of obtaining rare recombinants

Implications for breeding Gametic Phase Disequilibrium that is not due to linkage is eliminated by making the F 1 cross Recombination occurs during selfing There would be greater recombination with additional random mating, but it may not be worth the time and resources P1 P2 A 1 A 1 B 1 B 1 x A 2 A 2 B 2 B 2 F 1 A 1 A 2 B 1 B 2 gamete frequency A 1 B 1 0.5*(1-c) A 1 B 2 0.5*c A 2 B 1 0.5*c A 2 B 2 0.5*(1-c)

Effect of mating system on LD decay c = effective recombination rate s = the fraction of selfing no linkage 99% selfing outcrossing

Alternative measures of LD D is the covariance between alleles at different loci Maximum values of D depend on allele frequencies It is convenient to consider r 2 to be the square of the correlation coefficient, but it can only obtain a value of 1 when allele frequences at the two loci are the same r 2 indicates the degree of association between alleles at different loci due to various causes (linkage, mutation, migration)

D – minimum and maximum values Bb AP AB = p A p B + DP Ab = p A p b - DpApA aP aB = p a p B - DP ab = p a p b + Dpapa pBpB pbpb If D>0 Look for the maximum value D can have P Ab = p A p b - D  0  D  p A p b P aB = p a p B - D  0  D  p a p B D  min(p A p b, p a p B ) If D<0 Look for the minimum value D can have P AB = p A p B + D  0  D  -p A p B P ab = p a p b + D  0  D  -p a p b D  max(-p A p B, -p a p b ) fyi

Alternative measures of LD D’ is scaled to have a minimum of 0 and a maximum of 1 D’ indicates the degree to which gametes exhibit the maximum potential disequilbrium for a given array of allele frequencies D’=1 indicates that one of the haplotypes is missing D’ is very unstable for small sample sizes, so r 2 is more widely utilized to measure LD When D AB > 0 When D AB < 0 fyi

Testing for gametic phase disequilibrium Best when you can determine haplotypes –inbred lines or doubled haploids –haplotypes of double heterozygotes inferred from progeny tests Use a Goodness of Fit test if the sample size is large –Chi-square –G-test (likelihood ratio) Use Fisher’s exact test for smaller sample sizes Use a permutation test for multiple alleles Need a fairly large sample to have reasonable power for LD (~200 individuals or more) See Weir (1996) pg for more information

Depiction of Linkage Disequilibrium Flint-Garcia et al., Annual Review of Plant Biology 54: Disequilibrium matrix for polymorphic sites within sh1 in maize r2r2 Prob value Fisher’s Exact Test

Extent of LD in Maize Linkage disequillibrium across the 10 maize chromosomes measured with 914 SNPs in a global collection of 632 maize inbred lines. Yan et al PLoS ONE 4(12): e8451 r2r2 Average LD decay distance is 5–10 kb

Extent of LD in Barley Average LD decay distance is ~5 cM Waugh et al., 2009, Current Opinion in Plant Biology 12: r2r2 No adjustment for population structure  Adjusted for population structure  Other studies Wild barley – LD decays within a gene Landraces ~ 90 kb European germplasm - significant LD: mean 3.9 cM, median 1.16 cM Elite North American Barley

References on linkage disequilibrium Flint-Garcia et al., Structure of linkage disequilibrium in plants. Annual Review of Plant Biology 54: 357–374. Gupta et al., Linkage disequilibrium and association studies in higher plants: present status and future prospects. Plant Molecular Biology 57: 461–485. Mangin et al., Novel measures of linkage disequilibrium that correct the bias due to population structure and relatedness. Heredity 108: 285–291. Slatkin, M Linkage disequilibrium – understanding the evolutionary past and mapping the medical future. Nature Reviews Genetics 9: 477–485. Waugh, R., Jean-Luc Jannink, G.J. Muehlbauer, L. Ramsay The emergence of whole genome association scans in barley. Current Opinion in Plant Biology 12(2): 218–222. Yan, J., T. Shah, M.L Warburton, E.S. Buckler, M.D. McMullen, et al Genetic characterization and linkage disequilibrium estimation of a global maize collection using SNP Markers. PLoS ONE 4(12): e8451. Zhu et al., Status and prospects of association mapping in plants. The Plant Genome 1: 5–20.