Population genetics. Population genetics concerns the study of genetic variation and change within a population. While for evolving species there is no.

Slides:



Advertisements
Similar presentations
EVOLUTION OF POPULATIONS
Advertisements

Alleles = A, a Genotypes = AA, Aa, aa
Chapter 19 Evolutionary Genetics 18 and 20 April, 2004
MIGRATION  Movement of individuals from one subpopulation to another followed by random mating.  Movement of gametes from one subpopulation to another.
Random fixation and loss of heterozygosity
Exam Thursday Covers material through Today’s lecture Practice problems and answers are posted Bring a calculator 5 questions, answer your favorite 4 Please.
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
Change in frequency of the unbanded allele (q) as a function of q for island populations. Equilibrium points a)Strong selection for q, little migration.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Atelier INSERM – La Londe Les Maures – Mai 2004
Signatures of Selection
Gene Substitution Dan Graur.
THE EVOLUTION OF POPULATIONS
Forward Genealogical Simulations Assumptions:1) Fixed population size 2) Fixed mating time Step #1:The mating process: For a fixed population size N, there.
14 Molecular Evolution and Population Genetics
From population genetics to variation among species: Computing the rate of fixations.
2: Population genetics break.
The infinitesimal model and its extensions. Selection (and drift) compromise predictions of selection response by changing allele frequencies and generating.
Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA
Molecular Evolution with an emphasis on substitution rates Gavin JD Smith State Key Laboratory of Emerging Infectious Diseases & Department of Microbiology.
CSE 291: Advanced Topics in Computational Biology Vineet Bafna/Pavel Pevzner
Modeling evolutionary genetics Jason Wolf Department of ecology and evolutionary biology University of Tennessee.
One-way migration. Migration There are two populations (x and y), each with a different frequency of A alleles (px and py). Assume migrants are from population.
Evolutionary Change in Populations: Population Genetics, Selection & Drift.
Lecture 2: Basic Population and Quantitative Genetics.
Population Genetics 101 CSE280Vineet Bafna. Personalized genomics April’08Bafna.
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
Broad-Sense Heritability Index
Medical Genetics 08 基因变异的群体行为 Population Genetics.
MIGRATION  Movement of individuals from one subpopulation to another followed by random mating.  Movement of gametes from one subpopulation to another.
1 1 Population Genetics. 2 2 The Gene Pool Members of a species can interbreed & produce fertile offspring Species have a shared gene pool Gene pool –
Population Genetics youtube. com/watch
The Evolution of Populations.  Emphasizes the extensive genetic variation within populations and recognizes the importance of quantitative characteristics.
Lecture 3: population genetics I: mutation and recombination
POPULATION GENETICS 1. Outcomes 4. Discuss the application of population genetics to the study of evolution. 4.1 Describe the concepts of the deme and.
1 Random Genetic Drift 2 Conditions for maintaining Hardy-Weinberg equilibrium: 1. random mating 2. no migration 3. no mutation 4. no selection 5.infinite.
Course outline HWE: What happens when Hardy- Weinberg assumptions are met Inheritance: Multiple alleles in a population; Transmission of alleles in a family.
Deviations from HWE I. Mutation II. Migration III. Non-Random Mating IV. Genetic Drift A. Sampling Error.
Models of Molecular Evolution III Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 7.5 – 7.8.
1 Population Genetics Basics. 2 Terminology review Allele Locus Diploid SNP.
Section 6 Maintenance of Genetic Diversity Levels of genetic diversity result from the joint impacts of: Mutation & migration adding variation Chance &
Ch. 20 – Mechanisms of Evolution 20.1 – Population Genetics macro-evolution – evolution on a large scale, such as the evolution of new species from a common.
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
Selectionist view: allele substitution and polymorphism
NEW TOPIC: MOLECULAR EVOLUTION.
By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458.
Genome Evolution. Amos Tanay 2010 Genome evolution Lecture 4: population genetics III: selection.
The plant of the day Pinus longaevaPinus aristata.
Testing the Neutral Mutation Hypothesis The neutral theory predicts that polymorphism within species is correlated positively with fixed differences between.
In populations of finite size, sampling of gametes from the gene pool can cause evolution. Incorporating Genetic Drift.
Modelling evolution Gil McVean Department of Statistics TC A G.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
LECTURE 9. Genetic drift In population genetics, genetic drift (or more precisely allelic drift) is the evolutionary process of change in the allele frequencies.
Lecture 6 Genetic drift & Mutation Sonja Kujala
Population Genetics Chapter 4.
MULTIPLE GENES AND QUANTITATIVE TRAITS
MIGRATION Movement of individuals from one subpopulation to another followed by random mating. Movement of gametes from one subpopulation to another followed.
Population Genetics.
Population Genetics As we all have an interest in genomic epidemiology we are likely all either in the process of sampling and ananlysising genetic data.
Signatures of Selection
Deviations from HWE I. Mutation II. Migration III. Non-Random Mating
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Conclusions of Hardy-Weinberg Law
MULTIPLE GENES AND QUANTITATIVE TRAITS
The ‘V’ in the Tajima D equation is:
MIGRATION Movement of individuals from one subpopulation to another followed by random mating. Movement of gametes from one subpopulation to another followed.
Lecture 2: Basic Population Genetics
Hardy Weinberg: Population Genetics
Modern Evolutionary Biology I. Population Genetics
Population Genetics: The Hardy-Weinberg Law
Presentation transcript:

Population genetics

Population genetics concerns the study of genetic variation and change within a population. While for evolving species there is no model for the branching process (speciation), in population genetics there is. This allows a detailed modelling of the interplay between mutation, selection, and stochastic effects (genetic drift). Simplifying assumptions that are initially made include: - No selection - No recombination - No fluctuations in population size - No population structure (subdivision; migration) - No assortative mating (individuals mate randomly) - No interaction between loci (no epistasis; no linkage) - No environmental effects (e.g. climate/habitat change etc.)  RA Fisher  Sewell Wright JBS Haldane  Motoo Kimura 

Kimura’s Neutral Theory Darwin(ism): * Something causes minute (phenotype) variations in a population (ideas: perhaps over-use during lifetime might cause variations (Lamarckism; think giraffes); perhaps traits are transmitted through blood and blend) * Natural selection causes adaptive variants to rise in frequency, while non-adaptive ones die out. Neo-darwinism: * The “something” is replaced by Mendelian genetics + random mutations * Panselectionism; adaptionism: most traits are optimal; selection main driving force of evolution (R.A. Fisher; Richard Dawkins; John Maynard Smith) Population genetics / neutral theory: * Most mutations are neutral; genetic drift underlies most of evolution (Fisher; Haldane; Wright; Kimura) Modern evolutionary synthesis: * Takes onboard (parts of) all of the above. * Neutral theory relevant for DNA data in populations; considered less relevant for phenotypes.

Wright-Fisher Model -Constant population size N diploid individuals = 2N alleles -Each descendant chooses a parent randomly -Everyone reproduces simultaneously (no overlapping generations)

Wright-Fisher Model Suppose i(t) individuals carry a particular mutation A in generation t. The probability of any individual in generation t+1 to be of type A is x = i(t) / 2N The number of individuals of type A in generation t+1 is binomially distributed: This distribution has mean and variance E(i(t+1)) = i(t) Var(i(t+1)) = 2N x (1-x) The expected number of individuals carrying a mutation A does not change, but because the variance will increase, eventually the mutation will either be lost (i=0) or reach fixation (i=2N).

Wright-Fisher Model Suppose the initial frequency of the mutant A is i. Since E( i(t+1) ) = i(t), the expectation of the frequency remains constant throughout. However, eventually it will either be lost or go to fixation. If the probability of eventual fixation is p, we have i = E( i(0) ) = E( i(  ) ) = 2N p + 0 (1-p) = 2 N p The probability p that A will go to fixation is therefore p = i / 2N A simpler argument is this: without selection all alleles are equivalent; the one that gets fixed is chosen uniformly from the present-day population; the probability that this is an A mutant is i / 2N. This also means that for neutral sites, the rate ρ of substitution = the rate u of mutation.

Wright-Fisher Model Since x=i / 2N and Var(i(t+1)) = 2N x (1-x) we get Var ( x ) = x (1-x) / 2N, in other words, the sampling variance in the allele frequency x is inversely proportional to the population size. This effect is called (random) genetic drift. The Wright-Fisher model is highly idealized; e.g. populations do vary in size, there is structure, and individuals do not mate randomly. Therefore, N does not directly relate to the actual population size. A more accurate way of putting this is to say that N is the Wright-Fisher population size that generates the same amount of genetic drift as there is in the actual population. To emphasize this, the parameter N is often called the effective population size (and written N e ).

The coalescent model Whole population; Wright-Fisher Ancestry of current population Ancestry of a random sampleCoalescent

Kingman’s coalescent Probability that two given lineages coalesce in one generation: P(coalescence) = 1/2N Expected number of generations before coalescence, i.e. the time to the most recent common ancestor (MRCA): E( T MRCA ) = 2N Probability of coalescence (of 2 lineages) when k lineages are present = 1-P(no coalescence): Other argument: Coalescence rate per pair is 1/2N; there are k-choose-2 pairs. J.F.C. Kingman

Variation in the population Suppose the mutation rate is u (per generation, and per locus or site). The expected number of differences between two individuals (diversity) is  = 2 * u * E( T MRCA ) = 4 N u (assuming all mutations are unique). The quantity 4 N u often appears in population genetics, and is usually treated as an independent parameter, . Real-life populations do not, of course, follow the Wright-Fisher model. The parameter N that makes the W-F diversity  equal to the observed diversity is called the effective population size, N e. Other definitions (based on other aspects of the model) are used as well.

Allele frequency spectrum By going to the continuous (diffusion) limit, the equilibrium distribution of allele frequencies can be derived. This is called the “allele frequency spectrum”. Assuming that mutations and back-mutations occur at the same rate u, the allele frequency spectrum P(x)dx is P(x) dx = x  -1 (1-x)  -1 dx (apart from normalization). Here  = 4 N u. Suppose a mutation occurs at frequency x. The probability of sampling two individuals that are different at that locus is 2 x (1-x). Multiplying with P(x) dx gives the contribution to the heterozygosity  (= probability that two random alleles differ) per unit of frequency: H(x) dx = x  (1-x)  dx Since  is small, every frequency contributes nearly equally to the total heterozygosity . Under the influence of selection, the allele frequency spectrum becomes skewed towards the advanta-geous allele, and depleted of intermediate-frequency alleles. This is one way to test for selection.

Linkage disequilibrium (LD) Relates to 2 polymorphic sites D AB = f AB – f A f B = f AB f ab - f Ab f aB (D AB = -D aB = -D Ab = D ab ) Correlation coefficient (Hill & Robertson 1968) : r 2 AB = D AB 2 / f a f A f b f B Richard Lewontin (1929-)

Dynamics of LD Genetic drift causes reduction in diversity, so that (expected) LD  0 at equilibrium. Recombination decreases LD. Effect of selective sweep (rapid increase of frequency of an advantageous allele) on LD: – Diversity is reduced – Polymorphisms on selected haplotype are carried along: hitchhiking – More correlations between sites: many share ancestry – Result: LD increases Sweep

Prior observations “Extent of enzyme polymorphism is surprisingly constant between species. So constant, in fact, that the effective sizes of most species must be within 1 order of magnitude of each other.” (Lewontin 1974; Maynard Smith & Haigh 1974) Variation is reduced in regions with low recombination (Aguade 1989; Begun & Aquadro 1992, etc.)

Assumptions: - Rate of neutral mutations = u - Rate of advantageous mutations = v - Selective advantage of adv. mutations = σ Without linkage to selected locus: mean sum-of-site heterozygosities (ssh; diversity) = 4 N u ( = mean time to coalescence * 2 lineages * neutral mutation rate) Neutral locus Selected locus

Assumptions: - Rate of neutral mutations = u - Rate of advantageous mutations = v - Selective advantage of adv. mutations = σ - Times of fixation at selected locus: Poisson process, rate ρ - Fixations are fast compared to drift, can be regarded as instantaneous With linkage to selected locus: Rate of coalescence due to drift = 1/2N Rate of fixation of adv. muts. at selected site = ρ Total coalescence rate: ρ + 1/2N Average time to coalescence: 1 / (ρ + 1 / 2N) ssh = 2 u / (ρ + 1 / 2N) = 4 N u / ( N ρ ) Limit for N  infinity: ssh = 2 u / ρ Neutral locus Selected locus

ssh = 2 u / (ρ + 1 / 2N) = 4 N u / ( N ρ ) Rate of fixation ρ  v * 2 N σ (provided 1/2N < σ < 1 )

Fixation due to hitchhiking: Current frequency of allele A = x New frequency of allele = z z = 1with probability ρ x (hitchhiking; allele A) z = 0with probability ρ (1-x) (hitchhiking; allele a) z = xwith probability (1-ρ) (no hitchhiking)  freq = z-x E(  freq) = 0 Var(  freq) = ρ x (1-x) (infinite population) Var(  freq) = (1/2N) x (1-x)(finite population; no hitchhiking) Var(  freq) = (ρ + 1/2N) x (1-x) (finite population + hitchhiking) Same form as standard W-F model, but with N e = N / (1 + 2 N ρ)

Now assume some recombination between neutral & selected loci (instead of total linkage). Suppose allele linked to advantageous mutation rises to frequency y (rather than frequency 1). z = y + (1-y)xwith probability ρ x (hitchhiking; allele A) z = (1-y)xwith probability ρ (1-x) (hitchhiking; allele a) z = xwith probability (1-ρ) (no hitchhiking)  freq = z-x E(  freq) = 0 Var(  freq) = ρ y 2 x (1-x) (infinite population) Var(  freq) = (1/2N) x (1-x)(finite population; no hitchhiking) Var(  freq) = (ρ y 2 + 1/2N) x (1-x) (finite population + hitchhiking) Same form as standard W-F model, but with N e = N / (1 + 2 N ρ y 2 )

Coalescence rate due to drift = 1/2N Coalescence rate due to hitchhiking = ρ E( y 2 ) If 2 ρ y 2 > 1/N, “draft” (due to hitchhiking, sweeps) is more important than “drift” (population size effect). In the “draft” regime, nucleotide diversity is independent of population size. Numerical example: Fruitfly Limit for N  infinity:  = ssh = 2 u / ρ y 2 Neutral mutation rate u = 10^-9 per generation, per site Site heterozygosity  = Assume y=1 Rate of advantageous substitutions ρ ~ 10 -7, “typical of rate of amino acid substitutions in coding regions”

Questions in (population) genetics Effective population size of human population ~ Why the huge discrepancy with actual population size? The amount of genetic diversity is “surprisingly constant between species” (Lewontin 1964). Is this (i) not a problem / not true, (ii) caused by Gillespie’s “genetic draft”, or (iii) caused by something else? What is the cause of the variation in recombination rate (including hotspots) across the human genome. Are the latest measurements accurate? Roughly the same 3-5% of mammalian genome is conserved within the mammalian clade. Does this represent most/all of the functional genome, or is a large fraction functional and fast evolving? What can population genetics (rather than species comparisons) bring to this question? Common (high-frequency) genetic variants associated with common disease are hard to find and usually explain only a small fraction (~1%) of variability of susceptibility variation. Are common diseases often caused by rare genetic variants instead? If so, how can these be found? (Not by association studies – but linkage studies are expensive and have low-resolution)