Neutrality Test First suggested by Kimura (1968) and King and Jukes (1969) Shift to using neutrality as a null hypothesis in positive selection and selection.

Slides:



Advertisements
Similar presentations
IMPRS workshop Comparative Genomics 18 th -21 st of February 2013 Lecture 4 Positive selection.
Advertisements

Quick Lesson on dN/dS Neutral Selection Codon Degeneracy Synonymous vs. Non-synonymous dN/dS ratios Why Selection? The Problem.
Drosophila Population Genetics
Alleles = A, a Genotypes = AA, Aa, aa
Random fixation and loss of heterozygosity
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Atelier INSERM – La Londe Les Maures – Mai 2004
Signatures of Selection
From population genetics to variation among species: Computing the rate of fixations.
Molecular Evolution with an emphasis on substitution rates Gavin JD Smith State Key Laboratory of Emerging Infectious Diseases & Department of Microbiology.
Molecular Clocks, Base Substitutions, & Phylogenetic Distances.
Reseach Training Presentation By Yanhong Zhao Department of Evolutionary Functional Genomics, Uppsala University, Sweden Supervisor: Prof. Ulf Lagercrantz.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Hidenki Innan and Yuseob Kim Pattern of Polymorphism After Strong Artificial Selection in a Domestication Event Hidenki Innan and Yuseob Kim A Summary.
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
Lecture 21: Tests for Departures from Neutrality November 9, 2012.
Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012.
Lab 11 :Test of Neutrality and Evidence for Selection.
Rates and Fitness Effects of Mutations Adam Eyre-Walker (University of Sussex)
Models of Molecular Evolution III Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 7.5 – 7.8.
Selectionist view: allele substitution and polymorphism
Lecture 20 : Tests of Neutrality
NEW TOPIC: MOLECULAR EVOLUTION.
Molecular evolution Part I: The evolution of macromolecules.
Genomics of Adaptation
Lab 11 :Test of Neutrality and Evidence for Selection
Testing the Neutral Mutation Hypothesis The neutral theory predicts that polymorphism within species is correlated positively with fixed differences between.
In populations of finite size, sampling of gametes from the gene pool can cause evolution. Incorporating Genetic Drift.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
LBA ProtPars. LBA Prot Dist no Gamma and no alignment.
Lecture 6 Genetic drift & Mutation Sonja Kujala
Chapter 2 Genes Code for Proteins. 2.1Introduction Early work measuring recombination frequencies between genes led to the establishment of “linkage groups”:
Hudson Kreitman Aguadé 1987
Data analyses Course code: ZOO560 Week 3
Genetic Linkage.
Population Genetics Chapter 4.
MULTIPLE GENES AND QUANTITATIVE TRAITS
Part 2: Genetics, monohybrid vs. Dihybrid crosses, Chi Square
Detection of the footprint of natural selection in the genome
Hypothesis Testing: One Sample Cases
Polymorphism Polymorphism: when two or more alleles at a locus exist in a population at the same time. Nucleotide diversity: P = xixjpij considers.
Evolution of gene function
Causes of Variation in Substitution Rates
Inference and Tests of Hypotheses
Evolution of populations
Signatures of Selection
The neutral theory of molecular evolution
Allele frequency Time.
Genetic Variation Genetic Variation in Populations
Linkage and Linkage Disequilibrium
The Neutral Theory M. Kimura, 1968
Diversity and selection of the MHC class II genes in canids
Genetic Linkage.
Distances.
Detection of the footprint of natural selection in the genome
Calculating genetic biodiversity
What are the Patterns Of Nucleotide Substitution Within Coding and
MULTIPLE GENES AND QUANTITATIVE TRAITS
Testing the Neutral Mutation Hypothesis
The ‘V’ in the Tajima D equation is:
1. "HARD" Selection can 'cost' a population individuals:
Molecular evolution: traditional tests of neutrality
The Evolution of Populations
Genetic Drift, followed by selection can cause linkage disequilibrium
Genetic Linkage.
Genetic drift in finite populations
The Evolution of Populations
Genes Code for Proteins
Genes Encode RNAs and Polypeptides
Type I and Type II Errors
Presentation transcript:

Neutrality Test First suggested by Kimura (1968) and King and Jukes (1969) Shift to using neutrality as a null hypothesis in positive selection and selection sweep tests Positive selection is when a new advantageous trait is segregating in a population Selection sweep is when reduction of neutral allele diversity linked to a selected loci is fixed

Neutrality Test Neutrality tests allow us to: Identify causes of species-specific phenotype differences Identify regions currently under selection Form hypotheses on function from genome data Challenges in neutrality tests Extracting the data Identifying the loci under selection

Neutrality Test Two main classes of neutrality test Allelic distribution and/or level of variability Comparisons of divergence/variability between different mutation classes within a locus The former relies on major assumptions on population demographics

Single-Locus Test Ewens Sampling Formula Sampling probability under infinite allele model Ewens-Watterson Test Compare the expected homozygosity with the observed homozygosity If larger than a threshold value, reject the null hypothesis

Single-Locus Test Tajima's D-Test Nucleotide data D = θπ-θω/Sθπ-θω D is the scaled difference in the estimate of θ=4Νeμ Θπ is an estimator of θ based on average number of pairwise differences Θω is an estimator of θ based on number of segregating sites Sθπ is an estimate of the standard error of the difference of the two estimates

Single-Locus Test D-Test Difficulty in interpreting significant results Useful for detecting bottlenecks and subdivision as well as selection sweeps

Multiple-Loci Test Lewontin-Krakauer test Data from diallelic loci from multiple populations F = σp2/[p(1-p)] P and σp2 are the mean and variance of allele frequencies across populations If F is too large, the neutral hypothesis is rejected

Multiple-Loci Test HKA Test Variability between and within species is compared for two or more loci Assumes that under neutrality Expected number of segregating sites within species and expected number of fixed differences between species are proportional to mutation rate Ratio of two expectations is constant among loci Therefore if divergence:polymorphism ratio is too high, selection is at work

Multiple-Loci Test HKA Test Challenge: Variance in segregating sites highly depends on demographics Example: Immigration from unknown population M = Immigration rate CV = Standard deviation divided by mean in segregating sites number

Multiple-Loci Test Assumptions and Challenges Challenges Selection will contrast target alleles/loci Selection can be seen if significant difference in adherence to the neutral model between loci Challenges Our expected value and variance of D depends heavily on the demographic model

Multiple-Loci Test Example

Comparing Variability in Different Classes of Mutations McDonald-Kreitman (MK) Type Tests Traditionally used to detect and measure the amount of adaptive evolution within a species by determining whether adaptive evolution has occurred, and the proportion of substitutions that resulted from positive selection. In general, the MK test compares the amount of species polymorphism and the divergence (substitutions) between species at neutral and non- neutral sites (advantageous or deleterious).

McDonald-Kreitman cont. Setting up a MK test Set up a two way contingency table show to the right Term clarification: Synonymous—a point mutation causing a silent mutation (phenotypically normal)—often used as a control Nonsynonymous—mutation that causes a change in phenotype Fixed Polymorphic Synonymous Ds Ps Nonsynonymous Dn Pn Ds: the number of synonymous substitutions per gene Dn: the number of nonsynonymous substitutions per gene Ps: the number of synonymous polymorphisms per gene Pn: the number of nonsynonymous polymorphisms per gene

McDonald-Kreitman cont. First used with drosophila in 1991 and the ADH gene. The test proposed a method to estimate the proportion of substitutions that are fixed by positive selection rather than by genetic drift. The ratio of ns. to s. variation within a species is going to equal the ratio of ns. to s. variation between species: Dn/Ds = Pn/Ps

McDonald-Kreitman cont. When positive or negative selection influences ns. variation, the ratios will no longer be equal. The ratio of ns. to s. between species is lower than the ratio of ns. to s. within species when negative selection is high and deleterious alleles strongly affect polymorphism: Dn/Ds < Pn/Ps The ratio of ns. to s. within species is lower than the ratio of ns. to s. between species when positive selection is high. Dn/Ds > Pn/Ps These do not necessarily contribute to polymorphism but have an effect on divergence.

McDonald-Kreitman cont. Possible shortcoming of the MK type tests: It’s not always clear what type of selection is acting upon a gene Ex—changes in pop size combined with weak selection against slightly deleterious mutation may either increase or decrease the number of ns polymorphisms An increase in pop size will lead to excessive ns polymorphisms Significant results from MK cannot be interpreted directly as evidence for positive selection

The Genomic Rate of Adaptive Evolution—Smith and Eyre-Walker Additional work with MK tests by Smith and Eyre- Walker: α = 1 – (DsPn)/(DnPs) In the above equation, α = proportion of substitutions driven by positive selection. See research handouts.

Test Based on Allelic Distribution in ns and s Sites Some tests are done by examining different types of sites (non- protein coding sites) Differences I the allelic distributions (frequency spectra) between s and ns polymorphisms. Used for genomic sets in which large number of polymorphisms can be obtained. Microsat data? Nielsen and Weinreich performed frequency spectra analysis in the human genome. (1999) Differences in the average age of ns and s mutations provided evidence for selection.

Tests Based on the dN/ds Ratio or ω The most direct method for showing the presence of positive selection is to demonstrate that the number of ns substitutions per ns sites (dN) is much larger than the number of s substitutions per s sites (dS)

Definitions…The dN dN (alternatively designated Ka) is a measure of the degree to which two homologous coding sequences differ with respect to amino-acid content. Specifically, it indicates the degree to which two sequences differ at ns sites (substitution that changes the aa). dN is the average number of nucleotide differences between the sequences per ns site.

More Definitions…The dS dS (alternatively designated Ks) is a measure of the degree to which two homologous coding sequences differ with respect to silent nucleotide substitutions (substitutions that do not cause an amino-acid substitution). It indicates the degree to which two sequences differ at s sites (substitution that does not change the aa). dS is the average number of nucleotide differences between sequences per synonymous site.

Tests Based on the dN/ds Ratio or ω A value of dN > dS implies that ns mutations are fixed with a higher P than neutral ones due to positive selection. If testing dN < dS (ω ≤ 1)for an entire gene is a very conservative test of neutrality. Purifying selection must occur frequently in functional genes to preserve function. Therefore, the average dN is expected to be much less than the average dS, even if positive selection is occurring in some sites.

Differences in MK and H0:ω≤1 ω≤1 is to date the only direct method available to provide data for detecting positive selection. ω>1 is to date the only direct method available for detecting positive selection from DNA sequence data. Limitations: they assume no recombination and the effect of strong codon bias on these methods have not been systematically explored (2001). **Have the above limitations been investigated yet?