Atelier INSERM – La Londe Les Maures – Mai 2004

Slides:



Advertisements
Similar presentations
Sampling distributions of alleles under models of neutral evolution.
Advertisements

Change in frequency of the unbanded allele (q) as a function of q for island populations. Equilibrium points a)Strong selection for q, little migration.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Signatures of Selection
Pattern of similarity between Europeans and Neanderthals Green et al. Science 328, 710 (2010)
Detection of domestication genes and other loci under selection.
Population size does not influence mitochondrial genetic diversity in animals E. Bazin, S. Glémin, N. Galtier CNRS UMR 5171 – Génome, Populations, Interactions,
Forward Genealogical Simulations Assumptions:1) Fixed population size 2) Fixed mating time Step #1:The mating process: For a fixed population size N, there.
BIOE 109 Summer 2009 Lecture 6- Part I Microevolution – Random genetic drift.
From population genetics to variation among species: Computing the rate of fixations.
Human Evolution: Searching for Selection Andrew Shah Algorithms in Biology 374 Spring 2008.
Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA
Bruce Walsh, University of Arizona
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
Scott Williamson and Carlos Bustamante
Inferring human demographic history from DNA sequence data Apr. 28, 2009 J. Wall Institute for Human Genetics, UCSF.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
Salit Kark Department of Evolution, Systematics and Ecology The Silberman Institute of Life Sciences The Hebrew University of Jerusalem Conservation Biology.
Positive selection A new allele (mutant) confers some increase in the fitness of the organism Selection acts to favour this allele Also called adaptive.
Estimating recombination rates using three-site likelihoods Jeff Wall Program in Molecular and Computational Biology, USC.
Human population migrations Out of Africa, Replacement –Single mother of all humans (Eve) ~150,000yr –Single father of all humans (Adam) ~70,000yr –Humans.
Procedures in RFLP. RFLP analysis can detect Point mutations Length mutations Inversions.
Hidenki Innan and Yuseob Kim Pattern of Polymorphism After Strong Artificial Selection in a Domestication Event Hidenki Innan and Yuseob Kim A Summary.
Molecular phylogenetics
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
Haplotype Blocks An Overview A. Polanski Department of Statistics Rice University.
Lecture 21: Tests for Departures from Neutrality November 9, 2012.
Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012.
Lab 11 :Test of Neutrality and Evidence for Selection.
Lecture 3: population genetics I: mutation and recombination
Population assignment likelihoods in a phylogenetic and demographic model. Jody Hey Rutgers University.
Genetic Linkage. Two pops may have the same allele frequencies but different chromosome frequencies.
Announcements: Proposal resubmission deadline 4/23 (Thursday).
Deviations from HWE I. Mutation II. Migration III. Non-Random Mating IV. Genetic Drift A. Sampling Error.
Getting Parameters from data Comp 790– Coalescence with Mutations1.
Models of Molecular Evolution III Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 7.5 – 7.8.
What is a SNP?. Lecture topics What is a SNP? What use are they? SNP discovery SNP genotyping Introduction to Linkage Disequilibrium.
Genes in human populations n Population genetics: focus on allele frequencies (the “gene pool” = all the gametes in a big pot!) n Hardy-Weinberg calculations.
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.
Copyright © 2004 Pearson Prentice Hall, Inc. Chapter 7 Multiple Loci & Sex=recombination.
Selectionist view: allele substitution and polymorphism
Lecture 20 : Tests of Neutrality
NEW TOPIC: MOLECULAR EVOLUTION.
By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458.
Molecular evolution Part I: The evolution of macromolecules.
Genomics of Adaptation
Lab 11 :Test of Neutrality and Evidence for Selection
Testing the Neutral Mutation Hypothesis The neutral theory predicts that polymorphism within species is correlated positively with fixed differences between.
In populations of finite size, sampling of gametes from the gene pool can cause evolution. Incorporating Genetic Drift.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Lecture 6 Genetic drift & Mutation Sonja Kujala
Evolution and Population Genetics
Data analyses Course code: ZOO560 Week 3
Genetic Linkage.
Polymorphism Polymorphism: when two or more alleles at a locus exist in a population at the same time. Nucleotide diversity: P = xixjpij considers.
Population Genetics As we all have an interest in genomic epidemiology we are likely all either in the process of sampling and ananlysising genetic data.
Signatures of Selection
The neutral theory of molecular evolution
Neutrality Test First suggested by Kimura (1968) and King and Jukes (1969) Shift to using neutrality as a null hypothesis in positive selection and selection.
The Neutral Theory M. Kimura, 1968
Genetic Linkage.
Patterns of Linkage Disequilibrium in the Human Genome
MULTIPLE GENES AND QUANTITATIVE TRAITS
Testing the Neutral Mutation Hypothesis
The ‘V’ in the Tajima D equation is:
Genetic Drift, followed by selection can cause linkage disequilibrium
Genetic Linkage.
Testing for Selective Neutrality
Presentation transcript:

Atelier INSERM – La Londe Les Maures – Mai 2004 DETECTING SELECTION FROM DNA SEQUENCE POLYMORPHISM DATA N. GALTIER CNRS UMR 5171 – Génome, Populations, Interactions, Adaptation  Université Montpellier 2, France galtier@univ-montp2.fr

SEQUENCE POLYMORPHISM DATA population (species)

SEQUENCE POLYMORPHISM DATA 5 genes DNA fragment (locus) sample population (species) ....ACGGATAGTTAGTGACGATA... ....ACGTATAGCTAGTGACGATA... ....ACGGATAGCTAGTGACGATA... ....ACGGATAGCTAGTGACGATC... site * * * 3 polymorphic (segregating) sites 4 distinct sequences (haplotypes)

SEQUENCE POLYMORPHISM DATA 5 genes sample DNA fragment (locus) population (species) ....ACGGATAGTTAGTGACGATA... ....ACGTATAGCTAGTGACGATA... ....ACGTATAGCTAGTGACGATA... ....ACGGATAGCTAGTGACGATA... ....ACGGATAGCTAGTGACGATC... ....CCAGCTAGCTACTGAAGTTG... outgroup

MUTATIONS SEGREGATING IN A POPULATION (1) sample 1 mutant allele frequency NEUTRAL time Mutations (black dots) arise at rate 2N.m Under neutrality, a new mutation reaches fixation with probability 1/2N This results in a neutral substitution rate of 2N.m / 2N = m (red dots) N: effective population size m: mutation rate The amount of polymorphism in the population at mutation-drift equilibrium is determined by the N.m product, usually measured as q = 4N.m

MUTATIONS SEGREGATING IN A POPULATION (2) 1 mutant allele frequency NEUTRAL 1 mutant allele frequency PURIFYING SELECTION time - a decreased substitution rate Purifying (=negative) selection results in : - a decreased amount of polymorphism - lower allele frequencies

MUTATIONS SEGREGATING IN A POPULATION (3) 1 mutant allele frequency NEUTRAL 1 mutant allele frequency ADAPTIVE SELECTION - an increased substitution rate Adaptive (=positive) selection results in : - a decreased amount of polymorphism - higher allele frequencies

LINKAGE AND HITCH-HIKING SELECTIVE SWEEP sampled neutral locus linked selected locus Directional selection decreases polymorphism at linked (neighbour) neutral sites by increasing the apparent drift.

LINKAGE AND HITCH-HIKING SELECTIVE SWEEP sampled neutral locus linked selected locus Recombination reduces the effect of selection at neighboring loci.

DETECTING SELECTION BY SEEKING REGIONS OF "LOW" POLYMORPHISM Selection reduces polymorphism, but the level of polymorphism is determined by other factors including population size and mutation rate. To make sure that selection is acting, one must control for these nuisance factors. Example: the sliding window strategy p selection or reduced mutation bias? DNA fragment

HITCH-HIKING MAPPING POPULATIONS (distinct N's) 1 2 3 4 5 0.05 A B 0.07 LOCI (distinct m's) C 0.20 D 0.13 0.05 0.06 0.10 E 0.11 F 0.03 A selective sweep occurred at locus D in population 3 - reduced population size (other loci show high polymorphism in pop 3) - low mutation rate (other pops show high polymorphism at locus D) The low amount of polymorphism at locus D, pop 3 cannot be explained by:

focal species outgroup focal species outgroup THE HKA TEST Locus B Locus A Selection has influenced polymorphism at one of the two loci. reduced population size (locus A shows high polymorphism) - low mutation rate (the distance to outgroup is not reduced) The reduced amount of polymorphism at locus B cannot be explained by:

5 4 2 8 focal species outgroup THE McDONALD-KREITMAN TEST synonymous non-synonymous polymorphic fixed 5 4 2 8 focal species outgroup The ratio of nonsynonymous to synonymous is higher between species (divergence) than within species (polymorphism), when the two ratios should be equal under neutrality: positive selection has promoted the fixation of nonsynonymous changes.

COALESCENCE THEORY : FOCUSING ON SAMPLE GENEALOGY 2N chromosomes 1 2 3 k.N . . Time

T2 2N (on average) 4N (on average) T3 T4 T5 COALESCENCE THEORY : THE STANDARD COALESCENT The genealogy of a sample of size n at a neutral locus in a panmictic population of constant size 2N should be like: T2 2N (on average) 4N (on average) T3 T4 T5 where - all topologies are equiprobable - coalescence times Ti’s are exponential random variables of expectation E(Ti)=4N/(i.(i-1)) - mutations are superimposed onto the genealogy according to a Poisson process

THE COALESCENCE PROCESS HAS A HIGH VARIANCE T2 distribution Two realisations of the coalescent with equal Tn, Tn-1, …, T3, but distinct T2

DEPARTURE FROM NEUTRALITY : THE SELECTIVE SWEEP EXAMPLE linked selected sampled neutral SELECTIVE SWEEP neutral genealogy sweep "complete" selective sweep : star-like genealogy

DEPARTURE FROM NEUTRALITY : THE SELECTIVE SWEEP EXAMPLE linked selected sampled neutral SELECTIVE SWEEP neutral genealogy "partial" selective sweep : partly star-like genealogy sweep

DEPAULIS’ HAPLOTYPE TEST neutral genealogy "partial" selective sweep : partly star-like genealogy 9 polymorphic sites 8 haplotypes 9 polymorphic sites 3 haplotypes A partially star-like genalogy results in a number of haplotypes lower than expected given the number of polymorphic sites. Other test statistics aiming at detecting non-neutral shapes of genealogy were proposed: Tajima's D, Fu and Li's F, Fay and Wu's H, ...

DEMOGRAPHY vs SELECTION Detecting a departure from the standard coalescent means that at least one of its assumptions are wrong. Neutrality, unfortunately, is only one of them. Demographic effects (departure from the constant-population size assumption) can distort genealogies in a way very similar to selection. A bottleneck (sudden decrease of population size, followed by a restauration of the former size), for example, has consequences highly similar to that of a selective sweep. To distinguish: multi-locus analysis. Demography impacts the whole genome, while selection is locus-specific.

A LIKELIHOOD-BASED APPROACH M1: neutral, constant size p parameters (q1, ..., qp) T M2: bottleneck p+2 parameters (T, S, q1, ..., qp) T1 T2= T3 M3: selective sweep 3p parameters (T1, S1, q1, ... , Tp, Sp, qp) Calculate and compare the likelihood (probability of the data) under the three models using a likelihood ratio test.

WHAT I DID NOT TALK ABOUT - subdivided populations, migration, isolation by distance, hybrid zones, clines - other forms of selection (e.g. balancing selection) - weak selection applying at many loci (e.g. codon usage) - (biased) gene conversion - patterns of linkage disequilibrium, coalescent with recombination - microsatellites and other non-sequence genetic markers