The influence of population size on patterns of natural selection in mammals Carolin Kosiol Cornell University 21 st December 2007 Isaac.

Slides:



Advertisements
Similar presentations
IMPRS workshop Comparative Genomics 18 th -21 st of February 2013 Lecture 4 Positive selection.
Advertisements

Quick Lesson on dN/dS Neutral Selection Codon Degeneracy Synonymous vs. Non-synonymous dN/dS ratios Why Selection? The Problem.
Towards realistic codon models: among site variability and dependency of synonymous and nonsynonymous rates Itay Mayrose Adi Doron-Faigenboim Eran Bacharach.
Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.
R ATES OF P OINT M UTATION. The rate of mutation = the number of new sequence variants arising in a predefined target region per unit time. Target region.
GENE TREES Abhita Chugh. Phylogenetic tree Evolutionary tree showing the relationship among various entities that are believed to have a common ancestor.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
1 The Dynamics of Positive Selection on the Mammalian Tree Carolin Kosiol Cornell University Joint with: Tomas Vinar, Rute Da Fonseca,
Molecular Evolution Revised 29/12/06
Genetica per Scienze Naturali a.a prof S. Presciuttini Human and chimpanzee genomes The human and chimpanzee genomes—with their 5-million-year history.
From population genetics to variation among species: Computing the rate of fixations.
Molecular Evolution with an emphasis on substitution rates Gavin JD Smith State Key Laboratory of Emerging Infectious Diseases & Department of Microbiology.
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
The Distribution of Fitness Effects of Mutations in Humans and Flies
Scott Williamson and Carlos Bustamante
Positive selection A new allele (mutant) confers some increase in the fitness of the organism Selection acts to favour this allele Also called adaptive.
Comparative Genomics and Evolution Pollard, K.S., et al., Forces Shaping the Fastest Evolving Regions in the Human Genome. PLoS Genetics 2(10), McLean,
Monte Carlo methods for estimating population genetic parameters Rasmus Nielsen University of Copenhagen.
1 Functional prediction in proteins (purifying and positive selection)
Adaptive Molecular Evolution Nonsynonymous vs Synonymous.
Short Primer on Comparative Genomics Today: Special guest lecture 12pm, Alway M108 Comparative genomics of animals and plants Adam Siepel Assistant Professor.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Model Selection Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
Topics covered Overview of similarities between the genetic makeup of humans and chimpanzees. Comparison of brain and speech genes between humans and.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
Molecular phylogenetics
Origins and impact of constraints in evolution of gene families Boris E. Shakhnovich and Eugene V.Koonin Genome Research 2006, October 19 Stella Veretnik.
Molecular Clock. Rate of evolution of DNA is constant over time and across lineages Resolve history of species –Timing of events –Relationship of species.
Molecular basis of evolution. Goal – to reconstruct the evolutionary history of all organisms in the form of phylogenetic trees. Classical approach: phylogenetic.
Cryptic Variation in the Human mutation rate Alan Hodgkinson Adam Eyre-Walker, Manolis Ladoukakis.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
PHYLOGENETICS CONTINUED TESTS BY TUESDAY BECAUSE SOME PROBLEMS WITH SCANTRONS.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Rates and Fitness Effects of Mutations Adam Eyre-Walker (University of Sussex)
1 Genome Evolution Chapter Introduction Genomes contain the raw material for evolution; Comparing whole genomes enhances – Our ability to understand.
Calculating branch lengths from distances. ABC A B C----- a b c.
Analysis of Mitochondrial DNA from Chimpanzees in Tanzania Timothy Comar, April Bednarski, and Douglas Green.
Identifying and Modeling Selection Pressure (a review of three papers) Rose Hoberman BioLM seminar Feb 9, 2004.
Models of Molecular Evolution III Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 7.5 – 7.8.
PREETI MISRA Advisor: Dr. HAIXU TANG SCHOOL OF INFORMATICS - INDIANA UNIVERSITY Computational method to analyze tandem repeats in eukaryote genomes.
Using blast to study gene evolution – an example.
Cédric Notredame (08/12/2015) Molecular Evolution Cédric Notredame.
Selectionist view: allele substitution and polymorphism
N=50 s=0.150 replicates s>0 Time till fixation on average: t av = (2/s) ln (2N) generations (also true for mutations with negative “s” ! discuss among.
NEW TOPIC: MOLECULAR EVOLUTION.
Molecular evolution Part I: The evolution of macromolecules.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Testing the Neutral Mutation Hypothesis The neutral theory predicts that polymorphism within species is correlated positively with fixed differences between.
Restriction enzyme analysis The new(ish) population genetics Old view New view Allele frequency change looking forward in time; alleles either the same.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College
Evolution of individual genes in humans
Bioinf.cs.auckland.ac.nz Juin 2008 Uncorrelated and Autocorrelated relaxed phylogenetics Michaël Defoin-Platel and Alexei Drummond.
In populations of finite size, sampling of gametes from the gene pool can cause evolution. Incorporating Genetic Drift.
Modelling evolution Gil McVean Department of Statistics TC A G.
Human survivorship Developed Developing Bob May (2007), TREE 22:
HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),
BME 130 – Genomes Lecture 20 Population Genomics I.
Human survivorship Developed Developing Bob May (2007), TREE 22:
Detection of the footprint of natural selection in the genome
The neutral theory of molecular evolution
Neutrality Test First suggested by Kimura (1968) and King and Jukes (1969) Shift to using neutrality as a null hypothesis in positive selection and selection.
Linkage and Linkage Disequilibrium
Chromosome-level Mutation
Pipelines for Computational Analysis (Bioinformatics)
Models of Sequence Evolution
Testing the Neutral Mutation Hypothesis
Mattew Mazowita, Lani Haque, and David Sankoff
by , Christine G. Elsik, Ross L. Tellam, and Kim C. Worley
Natural Selection on Genes that Underlie Human Disease Susceptibility
Presentation transcript:

The influence of population size on patterns of natural selection in mammals Carolin Kosiol Cornell University 21 st December 2007 Isaac Newton Institute

0.05 human macaque mouse rat dog chimp Based on multiple alignments (RefSeq, Vega, and UCSC Known Genes). Rigorous filters (spurious annotations, paralogous alignments, pseudogenized). Include genes where the sequences of up to three species are missing as well as truncated genes human / chimp / macaque / mouse / rat / dog orthologous genes. Six Mammalian Genomes

aggregated models ATT CTT CCT CCG I L P P Leu Pro Ile  < 1 purifying selection  = 1 neutral evolution  > 1 positive selection Measuring selective pressures nonsynonymous substitution synonymous substitution  is defined as the nonsynonymous- synonymous rate ratio

0.05 human macaque mouse rat dog chimp branch specific nonsynonymous -synonymous rate ratios  Average rates of positive selection

Per gene analysis Frequency Nonsynonymous-synonymous rate ratio  P-value: 2.2x10 -16

Likelihood Ratio Tests (LRTs) for positive selection Branch-site models (Yang & Nielsen, 2001, 2005 ) to find positively selected genes (PSGs) on -Any branch -Specific internal and external branches -Specific clades We identify 544 putatively positively selected genes (PSGs) in all test.

0.05 human macaque mouse rat dog chimp branch specific nonsynonymous -synonymous rate ratios  primate branch rodent branch human macaque chimp PSG: 400/16529PSG: 24/14425 PSG: 21/9566 PSG: 10/14558PSG: 18/14558PSG: 16/12499 PSG: 61/10762 PSG: 56/8991 primate clade rodent clade PSG: 7/10980 hominid

Bayesian Model selection

Previous scans on mammalian genomes 0.05 human chimp macaque (PNAS, 2007) “[…] These observations are explainable by the reduced efficacy of natural selection in humans because of their smaller long-term effective population size …” “[…] The diversity in West African chimpanzees is similar to that seen for human populations. The observed Clint is broadly consistent with West African origin …” (Nature,2005) Clint

The effective population size N and probability of fixation where s>0 mutation is selectively favoured s<0 mutation is selectively disfavoured For the neutral case (s = 0) this is simply the initial frequency of mutation 1/2N. (Kimura 1969) Let 2N the number of chromosomes. Then

Rates of mutations The fixation rate of a new mutations is the product of the mutation rate  per site, the chromosomal population size 2N and the probability of fixation: The rate of neutral mutations is

‘Popgen Omega’ Rate for selected mutations s =  ∙ 4Ns/(1-e -4Ns ) Rate for neutral mutations 0 =  (Bruno &Halpern 1998 Nielsen & Yang, 2003, Thorne et al., 2007). where  = 2Ns

Population genetic interpretation of  Advantages: Accounts for multiple substitutions per site  can be calculated for each lineage (branch model) Improves understanding of effects of population size N on the nonsynonymous-synonymous rate ratio  Disadvantages: Assumes that sites are independent Instantaneous change of population size at speciation N and s always come as a product  =Ns and cannot be estimated separately by ML techniques

Comparison to estimates from polymorphism data Estimation from genes: N m /N h = % CI =(1.15, 1.64) Population genetics: N m = (Hernandez et al.,2007) N h =40,000-70,000 (Wall, 2003) => N m /N h = human macaque mouse rat dog chimp NhNh N c = N h NmNm

LRT for population size ratios Model1:  h =  c,  m Model2:  h =  c,  m = c(N m /N h ) x  h For out of (95.6%) no significant deviation for was observed. The differences in selection pressures are well described by differences in differences in population size.

Summary The population genetic interpretation of  is helpful to understand differences between selection pressures. For human-chimp macaque trios our estimates of population size ratios agree with estimates from population genetics. An LRT shows that differences in selection pressure are well explained with the differences in population size.

Mammalian (Xmas) Tree 0.05 human macaque mouse rat dog chimp 0.05 human macaque mouse rat dog chimp Re-interprete the mammalian tree !

Siepel Labs (Cornell) Adam Siepel, Tomas Vinar, Brona Brejova, Adam Diehl, Alex Denby Bustamante Lab (Cornell) Carlos Bustamante, Adam Boyko, Adam Auton, Badri Padhukasahasram, Abra Brisbin, Kasia Bryc, Jeremiah Degenhardt, Ryan Hernandez, Emilia Huerta-Sanchez, Lin Li, Kirk Lohmueller, Hong Gao, Amit Indap, Dara Torgeson Rasmus Nielsen (Copenhagen) Tanja Gesell (Vienna) NIH and NSF for funding Thanks