Download presentation
1
Detecting selection using genome scans
Roger Butlin University of Sheffield
2
Nielsen R (2005) Molecular signatures of natural selection. Annu. Rev
Nielsen R (2005) Molecular signatures of natural selection. Annu. Rev. Genet. 39, 197–218. What signatures does selection leave in the genome? Population differentiation – today’s focus! Frequency spectrum, e.g. Tajima’s D Selective sweeps Haplotype structure (linkage disequilibrium) MacDonald-Kreitman tests (or PAML over long time-scales)
3
Frequency distribution:
From Nielsen (2005): frequency of derived allele in a sample of 20 alleles. Tajima’s D = (π-S)/sd, summarises excess of rare variants
4
Selective sweep:
5
Extended haplotype homozygosity (Sabeti et al. 2002)
6
MacDonald-Kreitman and related tests
dN = replacement changes per replacement site dS = silent changes per silent site dN/dS = neutral dN/dS < conserved (purifying selection) dN/dS > adaptive evolution (positive selection)
7
Selection on phenotypic traits:
QTL Association analysis Candidate genes
8
Genome scans (aka ‘Outlier analysis’)
9
Littorina saxatilis – locally adapted morphs
What signatures of selection might we look for? ‘H’ NB divergent selection (sort of equivalent to RI) ‘M’ Thornwick Bay
10
Signatures of selection:
Departure from HWE Low diversity (selective sweep) Frequency spectrum tests High divergence Elevated proportion of non-synonymous substitutions LD
11
Neutral loci
12
Stabilizing selection
13
Local adaptation
14
Charlesworth et al. 1997 (from Nosil et al. 2009)
15
A concrete example: adaptation to altitude in Rana temporaria (Bonin et al. 2006)
High – 2000m Intermediate – 1000m Low – 400m 190 individuals 392 AFLP bands
16
Generating the expected distribution
DetSel – Vitalis et al. 2001 N0 N1 N2 t μ to F1,2 – measure of divergence of population 1,2 from population 2,1 Dfdist – Beaumont & Nichols 1996 N m FST – symmetrical population differentiation, as a function of heterozygosity Separation of timescales argument… Does the structure/history matter?
17
DetSel Dfdist 95% CI 95% 50% 5% ‘Low 1’ vs ‘High 1’
18
DetSel Dfdist Both Interpretation Monomorphic in one population 35 N/A Unreliable outliers Significant in one comparison 14 29 False positives Significant in comparisons involving one population 3 11 Local effects Significant in at least 2 comparisons 2 1 Adaptation to altitude Significant in global comparison across altitudes 6 (2 at 99%) 392 AFLPs, 12 pairwise comparisons across altitude or 3 altitude categories, 95% cut off
19
343 loci 8 loci
20
Outlier AFLP in homologous set*
Outliers and selected traits Rogers and Bernatchez (2007): Dwarf x Normal cross both backcrosses Measure ‘adaptive’ traits (9) QTL map (>400 AFLP plus microsatellites) Homologous AFLP in 4 natural sympatric population pairs Outlier analysis (forward simulation based on Winkle) Coregonus clupeaformis (lake whitefish) Homologous AFLP Outlier AFLP in homologous set* Outlier within QTL (based on 1.5 LOD support) Hybrid x Dwarf 180 19 9 (3.6 expected, P=0.0015) Hybrid x Normal 131 8 4 (0.5 expected, P=0.0002) Expectation based on overall proportion of AFLP associated with QTL. *Only 3 outliers shared between lakes
21
Roger Butlin - Genome scans
22
Nosil et al. 2009 review of 14 studies:
0.5 – 26% outliers, most studies 5-10% 1 - 5% outliers replicated in pair-wise comparisons % of outliers specific to habitat comparisons No consistent pattern for EST-associated loci LD among outliers typically low But many methodological differences between studies Population sampling Marker type Analysis type and options Statistical cut-offs
23
Environmental correlations SAM – Joost et al. 2007
IBA – Nosil et al. 2007 FST for each locus correlated with ‘adaptive distance’, controlling for geographic distance (partial Mantel test)
24
Methodological improvements – Bayesian approaches
BayesFst – Beaumont & Balding 2004 Bayescan – Foll & Gaggiotti 2008 For each locus i and population j we have an FST measure, relative to the ‘ancestral’ population, Fij Then decompose into locus and population components, Log(Fij/(1-Fij) = αi + βj αi is the locus-effect – 0 neutral, +ve divergence selection, -ve balancing selection βj is the population effect Assuming Dirichlet distribution of allele frequencies among subpopulations, can estimate αi + βj by MCMC In Bayescan, also explicitly test αi = 0 Ancestral
25
Apparently much greater power to detect balancing selection than FDIST
Lower false positive rate Wider applicability
26
Methodological improvements – hierarchical structure
Arlequin – Excoffier et al. 2009
27
Circles – simulated STR data, grey – null distribution
29
Multiple analyses? Candidate vs control? E.g. Shimada et al. 2010
Bayenv – Coop et al. 2010 Estimates variance-covariance matrix of allele frequencies then tests for correlations with environmental variables (or categories). Software available at: Multiple analyses? Candidate vs control? E.g. Shimada et al. 2010
31
Hohenlohe et al. 2010
32
Mäkinen et al 2008 7 populations 3 marine, 4 freshwater 103 STR loci
Analysed by BayesFst (and LnRH) 5 under directional selection (3 in Eda locus) 15 under balancing selection Used as a test case by Excoffier et al 2 directional 3 balancing
33
Can we replicate these results?
Bayescan Stickleback_allele.txt – input file Output_fst.txt – view with R routine plot_Bayescan Arlequin Stickleback_data_standard.arp – IAM Stickleback_data_repeat.arp – SMM Run using Arlequin3.5 Try hierarchical and island models, maybe different hierarchies
35
Sympatric speciation? FST distribution as evidence of speciation with gene flow Savolainen et al (2006) Cf. Gavrilets and Vose (2007) few loci underlying key traits intermediate selection initial environmental effect on phenology Howea - palms
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.