The Neolithic transition in Europe: different views from population genetic (a tentative discussion around some methodological questions) Lounès Chikhi.

Slides:



Advertisements
Similar presentations
Lab 3 : Exact tests and Measuring of Genetic Variation.
Advertisements

Lab 3 : Exact tests and Measuring Genetic Variation.
METHODS FOR HAPLOTYPE RECONSTRUCTION
Variance reduction techniques. 2 Introduction Simulation models should be coded such that they are efficient. Efficiency in terms of programming ensures.
Amorphophallus titanum Largest unbranched inflorescence in the world Monecious and protogynous Carrion flower (fly/beetle pollinated) Indigenous to the.
Sampling distributions of alleles under models of neutral evolution.
Lecture 23: Introduction to Coalescence April 7, 2014.
MALD Mapping by Admixture Linkage Disequilibrium.
Atelier INSERM – La Londe Les Maures – Mai 2004
Signatures of Selection
Inferring human demographic history from DNA sequence data Apr. 28, 2009 J. Wall Institute for Human Genetics, UCSF.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
Salit Kark Department of Evolution, Systematics and Ecology The Silberman Institute of Life Sciences The Hebrew University of Jerusalem Conservation Biology.
Genetica per Scienze Naturali a.a prof S. Presciuttini Mutation Rates Ultimately, the source of genetic variation observed among individuals in.
2: Population genetics. Problem of small population size Small populations are less fit (more vulnerable) than large populations.
Quantitative Genetics
Review Session Monday, November 8 Shantz 242 E (the usual place) 5:00-7:00 PM I’ll answer questions on my material, then Chad will answer questions on.
KEY CONCEPT A population shares a common gene pool.
Chuanyu Sun Paul VanRaden National Association of Animal breeders, USA Animal Improvement Programs Laboratory, USA Increasing long term response by selecting.
Modes of selection on quantitative traits. Directional selection The population responds to selection when the mean value changes in one direction Here,
Biodiversity IV: genetics and conservation
1 Genetic Variability. 2 A population is monomorphic at a locus if there exists only one allele at the locus. A population is polymorphic at a locus if.
KEY CONCEPT A population shares a common gene pool.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
Haplotype Blocks An Overview A. Polanski Department of Statistics Rice University.
Lecture 21: Tests for Departures from Neutrality November 9, 2012.
Discussion: mtDNA / NRY versus X chromosome / Autosomes : The information given by mtDNA and the NRY is represented by dotted lines in figure 2a and 2b.
Genetic Linkage. Two pops may have the same allele frequencies but different chromosome frequencies.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
Genetics and Speciation
Large-scale recombination rate patterns are conserved among human populations David Serre McGill University and Genome Quebec Innovation Center UQAM January.
Models of Molecular Evolution III Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 7.5 – 7.8.
INTRODUCTION TO ASSOCIATION MAPPING
Habitat loss and fragmentation I Bio 415/615. Questions 1.What does F ST measure? 2.How does F ST relate to fire management and collared lizards in the.
Geo479/579: Geostatistics Ch4. Spatial Description.
Stat 112 Notes 9 Today: –Multicollinearity (Chapter 4.6) –Multiple regression and causal inference.
Fall 2002Biostat Statistical Inference - Proportions One sample Confidence intervals Hypothesis tests Two Sample Confidence intervals Hypothesis.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Lecture 20 : Tests of Neutrality
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
MARKET APPRAISAL. Steps in Market Appraisal Situational Analysis and Specification of Objectives Collection of Secondary Information Conduct of Market.
Populations: defining and identifying. Two major paradigms for defining populations Ecological paradigm A group of individuals of the same species that.
Lecture 22: Quantitative Traits II
Mammalian Population Genetics
Serial Founder Effects in Linguistics and Genetics Claire Bowern (with Keith Hunley and Meghan Healy) Yale and University of New Mexico Feb 9, 2012 Based.
From the population to the sample The sampling distribution FETP India.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College
Using Merlin in Rheumatoid Arthritis Analyses Wei V. Chen 05/05/2004.
Review Statistical inference and test of significance.
Evolution of Populations. Individual organisms do not evolve. This is a misconception. While natural selection acts on individuals, evolution is only.
11.1 Genetic Variation Within Population KEY CONCEPT A population shares a common gene pool.
11.1 Genetic Variation Within Population KEY CONCEPT A population shares a common gene pool.
Random Change. In terms of genetics, it is any change in allele frequencies within a population. The H-W, provided conditions that evolution would not.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Sierra M. Love Stowell & Andrew P. Martin Student Figures
Signatures of Selection
Laurie S. Stevison Suzanne E. McGaugh Mohamed A. F. Noor
Questions Is there selection for an intermediate level of population
Daniel Falush, Dan Lawson, Lucy van Dorp
Human Genetics: Message from the Mesolithic
Gerald Dyer, Jr., MPH October 20, 2016
HMD Bio CH 11.1 KEY CONCEPT A population shares a common gene pool.
Basic Practice of Statistics - 3rd Edition Inference for Regression
Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian Migrations  Marc Haber, Massimo Mezzavilla, Anders Bergström, Javier.
Selection and Reduced Population Size Cannot Explain Higher Amounts of Neandertal Ancestry in East Asian than in European Human Populations  Bernard Y.
Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian Migrations  Marc Haber, Massimo Mezzavilla, Anders Bergström, Javier.
Pier Francesco Palamara, Todd Lencz, Ariel Darvasi, Itsik Pe’er 
Evolution of Populations
Presentation transcript:

The Neolithic transition in Europe: different views from population genetic (a tentative discussion around some methodological questions) Lounès Chikhi Evolution et Diversité Biologique CNRS Université Paul Sabatier, Toulouse

Inference in population genetics Data collection Genetic typing Description of patterns of genetic variability Analysis and interpretation Test (simulations)

Sampling of “populations” “Choice” of the markers (genome sampling) mitochondrial DNA : female demography Y Chromosome : male demography nuclear genes (markers: allozymes, microsatellites, RFLP, AFLP, SNPs, etc.) Description of the patterns : –Diversity within samples –Diversity between samples –Are there spatial patterns ? Inference in population genetics

A similar pattern with Y chromosome data Semino et al. (2000) Science What to do of the patterns ? How to interpret them ?

Are the patterns, if any, compatible with hypotheses or demographic scenarios from other areas (archaeology, linguistics, etc.) ? Inference in population genetics

10,000 BP 45,000 BP 18,000 BP A possible scheme of population movements since Paleolithic

Is there a link between these images (archeo- genetico-linguistic) ? Can we estimate demographic parameters ? –Population : stable ? growing ? bottleneck ? –Admixture between populations ? Can we date these events ? Can we detect selection ? Inference in population genetics

Effect of population size changes on some measures of genetic diversity n A drops quicker than H e because rare alleles are eliminated and do not contribute to –H e = 1-Σp i 2. gappy allelic size distributions range varies little (r=range) Bottleneck n A = 4 r = 7 n A /r = 0.57 H e = 0.71 gaps Allele sizes (nb of repetitions) Allele frequency n A = 7 r = 8 n A /r = 0.88 H e = 0.74 gap Allele frequency

Thus there is some information in genetic data about ancient demographic events. However, this information, may be qualitative rather than quantitative and does not allow us to determine whether other scenarios could have played a role (or selection). Inference in population genetics

Recent data from the Y chromosome have been interpreted as indicating a Neolithic contribution of 22% (Semino et al., 2000). This figure (22%) is the sum of the frequencies of 4 haplotypes called Eu4, 9, 10 and 11 Question : why should the proportion of haplotypes exhibiting a clinal distribution today represent the so-called “Neolithic” contribution?

There are two problems with this “estimation”: 1. Clines are only expected for alleles that were present in different frequencies in the populations when they mixed (dilution problem). Moreover, drift in the last years may have blurred clines that were visible at the time. Many haplotypes are observed only 1, 2 or 3 times in each sample (i.e. no cline is going to be as visible by eye as those observed for the 4 selected haplotypes) 2. Even if it were estimated properly it would be meaningless for understanding the processes of European colonization. A single number cannot summarise a cline.

Average = 100  (P N +P N 2 +…+ P N n )/n. Same value of P N = 0.9 (90% farmers + 10% hunter- gatherers). Horizontal lines are averages: n=10: average = 62% n=25: average = 36% n=50: average = 21% Thus, a lack of pattern or a low average can correspond to a high P N value. P N = proportion of farmers in any admixed population n= number of admixture events Geometric decrease of Neolithic contrib. from P N to P N n Ex: P N =0.9 and n=25, then P N n =0.07 +: n=10 Δ: n=25 O: n=50

Two major models have been proposed (or at least structure current debate) The demic diffusion model: significant correlations between archaeological and genetic maps are explained by a movement of people entering Europe from the Levant and Anatolia during the Neolithic. We would expect a significant genetic contribution. The cultural diffusion model: the spread of agriculture in Europe involved the movement of ideas, not of people. The genetic contribution of Near East farmers to the European gene pool should be limited. Demic diffusion Large Genetic contributionSmall Genetic contribution Cultural diffusion Average

What kind of inference ? –Qualitative versus quantitative ? –Detection versus estimation. –Models and underlying assumptions. Inférence en génétique des populations

Hybrid (Europe) Parent 2 (Near East) T Present Past Parent 1 (Basques) Parent 2 (farmers.) p 1 1 – p 1 Parent 1 (hunter- gatherers) Hybrid T/N 1 T/N h T/N 2 Admixture model Separates the effects of drift and admixture

The effect of drift is that the « hybrid » population may not even be intermediate after a limited number of generations. In other words: (i)the information on admixture decreases with time. (ii) It is risky to analyse single locus data when demographic events are ancient.

p 1 = 0.3 Little drift More drift 1) We simulate data according to the model (figure above) varying some parameters (here drift) 2) The outputs are given to the program implementing the method 3) One distribution is obtained for each simulation

1) We note that for the VERY SAME scenario inference can be extremely different ! 2) This inference varies from one locus to the other. 3) When two loci produce different estimates, we cannot conclude that they had a different demographic history. 4) Worse : we are in an optimal situation : we « know » the real p and the data were simulated according to the model. This NEVER happens in real life. p 1 = 0.3

1) One solution : multi-locus data. 2) Increasing sample sizes is NOT very useful. 3) Better to have multilocus data much later than one locus juste after: Ex: 5 loci after 100 generations versus 1 locus after 1 generation (for N=1000) 4) Don’t throw your allozyme data away.

What if we re-analyse Semino et al.’s data ?

Y chromosome data (Semino et al., 2000). p 1 represents the hunter- gatherer contribution (descendants = Basques) Each curve corresponds to the analysis of a European population. Significant cline observed for (1- p 1 ) values (i.e. Near Eastern contribution) against geog. distance calculated from the Near East.

After Semino et al., 2000 Science

As a test we can analyse the same data considering Sardinians as descendants of the hunter- gatherers. We find an extremely similar result. The « Neolithic » contribution is even slightly superior: on the order of 65% instead of 50%.

1)There are significant clines for the parameter representing the Neolithic contribution Néolithique across Europe. 2)This “trend” is signifcantly different from that “obtained” by Semino et al. (2000). 3)The Neolithic contribution appears to be around 50% rather than 22%. 4)Re-analysis of all European populations using the Sardinian population as P 1 shows very similar results with higher Neolithic contribution (average of 65%). Conclusion: The cultural diffusion model is unlikely to explain the patterns observed using the Y chrom. data. Model-based results (i) (mostly on Y chromosome data)

Tests performed are partial and the model is simplistic but it is a first step towards quantification of demographic parameters clearly identified. Qualitative approach : –Easy and useful BUT little or misleadingly precise Quantitative approach: –Assumptions are explicit –Results can be precise (or not) BUT often complicated to interpret and (maybe) model-dependent.

Inference in population genetics Data collection Genetic typing Description of patterns of genetic variability Analysis and interpretation Test (simulations)

Inference in population genetics In case I was not specific enough : Beware the use of any method whose assumptions you do not understand or which have not been extensively tested on simulations : –Nested Clade Analysis –Median-network Thank you AND MANY THANKS TO Mark Beaumont, Mike Bruford, Guido Barbujani, Richard Nichols, etc.