2/6/01 Historical inference from linguistic and genetic data Potentially “…the best evidence of the derivation of … the human race” (Thomas Jefferson)

Slides:



Advertisements
Similar presentations
Population Genetics 3 We can learn a lot about the origins and movements of populations from genetics Did all modern humans come from Africa? Are we derived.
Advertisements

Race, Language and Culture My role is to introduce the biological aspect of the course. So, I will discuss: What is the nature of human races? Are there.
Vicky Lee.  The Descent of Man “In each great region of the world the living mammals are closely related to the extinct species of the same region. It.
Gene tree analyses of Aboriginal Australians Rosalind Harding University of Oxford.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
Recombination and genetic variation – models and inference
CHAPTER 20: HUMAN EVOLUTION Understanding “Mitochondrial Eve” and the Out of Africa hypothesis.
Molecular Evolution. Morphology You can classify the evolutionary relationships between species by examining their features Much of the Tree of Life was.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Selection of Research Participants: Sampling Procedures
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
1 African Populations and the Evolution of Human Mitochondrial DNA Vigilant L., Stoneking M., Harpending H., Hawkers K. and Wilson A. C. Science, New Series,
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Phylogenetic reconstruction
Reconstructing and Using Phylogenies
EVOLUTION: A History and a Process. Voyage of the Beagle  During his travels, Darwin made numerous observations and collected evidence that led him to.
Human Evolution What were our ancestors like? Where did we evolve? Why big brains? Relationships between populations?
Molecular Evolution Revised 29/12/06
Islands in Africa: a study of structure in the source population for modern humans Rosalind Harding Depts of Statistics, Zoology & Anthropology, Oxford.
Tracing the dispersal of human populations By analysis of polymorphisms in the Non-recombining region of the Human Y Chromosome Underhill et al 2000 Nature.
Lecture 28 Evolution. Variation Without variation (which arises from mutations of DNA molecules to produce new alleles) natural selection would have nothing.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
DNA in Genealogy R.E. Butler Apr 9/2005. Presentation Outline No. Slides DNA Basics4 DNA Testing for Genealogy7 Sample Results & Maps5 Sample Data Base.
Out-of-Africa Theory: The Origin Of Modern Humans
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
What does it mean, in practice? 100%. Members of our community are only slightly less different from us than members of distant populations 85% 100%
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Molecular phylogenetics
The Tree of Life. Questions 6.5 Billion Humans 6.5 Billion Humans Who were the first modern people in Africa? Who were the first modern people in Africa?
Speciation SJCHS. Evolution Microevolution: Change in a population ’ s gene pool from generation to generation Speciation: When one or more new species.
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
McGraw-Hill © 2008 The McGraw-Hill Companies, Inc. All Rights Reserved.
Background Information First species of Homo, Homo habilis, evolved in Africa around 2 million years ago. Later, a descendant of Homo habilis, Homo erectus.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Ma.Luisa V. Cuaresma Biological Sciences Department Chapter 3 Evolution, Systematics and Phylogeny.
The Search for Genetic Eve and Adam. Divergence Points 5-7 Million Years Ago (MYA)– Divergence from the Chimpanzee Lineage 5-7 Million Years Ago (MYA)–
 Read Chapter 4.  All living organisms are related to each other having descended from common ancestors.  Understanding the evolutionary relationships.
Measures of Variation Among English and American Dialects Robert Shackleton U.S. Congressional Budget Office.
Phylogeny GENE why is coalescent theory important for understanding phylogenetics (species trees)? coalescent theory lets us test our assumptions.
16. Molecular Phylogenetics
Evolutionary Biology Concepts Molecular Evolution Phylogenetic Inference BIO520 BioinformaticsJim Lund Reading: Ch7.
Population Genetics and Human Evolution
Language family 1 BBI LANGUAGE FAMILIES - LECTURE TWO.
GENE 3000 Fall 2013 slides wiki. wiki. wiki.
Coalescent Models for Genetic Demography
26.1 Organisms Evolve Through Genetic Change Occurring Within Populations. “Nothing in Biology makes sense except in the light of Evolution” –Theodosius.
Introduction to History of Life. Biological evolution consists of change in the hereditary characteristics of groups of organisms over the course of generations.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Hypothesis Testing Introduction to Statistics Chapter 8 Feb 24-26, 2009 Classes #12-13.
Copyright © 2010 Pearson Education, Inc. publishing as Benjamin Cummings Lectures by Greg Podgorski, Utah State University Current Issues in Biology, Volume.
1.Stream A and Stream B are located on two isolated islands with similar characteristics. How do these two stream beds differ? 2.Suppose a fish that varies.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Serial Founder Effects in Linguistics and Genetics Claire Bowern (with Keith Hunley and Meghan Healy) Yale and University of New Mexico Feb 9, 2012 Based.
Molecular Clocks and Evolution Bio 135 Summer 2009.
The Little BIG HISTORY of Human Migration The Horn of Africa, 80,000 BC: Have you ever wondered what routes our ancestors took as they multiplied and settled.
The Debate over Modern Human Origins  What have been the major competing models regarding the origin of modern Homo sapiens?  What evidence has been.
Out-of-Africa Theory: The Origin Of Modern Humans.
Section 2: Modern Systematics
Language
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Simulation-Based Approach for Comparing Two Means
Section 2: Modern Systematics
Reminder: Populations
Historical Linguistics
Multiple Alignment and Phylogenetic Trees
Phylogeny & Systematics
What is Generally Agreed Upon?
But what if there is a large amount of homoplasy in the data?
Presentation transcript:

2/6/01 Historical inference from linguistic and genetic data Potentially “…the best evidence of the derivation of … the human race” (Thomas Jefferson) BUT Inferences are complex methods and results from several disciplines Intellectual stakes are high Work has often been careless sometimes spectacularly so dangers of overinterpretation and “scientism”

2/6/01 General methodological problems Not all graphs are trees –“treeness” tests often left out –“treeness” hypothesis can often be rejected Tree inference may be underdetermined –Branching structure –Root choice Rates of change may not be constant –for different markers –across time Gene trees (and language trees) may not be population trees Biology and language are complicated –simplifying assumptions are sometimes perniciously mistaken

2/6/01 Trees vs. Clines (etc.) A tree structure represents the results of a sequence of splits in population (or language) –no further influences among separate branches –if rates of change are constant, distances should be quantized Within an interbreeding (intercommunicating) population, distances reflect the amount of gene flow (transmission of linguistic traits) –should correlate strongly with accessibility –e.g. geographical distance in the simplest case

2/6/01

The… procedures outlined here provide a rigorous method for inferring whether the geographical pattern of variation is consistent with an historical split (fragmentation) or no split(recurrent gene flow) using criteria that are completley explicit. For example, in analyzing the mtDNA of tiger salamanders, a clear split into eastern and western lineages was detected for mtDNA. Using the same explicit criteria, there was no split among any human populations. Quite the contrary, the present analysis documents recurrent and continual genetic interchange among all Old World human populations throughout the entire time period marked by mt DNA. Accordingly, estimating a date for a 'split' of Africans from non- Africans based on evidnece from mtDNA is certainly allowed by many computer programs, but the results are meaningless because a date is being assigned to an 'event' that never occurred. Templeton (1997)

2/6/01 Methods for tree inference (“phylogeny”) Two general approaches –clustering (easier but cruder) –generate and evaluate alternative trees Distance-based methods –based on matrix of distances/similarities Parsimony –based on set of partly-shared characters or traits documents 193 different phylogeny packages

2/6/01 Cognate percentages for 8 Vanuatu languages Toga 64 Mosina Peterara Nduindui Sakao Malo Fortsenal Raga Data from Guy (1994)

2/6/01 Reconstruction Algorithm (Guy 1994) “A message is input at the root of a tree-shaped transmission network, whence it is transmitted to the terminal nodes. As they travel, copies of the original message are affected by errors consisting in randomly selected segments of the message being replaced by other segments randomly drawn from a pool of possible segments (the "alphabet“ of the message). The problem is: from the garbled versions of the original message collected at the terminal nodes, reconstruct the network and the history of the transmission of the message.” “Additive-distance” tree with weights on branches rather than on nodes -- doesn’t assume constant rate of change…

2/6/01 Explanatory force of the model Set of distances grows as Set of binary-tree branch labels grows as For 8 languages: we predict 28 numbers (the inter-language cognate proportions) with 14 numbers (the binary tree branch proportions)

2/6/01 Inferred tree Toga : : : : Mosina ' | | | Peterara ' | | Nduindui : ' | Raga ' | Sakao : : ' Fortsenal ' | Malo ' from Guy (1994) Mosina/Toga:.77*.83 =.6391 (really 64%) Peterara/Mosina:.829*.919*.77 =.5866 (really 58%) Peterara/Toga:.829*.919*.830 =.6323 (really 64%)

2/6/01 True - predicted cognate percentages Toga 0 Mosina 1 -1 Peterara Nduindui Sakao Malo Fortsenal Raga The model fits very well!

2/6/01 Where’s the root? Toga : : : :--Protolanguage Mosina ' | | | Peterara ' | | Nduindui : ' | Raga ' | Sakao : : ' Fortsenal ' | Malo ' Isn’t it obvious?

2/6/01 Oops: other options Toga : : : : Mosina ' | | | Peterara ' | | Nduindui : ' | Raga ' | Sakao : : ' Fortsenal ' | Malo ' protolanguage

2/6/01 And some more… Toga -830-:-919-:-972-:-947-:-895-:-883-:-567- Sakao Mosina -770-' | | | `-759- Fortsenal Peterara ' | ` Malo Nduindui :-949-' Raga ' protolanguage In the absence of other constraints, the root can be placed anywhere in the tree without changing the model’s fit!

2/6/01 Possible “other constraints” Historical evidence –about earlier forms –about structure of relationships among contemporary forms “outgroup” Constraints on rate of change –linguistic (or genetic) “clock”

2/6/01 A universal constant for glottochronology? Thirteen sets of data, presented in partial justification of these assumptions, serve as a basis for calculating a universal constant to express the average rate of retention k of the basic-root morphemes: k = ± per millennium, with a confidence limit of 90%. Lees (1953)

2/6/01 LanguageYearsWordsCognatesRate (per millenium) English Latin/Spanish Latin/French German Middle Egyptian / Coptic Greek Chinese Swedish Some of Lees’ data:

2/6/01 Some more retentive languages (rates per 1000 years) Language100-word list200-word list Icelandic (rural)99%97.6% Icelandic (urban)98%96.2% Georgian96.5%89.9% Amenian97.8%94% Bergsland & Vogt (1962)

2/6/01 David Lithgow (pers. com. circa 1970) has observed a replacement of some 20% of the basic vocabulary in Muyuw (Woodlark island) in one generation. Raise 0.8 to the 33rd power, and that gives you the retention rate of Muyuw per 1000 years should it continue to evolve at that rate: 0.06%. Jacques Guy (1994) Some less retentive ones Bergsland & Vogt estimate of vocabulary retention in East Greenlandic as.722 in 600 years, or.34 per millenium.

2/6/01 “Language chains” A.77 B C Configurations like this are taken as prima facie evidence of “non-treeness”, to be attributed to borrowing/mixing/cline types of situations. But in fact they can also easily be generated by variable rates of change: A % |____ protolanguage B % | | % ----' C % ----' Note that the required difference in mean rate of change is only (.9-.9*.8)/.9 =.2, or 20%

2/6/01 Mitochondrial Genome

2/6/01 Mitochondrial family tree

2/6/01 Mitochondrial phylogeny

2/6/01 Three fascinating “results” Mitochrondrial Eve Mitochrondial Clans The three-wave theory: converging linguistic and genetic evidence

2/6/01 Mitochondrial Eve Cann, Stoneking, and Wilson (1987): mtDNA comparisons of 147 people from Europe, Africa, Asia, Australia, and new Guinea show that all present human mtDNA is descended from a single African woman who lived about 200,000 years ago.

2/6/01 First problem Computer program was used to find a tree consistent with the mtDNA data But so were many other (unreported) trees! –order of answers depended on order of data –root could be effectively anywhere in the dataset e.g. Melanesian Eve, Asian Eve, European Eve…

2/6/01 Other problems mtDNA may not change at a constant rate mtDNA changes may be adaptive Gene trees may not be population trees –DNA (including mtDNA) can spread by gradual flow or by range expansion –spread can be influenced by other factors

2/6/01 Early results: Native Americans come from four genetic lineages, labeled A through D. Amerinds have all four lineages, NaDene only A, and Eskaleuts A and D. Current results: The four mtDNA lineages divide into nine distinct genetic subtypes. All four lineages are in all three language groups. Many local populations have all four lineages and a number even have all the subtypes. All subtypes can be found in North, Central and South America. “It isn't realistic to believe that the same lineages ended up in all these populations across two continents by separate migrations."

2/6/01 Oxford Ancestors We put the Genes in Genealogy Oxford Ancestors is the World's first organization to harness the power and precision of modern DNA- based genetics in the service of genealogy. MatriLine™ interprets your deep maternal ancestry, linking you - if your roots are in Europe - to one of seven women: Ursula, Tara, Helena, Katrine, Velda, Xenia or Jasmine.

2/6/01

And MtDNA inheritance may not even be entirely clonal! Mice –demonstration of “paternal leakage” Hagelberg –rare mtDNA mutation in Vanuatu Erye-Walker –statistics of mtDNA “homoplasies”

2/6/01 Island evidence Erika Hagelberg (Proc. R. Soc. 1999) –Island of Nguna (Vanuatu, Melanesia) –3 main MtDNA population groups as expected for the region –In all three groups, the same mutation is sometimes found previously known only from one Northern European –Repeated chance mutation is unlikely local spread by recombination seems more probable

2/6/01 Statistics of mtDNA “homoplasies ” Mutations that occur in different mtDNA haplogroups around the world Assuming purely maternal inheritance, these were thought to represent chance recurrence of mutations in “hypervariable” regions Eyre-Walker et al. (Proc. R. Soc. 1999): –regions are not statistically more variable than others –mutations cluster geographically MacCauley (1999) counters –much of the result comes from a dataset that may be errorful –“no need to panic”

2/6/01 Reaction of another mtDNA afficionado: …I am reminded of a comment by a bishop’s wife in Victorian England, also concerning human origins: “Let us hope that it isn’t true, and if it is, that it will not become generally known.”