Gene tree analyses of Aboriginal Australians Rosalind Harding University of Oxford.

Slides:



Advertisements
Similar presentations
Population Genetics 3 We can learn a lot about the origins and movements of populations from genetics Did all modern humans come from Africa? Are we derived.
Advertisements

Exact Computation of Coalescent Likelihood under the Infinite Sites Model Yufeng Wu University of Connecticut DIMACS Workshop on Algorithmics in Human.
Y CHROMOSOME VARIATION Males from Cis-Baikal Sites of Lokomotiv, Shamanka II, and Ust’-Ida Examine the genetic relationships between prehistoric Cis- Baikal.
Background The demographic events experienced by populations influence their genealogical history and therefore the pattern of neutral polymorphism observable.
Recombination and genetic variation – models and inference
Amorphophallus titanum Largest unbranched inflorescence in the world Monecious and protogynous Carrion flower (fly/beetle pollinated) Indigenous to the.
Practical Session: Bayesian evolutionary analysis by sampling trees (BEAST) Rebecca R. Gray, Ph.D. Department of Pathology University of Florida.
Sampling distributions of alleles under models of neutral evolution.
Genomes as the Hub of Biology UNIT 2. The hub of biology As biologists, we seek not only to understand how a single organism works, but how organisms.
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Lecture 23: Introduction to Coalescence April 7, 2014.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Phylogenetic reconstruction
Human Evolution What were our ancestors like? Where did we evolve? Why big brains? Relationships between populations?
Molecular Evolution Revised 29/12/06
Forward Genealogical Simulations Assumptions:1) Fixed population size 2) Fixed mating time Step #1:The mating process: For a fixed population size N, there.
Review of cladistic technique Shared derived (apomorphic) traits are useful in understanding evolutionary relationships Shared primitive (plesiomorphic)
Islands in Africa: a study of structure in the source population for modern humans Rosalind Harding Depts of Statistics, Zoology & Anthropology, Oxford.
Exact Computation of Coalescent Likelihood under the Infinite Sites Model Yufeng Wu University of Connecticut ISBRA
Tracing the dispersal of human populations By analysis of polymorphisms in the Non-recombining region of the Human Y Chromosome Underhill et al 2000 Nature.
Polymorphism Structure of the Human Genome Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA
Lecture 28 Evolution. Variation Without variation (which arises from mutations of DNA molecules to produce new alleles) natural selection would have nothing.
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary, May 2006.
Inferring human demographic history from DNA sequence data Apr. 28, 2009 J. Wall Institute for Human Genetics, UCSF.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
Estimating recombination rates using three-site likelihoods Jeff Wall Program in Molecular and Computational Biology, USC.
Monte Carlo methods for estimating population genetic parameters Rasmus Nielsen University of Copenhagen.
Chapter Geography of Evolution Platyrrhini Catarrhini.
Ferris et al PNAS 1981 Evolutionary tree of apes and humans based on cleavage maps of mtDNA Figure compares restriction fragments in humans and gorillas.
Out-of-Africa Theory: The Origin Of Modern Humans
GENETIC DISTINCTIVENESS OF ITALIAN AUROCHS: NEW INSIGHTS INTO CATTLE DOMESTICATION PROCESS Giulio Catalano (1),Stefano Mona (2), Martina Lari (1), Paolo.
Gil McVean Department of Statistics, Oxford Approximate genealogical inference.
Phylogenetics and Coalescence Lab 9 October 24, 2012.
The Search for Genetic Eve and Adam. Divergence Points 5-7 Million Years Ago (MYA)– Divergence from the Chimpanzee Lineage 5-7 Million Years Ago (MYA)–
Biological inferences from barcoding data Timothy G. Barraclough Establishing a standard DNA barcode for land plants.
Discuss results of forensics analysis Review mini satellites and microsatellites Present Y chromosome study of human origins and migration Discuss one.
Simon Myers, Gil McVean Department of Statistics, Oxford Recombination and genetic variation – models and inference.
Population Genetics and Human Evolution
Chapter 24: Molecular and Genomic Evolution CHAPTER 24 Molecular and Genomic Evolution.
Getting Parameters from data Comp 790– Coalescence with Mutations1.
Patterns of divergent selection from combined DNA barcode and phenotypic data Tim Barraclough, Imperial College London.
Coalescent Models for Genetic Demography
Neanderthals Noonan, et al. Sequencing and Analysis of Neanderthal Genomic DNA Green, et al. Analysis of one million base pairs of Neanderthal DNA Kristine.
Population genetics. coalesce 1.To grow together; fuse. 2.To come together so as to form one whole; unite: The rebel units coalesced into one army to.
Figure 5.1 Giant panda (Ailuropoda melanoleuca)
Hickory Dickory Dock: Understanding the Molecular Clock Felisa Wolfe ERUPT: Biocomplexity Seminar 28 Feb 2003.
Copyright © 2010 Pearson Education, Inc. publishing as Benjamin Cummings Lectures by Greg Podgorski, Utah State University Current Issues in Biology, Volume.
By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458.
Serial Founder Effects in Linguistics and Genetics Claire Bowern (with Keith Hunley and Meghan Healy) Yale and University of New Mexico Feb 9, 2012 Based.
Coalescent theory CSE280Vineet Bafna Expectation, and deviance Statements such as the ones below can be made only if we have an underlying model that.
Amorphophallus titanum
Testing the Neutral Mutation Hypothesis The neutral theory predicts that polymorphism within species is correlated positively with fixed differences between.
Restriction enzyme analysis The new(ish) population genetics Old view New view Allele frequency change looking forward in time; alleles either the same.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College
Bioinf.cs.auckland.ac.nz Juin 2008 Uncorrelated and Autocorrelated relaxed phylogenetics Michaël Defoin-Platel and Alexei Drummond.
Fixed Parameters: Population Structure, Mutation, Selection, Recombination,... Reproductive Structure Genealogies of non-sequenced data Genealogies of.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
CSE 280A: Advanced Topics in Computational Molecular Biology
Statistical Modeling of Ancestral Processes
26.5 Molecular Clocks Help Track Evolutionary Time
The coalescent with recombination (Chapter 5, Part 1)
Track the Split of Crocodile Sub Populations
Volume 20, Issue 4, Pages R194-R201 (February 2010)
Phylogenics & Molecular Clocks
The Dual Origin of the Malagasy in Island Southeast Asia and East Africa: Evidence from Maternal and Paternal Lineages  Matthew E. Hurles, Bryan C. Sykes,
Outline Cancer Progression Models
Maternal History of Oceania from Complete mtDNA Genomes: Contrasting Ancient Diversity with Recent Homogenization Due to the Austronesian Expansion  Ana T.
Unit Genomic sequencing
Messages through Bottlenecks: On the Combined Use of Slow and Fast Evolving Polymorphic Markers on the Human Y Chromosome  Peter de Knijff  The American.
Presentation transcript:

Gene tree analyses of Aboriginal Australians Rosalind Harding University of Oxford

Aim To investigate gene genealogies of two data sets –Human mitochondrial coding genomes from Aboriginal Australians –Hepatitis B virus from the Pacific region (collaboration with Rory Bowden) Why? –To evaluate time depth of polymorphism –To use coalescent models rather than molecular clocks in phylogenies –To examine the implications of demographic assumptions

Aim for the ongoing study of HBV For HBV, mutations at fast sites have to be removed to resolve networks. But, mutations at fast sites contribute to high estimates of mutation rates. If we remove the fast sites, how do we recalibrate the mutation rates? Can we match patterns of HBV diversity in the Pacific region to human dispersals that have been dated by archaeology and genetics, to suggest appropriate time scales?

Theoretical background

Coalescence times in a gene genealogy Notice that T(2) is longer than T(3). Here N is assumed constant Rosenberg and Nordborg, 2002 This time scaling shows what we expect for a standard coalescence model.

Introducing Mutation G → T1 2 3 MCRA of 1, 2 & 3 The MCRA of 1, 2 & 3 is usually a more recent (younger) common ancestor than the common ancestor in whom a shared mutation, G → T, first arose for the whole tree Rosenberg and Nordborg, 2002

Constant N vs Expansion Gene genealogy simulated assuming constant N e ABC Frequencies of 3 alleles Frequencies of 11 alleles Gene genealogy simulated under population expansion

Computational analyses Software based on Genetree written by Prof Bob Griffiths Input data: infinite-sites compatible gene tree Unpublished upgrades that use importance sampling, following algorithms developed by Paul Fearnhead.

Polymorphism data for gene genealogies

Resolving the gene trees MtDNA coding genomes: –Minor problem: recurrent or back mutation events –Solution: re-instate inferred mutation events following standard mtDNA phylogeny reconstructions HBV data: –Major problem: a subset of fast sites –Solution: determine fast sites using Parat software from Meyer & Von Haeseler, 2003, Mol. Biol. Evol. 20(2): , and proceed as above.

Background to mtDNA study Van Holst Pellekaan et al. (2006) Mitochondrial genomics identifies major haplogroups in aboriginal Australians. Am J Phys Anthrop 131: Estimated a time scaled genealogy for 8 mtDNA coding regions from individual samples sequenced by Van Holst Pellekaan. No of genomes in public database and available to study has increased, now n=34.

r1 d3 d38 d32 r17 r6 r7 r25 KYA 74,200 17,400 26,700 36,350 36,300 43,000 53,200 N AuA N AuE N AuD N AuC M AuB 15301G 10873T 10398A 9540T 8701A 15043A 14783C 12771A 10400T 8793C 4508T 1598G 12705C 15040T 14384C 8506C 8404C 8251A 15002A 9095C 4008G 1598G 13341T 13105G 8542C 8474T 8014G 7705C 5126T 15885T 15852C 15300C 12771A 10724C 8705C 6755A 6221C 5563A 5147A 15521C 15511C 15110A 12756A 12414C 11404G 11353C 11065G 8614C 8167C 7805A 5237A 3391A 10398G 13419G 13132T 11288T 11016A 10914A 10088T 6881G 6260A 5276G 4976G 4688C 3010A 1719A 591A 12999G 8635A 8251A 7961C 1346G 1187C 14527G 9410G 9156G 6104T 5563A 14572T 10645G 8269A 5442C 2772T 12715G 11110G 10786C 14783T 15607G NM Present Both of the major non-African haplogroups represented. Time scale estimated from gene tree suggests that these lineages evolved from original founders, 40,000 – 50,000 years ago. mtDNA genealogy

M: AuB; N: haplogroups O (AuD), S (AuA), P: P3, P4 (AuC), P5, P6, P7, P8 (AuE) Network of 34 mtDNA genomes

Time scale for Australian mtDNA Estimated mutation rate: –  : per coding region per generation Data suggests population expansion Find model parameters with relatively high likelihood –ML(  ) = 350 = 2Nu –Population expansion rate since TMRCA : e 5 –TMRCA: time to most recent common ancestor Population size –N present: 33,000 –from N ancestral: 220 at TMRCA TMRCA: 66,000 yrs Note: P3 is the only haplogroup with branches represented in both Australia and PNG.

New analysis confirms: Aboriginal Australian diversity has been evolving in isolation for ~40,000 years. mtDNA genealogy

Background to HBV study Analysis by Rory Bowden Focus on HBV variability in Australia and the Pacific and judge the time scale of the genealogy by comparison with hypotheses for HBV dispersal Within Genotype C are two very distinct sequences from aboriginal Australians.

HBV Genotypes Worldwide, 7 HBV genotypes each with distinct geographic distribution. Sampling in East Asia and Pacific region finds mainly genotypes C and D

Australia and Pacific region First occupation of Australia: 50,000 yrs BP; PNG and Solomons: 30,000 yrs BP; Austronesian expansion: Vanuatu: 5,000 yrs BP; Fiji and Tonga: 3,000 yrs BP.

HBV C Genotype Network Various, mainly Melanesian Vanuatu Tonga, Fiji China/Japan AUSTRALIA

HBV C: Starting again … Network after removal of 10 fastest sites. S antigen sequences, relative rate cut-off of 15.

HBV C: more resolution Network of S antigen sequences

HBV Genotype C in the Pacific

Time scales 5000 or 50,000 years? 3000 or 30,000 years? 2000 or 20,000 years?

Conclusions Gene trees can be constructed for mtDNA and HBV data to represent polymorphism data. Coalescent analyses are feasible Contemporary mtDNA diversity in aboriginal Australians dates represents founding lineages Contemporary HBV diversity in Australia and Pacific could be explained by two alternative times scales (more work to do!) –Over 50,000 years –Over 5,000 years

Abstract: Gene tree analyses of Aboriginal Australians. Genetrees from mitochondrial DNA sequences have been widely used for phylogeographic analyses of modern human dispersal but are not so often used in combination with coalescent models for demographic inferences. Given the lack of recombination in mtDNA, such data should be ideal for gene tree based coalescent analyses. However, the same mutability that makes them so informative for studies of geographic variation also generates difficulties for analyses assuming an infinite-sites mutation process. The main aim of this talk is to present some gene tree based coalescent analyses applied to hypervariable sequence data from mtDNA and also other genomes, and discuss solutions to the problems ensued. The primary data set comprises 34 mtDNA coding genomes from Aboriginal Australians and extends work presented by van Holst Pellekaan et al. (Am J Phys Anthrop 131: , 2006). Mitochondrial DNA is not the only haploid genome that has value for anthropological genetics. Vertically transmitted bacteria can also be informative, as has been previously shown using data from Helicobacter pylori. In collaboration with Rory Bowden, we hope to show that Hepatitis B virus strains may also provide insights into anthropological questions about Aboriginal Australian prehistory. Hopefully, I will have results on some gene tree analyses as well as methodological issues to discuss.

The Pacific

Pacific and Indian D Genotype

A more complicated network …

HBV S antigen gene sequences