Sequencing Neanderthal DNA

Slides:



Advertisements
Similar presentations
Vicky Lee.  The Descent of Man “In each great region of the world the living mammals are closely related to the extinct species of the same region. It.
Advertisements

From Africa to Aotearoa The story of human migrations.
“Genome Sequence of a 45,000-year-old modern human from western Siberia” Presented by: James Byrnes Postdoctoral Fellow 1.
Amorphophallus titanum Largest unbranched inflorescence in the world Monecious and protogynous Carrion flower (fly/beetle pollinated) Indigenous to the.
Chapter 19 Evolutionary Genetics 18 and 20 April, 2004
Genomes as the Hub of Biology UNIT 2. The hub of biology As biologists, we seek not only to understand how a single organism works, but how organisms.
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
Lecture 23: Introduction to Coalescence April 7, 2014.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Patterns of population structure and admixture among human populations Katarzyna Bryc OEB 275br February 19, 2013.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Methods and challenges in the analysis of admixed human genomes Simon Gravel Stanford University.
Signatures of Selection
Population Genetics I. Evolution: process of change in allele
Genetica per Scienze Naturali a.a prof S. Presciuttini Human and chimpanzee genomes The human and chimpanzee genomes—with their 5-million-year history.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Tracing the dispersal of human populations By analysis of polymorphisms in the Non-recombining region of the Human Y Chromosome Underhill et al 2000 Nature.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary, May 2006.
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
Recent genetic evidence on the Neandertal/modern human relationship.
Neandertals: Late archaic Homo sapiens. How to classify? ?
Out-of-Africa Theory: The Origin Of Modern Humans
1 Genetic Variability. 2 A population is monomorphic at a locus if there exists only one allele at the locus. A population is polymorphic at a locus if.
Molecular phylogenetics
Speciation Until recently, over 500 species of cichlid fishes lived in East Africa’s Lake Victoria Copyright © 2009 Pearson Education, Inc.
Background Information First species of Homo, Homo habilis, evolved in Africa around 2 million years ago. Later, a descendant of Homo habilis, Homo erectus.
“Recent next generation sequencing results” MACHADO LAB.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
APPLICATIONS OF MOLECULAR PHYLOGENETICS
Simon Myers, Gil McVean Department of Statistics, Oxford Recombination and genetic variation – models and inference.
Analysis of Mitochondrial DNA from Chimpanzees in Tanzania Timothy Comar, April Bednarski, and Douglas Green.
Mechanisms of Population Evolution
Identification of Copy Number Variants using Genome Graphs
Biology 3201 Chapters The Essentials. Micro vs. Macro Evolution Micro Evolution Evolution on a smaller scale. This is evolution within a particular.
Neanderthals Noonan, et al. Sequencing and Analysis of Neanderthal Genomic DNA Green, et al. Analysis of one million base pairs of Neanderthal DNA Kristine.
Ecological Genomics of Hadal Amphipods Heather Ritchie 1, Alan J. Jamieson 2 and Stuart B. Piertney 1 1 Institute of Biological and Environmental Sciences,
Errors in Genetic Data Gonçalo Abecasis. Errors in Genetic Data Pedigree Errors Genotyping Errors Phenotyping Errors.
The International Consortium. The International HapMap Project.
NEW TOPIC: MOLECULAR EVOLUTION.
1 Paper Outline Specific Aim Background & Significance Research Description Potential Pitfalls and Alternate Approaches Class Paper: 5-7 pages (with figures)
C MODERN HUMANS Cont…..
Amorphophallus titanum
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College
Our Current Understanding of Human Demographic History and Migrations NeandertalModern Homo Sapiens.
7.4 Human Genetics and Pedigrees TEKS 6F, 6H The student is expected to: 6F predict possible outcomes of various genetic combinations such as monohybrid.
Chapter 15.1 Genetic Engineering Selective Breeding.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
W 7/30 exam #3 (bring cheat sheet) bonus #2 due W 8/6 optional final exam, during class time.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Lecture 24: Human Origins and Signatures of Selection April 11, 2014.
Gene flow and speciation. Mechanism for speciation Allopatric speciation Sympatric speciation.
Neanderthals and Disease Genes
Do we really need theory?
Signatures of Selection
Laurie S. Stevison Suzanne E. McGaugh Mohamed A. F. Noor
High-resolution haplotype structure in the human genome
The same gene can have many versions.
Genetic Variation.
GREGOR MENDEL Founder of Genetics.
Identification of imprinted genes and imprinted DMRs
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian Migrations  Marc Haber, Massimo Mezzavilla, Anders Bergström, Javier.
Volume 173, Issue 1, Pages e9 (March 2018)
The Predecessors Within
Sriram Sankararaman, Swapan Mallick, Nick Patterson, David Reich 
Chad Genetic Diversity Reveals an African History Marked by Multiple Holocene Eurasian Migrations  Marc Haber, Massimo Mezzavilla, Anders Bergström, Javier.
by Yali Xue, Javier Prado-Martinez, Peter H
Selina Vattathil, Joshua M. Akey  Cell 
Sriram Sankararaman, Swapan Mallick, Nick Patterson, David Reich 
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Presentation transcript:

Sequencing Neanderthal DNA Lauren Edelson Image sources: http://en.wikipedia.org/wiki/Neanderthal

Road Map Sequencing ancient DNA: methods & outcomes (Review article) DNA sequencing to infer Neanderthal ancestry in modern humans (Sankararaman et al) Examining the Altai Neanderthal population in depth (Prüfer et al) More complicated than this diagram!!

Learning About Human Population History From Ancient And Modern Genomes (Review, 2011) Before genome-wide data was available, human population studies relied on single genetic loci (eg mitochondrial DNA or noncombining regions of Y chromosome). These methods provide much more limited information. IN CONTRAST, high throughput DNA sequencing allows us to focus on nearly the entire genome instead of just on single loci. Including full genome of extinct species and populations. These technologies have allowed us to sequence genomes of two groups of human ancestors: Neanderthals and Denisovans Modern DNA sequencing techniques allow us to arrive at conclusions like the chart above– lets see how! ----- Meeting Notes (5/20/14 11:56) ----- Mark Stoneking and Johannes Krause

High-throughput sequencing of ancient DNA Even though DNA degrades with time, with the right conditions it can be preserved up to 100,000 years! High throughput sequencing allows us to look at whole genome. Need <50 mg of fossil material to obtain full ancient genome Yall don’t need to fully understand this diagram just understand bone->DNA repair->analysis Other modern methods include targeted approaches using SNP arrays and targeted DNA hybridization capture. Also unsupervised analyses (focusing on individual instead of the population).

Challenges of sequencing ancient DNA Repairing damaged DNA Avoiding contaminant DNA DAMAGED DNA Short fragment length (usually <70 base pairs) and low quantities of DNA. Short fragments can create mapping bias – hard to map these fragments to the genome CONTAMINATED DNA % of endogenous DNA can span from 100% to <0.1%. Possible sources of DNA include: the organism itself, microbial/enviornmental DNA indroduced after fossil deposition, and DNA contamination AFTER sample collection

Challenges of sequencing ancient DNA (continued) Post-mortem chemical damage Chemical modifications can cause nucleotide misinterpretations during amplification and sequencing. The most common form of this damage is called cytosine deamination, during which cytosine is converted into uracil and thus interpreted as thymine during sequencing. 5’ ends show high rates of C->T changes (cytosine deamination). 3’ ends show high rates of G->A changes (due to a reaction during the blunt-end repair process). However, in some studies, these chemical damages are actually useful! Contaminant human DNA has 8x less cytosine deamination on strand ends during High throughput sequencing than Neanderthal DNA. This can help us identify what is contamination. However we don’t know how fast this process occurs so it doesn’t exclude the possibility of deaminated human contaminant DNA from fossils collected during 19th century etc. We can use cytosine deamination to test the authenticity of ancient DNA sequences!

Okay, so we sequenced some Neanderthal DNA. What now?

How much can you actually tell from ancient DNA? Genetic composition of organisms Phylogenetic relationships Divergence times Population structure Population hybridization Population bottlenecks Phylogeographic patterns

Emerging technologies allow for more complex population divergence models Anlysis of only one or a few loci can really only give us a simple understanding Genome-wide data allow for more specific models including migrations, bottlenecks, and population expansions. We can also do direct testing of hypotheses about population history. Stay tuned to see what results we find from the actual data! Get even more complex than this! Used to tell measures of admixture between populations. ADMIXTURE = Gene flow between two or more groups that have been separated for a long enough period of time to be genetically distinct. Results in introduction of new genetic lineages into a population. Prevents speciation Our best models as of 2009 for population divergence

SNP data used to compare possible dispersal route scenarios METHODS: Recent study of ~1 million SNPs from populations from Borneo, New Guinea, Fiji, and Polynesia. Accounts for acertainment bias. (Acertainment Bias = how SNPs are chosen for inclusion on SNP arrays. SNPs known to be polymorphic in one population will cause overestimation of genetic variation in that population compared to others). Compare with data from Yoruba, Chinese, and European-Americans collected for the International HapMap project. Evaluate three possible scenarios with approximate Bayesian computation. mtDNA and Y-chromosome evidence support early southern route dispersal hypothesis (Southern route dispersal hypothesis??) In contrast, genome-wide SNP data support modified version of early southern route dispersal hypothesis with single out-of-africa migration, followed by separate dispersals from non-africa population (middle box). Supported by evidence in modern day human genomes! All non-African modern humans have signal of Neanderthal gene flow. And New Guineans have signal of Denisovan gene flow not (as of yet) found in other East Asian populations

Key Dates Neanderthal/Denisovan split with humans: 820,000 ya (350,000 ya?) Neanderthal and Denisovan split: 680,000 ya Out of Africa: 50,000 ya Further human population divergence: 35-50ya

The genomic landscape of Neanderthal ancestry in present-day humans (March 2014) Sriram Sankararaman et al VERY new paper!! Image source http://ahmadiyyatimes.blogspot.com/2011/07/faith-and-science-all-non-africans-part.html

Subjects 1004 modern humans 176 West African (Yoruba tribe: Ibadan, Nigeria) 758 Europeans 572 east-Asians This study examined the genotypes of 1,004 modern day humans and compares them to Neanderthal DNA. They compare genes of 176 West African Yoruba tribe from Ibadan Nigeria with Neanderthal and non-African genomes (758 Europeans and 572 east-Asians).

Three features of genetic information Allelic pattern at SNPs High sequence divergence Haplotype length CRFs take into account three features of genetic information: If a non-African has allele seen in Neanderthals but absent in west-Africans then the allele is likely to originate from Neanderthals. 2) High sequence divergence of non-African and African but NOT of non-African and Neanderthal 3) Haplotype length that would fit with Neanderthal- human interbreeding occurring between 37,000-86,000 years ago. This length is about 0.05 centimorgans = (100cM per Morgan)/(2,000 generations) Feature functions: predicting ancestor alleles based on observed These features are then used to generate a Conditional Random Field (CRF)!

Conditional Random Fields (CRFs) Based on “3 features” Used to predict likelihood of Neanderthal ancestry in a given DNA sequence Feature functions Single SNP in Africans vs Europeans vs Test Multiple SNPs used to capture ancestry (divergence) Two classes of feature functions Captures info from joint patterns observed at a single SNP in Africans and Europeans. Main patterns they are looking for: Africans have ancestral allele, test haplotype & Neanderthals have at least one derived alleles = INCREASED Neanderthal ancestry Test haplotype allele is absent from Neanderthals and polymorphic (present) in Africans= DECREASED Neanderthal ancestry 2) Uses multiple SNPs to capture signal of Neanderthal ancestry. Comparing divergence of test haplotype & Neanderthal Sequence to divergence of test haplotype & African haplotypes. Basically you expect a given sequence to be either mostly Neanderthal or mostly modern (obviously with large amounts of variation) -> In a region of genome in which test haplotype carries Neanderthal ancestry, we expect the test haplotype to be closer to Neanderthal sequence than to most modern human sequences; this pattern is reversed outside these regions

Tiling path from inferred Neanderthal Haplotypes Looking at a specific locus on Chromosome 9 in European individuals. Red: inferred Neanderthal haplotype. Blue: resulting tiling path (contigs) How do you infer Neanderthal haplotypes? Runs of consecutive SNPs with high marginal probability and haplotypes at least 0.02 centimorgans long CONTIG= Set of overlapping DNA segments that together represent a consensus region of DNA b) Distribution of the “contig” length: 4,437 Neanderthal contigs, median length = 129 kb Chromosome 9 in several Europeans Average contig length

Maps of Neanderthal Ancestry European East Asian African On Chromosome 9, looking at the marginal probability of Neanderthal ancestry at a given position in INDIVIDUALS: one European American (resident of Utah) - Red one east-Asian (Han Chinese in Beijing) - Green one sub-Saharan African individual (African Luhya in Kenya) - Blue Population maps: AVERAGE data across all European individuals (red) and all East Asian individuals (Green) in the study. Showing estimates of proportion of Neanderthal ancestry in non-overlapping 100-kb windows on Chromosome 9. At certain locations Neadnerthal ancestry is inferred to be as high as 62% (East Asian) and 64%(European)

Neanderthal Ancestry in 1000 modern genomes European East Asian For each chromosome, this graph shows the fraction of alleles confidently inferred to be of Neanderthal origin in Europeans (red) and East Asians (green) in non-overlapping 1-Mb windows. Black bars = centromeres. Note that they label the 10-mb sized windows Note the “deserts”– spots on chromosomes with extremely low Neanderthal ancestry. This can probably be explained partially by small population sizes directly after interbreeding and partially by selection. Largest deserts occur on the X chromosome. There is an approximately 5-fold reduction of Neanderthal ancestry on the X chromosome. ThiThis is a region known to be dense in male hybrid sterility genes across many species… High selection against Neanderthal DNA on X chromosome

Male hybrid sterility Male donkey Female horse Mule (infertile) After crossing two species, offspring are often infertile, keeping the two species distinct Evidence for this in humans and Neanderthals? Yes! But it’s not the only factor causing selection against Neanderthal genome If male hybrid sterility occurs in humans and Neanderthals, we will see responsible genes disproportionately expressed in the testes. They test this by analyzing tissue specific genes as those that are expressed higher in certain tissues than others. Results: ONLY genes specific to testes were enriched in regions of low Neanderthal ancestry THEREFORE: Interbreeding of Humans and Neanderthals introduced alleles to humans that weren’t tolerated. In part probably because they contribute to male hybrid sterility.

B statistic Background selection (B) value “indicates the expected fraction of neutral diversity that is present at a site” Ranges from 0-1 Relatively obscure Values near 0: Almost all diversity removed by selection Values near 1: little effect of selection “Widespread Genomic Signatures of Natural Selection in Hominid Evolution” (McVicker et al, 2009)

Functionally important regions are deficient in Neanderthal ancestry OVERVIEW: Here they use B statistic: low B implies high density of functionally important elements. The B values are then binned into 5 categories (quintiles). BINNING: Essentially data broken into 5 buckets and each bucket is averaged RESULTS: If Neanderthal DNA had no effect on our genome, we would see a straight line. Instead, regions with reduced Neanderthal alleles are enriched in genes, implying that selection has acted to remove genetic material derived from Neanderthals from the modern genome. Quintile with highest B value (last quintile) has the highest Neanderthal ancestry out of the quintiles. Also, there has been especially high selection on the X chromosome, as we already saw in previous graph. We know that Neanderthal ancestry is positively correlated with B statistic, SO THEREFORE we know that some of the reduction in Neanderthal ancestry that we can see on the X chromosome MUST be due to selection. (Selection removing neanderthal DNA from our genome… Less neanderthal in the more functional regions)

Neanderthal GWAS Genome-Wide Association Studies (GWAS) can be used to associate Neanderthal-derived alleles with modern phenotypes Examination into the 5% of genes with highest Neanderthal ancestry (remember the red/green graphs?!) show that these genes are involved in keratin filament formation, suggesting that Neanderthal alleles that affect skin and hair may have helped humans adapt to non-African environments.

Genome-wide estimates of Neanderthal ancestry Neanderthal ancestry: east-Asian > European Larger European population: less time for selection to remove deleterious alleles

The complete genome sequence of a Neanderthal from the Altai Mountains (January 2014) ALSO a very modern paper!! Found a phalanx (bone segment) from 4th or 5th toe of an adult Neanderthal woman in Altai mountains and sequenced it! This bone was found in Denisova caves, where other bones of a “sister species” to Neanderthals have also been found. DNA sequencing revealed that this toe is NOT Denisovan and is instead Neanderthal. To sequence they had to remove uracil residues resulting from cytosine deamination Denisovans = archaic hominins represented by fossil remains found in Denisova cave in Siberia. According to DNA, “sister group” to neanderthals Kay Prüfer et al

Phylogenetic relationships of the Altai Neanderthal They used different mathematical models for making trees. Each tree takes into account a different aspect of the data… EVEN SO, both trees display similar patterns. In both trees, Human groups are clustered together (blue), Neanderthal groups are clustered together (red), and Denisovans are by themselves. Left tree is Bayesian/phylogenetic. Humans and Neanderthals are closer and Denisovans are separate. Right tree is Neighbor-joining algorithm: showing differences between purines and pyramidines. In right tree Denisovans and Neanderthals are closer and humans are separate. Overall, another way to visualize phylogeny. Disclaimer: this figure is based on dates of divergence of human/chimp DNA sequences which in turn rely on the human mutation rate (currently a controversial statistic) Modern humans Neanderthals

Homozygosity Inheriting identical copies of the same gene from each parent Eg Tay-Sachs (from population bottleneck) Long runs of homozygosity indicate inbreeding Tay-Sachs especially prominent in Ashkenazi jewish populations, which have undergone population bottlenecks. These bottlenecks reduce genetic diversity and therefore increase homozygosity. This diagram shows a single mutation, but you can imagine that a similar concept applies to the entire genome

IBD = Identity by descent IBD = Identity by descent. Important to know that these regions of the gene could also be the same by convergence.. But it usually implies shared ancestry. Length of segment reflects how many generations have passed since the most recent common ancestor. The more meioses occur, the more recombination has happened and the smaller the segment length typically is.

Indications of Inbreeding in Altai Neanderthal individual a)Time since most recent common ancestor for two alleles of French, Denisovan, and Altai individual as determined by homozygosity on 40mb chromosome 21 b) Four possible scenarios of parental relatedness for the Altai Neanderthal. We could derive additional scenarios from those marked with asterisk by switching sexes. As she is a female and the X chromosome also displays homozygosity, scenarios with two successive males in the pedigree have been omitted. c)Fraction of the genome in runs of homozygosity (of length 2.5-10cM) for various populations. Altai neanderthal MUCH higher than other groups. Representing higher rates of inbreeding and perhaps population bottleneck. Any of these scenarios above could account for this amount. They are all a little bit less because????

Heterozygosity estimates by group Heterozygosity = opposite of homozygosity. This graph shows range of heterzygosity observed in 15 non-African and 10 African individuals. Red bar: taking out the recent inbreeding still shows low heterozygosity. This shows that the individual genome sequenced is not an anomaly; rather we can infer much more about the population.

Inference of population size change over time X axis: time, going backwards. Note that Neanderthal/denisovan populations decrease while human populations eventually explode

Possible model of gene flow Shows the direction and magnitude of inferred gene flow events. Evidence for 3-5 cases of interbreeding among 4 distinct hominin populations – real history is probably more complex than this. Dashed line represents uncertainty. Note the “potential unknown hominin”. This could be a single entity such as homo erectus, or even a whole subtree.

Remember this guy? Look at how far our models have come!

Possible model of gene flow Pretty cool how much we can tell just by a single bone!!

Takeaways Although major challenges exist, it is possible to accurately sequence ancient DNA with modern technology We can use these methods to compare ancient and modern DNAAll non-African modern day humans have a small genetic presence of Neanderthal ancestry We can learn much about ancient populations (eg bottlenecks & inbreeding of Altai Neanderthals) by looking solely at one bone fragment!

thank you Questions?