Download presentation
1
Sequencing Neanderthal DNA
Lauren Edelson Image sources:
2
Road Map Sequencing ancient DNA: methods & outcomes (Review article)
DNA sequencing to infer Neanderthal ancestry in modern humans (Sankararaman et al) Examining the Altai Neanderthal population in depth (Prüfer et al) More complicated than this diagram!!
3
Learning About Human Population History From Ancient And Modern Genomes (Review, 2011)
Before genome-wide data was available, human population studies relied on single genetic loci (eg mitochondrial DNA or noncombining regions of Y chromosome). These methods provide much more limited information. IN CONTRAST, high throughput DNA sequencing allows us to focus on nearly the entire genome instead of just on single loci. Including full genome of extinct species and populations. These technologies have allowed us to sequence genomes of two groups of human ancestors: Neanderthals and Denisovans Modern DNA sequencing techniques allow us to arrive at conclusions like the chart above– lets see how! ----- Meeting Notes (5/20/14 11:56) ----- Mark Stoneking and Johannes Krause
4
High-throughput sequencing of ancient DNA
Even though DNA degrades with time, with the right conditions it can be preserved up to 100,000 years! High throughput sequencing allows us to look at whole genome. Need <50 mg of fossil material to obtain full ancient genome Yall don’t need to fully understand this diagram just understand bone->DNA repair->analysis Other modern methods include targeted approaches using SNP arrays and targeted DNA hybridization capture. Also unsupervised analyses (focusing on individual instead of the population).
5
Challenges of sequencing ancient DNA
Repairing damaged DNA Avoiding contaminant DNA DAMAGED DNA Short fragment length (usually <70 base pairs) and low quantities of DNA. Short fragments can create mapping bias – hard to map these fragments to the genome CONTAMINATED DNA % of endogenous DNA can span from 100% to <0.1%. Possible sources of DNA include: the organism itself, microbial/enviornmental DNA indroduced after fossil deposition, and DNA contamination AFTER sample collection
6
Challenges of sequencing ancient DNA
(continued) Post-mortem chemical damage Chemical modifications can cause nucleotide misinterpretations during amplification and sequencing. The most common form of this damage is called cytosine deamination, during which cytosine is converted into uracil and thus interpreted as thymine during sequencing. 5’ ends show high rates of C->T changes (cytosine deamination). 3’ ends show high rates of G->A changes (due to a reaction during the blunt-end repair process). However, in some studies, these chemical damages are actually useful! Contaminant human DNA has 8x less cytosine deamination on strand ends during High throughput sequencing than Neanderthal DNA. This can help us identify what is contamination. However we don’t know how fast this process occurs so it doesn’t exclude the possibility of deaminated human contaminant DNA from fossils collected during 19th century etc. We can use cytosine deamination to test the authenticity of ancient DNA sequences!
7
Okay, so we sequenced some Neanderthal DNA. What now?
8
How much can you actually tell from ancient DNA?
Genetic composition of organisms Phylogenetic relationships Divergence times Population structure Population hybridization Population bottlenecks Phylogeographic patterns
9
Emerging technologies allow for more complex population divergence models
Anlysis of only one or a few loci can really only give us a simple understanding Genome-wide data allow for more specific models including migrations, bottlenecks, and population expansions. We can also do direct testing of hypotheses about population history. Stay tuned to see what results we find from the actual data! Get even more complex than this! Used to tell measures of admixture between populations. ADMIXTURE = Gene flow between two or more groups that have been separated for a long enough period of time to be genetically distinct. Results in introduction of new genetic lineages into a population. Prevents speciation Our best models as of 2009 for population divergence
10
SNP data used to compare possible dispersal route scenarios
METHODS: Recent study of ~1 million SNPs from populations from Borneo, New Guinea, Fiji, and Polynesia. Accounts for acertainment bias. (Acertainment Bias = how SNPs are chosen for inclusion on SNP arrays. SNPs known to be polymorphic in one population will cause overestimation of genetic variation in that population compared to others). Compare with data from Yoruba, Chinese, and European-Americans collected for the International HapMap project. Evaluate three possible scenarios with approximate Bayesian computation. mtDNA and Y-chromosome evidence support early southern route dispersal hypothesis (Southern route dispersal hypothesis??) In contrast, genome-wide SNP data support modified version of early southern route dispersal hypothesis with single out-of-africa migration, followed by separate dispersals from non-africa population (middle box). Supported by evidence in modern day human genomes! All non-African modern humans have signal of Neanderthal gene flow. And New Guineans have signal of Denisovan gene flow not (as of yet) found in other East Asian populations
11
Key Dates Neanderthal/Denisovan split with humans: 820,000 ya (350,000 ya?) Neanderthal and Denisovan split: 680,000 ya Out of Africa: 50,000 ya Further human population divergence: 35-50ya
12
The genomic landscape of Neanderthal ancestry in present-day humans (March 2014)
Sriram Sankararaman et al VERY new paper!! Image source
13
Subjects 1004 modern humans
176 West African (Yoruba tribe: Ibadan, Nigeria) 758 Europeans 572 east-Asians This study examined the genotypes of 1,004 modern day humans and compares them to Neanderthal DNA. They compare genes of 176 West African Yoruba tribe from Ibadan Nigeria with Neanderthal and non-African genomes (758 Europeans and 572 east-Asians).
14
Three features of genetic information
Allelic pattern at SNPs High sequence divergence Haplotype length CRFs take into account three features of genetic information: If a non-African has allele seen in Neanderthals but absent in west-Africans then the allele is likely to originate from Neanderthals. 2) High sequence divergence of non-African and African but NOT of non-African and Neanderthal 3) Haplotype length that would fit with Neanderthal- human interbreeding occurring between 37,000-86,000 years ago. This length is about 0.05 centimorgans = (100cM per Morgan)/(2,000 generations) Feature functions: predicting ancestor alleles based on observed These features are then used to generate a Conditional Random Field (CRF)!
15
Conditional Random Fields (CRFs)
Based on “3 features” Used to predict likelihood of Neanderthal ancestry in a given DNA sequence Feature functions Single SNP in Africans vs Europeans vs Test Multiple SNPs used to capture ancestry (divergence) Two classes of feature functions Captures info from joint patterns observed at a single SNP in Africans and Europeans. Main patterns they are looking for: Africans have ancestral allele, test haplotype & Neanderthals have at least one derived alleles = INCREASED Neanderthal ancestry Test haplotype allele is absent from Neanderthals and polymorphic (present) in Africans= DECREASED Neanderthal ancestry 2) Uses multiple SNPs to capture signal of Neanderthal ancestry. Comparing divergence of test haplotype & Neanderthal Sequence to divergence of test haplotype & African haplotypes. Basically you expect a given sequence to be either mostly Neanderthal or mostly modern (obviously with large amounts of variation) -> In a region of genome in which test haplotype carries Neanderthal ancestry, we expect the test haplotype to be closer to Neanderthal sequence than to most modern human sequences; this pattern is reversed outside these regions
16
Tiling path from inferred Neanderthal Haplotypes
Looking at a specific locus on Chromosome 9 in European individuals. Red: inferred Neanderthal haplotype. Blue: resulting tiling path (contigs) How do you infer Neanderthal haplotypes? Runs of consecutive SNPs with high marginal probability and haplotypes at least 0.02 centimorgans long CONTIG= Set of overlapping DNA segments that together represent a consensus region of DNA b) Distribution of the “contig” length: 4,437 Neanderthal contigs, median length = 129 kb Chromosome 9 in several Europeans Average contig length
17
Maps of Neanderthal Ancestry
European East Asian African On Chromosome 9, looking at the marginal probability of Neanderthal ancestry at a given position in INDIVIDUALS: one European American (resident of Utah) - Red one east-Asian (Han Chinese in Beijing) - Green one sub-Saharan African individual (African Luhya in Kenya) - Blue Population maps: AVERAGE data across all European individuals (red) and all East Asian individuals (Green) in the study. Showing estimates of proportion of Neanderthal ancestry in non-overlapping 100-kb windows on Chromosome 9. At certain locations Neadnerthal ancestry is inferred to be as high as 62% (East Asian) and 64%(European)
18
Neanderthal Ancestry in 1000 modern genomes
European East Asian For each chromosome, this graph shows the fraction of alleles confidently inferred to be of Neanderthal origin in Europeans (red) and East Asians (green) in non-overlapping 1-Mb windows. Black bars = centromeres. Note that they label the 10-mb sized windows Note the “deserts”– spots on chromosomes with extremely low Neanderthal ancestry. This can probably be explained partially by small population sizes directly after interbreeding and partially by selection. Largest deserts occur on the X chromosome. There is an approximately 5-fold reduction of Neanderthal ancestry on the X chromosome. ThiThis is a region known to be dense in male hybrid sterility genes across many species… High selection against Neanderthal DNA on X chromosome
19
Male hybrid sterility Male donkey Female horse Mule (infertile) After crossing two species, offspring are often infertile, keeping the two species distinct Evidence for this in humans and Neanderthals? Yes! But it’s not the only factor causing selection against Neanderthal genome If male hybrid sterility occurs in humans and Neanderthals, we will see responsible genes disproportionately expressed in the testes. They test this by analyzing tissue specific genes as those that are expressed higher in certain tissues than others. Results: ONLY genes specific to testes were enriched in regions of low Neanderthal ancestry THEREFORE: Interbreeding of Humans and Neanderthals introduced alleles to humans that weren’t tolerated. In part probably because they contribute to male hybrid sterility.
20
B statistic Background selection (B) value “indicates the expected fraction of neutral diversity that is present at a site” Ranges from 0-1 Relatively obscure Values near 0: Almost all diversity removed by selection Values near 1: little effect of selection “Widespread Genomic Signatures of Natural Selection in Hominid Evolution” (McVicker et al, 2009)
21
Functionally important regions are deficient in Neanderthal ancestry
OVERVIEW: Here they use B statistic: low B implies high density of functionally important elements. The B values are then binned into 5 categories (quintiles). BINNING: Essentially data broken into 5 buckets and each bucket is averaged RESULTS: If Neanderthal DNA had no effect on our genome, we would see a straight line. Instead, regions with reduced Neanderthal alleles are enriched in genes, implying that selection has acted to remove genetic material derived from Neanderthals from the modern genome. Quintile with highest B value (last quintile) has the highest Neanderthal ancestry out of the quintiles. Also, there has been especially high selection on the X chromosome, as we already saw in previous graph. We know that Neanderthal ancestry is positively correlated with B statistic, SO THEREFORE we know that some of the reduction in Neanderthal ancestry that we can see on the X chromosome MUST be due to selection. (Selection removing neanderthal DNA from our genome… Less neanderthal in the more functional regions)
22
Neanderthal GWAS Genome-Wide Association Studies (GWAS) can be used to associate Neanderthal-derived alleles with modern phenotypes Examination into the 5% of genes with highest Neanderthal ancestry (remember the red/green graphs?!) show that these genes are involved in keratin filament formation, suggesting that Neanderthal alleles that affect skin and hair may have helped humans adapt to non-African environments.
23
Genome-wide estimates of Neanderthal ancestry
Neanderthal ancestry: east-Asian > European Larger European population: less time for selection to remove deleterious alleles
24
The complete genome sequence of a Neanderthal from the Altai Mountains (January 2014)
ALSO a very modern paper!! Found a phalanx (bone segment) from 4th or 5th toe of an adult Neanderthal woman in Altai mountains and sequenced it! This bone was found in Denisova caves, where other bones of a “sister species” to Neanderthals have also been found. DNA sequencing revealed that this toe is NOT Denisovan and is instead Neanderthal. To sequence they had to remove uracil residues resulting from cytosine deamination Denisovans = archaic hominins represented by fossil remains found in Denisova cave in Siberia. According to DNA, “sister group” to neanderthals Kay Prüfer et al
25
Phylogenetic relationships of the Altai Neanderthal
They used different mathematical models for making trees. Each tree takes into account a different aspect of the data… EVEN SO, both trees display similar patterns. In both trees, Human groups are clustered together (blue), Neanderthal groups are clustered together (red), and Denisovans are by themselves. Left tree is Bayesian/phylogenetic. Humans and Neanderthals are closer and Denisovans are separate. Right tree is Neighbor-joining algorithm: showing differences between purines and pyramidines. In right tree Denisovans and Neanderthals are closer and humans are separate. Overall, another way to visualize phylogeny. Disclaimer: this figure is based on dates of divergence of human/chimp DNA sequences which in turn rely on the human mutation rate (currently a controversial statistic) Modern humans Neanderthals
26
Homozygosity Inheriting identical copies of the same gene from each parent Eg Tay-Sachs (from population bottleneck) Long runs of homozygosity indicate inbreeding Tay-Sachs especially prominent in Ashkenazi jewish populations, which have undergone population bottlenecks. These bottlenecks reduce genetic diversity and therefore increase homozygosity. This diagram shows a single mutation, but you can imagine that a similar concept applies to the entire genome
27
IBD = Identity by descent
IBD = Identity by descent. Important to know that these regions of the gene could also be the same by convergence.. But it usually implies shared ancestry. Length of segment reflects how many generations have passed since the most recent common ancestor. The more meioses occur, the more recombination has happened and the smaller the segment length typically is.
28
Indications of Inbreeding in Altai Neanderthal individual
a)Time since most recent common ancestor for two alleles of French, Denisovan, and Altai individual as determined by homozygosity on 40mb chromosome 21 b) Four possible scenarios of parental relatedness for the Altai Neanderthal. We could derive additional scenarios from those marked with asterisk by switching sexes. As she is a female and the X chromosome also displays homozygosity, scenarios with two successive males in the pedigree have been omitted. c)Fraction of the genome in runs of homozygosity (of length cM) for various populations. Altai neanderthal MUCH higher than other groups. Representing higher rates of inbreeding and perhaps population bottleneck. Any of these scenarios above could account for this amount. They are all a little bit less because????
29
Heterozygosity estimates by group
Heterozygosity = opposite of homozygosity. This graph shows range of heterzygosity observed in 15 non-African and 10 African individuals. Red bar: taking out the recent inbreeding still shows low heterozygosity. This shows that the individual genome sequenced is not an anomaly; rather we can infer much more about the population.
30
Inference of population size change over time
X axis: time, going backwards. Note that Neanderthal/denisovan populations decrease while human populations eventually explode
31
Possible model of gene flow
Shows the direction and magnitude of inferred gene flow events. Evidence for 3-5 cases of interbreeding among 4 distinct hominin populations – real history is probably more complex than this. Dashed line represents uncertainty. Note the “potential unknown hominin”. This could be a single entity such as homo erectus, or even a whole subtree.
32
Remember this guy? Look at how far our models have come!
33
Possible model of gene flow
Pretty cool how much we can tell just by a single bone!!
34
Takeaways Although major challenges exist, it is possible to accurately sequence ancient DNA with modern technology We can use these methods to compare ancient and modern DNAAll non-African modern day humans have a small genetic presence of Neanderthal ancestry We can learn much about ancient populations (eg bottlenecks & inbreeding of Altai Neanderthals) by looking solely at one bone fragment!
35
thank you Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.