Download presentation
Presentation is loading. Please wait.
1
Human Evolution: Searching for Selection Andrew Shah Algorithms in Biology 374 Spring 2008
2
Overview Given a DNA sequences how do we know when natural selection has occurred? Different methods of answering this question How does having the entire genome available change this?
3
Natural Selection Introduction
4
Natural Selection Introduction
5
Natural Selection Introduction
6
Natural Selection What sort of artifacts would this leave within the genome? Introduction
7
Natural Selection Introduction The frequency of the long gene increases from one generation to the next. It eventually reaches 100%, or fixation.
8
Natural Selection Gene Perspective Introduction Same process at the gene level Let the yellow dot represent the advantageous allele It begins at a small frequency (.125 in this case)
9
Natural Selection Gene Perspective Introduction During selection The allele has risen in frequency! Because of linkage, the nearby alleles have also risen in frequency
10
Natural Selection Gene Perspective Introduction The allele has reached fixation! As time goes on the nearby genes will slowly begin to reach fixation as well Diversity has been lost
11
Natural Selection Gene Perspective Introduction Effect of Selection on the Genome Next Challenge: How did this effect differ from non-selection?
12
Neutral Theory (N.T.) Problem: Need to distinguish natural selection Therefore: Need a null hypothesis Solution: Create model that approximates neutral evolution Introduction Kimura, 1960s
13
N.T. & Genetic Drift Most variation is neutral with respect to selection Therefore most changes in frequency are due to genetic drift Introduction
14
N.T. & Genetic Drift A neutral gene has an equal probability of increasing or decreasing in frequency in the next generation Introduction
15
N.T. & Mutation New alleles are introduced a constant rate (at a particular point) To think about: How will this help us search for selection? Introduction
16
N.T. & Mutation Introduction
17
N.T. & Mutation Introduction
18
N.T. & Mutation Introduction
19
N.T. & Recombination Recombination occurs at a near- constant rate at a given position Introduction
20
Testing the N. T. How would natural selection differ from these assumptions? Introduction
21
“ Positive Natural Selection in the Human Lineage” P. C. Sabeti, S. F. Schaffner, B. Fry, J. Lohmueller, P. Varilly, Shamovsky, A. Palma, T. S. Mikkelsen, D. Altshuler, E. S. Lander
22
Testing for Selection Sabeti et al. Review of current state of genomic selection Five statistical tests which use divergence from neutral theory to test for selection Ideas? Functional Alteration, Decreased Diversity, High Derived Alleles, Population Differences, Long Haplotypes
23
Sabeti et al. I. Functional Alteration Get a section of genome, and compare synonymous vs. non-synonymous mutations between two species Definition of synonymous mutation
24
I. Functional Alteration Sabeti et al. Silent/ Synonymous Non-Synonymous
25
I. Functional Alteration Sabeti et al. Long time scale, because it is an interspecies metric Limited value--only finds ongoing or recurrent selection Use a Ka/Ks statistical test, or McDonald- Kreitman
26
II. Decreased Diversity Sabeti et al. Way of detecting a selective sweep Requires you know ancestral gene, derived genes A derived gene is one that is a descendent of the ancestral one-it can be inferred using comparison to others species
27
II. Decreased Diversity Sabeti et al. The two small bars represent mutations. They are derived genes of the blue ancestor gene.
28
II. Decreased Diversity Sabeti et al. After the selective sweep the frequency of the derived alleles has jumped vis-a-vis the ancestral gene
29
II. Decreased Diversity Sabeti et al. A real example: derived alleles in red
30
II. Decreased Diversity Sabeti et al. Key idea: need to have ancestral genes present The genes must not have reached fixation! The pattern will be that of normal diversity of alleles but with skewed distribution of variation Statistical Tests: Tajima’s D, Fu and Li’s D*
31
III. New Alleles (AKA High Frequency of Derived Alleles) Another technique for detecting selective sweep Gene ‘hitch-hiking’ Limited diversity because of fixation Key idea: low frequency of new genes, but high diversity of rare alleles Sabeti et al.
32
III. New Alleles (AKA High Frequency of Derived Alleles) Sabeti et al. Gene has reached fixation Low diversity in this region compared to other regions
33
III. New Alleles (AKA High Frequency of Derived Alleles) Sabeti et al. Next mutations slowly increase the diversity Because they are all new the frequency remains low
34
III. New Alleles (AKA High Frequency of Derived Alleles) Sabeti et al. As more time progresses, any pre- selective sweep alleles die out, and diversity is replace by many derived alleles
35
III. New Alleles (AKA High Frequency of Derived Alleles) Sabeti et al. Real world example: Red dots indicate rare alleles
36
III. New Alleles (AKA High Frequency of Derived Alleles) Sabeti et al. Key Idea: The genes will have reached fixation and decreased diversity The diversity will all be in the form of rare alleles (because they are new) Statistical Test: Fay and Wu’s H
37
Comparing Methods The difference between decreased diversity and increased frequency of new alleles? Sabeti et al. Vs.
38
IV. Population Differences Requires population split Disproportionate shift in gene frequencies Limited utility Sabeti et al.
39
IV. Population Differences Sabeti et al.
40
IV. Population Differences Sabeti et al. Tall Tree Island
41
IV. Population Differences Sabeti et al.
42
IV. Population Differences Sabeti et al. Two separated populations--specific gene will show disproportionate shift in frequency with respect to the other genes Limited to cases where there are two populations Statistical Test: F(st), P(excess)
43
V. Long Haplotypes Based on Linkage Disequilibria (LD) Long Haploblock and high frequency Sabeti et al.
44
V. Long Haplotypes Under neutral conditions, a new allele has low frequency and high linkage disequilibrium Sabeti et al.
45
V. Long Haplotypes As time goes on and the neutral allele increases in frequency recombination erodes the L.D. Sabeti et al.
46
V. Long Haplotypes Sabeti et al.
47
Genome-Wide Scanning Better estimation of background rate Helps to confirm previous studies Suggests future areas of research MORE POWER Sabeti et al.
48
Genome-Wide Scanning SNP: Single Nucleotide Polymorphisms (excludes other types of mutations) that occur at > 1% frequency SNPs are the basis of many genome wide analyses Sabeti et al.
49
“Forces Shaping the Fastest Evolving Regions in the Human Genome” K. S. Pollard, S. R. Salama, B. King, A. D. Kern, T. Dreszer, S. Katzman, A. Siepel, J. S. Pedersen, G. Bejerano, R. Baertsch, K. R. Rosenbloom, J. Kent, D. Haussler
50
Background Exploits the very recent sequencing of the chimp and human genome Uses the rate of allele replacement as test for selection Assumption is that highly changing parts of the genome have been under selective pressure Pollard et al.
51
Idea Take chimp and mouse genome, find common regions Compare these regions to human genome Pollard et al.
52
Method Part I First half: Find conserved regions. Use sequence tests to look for regions of 100bp with 96% similarity Pollard et al.
53
Results Part I
54
Conclusion: These areas represent genes with deep functionality
55
Method Part II Pollard et al. Search human genome for conserved regions
56
Method Part II Pollard et al. For every region that doesn’t match up, label Human Accelerated Region
57
Formal Description Pollard et al.
58
Results Part II Found 202 Human Accelerated Regions in total These were regions where there had been rapid evolution in the past 5 million years But evolution doesn’t mean selection Pollard et al.
59
Possible Explanations Relaxation of negative selection -- ruled out because the rate of neutral evolution is slower for 201/202 HARs Natural selection Sudden change in mutation rate Pollard et al.
60
But was it Selection? Pollard et al.
61
A Digression Biased Gene Conversion: Tendency to replace misaligned nucleotides with GC In all but two of the HARs there was no evidence of a selective sweep but significant evidence of GC favored replacement Pollard et al.
62
A Digression New Paper suggests BGC hotspots change for species Conserved areas may suddenly become a BGC hotspot, explaining the HAR’s high BGC rates Adaptation or biased gene conversion: Extending the null hypothesis of molecular evolution, Galtier & Duret 2007 Pollard et al.
63
General Implications Illustrates utility of genome wide approached-- by using the full genome to establish a background rate, signals stand out of noise Weaknesses: approach did not take into account failure to meet the assumption of neutral theory (mutation rate) Pollard et al.
64
“Global Landscape of Recent Inferred Darwinian Selection for Homo Sapiens” E. Wang, G. Kodama, P. Baldi, and R. K. Moyzis
65
Background Ever growing catalog of SNPs for human populations SNP data can be used to construct haplotype maps Can screen whole genome for haplotype outlier Wang et al.
66
Idea Take only homozygotes Bin the alleles together Calculate the L.D. for each allele Wang et al.
67
Idea Wang et al.
68
Formalized Description Wang et al.
69
Description of the Formalized Description Wang et al. Expected decay of LD for a allele of a specific frequency
70
Description of the Formalized Description Wang et al.
71
Description of the Formalized Description Wang et al. Selective sweep will be more resistant to decay
72
Description of the Formalized Description Wang et al. Normalize with respect to the sigmoidal curve
73
Advantages of Method By using the whole genome can track not only for L. D. but the exponential decay of L.D. over distance. This helps to distinguish selective sweeps from other demographic shifts such as bottlenecks Wang et al.
74
Results Wang et al.
75
Results Wang et al. “Darwin’s Fingerprint”: Using different datasets from different populations, certain areas show consistent evidence of selection
76
Discussion Wang et al. Compare regions to known gene functions Six groups predominate Test was well designed Limited detection: Genes cant be at fixation
77
Overall Conclusions It all comes down to statistics. What are the null assumptions? What are the alternate assumptions? Genome-wide scans improve by allowing us to exploit this elegant statistical method in new ways Improved data for null hypothesis Increased volume to potential candidates Wang et al.
78
Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.