Presentation is loading. Please wait.

Presentation is loading. Please wait.

Polymorphism Polymorphism: when two or more alleles at a locus exist in a population at the same time. Nucleotide diversity: P = xixjpij considers.

Similar presentations


Presentation on theme: "Polymorphism Polymorphism: when two or more alleles at a locus exist in a population at the same time. Nucleotide diversity: P = xixjpij considers."— Presentation transcript:

1 Polymorphism Polymorphism: when two or more alleles at a locus exist in a population at the same time. Nucleotide diversity: P = xixjpij considers # differences and allele frequency ij Freq (x) Seq 1 G A G G T G C A A C Seq 2 G A G G A C C A A C Seq 3 G A G C T G G A A G p p13 p23 P = (0.4)(0.5)(0.2) + (0.4)(0.1)(0.3) + (0.5)(0.1)(0.5) = 0.077 p12 p13 p23

2 In Theory: Under infinite-sites model: Expectation (P 
4Nem = frequency of heterozygotes per nucleotide site

3 Nucleotide diversity is low in humans

4 Expectation (K
Polymorphism is also estimated by: ATCCGGCTTTCGA K = 3 for-->ATCCGAATTTCGA ATTCGCCTTTCGA K= Number of segregating (variable) sites in a sample of alleles. In Theory: Expectation (K Where a = 1 + 1/2 + 1/3 +……..1/n-1

5 Coalescent Process t2 t3 t4 t5 Gene Tree tm is time for coalescence
from m to m-1 sequences t3 t4 t5 Gene Tree

6 Coalescent Process a b c d e f g h Gene Tree
The geneology of n sequences has 2(n-1) branches. n = number of external branches. c d n-2 are internal e f g h Gene Tree

7 How long will the coalescence process take?
Simplest case: If pick two random gene copies, probability that the second is the same as the first is 1 / (2N). This is the probability that two alleles coalesce in previous generation. It follows that / (2N) is the probability that two sequences were derived from different sequences in the preceding generation. Therefore, the probability that 2 sequences derived from the same ancestor 2 generations ago (grandparent) is / (2N) x 1 / (2N). It can be shown that the probability that two sequences were derived from the same ancestor t generations ago is: [1 - 1 / (2N)t x (1 / (2N)] ~ (1 / (2N) e(-t/(2N)

8 Consider probability of common ancestry for:
[1 - 1 / (2N)g-1 x (1 / (2N)] Because N is in denominator, the probability will depend on sample size Consider probability of common ancestry for: Generations ago Prob(N=5) Prob(N=10) It can be shown that the average time back to common ancestry of a pair of genes in a diploid population is 2N, and the average time back to common ancestry of all gene copies is 4N.

9 Large pop Time back to common ancestor Small pop

10 Coalescence with no mutation
The average degree of relationship increases with time. All of the gene copies in a population can be traced back to a single ancestral gene. A population will eventually become monomorphic for one allele or another, with this probability determined by initial allele frequencies.

11 Coalescence with mutation
If each lineage experiences m mutations per generation, then the number of base pair differences between them will be #dif = 2mtca. If the average time to coalescence is 2N for two randomly chosen gene copies, then #dif = 2 m (2N). Therefore, expect the average number of base pair differences between gene copies to be greater in a larger population.

12 Total length of branches of gene tree
I + L = J Internal branches External branches Total time length + = Now consider mutation among branches during the coalescent process. i) + e) =  Mutations internal branches Mutations external branches Total number of mutations in gene tree + = In theory: total number of mutations equals the number of segregating sites (K)

13 Testing for Selective Neutrality
Using the difference in estimates of polymorphism to detect deviation from neutrality. Tajima’ s Test (1989): P- K / a D = V(P- K/a) Normalizing factor Rationale: Pand K are differentially influenced by the frequency of alleles.

14 P K/a Few alleles at intermediate frequency > < Many low frequency, variable alleles D = 0 neutral prediction D > 0 balancing selection D < 0 directional selection

15 Fu and Li’s Test (1993): Using the difference in
# mutations in gene tree to detect deviation from neutrality. i - e / (a - 1) D = V[i - e / (a - 1) Rationale: An equivalent number of mutations is expected between interior verses exterior branches of a neutral gene tree.

16 i  e Few alleles at intermediate frequency > < Many low frequency, variable alleles D = 0 neutral prediction D > 0 balancing selection D < 0 directional selection

17 Gene genealogies under no selection, positive
selection, balancing selection, and background selection. No Selection : 7 neutral mutations accumulate since the time of the last common ancestor. D = 0

18 Consider the Effects of Selection on Neutral Sites Linked to a Selected Site
Positive Selection : neutral variation at linked sites will be eliminated (swept away) as the advantageous allele quickly is fixed in the population. This process is also called hitch-hiking. Time D < 0

19 Consider the Effects of Selection on Neutral Sites Linked to a Selected Site
Balancing Selection : neutral variation at linked sites accumulates during the long period of time that both allele lineages are maintained. Time D > 0

20 Consider the Effects of Selection on Neutral Sites Linked to a Selected Site
Background Selection : gene lineages become extinct not only by chance, but because of deleterious mutations to which they are linked, which eliminates some gene copies. Time D < 0

21 Problem: Background selection and hitchhiking are
contrasting processes that lead to the same pattern. How to differentiate? Dramatic examples of reduced polymorphism=hitchhiking. Less dramatic examples=background selection.


Download ppt "Polymorphism Polymorphism: when two or more alleles at a locus exist in a population at the same time. Nucleotide diversity: P = xixjpij considers."

Similar presentations


Ads by Google