Presentation is loading. Please wait.

Presentation is loading. Please wait.

2: Population genetics break.

Similar presentations


Presentation on theme: "2: Population genetics break."— Presentation transcript:

1 2: Population genetics break

2 Population genetics: Introduction and motivation

3 Reference.

4 The big questions of population genetics:
What are the evolutionary forces shaping diversity among individuals within the same species?

5 The language of population genetics

6 Locus: a place on the chromosome where an allele resides.
Allele: A part of DNA at a specific location on the chromosome. A locus is thus a template for the alleles. An allele is an instantiation of a locus. A diploid organism has two alleles at a particular autosomal locus, one from its mother and the other from its father.

7 Allele frequencies

8 Genotype and allele frequencies
Say we have 3 AA, 2 Aa and 1 aa. What is the frequency of the allele A? Answer: there are 6 alleles of A in the 3 AA there are 2 alleles of A in the 2 Aa and there are 0 alleles of A in the 1 aa. Altogether there are 8 alleles of A, out of 12 alleles in total, thus the frequency of A is 8/12.

9 Assume there are 11 alleles, 6 ”A” and 5 ”a”.
The frequency of the “A” allele is 6/11 = 0.55. 0.55 is the observed frequency of the “A” in the sample is also the estimate of the allele frequency in the entire population.

10 If n is the sample size, the 95% confidence interval can be approximated by

11 Thus, the probability that the population allele frequency falls within the interval (0.26,0.84) is When n increases, the confidence interval becomes smaller.

12 2: Population genetics break

13 General homozygosity Genetic drift

14 The general homozygosity for k alleles is defined as
And the general heterozygosity is defined as H = 1-G. G is the probability that if one samples two alleles at random from the population with replacement, he obtains the same two allele.

15 Note that the definition of heterozygosity uses only allele frequencies.
Thus, it can be used even for populations that are not in HW, or even to non-diploid populations.

16 Population genetics: finite populations.
In real life all population are finite. For some populations (bacteria), the assumption of infinite size is a good approximation. For some, this is completely unrealistic. HW assume that the population is infinite. When a population is finite, random genetic drift can take place. Random genetic drift is the random change of allele frequencies. The source of the random changes is random variation in the number of offspring between individuals and for diploids.-sexual organisms – from Mendel’s law of segregation.

17 Stochastic models Deterministic models
Two mathematical approaches to studying genetic changes in populations: Deterministic models Stochastic models HW selection, mutation, migration Drift

18 Mutation Some “green genes” randomly mutate to “brown genes” (although since any particular mutation is rare, this process alone cannot account for a big change in allele frequency over one generation).

19 Migration Migration (or gene flow): Some beetles with brown genes immigrated from another population, or some beetles carrying green genes emigrated

20 Selection Natural selection: Beetles with brown genes escaped predation and survived to reproduce more frequently than beetles with green genes, so that more brown genes got into the next generation.

21 Genetic drift Genetic drift: When the beetles reproduced, just by random, more brown genes than green genes ended up in the offsprings. In the diagram on the right, brown genes occur slightly more frequently in the offsprings (29%) than in the parent generation (25%).

22 Drift An important factor in producing changes in allele frequencies is the random sampling of gametes during reproduction.

23 Niche capacity = 10 plants

24 Random genetic drift has one effect:
Removal of variation by fixation. If no mutations are introduced – after enough time we will all be Cohen (or any other one allele). If mutations are introduced, due to genetic drift, they have a good chance of getting lost. The probabilistic model for the random genetic drift is the random walk…

25 We will see that the rate of removal is inversely proportional to the population size. The effect of random genetic drift is the biggest when small populations are considered. Drift has a small effect Drift has a large effect

26 Mutations introduce variation. Random genetic drift removes variation
Mutations introduce variation. Random genetic drift removes variation. What is the equilibrium between these two? Mutations Variation Drift

27 The neutral theory states that much of the molecular variation in nature is due to mutation and random genetic drift. Selection has a very minor role in shaping allele variability.

28 The neutral theory was tremendously controversial.
Partly because it was difficult to test, and mostly, because it seemed an outrageous claim that most of evolution is due to genetic drift rather than natural selection, as Darwin proposed.

29 2: Population genetics break

30 Modeling genetic drift

31 A computer simulation approach to study genetic drift (Simulating the Wright-Fisher model):
Let N be the number of diploid individuals (Say we have N=20 individuals, with 40 alleles.) Let the two alleles be A and a where the frequency of A is p. (Say p=0.2, so we start with 8 “A” alleles and 32 “a” alleles).

32 To go to the next generation, we randomly chose 2N alleles from the previous generation with replacements (the same allele can be chosen more than once). We repeat this process for many generations.

33 Assumptions All the individuals in the population have the same fitness (selection does not operate). The generations are nonoverlapping. Adult population size is finite and does not change from generation to generation. Gamete population size is infinite. The population is diploid (N individuals, 2N alleles). One locus with two alleles, A1 and A2, with frequencies p and q = 1 – p, respectively.

34 Random mating n adults But only n survive to the next adult stage
A1 in proportion p A2 in proportion q n adults Random mating A1A1 (p2) A1A2 (2pq) A2A2 (q2) But only n survive to the next adult stage A1 in proportion p’ A2 in proportion q’

35

36 Magnitude of fluctuations depends on population size.
Large population = Small fluctuations. Small population = Large fluctuations. Mean time to fixation or loss depends on population size. Large population = Long time. Small population = Short time.

37

38 Ex. What is the probability that a particular allele gets a copy into the next generation? Assume N is the population size of diploids. Solution:

39 Ex. Consider a single hermaphroditic diploid individual that is heterozygous, with genotype Aa. Say this population mates randomly (yet, it is a strange notion for a single individual, and yet). What is the change if the population is fixed after 1 generation? After 2 generations? After n? (assuming the size of the population remains the same, i.e., 1). On average, how much time will it take for the population to become fixed? Solution: After 1 generation there is a 0.25 chance of AA, 0.25 chance of aa, and 0.5 chance of Aa. Thus, the probability of fixation is 0.5. After n generations it will be 1/2n. This is a geometric distribution. The expectation of which is, in this case 2.

40 This exercise shows that the time till fixation is a random variable, and has a specific distribution.

41 The general homozygosity for k alleles is defined as
H=1-G. Our goal will be to formulate G(t) and H(t)=1-G(t) where t is the number of generations. This function should depend on N – the population size. G(t) should increase with t, and H(t) should decrease…

42 The general homozygosity for k alleles is defined as
Let G’ be the probability that two alleles drawn at random from the population without replacements are identical by state. G’ also measure homozygosity, but is not the same as G.

43 G’ is an approximation to G, in the following sense.
G is the probability that sampling two gametes with replacements results in gametes with identical states. G’ is the same but without the replacement. G can be computed from G’, because G (with replacements) can be decomposed into 2 events. One: the second draw was the same allele exactly (probability 1/2N), or the second draw was from a different allele (probability 1-1/2N). In the first event, the two allele are the same with probability 1. In the second event, the two allele are the same with probability G’. Thus,

44 For big N, these values are almost the same.
From the math point of view, it is easier to work with G’

45 A recursion formula for G’ (without replacements).
Let G’(t) be G’ in generation t. G’(t+1) = chance that if we draw two alleles without replacement they will be the same by state. Take two alleles in generation t+1. They both existed already in generation t. But, there are two possibilities. Either it was the same allele in generation t (i.e., it was sampled more than twice in the change from t to t+1), or not (i.e., either the two alleles are identical by descent or not). The probability that it was the same allele is 1/2N. In such case, with probability 1 they have the same state. If they are different by descent, the probability that they have the same state is G’(t). Putting it all together, we get

46 Exercise: show that when t approximates infinity, G’(t) approximates 1.
Answer: Let than Taking lim from both side of the equation above we obtain: Conclusion: The homozygosity approaches 1 after many generations

47 Solving the equation for G’
Define H’(t) to be 1-G’(t) (H’ is similar to heterozigosity, but without replacements). We get:

48 Solving the equation for G’
This is a geometric series: The conclusion is that the heterozigosity is decreasing in an exponential rate, that depends on the population size. Since we are talking about discrete organisms, H(t) will eventually, become 0.

49 Half life (not the computer game)
The half time is the t that solves the above equation. We indicate this t by the symbol: The conclusion is that the heterozigosity is decreasing in an exponential rate, that depends on the population size. Since we are talking about discrete organisms, H(t) will eventually, become 0.

50 Solving:

51 Half life (not the computer game)
Taylor series of ln(1+x): Hence, for very small x we can approximate ln(1+x) by x.

52 Half life (not the computer game)
In other words: for big enough populations the time it takes for genetic drift to reduce H’ by one-half is proportional to the population size. Example: for a population of one million it takes 1.38 millions generations to reduce H’ by one half. If each generation time is 20 years, it take 28 millions years to reduce the genetic variation by half.

53 Fixation probability Say we have 2N different alleles. Eventually, one of these will be fixed. The probability that it will be allele i is 1/2N. If m alleles are the same, the probability that one of them will be fixed is m/2N. If the initial frequency is p – the fixation probability would also be p.

54 How important is genetic drift on large population is still debated.
HW, Drift, etc’… Random mating is a force with time scales of 1-2 generations (HW). Genetic drift is of time scales of 2N generations. In a short term – random mating will change genotype frequencies much more than drift. How important is genetic drift on large population is still debated.

55 2: Population genetics break

56 Modeling mutations

57 As before, to go to the next generation, we chose 2N alleles from the previous generation but this time - without replacements In each generation there is a probability u for any allele to mutate. We assume a mutation always result in a new allele that was never found in the population. This model is called the infinite-allele model. u is the mutation probability, but is sometimes also called the mutation rate.

58 A recursion formula for H’ (without replacements).
Let H’(t) be H’ in generation t (H’ = 1-G’). H’(t+1) = chance that if we draw two alleles without replacement they will be different by state. We neglect the chance that they were sampled from the same individual (neglect drift). Take two alleles in generation t+1. They both existed already in generation t. But, there are two possibilities. Either they already differ by states in generation t (probability H’(t)), or not (probability 1-H’(t)). If they were not the same by state in generation t, there is a possibility that they still differ by state, due to mutation. The chance for this is 1-(1-u)2.

59 Solving the equation for G’
Neglecting u2 (geometric series)

60 Solving the equation for G’
The conclusion is that the homozygosity is decreasing in an exponential rate, that depends on the mutation rate. Since we are talking about discrete organisms, G(t) will eventually, become 0.

61 Half life The half time is the t that solves the above equation. We indicate this t by the symbol

62 Half life (not the computer game)
The time scale of mutation is proportional to 1/u. If u=10-5, it takes 100,000 generations for mutation to reduce the homozygosity by a factor of 2.

63 2: Population genetics break

64 Modeling genetic drift + mutation

65 If genetic drift removes variation, why does genetic variation exist?
Mutations introduce new variation into the population. What is the relationship between drift and mutation?

66 A model for mutation. G’ is the probability of getting two identical alleles when drawing without replacements. After one generation, the computation is as before, but there is a chance u that any of the alleles would change. Hence 1-u is the probability of the complement event.

67 We are interested in the equilibrium between mutation and drift.
When t approaches infinity G’(t) approaches a constant between zero and 1. We want to compute this constant the probability that two alleles different by origin are identical by state after equilibrium is reached.

68 Let

69 Simplifying G

70 Simplifying G

71 An approximate solution for G
Assumptions: N is much bigger than u u is smaller than 1 u2N is also very small

72 A classic formula

73 Intermediate summary Drift only: Mutation only: Both:

74 Computing delta H Drift only:

75 Computing delta H Mutation only:

76 Computing delta H Mutation + Drift:

77 Computing delta H Mutation + Drift:

78 Summary with delta H’ Drift only: Mutation only: Both:

79 Delta H for the model of drift+mutation
Always negative Always positive Equilibrium is reached when


Download ppt "2: Population genetics break."

Similar presentations


Ads by Google