Presentation is loading. Please wait.

Presentation is loading. Please wait.

Effective Population Size Real populations don’t satisfy the Wright-Fisher model. In particular, real populations exhibit reproductive structure, either.

Similar presentations


Presentation on theme: "Effective Population Size Real populations don’t satisfy the Wright-Fisher model. In particular, real populations exhibit reproductive structure, either."— Presentation transcript:

1 Effective Population Size Real populations don’t satisfy the Wright-Fisher model. In particular, real populations exhibit reproductive structure, either due to geography or societal constraints. The number of descendents in a generation depends on many factors (health, disease, etc.), as opposed to the implicit Poisson model. Population size isn’t fixed, but changes over time 6/13/2015Comp 790– Continuous-Time Coalescence1

2 Sanity Check When the Wright-Fisher model, or the basic coalescent, is used to model a real population, the size of the population (2N) cannot be taken literally. For example, many human genes have a MRCA less than 200,000 years ago. If we consider one generation per 20 years then there have been 200,000/20 = 10000 generations Recall the average time to MRCA is 2, in Population scaled time, so with a population of 2N, the effective population size is 2N = 10000/2  N = 2500 6/13/2015Comp 790– Continuous-Time Coalescence2

3 Effective Population Size Without an estimate of an MRCA one can still use coalescence to find the effective population size. Recall for the discrete coalescent, the expected time for two genes to find a MRCA was E(T 2 ) = 2N Thus, This equation would be applied after tracing many paths of gene pairs, and E(T 2 ) would be measured in actual generations rather than the normalized notion of time used in the continuous coalescent (where t=1.0 represents the time when the population size is 2N) 6/13/2015Comp 790– Continuous-Time Coalescence3

4 Moran Model In 1958 Moran proposed an alternative to the Wright-Fisher model where reproductive generations overlap Central idea, is that each epoch represents two events, the loss of one gene and its replacement by another Rules out multiple coalescent events between epochs 6/13/2015Comp 790– Continuous-Time Coalescence4

5 Moran Formulation Probability that 2 genes share a common ancestor in the previous generation, P(T 2 =1) is: because only one of the pairs has a common ancestor Gives a geometric distribution with parameter and a natural time scale of N(2N -1) (compared to 2N for the Wright-Fisher model) 6/13/2015Comp 790– Continuous-Time Coalescence5

6 Moran Use When adjusted for differences in time scale the basic “continuous” coalescent holds for the Moran model as well Moran model often leads to a more tractable computation than the Wright-Fisher model The basic “continuous coalescent” is robust to the actual population model, whether it is Haploid or Diploid, Wright- Fisher or Moran, thus it is commonly used as a first-order approximation for making estimates about population structure, such as, how many variations one should expect in a sample size of N, and how long such divergences have existed 6/13/2015Comp 790– Continuous-Time Coalescence6

7 Dirty Details Thus far, we’ve considered very simple, and admittedly oversimplified models of biological and genetic processes. Next we’ll discuss many of the biological realities that the coalescent model either crudely approximates, or entirely ignores We also want to move from our simple geocentric view to a more complete organism 6/13/2015Comp 790– Continuous-Time Coalescence7

8 Terminology Gene: A unit of information transferred from generation to the next. Allele: An alternative form of a gene, information that comes in two or more forms. SNP: (acronym for Single Nucleotide Polymorphism) A position in a DNA’s sequence that can be found in multiple states of the 4 nucleotides (A, C, G, T). SNPs are one type of allele Haplotype: A subsequence of DNA that includes only positions known to vary (SNPs) 6/13/2015Comp 790– Continuous-Time Coalescence8

9 Causes of Genetic Variation Mutation: Changes in the genetic material of an organism. Events that actually modify genes potentially generating new alleles Recombination: A process in which new gene combinations are introduced – Crossovers, Gene-conversion, Lateral Gene Transfer Structural Rearrangement: Modifications that impact the number of old gene copies and their relative orderings – Insertions, Deletions, Inversions 6/13/2015Comp 790– Continuous-Time Coalescence9

10 Mutations There are many ways of altering a gene, some common and some rare – Environmental exposure (radiation, chemical, etc.) – Random events (faulty DNA replication, other malfunctions of biochemical machinery) Many mutations affect cells of an higher organisms without genetic ramifications (mutations of the so-called somatic cells), but they may be important to the organism (i.e. lead to cancer) Mutations of the germline (gamete) cells are those of genetic interest because they impact the life of genes, as opposed to their protective organism 6/13/2015Comp 790– Continuous-Time Coalescence10

11 Sequence Organization The DNA sequence is broken into several independent segments organized into structures called chromosomes Chromosomes vary between different organisms. The DNA molecule may be circular or linear, and can contain from 10,000 to 1,000,000,000 nucleotides. Simple single-cell organisms (prokaryotes, cells without nuclei such as bacteria) generally have smaller circular chromosomes, although there are many exceptions. More complicated cells (eukaryotes, with nuclei) have linear DNA molecules that are broken into segments and wound around special proteins. The aggregates are called chromosomes. 6/13/2015Comp 790– Continuous-Time Coalescence11

12 Monoploid Number The number of fragments that DNA is broken into leads to a distinct number of chromosomes. The number is called the monoploid number. 6/13/2015Comp 790– Continuous-Time Coalescence12 OrganismUnique Chromosomes Human23 Chimpanzee24 Mouse20 Dog39 Horse32 Donkey31 Hare23

13 Diploidy and Polyploidy Having only one copy of DNA is a risky proposition, since the loss of a single functional gene could lead to a bad outcome Evolution has addressed this obvious shortcoming by incorporating a mostly redundant copy of the entire sequence in most cells The haploid number is the number of chromosomes in a gamete of an individual. Nearly all mammals are diploid and receive a homologous sequence from each parent Many plants carry more than 2 copies of there sequence, 4 and 8 are typical, and the number can vary between subspecies. 6/13/2015Comp 790– Continuous-Time Coalescence13

14 Crossover Recombination In the formation of gametes (sperm and ovum) homologous DNA strands are combined in a process called crossover This effectively combines the prefix of one sequence with the suffix of another 6/13/2015Comp 790– Continuous-Time Coalescence14

15 Gene Conversion Recombination The DNA sequence is transferred from one copy (which remains unchanged) to another, whose sequence is altered. Results from the repair of damaged DNA as described by the Double Strand Break Repair Model. 6/13/2015Comp 790– Continuous-Time Coalescence15

16 Lateral Gene Transfer Any process in which an organism incorporates genetic material from another organism without being the offspring of that organism. Horizontal gene transfer is a confounding factor in inferring phylogenetic trees based on sequences. One of the most prevalent forms of recombination in “early” evolution 6/13/2015Comp 790– Continuous-Time Coalescence16

17 Wi’07Vineet Bafna Structural Rearrangements Large scale structural changes (deletions/insertions/inversions) may occur in a population.


Download ppt "Effective Population Size Real populations don’t satisfy the Wright-Fisher model. In particular, real populations exhibit reproductive structure, either."

Similar presentations


Ads by Google