Forward Genealogical Simulations Assumptions:1) Fixed population size 2) Fixed mating time Step #1:The mating process: For a fixed population size N, there.

Slides:



Advertisements
Similar presentations
Exact Computation of Coalescent Likelihood under the Infinite Sites Model Yufeng Wu University of Connecticut DIMACS Workshop on Algorithmics in Human.
Advertisements

Background The demographic events experienced by populations influence their genealogical history and therefore the pattern of neutral polymorphism observable.
Recombination and genetic variation – models and inference
Sampling distributions of alleles under models of neutral evolution.
MIGRATION  Movement of individuals from one subpopulation to another followed by random mating.  Movement of gametes from one subpopulation to another.
Modeling Populations forces that act on allelic frequencies.
BIOE 109 Summer 2009 Lecture 5- Part I Hardy- Weinberg Equilibrium.
 Read Chapter 6 of text  Brachydachtyly displays the classic 3:1 pattern of inheritance (for a cross between heterozygotes) that mendel described.
Coalescence with Mutations Towards incorporating greater realism Last time we discussed 2 idealized models – Infinite Alleles, Infinite Sites A realistic.
N-gene Coalescent Problems Probability of the 1 st success after waiting t, given a time-constant, a ~ p, of success 5/20/2015Comp 790– Continuous-Time.
Hidden Markov Models Theory By Johan Walters (SR 2003)
Atelier INSERM – La Londe Les Maures – Mai 2004
TEMPLATE DESIGN © Distribution of Passenger Mutations in Exponentially Growing Wave 0 Cancer Population Yifei Chen 1 ;
Exact Computation of Coalescent Likelihood under the Infinite Sites Model Yufeng Wu University of Connecticut ISBRA
From population genetics to variation among species: Computing the rate of fixations.
2: Population genetics break.
Continuous Coalescent Model
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
Scott Williamson and Carlos Bustamante
Review Session Monday, November 8 Shantz 242 E (the usual place) 5:00-7:00 PM I’ll answer questions on my material, then Chad will answer questions on.
Basic Statistics. Basics Of Measurement Sampling Distribution of the Mean: The set of all possible means of samples of a given size taken from a population.
Extensions to Basic Coalescent Chapter 4, Part 1.
Molecular phylogenetics
Medical Genetics 08 基因变异的群体行为 Population Genetics.
MIGRATION  Movement of individuals from one subpopulation to another followed by random mating.  Movement of gametes from one subpopulation to another.
Population Genetics is the study of the genetic
Chapter 7 Population Genetics. Introduction Genes act on individuals and flow through families. The forces that determine gene frequencies act at the.
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Speciation history inferred from gene trees L. Lacey Knowles Department of Ecology and Evolutionary Biology University of Michigan, Ann Arbor MI
I. In Part A of our allele frequency simulation the population was not evolving so the population is said to be in equilibrium. A. This means that allele.
Extensions to Basic Coalescent Chapter 4, Part 2.
Lecture 3: population genetics I: mutation and recombination
Course outline HWE: What happens when Hardy- Weinberg assumptions are met Inheritance: Multiple alleles in a population; Transmission of alleles in a family.
Trees & Topologies Chapter 3, Part 1. Terminology Equivalence Classes – specific separation of a set of genes into disjoint sets covering the whole set.
1 Evolutionary Change in Nucleotide Sequences Dan Graur.
Deviations from HWE I. Mutation II. Migration III. Non-Random Mating IV. Genetic Drift A. Sampling Error.
Confidence intervals and hypothesis testing Petter Mostad
Models and their benefits. Models + Data 1. probability of data (statistics...) 2. probability of individual histories 3. hypothesis testing 4. parameter.
1 Population Genetics Basics. 2 Terminology review Allele Locus Diploid SNP.
Coalescent Models for Genetic Demography
Selectionist view: allele substitution and polymorphism
Population genetics. coalesce 1.To grow together; fuse. 2.To come together so as to form one whole; unite: The rebel units coalesced into one army to.
Figure 5.1 Giant panda (Ailuropoda melanoleuca)
NEW TOPIC: MOLECULAR EVOLUTION.
By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458.
1.Stream A and Stream B are located on two isolated islands with similar characteristics. How do these two stream beds differ? 2.Suppose a fish that varies.
To be presented by Maral Hudaybergenova IENG 513 FALL 2015.
Coalescent theory CSE280Vineet Bafna Expectation, and deviance Statements such as the ones below can be made only if we have an underlying model that.
Testing the Neutral Mutation Hypothesis The neutral theory predicts that polymorphism within species is correlated positively with fixed differences between.
Restriction enzyme analysis The new(ish) population genetics Old view New view Allele frequency change looking forward in time; alleles either the same.
Fixed Parameters: Population Structure, Mutation, Selection, Recombination,... Reproductive Structure Genealogies of non-sequenced data Genealogies of.
A Little Intro to Statistics What’s the chance of rolling a 6 on a dice? 1/6 What’s the chance of rolling a 3 on a dice? 1/6 Rolling 11 times and not getting.
Modelling evolution Gil McVean Department of Statistics TC A G.
Inferential Statistics. Population Curve Mean Mean Group of 30.
8 and 11 April, 2005 Chapter 17 Population Genetics Genes in natural populations.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Genetic Algorithm(GA)
Trees & Topologies Chapter 3, Part 2. A simple lineage Consider a given gene of sample size n. How long does it take before this gene coalesces with another.
Lecture 6 Genetic drift & Mutation Sonja Kujala
The Chi Square Test A statistical method used to determine goodness of fit Chi-square requires no assumptions about the shape of the population distribution.
An Algorithm for Computing the Gene Tree Probability under the Multispecies Coalescent and its Application in the Inference of Population Tree Yufeng Wu.
Gil McVean Department of Statistics
Polymorphism Polymorphism: when two or more alleles at a locus exist in a population at the same time. Nucleotide diversity: P = xixjpij considers.
Montgomery Slatkin  The American Journal of Human Genetics 
Testing the Neutral Mutation Hypothesis
The coalescent with recombination (Chapter 5, Part 1)
Montgomery Slatkin  The American Journal of Human Genetics 
Trees & Topologies Chapter 3, Part 2
Trees & Topologies Chapter 3, Part 2
Incorporating changing population size into the coalescent
Presentation transcript:

Forward Genealogical Simulations Assumptions:1) Fixed population size 2) Fixed mating time Step #1:The mating process: For a fixed population size N, there is a random distribution of progeny from each parent, with a mean equaling 1.

Step #2:The mutation process is then overlaid on the genealogy. Mutations are assumed to be neutral and do not impact the fitness of progeny (i.e. are not selective). This is a reasonable assumption, as very few polymorphisms have any function that affect selection. Mutation is a random event that occurs at a frequency equal to µ.

BUT, there are 2 major problems with the forward simulation model: 1.It is computationally expensive to produce such a model 2.It is difficult to know how to start the process - need to know initial conditions for the species in question. The Coalescence Model bypasses the need to simulate every individual in the population. Because mating is random and mutations are neutral, the lineages of a sample of individuals can be traced back to a most recent common ancestor (MRCA) through statistical calculations. Individuals of the sample are said to “coalesce” at the point of their MRCA.

The main assumption of this theory is that each individual of the previous generation is equally likely to be the parent of any individual of the current generation (Wright-Fisher model). Therefore, the probability that a sample of 2 individuals possess the same common ancestor in the preceding generation is 1/N, where N = the size of the preceding generation. Stated in another way, the probability that a sample of 2 individuals do NOT share the same common ancestor is 1 - 1/N. N = 10 In the above example, the probability that the 2 shaded individuals in generation t do not have the same MRCA in generation t + 1 is: 1 - 1/10 = 0.9 Thus, there is a 90% chance that the 2 sampled individuals do not possess the same MRCA in the previous generation t + 1 t

This basic calculation of P(2) = 1 - 1/N to determine the likelihood of the MRCA occurring in the previous generation for a sample of 2 individuals can be mathematically expanded for a sample of n individuals: The above equation calculates the probability that n sampled individuals have n distinct ancestors in each of the preceding t generations. Essentially, this equation can be used to generate random genealogies by statistically tracing back to the MRCA from a sample of individuals in the current generation.

Now that a genealogy can be randomly generated, the effect of mutations can be overlaid on the process, as in forward genealogical simulations. Assumptions: 1) Constant-rate neutral mutation process 2) Infinite site model - the locus examined is composed of many sites and no more than one mutation occurs at any site within the genealogy of the sample E(  ) = 4 E(  ) = 1 E(  ) = 3 As one would expect, the number of accumulated mutations (S) is directly proportional to the length of the lineage [E(S) =  E(Ttot)].

So now a simple coalescent model has been developed in which a neutral mutation process, governed by the mutation rate , is overlaid on random genealogies generated from a sample of alleles from the current generation. So what is the practical use of such a model? Model fitting - manipulating parameters of the model to generate plausible theories based on collected data T(8) T(7) T(3) Time (in gen.)

One Simple Example: The Effects of Population Size (N) on the Coalescent Model Now apply these models to what is seen in the field to look at which best fits the data: Sequence data of individuals from African and European populations shows that African populations have a significantly higher polymorphism rate. Thus, a plausible theory is that the current African population is derived from a larger ancestral population than the current European population. Large NSmall N According to the Coalescent Model, individuals sampled from a large population accumulate more polymorphisms due to the extended time it takes to reach the MRCA.