Population genetics. coalesce 1.To grow together; fuse. 2.To come together so as to form one whole; unite: The rebel units coalesced into one army to.

Slides:



Advertisements
Similar presentations
The Comparison Test Let 0 a k b k for all k.. Mika Seppälä The Comparison Test Comparison Theorem A Assume that 0 a k b k for all k. If the series converges,
Advertisements

Section 9.1 – Sequences.
Measures of Dispersion and Standard Scores
Geometric Sequences A geometric sequence (or geometric progression) is a sequence in which each term after the first is obtained by multiplying the preceding.
Sampling distributions of alleles under models of neutral evolution.
INFINITE SEQUENCES AND SERIES
MIGRATION  Movement of individuals from one subpopulation to another followed by random mating.  Movement of gametes from one subpopulation to another.
Coalescence with Mutations Towards incorporating greater realism Last time we discussed 2 idealized models – Infinite Alleles, Infinite Sites A realistic.
Forward Genealogical Simulations Assumptions:1) Fixed population size 2) Fixed mating time Step #1:The mating process: For a fixed population size N, there.
TEMPLATE DESIGN © Distribution of Passenger Mutations in Exponentially Growing Wave 0 Cancer Population Yifei Chen 1 ;
9-4 Sequences & Series. Basic Sequences  Observe patterns!  3, 6, 9, 12, 15  2, 4, 8, 16, 32, …, 2 k, …  {1/k: k = 1, 2, 3, …}  (a 1, a 2, a 3, …,
Exact Computation of Coalescent Likelihood under the Infinite Sites Model Yufeng Wu University of Connecticut ISBRA
From population genetics to variation among species: Computing the rate of fixations.
2: Population genetics break.
SUMS OF RANDOM VARIABLES Changfei Chen. Sums of Random Variables Let be a sequence of random variables, and let be their sum:
Continuous Coalescent Model
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
2: Population genetics Break.
12 INFINITE SEQUENCES AND SERIES The Comparison Tests In this section, we will learn: How to find the value of a series by comparing it with a known.
11.4 Geometric Sequences Geometric Sequences and Series geometric sequence If we start with a number, a 1, and repeatedly multiply it by some constant,
Copyright © 2007 Pearson Education, Inc. Slide 8-1.
Section 11-1 Sequences and Series. Definitions A sequence is a set of numbers in a specific order 2, 7, 12, …
1 Appendix E: Sigma Notation. 2 Definition: Sequence A sequence is a function a(n) (written a n ) who’s domain is the set of natural numbers {1, 2, 3,
Infinite Series Objective: We will try to find the sum of a series with infinitely many terms.
MIGRATION  Movement of individuals from one subpopulation to another followed by random mating.  Movement of gametes from one subpopulation to another.
Geometric Sequences and Series Unit Practical Application “The company has been growing geometrically”
Absolute vs. Conditional Convergence Alternating Series and the Alternating Series Test.
Slide 7- 1 Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley.
Lecture 3: population genetics I: mutation and recombination
Copyright © 2007 Pearson Education, Inc. Slide 8-1.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
11.2 Series In this section, we will learn about: Various types of series. INFINITE SEQUENCES AND SERIES.
Arithmetic Sequence Chapter 2, lesson C. IB standard Students should know Arithmetic sequence and series; sum of finite arithmetic series; geometric sequences.
Sequences & Series. Sequences  A sequence is a function whose domain is the set of all positive integers.  The first term of a sequences is denoted.
Pg. 395/589 Homework Pg. 601#1, 3, 5, 7, 8, 21, 23, 26, 29, 33 #43x = 1#60see old notes #11, -1, 1, -1, …, -1#21, 3, 5, 7, …, 19 #32, 3/2, 4/3, 5/4, …,
Section 9.2 – Series and Convergence. Goals of Chapter 9.
Models and their benefits. Models + Data 1. probability of data (statistics...) 2. probability of individual histories 3. hypothesis testing 4. parameter.
Getting Parameters from data Comp 790– Coalescence with Mutations1.
1 Population Genetics Basics. 2 Terminology review Allele Locus Diploid SNP.
Copyright © 2007 Pearson Education, Inc. Slide , 2, 4, 8, 16 … is an example of a geometric sequence with first term 1 and each subsequent term is.
Absolute vs. Conditional Convergence Alternating Series and the Alternating Series Test.
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
Example Solution For each geometric sequence, find the common ratio. a)  2,  12,  72,  432,... b) 50, 10, 2, 0.4, 0.08,... SequenceCommon Ratio.
Infinite Geometric Series
Power Series Section 9.1a.
Infinite Series Objective: We will try to find the sum of a series with infinitely many terms.
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley.
Infinite Geometric Series Recursion & Special Sequences Definitions & Equations Writing & Solving Geometric Series Practice Problems.
Coalescent theory CSE280Vineet Bafna Expectation, and deviance Statements such as the ones below can be made only if we have an underlying model that.
SECTION 8.2 SERIES. P2P28.2 SERIES  If we try to add the terms of an infinite sequence we get an expression of the form a 1 + a 2 + a 3 + ··· + a n +
In this section, we will begin investigating infinite sums. We will look at some general ideas, but then focus on one specific type of series.
Testing the Neutral Mutation Hypothesis The neutral theory predicts that polymorphism within species is correlated positively with fixed differences between.
Restriction enzyme analysis The new(ish) population genetics Old view New view Allele frequency change looking forward in time; alleles either the same.
Fixed Parameters: Population Structure, Mutation, Selection, Recombination,... Reproductive Structure Genealogies of non-sequenced data Genealogies of.
Representations of Functions as Power Series INFINITE SEQUENCES AND SERIES In this section, we will learn: How to represent certain functions as sums of.
CHAPTER 4 ESTIMATES OF MEAN AND ERRORS. 4.1 METHOD OF LEAST SQUARES I n Chapter 2 we defined the mean  of the parent distribution and noted that the.
9.3 Geometric Sequences and Series. Common Ratio In the sequence 2, 10, 50, 250, 1250, ….. Find the common ratio.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
S ECT. 9-2 SERIES. Series A series the sum of the terms of an infinite sequence Sigma: sum of.
Polymorphism Polymorphism: when two or more alleles at a locus exist in a population at the same time. Nucleotide diversity: P = xixjpij considers.
Montgomery Slatkin  The American Journal of Human Genetics 
Testing the Neutral Mutation Hypothesis
10.2 Arithmetic Sequences and Series
9.3 Geometric Sequences and Series
Geometric Series Definitions & Equations
Montgomery Slatkin  The American Journal of Human Genetics 
Trees & Topologies Chapter 3, Part 2
Trees & Topologies Chapter 3, Part 2
John Wakeley, Rasmus Nielsen, Shau Neen Liu-Cordero, Kristin Ardlie 
The sum of an Infinite Series
Presentation transcript:

Population genetics

coalesce 1.To grow together; fuse. 2.To come together so as to form one whole; unite: The rebel units coalesced into one army to fight the invaders.

4Nu determines the level of variation under the neutral model: The coalescent

Each two alleles have a common ancestor -> can be represented by a tree. The coalescent

The genealogy of the sample. The alleles might be the same by state or not. s3s4 s1s2 T = t 2 T = 0 T = t 3 T = t 1 The coalescent

Define T i to be the time needed to reduce a coalescent with i alleles to a one with i-1 alleles. Thus, T 4 =t 1, T 3 =t 2 -t 1, and T 2 =t 3 -t 2. Joining these equations we obtain: Or in general for n alleles: s3s4 s1s2 T = t 2 T = 0 T = t 3 T = t 1 The total time in the coalescent is: The coalescent

n alleles Focusing on the last generation. For 2 alleles, what is the probability that they have different ancestors in the previous generation? n-1 alleles Tc is a function of N=population size and n=number of alleles in the sample. We can compute Tc assuming the infinite allele model. The coalescent

Assuming N is very big, and thus ignoring terms in which N 2 appears in the denominator, we obtain: We have n alleles. What is the probability that they all have different ancestors in the previous generation? The coalescent

The probability that at least 2 allele out of n alleles have a common ancestor in the previous generation? This is the probability of a coalescent in each generation The probability that n alleles have different ancestors in the previous generation? The coalescent

The number of generation till a coalescent is geometrically distributed with p=n(n-1)/4N. Thus, the expected time till a coalescent event is 1/p=4N/n(n-1). In other words: The probability of a coalescent in a single generation is: The coalescent

From the following two equations, we can obtain E(T c ) The coalescent

The coalescent: adding mutation. s3s4 s1s2 T = t 2 T = 0 T = t 3 T = t 1 The n alleles are either the same by states or not. Each mutation in the history of these alleles resulted in a segregating site. If there was one mutation, there is one segregating site. If there were 2 mutations, there are 2 segregating sites (the infinite allele model). In general: k mutation -> k segregating sites.

The coalescent Let u be the mutation rate per generation. Thus, the total number of mutation in a coalescent is, on average, uT c, which is: Since S can be estimated from the sample (i.e., the number of segregating sites observed) we can get an estimate of θ. But, this is exactly the expectation of the number of segregating sites, S

The coalescent Example: Assume 11 sequences, each 768 nucleotides, were sampled and 14 segregating sites were found. Estimate θ for each allele (sequence) and for each nucleotide site. Here, n=11 and the sigma equals to E(S) is estimated to be 14, and hence the estimate of θ is 14/2.929 = Hence 4Nu is estimated to be 4.78, for u which is the allele mutation rate. 4Nu in which u denotes the nucleotide mutation rate is 4.78/768 =

The coalescent A few words about the harmonic series: 1.The sum is infinite. Proof: 2.The partial sum converges in the sense that So the rate of growth of the series is the same as that of ln(n). For the series to be equal 3, one needs 10 samples. For the series to be equal 4, one already needs 30 samples.

The coalescent We thus have 2 methods for estimating θ. 1.Based on the general heterozygosity: 2.Based on the number of segregating sites:

The coalescent The estimation based on general heterozygosity does not use the information from each site. The contrast between the two formulas can be used to test the neutral theory (Tajima ’ s D test).