Download presentation
Presentation is loading. Please wait.
Published by형종 학 Modified over 5 years ago
1
Eigen’s paradox and other stories Matt Roberts, University of Bath
Probability meets Biology, 29th April 2019
2
A mathematical model for life
Represent an organism as a sequence of 0s and 1s Could alternatively use sequence of As, Cs, Gs and Us = (0,0,1,0,1,1,1,1,0,0,1,0,1,0,0,1)
3
Life = reproduction MRS FREG
Every organism (i.e. sequence of 0s and 1s) produces copies of itself. It does this at some rate, proportional to its fitness. However, mistakes happen (at random).
4
An example Organisms represented by sequences of three 0s and 1s
(0,0,0) reproduces at rate 1 All other sequences reproduce at rate 0.9 (0,0,0) (1,0,0) (0,1,0) (0,0,1) (1,1,0) (1,0,1) (0,1,1) (1,1,1)
5
Initial observations If there are no errors, 100% of the population is at one site (the one you started at). If errors happen with high probability, the population will be much more spread out. It could be that more of the population is at the “fittest” site than any other single site – but still only a small proportion of the population is at the “fittest” site. All this can be made precise and rigorous using eigenvector analysis.
6
How do errors happen? Biology => fixed probability p of error on each digit of the sequence. Probability we make no errors at all when reproducing is Q = (1-p)L where L is the length of the sequence. … 1 1 1
7
Longer sequences / more base pairs
To make our model even slightly more realistic, we need to consider sequences of length much bigger than 3. Let’s take L=100 (that is, sequences of length 100). (0,0,0,…,0,0) reproduces at rate 1; All other sequences reproduce at rate 0.95. From wikipedia:
8
Some numbers Errors happen with probability p ≈ 0.001 (???)
For L=100, this gives Q = (1-p)100 ≈ ≈ 0.9 This is not good enough for the fittest site to “win”. Error-correction enzymes can reduce p drastically! However, to encode error-correction enzymes, we need L >> 100. This is Eigen’s paradox. How did life begin?
9
Questions (Beware – answers may already be known!) Is this really a paradox? If my goal is to self-replicate, who cares if I produce loads of junk too? Unrealistic assumption (one site good, all others slightly worse) – is there a better model? (“Better” = biologically more realistic, mathematically still tractable.) How do organisms get more complex (i.e. longer sequences)?
10
And now for something completely different
Kingman’s coalescent is the simplest mathematical model of coalescence. Start with n blocks. Each pair of blocks coalesces at rate 1.
11
Universality for Kingman’s coalescent
Several “natural” population models give rise to Kingman’s coalescent. Kingman’s coalescent is seen as universal, and relies on only 3 things: No selection Well-mixed population Constant population size Unfortunately real data does not usually match Kingman, because in reality population sizes are not constant. There is a fudge-factor called the “effective population size” that aims to match real data to the Kingman model. (This is what I have been told, anyway!)
12
An alternative to Kingman
Problem: life happens forwards in time. We observe backwards in time. Kingman approach: take simplest backwards-in-time model (coalescent), and try to match to data. Alternative approach: take simplest forwards-in-time model, and try to analyse coalescent times. Then match these to data. Forwards-in-time model: the (critical) Galton-Watson tree. Everyone lives for an Exp(1) distributed amount of time, and then gives birth to a random number of children with mean 1 (and finite variance).
13
Sampling from a Galton-Watson process
𝑆 1 𝑆 2 𝑆 3 Pick k=4 particles uniformly at random
14
Split time distributions
We know the joint distribution of all the split times. The formulas get a bit complicated, but they are explicit. For example, when k=2, 𝑃 𝑆 1 𝑇 >𝑠 → 2 1−𝑠 𝑠 2 log 1 1−𝑠 −𝑠 Can also do near-critical version. And there is a probabilistic construction that is mathematically prettier but less explicit. “The coalescent structure of continuous-time Galton-Watson trees”, Harris, Johnston, R. Can we compare this to real data? Does it fit better than Kingman with effective population size?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.