Presentation is loading. Please wait.

Presentation is loading. Please wait.

Models for DNA substitution

Similar presentations


Presentation on theme: "Models for DNA substitution"— Presentation transcript:

1 Models for DNA substitution

2 http://www.stat.rice.edu/ ~mathbio/Polanski/stat655/

3 Plan Basics Models in discrete time Model is continuous time
Parameter estimation

4 Nucleotides Adenine ( A ) or ( a ) Guanine ( G ) or ( g ) purines
Cytosine ( C ) or ( c ) Thymine ( T ) or ( t ) purines pyrimidines

5 Substitution Purine Purine Transitions Pyrimidine Pyrimidine Purine
AG, G A, C T, T C Purine Pyrimidine Pyrimidine Purine Transversions AT, T A, A C, C A GT, T G, G C, C G

6 Other Deletions, insertions Insertions in reverse order

7 Hypothesis Substitution of nucleotides in the evolution of DNA sequences can be modeled by a Markov chain or Markov process

8 Other assumptions Stationarity Reversibility

9 Transition matrix P = a g c t paa pag pac pat a g pga pgg pgc pgt c
pca pcg pcc pct t pta ptg ptc ptt

10 Models – discrete time

11 Jukes – Cantor model All substitutions are equally probable

12 Stationary distribution

13 Spectral decomposition of Pn

14 Remark When learning and researching Markov models for nucleotide substitution, it greatly helps to use a software for symbolic computation, like Mathematica, Maple, Scientific Workplace.

15 Kimura models  - probability of a transition
 - probability of a specific transversion

16 Kimura 3ST model  - probability of : AG, C T  - probability of : AC, G T  - probability of : AT, C G

17 Stationary distribution

18 Generalizations of Kimura models
By Ewens:  - probability of : AG, C T  - probability of : AC, A  T, G C, G T  - probability of : CA, T  A, C G, T G

19 Stationary distribution

20 Spectral decomposition

21 By Blaisdell:  - probability of : AG, CT  - probability of : GA, TC  - probability of : AC, A  T, G C, G T  - probability of : CA, T  A, C G, T G

22 Stationary distribution
where Remark: this model is not reversible

23 Felsenstein model Probability of substitution of any nucleotide by another is proportional to the stationary probability of the substituting nucleotide

24 Stationary distribution

25 HKY model Hasegawa, Kishino, Yano
Different rates for transitions and transversions

26 Eigenvalues of P

27 Left (row) eigenvectors

28 Right (column) eigenvectors

29 General 12 parameter model
Tavare, 1986

30 Stationary distribution

31 Reversibility A=D, B=G, C=J, E=H, F=K, I=L
Conclusion – the most general reversible model has 12 – 6 = 6 free parameters

32 Continuous – time models

33 Matrix of transition probabilites
Q – intensity matrix

34 Jukes – Cantor model

35 Spectral decomposition of P(t)

36 Kimura model

37 Spectral decomposition of P(t)

38 Parameter estimation

39 Jukes – Cantor model Three things are equivalent due to reversibility:
Ancestor (A) D2 A D1 D1 A D2 D1 D2

40 Probability that the nucleotides are different in two descendants

41 Estimating p We have two DNA sequences of length N
D1: ACAATACAGGGCAGATAGATACAGATAGACACAGACAGAGCAGAGACAG D2: ACAATACAGGACAGTTAGATACAGATAGACACAGACAGAGCAGAGACAG Number of differences p = N

42 Kimura model p – probability of two different purines or pyrimidines
q – probability of purine and pyrimidine

43


Download ppt "Models for DNA substitution"

Similar presentations


Ads by Google