Presentation is loading. Please wait.

Presentation is loading. Please wait.

. Odds and Ends Tutorial #13 © Ilan Gronau. 2 The Noisy Transmission Model.

Similar presentations


Presentation on theme: ". Odds and Ends Tutorial #13 © Ilan Gronau. 2 The Noisy Transmission Model."— Presentation transcript:

1 . Odds and Ends Tutorial #13 © Ilan Gronau

2 2 The Noisy Transmission Model

3 3 0 I0I0 1 I1I1 01- 0.750.20.05 01 0.5 01 01- 0.20.750.05 01I0I0 I1I1 0 0.630.270.1-- 1 0.090.81--0.1 I0I0 0.560.240.2-- I1I1 0.080.72--0.2 Transitions: Stationary distribution: (8, 24, 1, 3)/36

4 4 Questions Given an output sequence (including blanks), what is the most probable path which yields this sequence?  (1.c) - Viterbi algorithm Given an output sequence, what is the most probable path to yield it, which passes through M non-noise states ( 0/1 )?  (1.d) Given an output sequence, what is the most probable path to yield it?  (bonus) Given an output sequence, what is the most probable transmission?  Problem: each transmission corresponds to multiple paths!

5 5 Answer to 1d Given an output sequence X 1,…,X n and M, we calculate the following values for all states S, i=1..n and j=1..M : v S (i,j) – log-probability of most probable path yielding output X 1,…,X i, passing through j non-noise states, and ending in state S. Initialize: v S (0,0) – initial log-probability of S (stationary distribution) For i,j>0 and a=0/1 : Hold update-pointers Most values are -∞  t(∙,∙), e(∙,∙) are log-probabilities

6 6 Answer to 1d Given an output sequence X 1,…,X n and M, we calculate the following values for all states S, i=1..n and j=1..M : v S (i,j) – log-probability of most probable path yielding output X 1,…,X i, passing through j non-noise states, and ending in state S. Recursion formulae: (For i,j>0 and a=0/1 ) At the end choose: and follow pointers to recover path Hold update-pointers

7 7 Bonus Given an output sequence, what is the most probable path to yield it? Approach 1: If we don’t know M, then we can fill in the tables column by column Eventually the probability of columns starts deteriorating Approach 2: a-priori bound Note that an optimal path doesn’t have 2 consecutive deletions (-) SiSi Si+1Si+1 S i+2 -- SiSi S i+2 -- Pr < Conclusion: M < 2n+2

8 8 2-species Evolution Observe the following evolution model for binary-character vectors: Each specie corresponds to a binary vector in {0,1} n Two species Y,Z evolve from a common ancestor X Each bit in X is chosen uniformly by random Each bit in X is flipped w.p. θ during evolution towards Y or Z Given binary vectors for Y, Z calculate most probable value for θ 1.Define the sufficient statistics of the problem 2.Give formula for L(θ) 3.Formulate EM algorithm for the problem 4.Give analytic solution (if exists) for MLE X Y Z θ θ hidden observed

9 9 2-species Evolution Define the sufficient statistics of the problem Given Y = y 1,…y n and Z = z 1,…z n define n 0 =|{i | y i = z i }|, n 1 =|{i | y i ≠ z i }| Give formula for L(θ) L(θ)= Pr[ Y,Z | θ]= Π i=1..n ( Pr[ Y i,Z i | θ] ) = X Y Z θ θ YiYi ZiZi XiXi Pr[X i,Y i,Z i ] 000½(1-θ) 2 1½θ2½θ2 010½ θ(1-θ) 1 Similarly if Y i =1

10 10 2-species Evolution Formulate EM algorithm for the problem E – Given θ calculate the expected number of flips from X to Y and Z E(#flips) = Σ i=1..n ( Pr[x i ≠ y i ] + Pr[x i ≠ z i ] ) = X Y Z θ θ YZXPr[X,Y,Z]Pr[X|Y,Z] 000½(1-θ) 2 1½θ2½θ2 010½ θ(1-θ) 1 #flips = sum of indicator variables M – Given expected number of flips from X to Y and Z calculate θ’ θ’= E(#flips) / 2n E+M –

11 11 2-species Evolution Give analytic solution (if exists) for MLE Find extreme-points of log-likelihood: X Y Z θ θ minimum maxima

12 12 Generalizing The Model Alphabet of size k : Uniform transition model: More complex transition models Evolution of n species (given the phylogenetic topology): X1X1 X2X2 X3X3 θ2θ2 θ1θ1 X4X4 X5X5 θ4θ4 θ3θ3 Y1Y1 Y2Y2 Y3Y3 YnYn observed hidden θ i correlates to evolutionary distance along the edge solves ‘small’ likelihood problem


Download ppt ". Odds and Ends Tutorial #13 © Ilan Gronau. 2 The Noisy Transmission Model."

Similar presentations


Ads by Google