Presentation is loading. Please wait.

Presentation is loading. Please wait.

. Alignment HMMs Tutorial #10 © Ilan Gronau. 2 Global Alignment HMM M ISIS ITIT STARTEND (a,a) (a,b) (z,z) (-,a) (-,z) (a,-) (z,-) Probability distributions.

Similar presentations


Presentation on theme: ". Alignment HMMs Tutorial #10 © Ilan Gronau. 2 Global Alignment HMM M ISIS ITIT STARTEND (a,a) (a,b) (z,z) (-,a) (-,z) (a,-) (z,-) Probability distributions."— Presentation transcript:

1 . Alignment HMMs Tutorial #10 © Ilan Gronau

2 2 Global Alignment HMM M ISIS ITIT STARTEND (a,a) (a,b) (z,z) (-,a) (-,z) (a,-) (z,-) Probability distributions The HMM defines a probability distribution over all possible alignments

3 3 Global Alignment HMM (-,a) (-,z) (a,-) (z,-) M ISIS ITIT STARTEND (a,a) (a,b) (z,z) Output sequence is an alignment An alignment defines a (hidden) Markov chain HMM defines distribution over all alignments Given a sequence-pair S,T  Find the most probable alignment of S,T M ISIS M ITIT ITIT ISIS ATAT C-C- CCCC -C-C -G-G T-T- star t end S: T:

4 4 Global Alignment HMM Parameter settings startMISIS ITIT end start --1-2δ-τδδτ M --1-2δ-τδδτ ISIS --1-ε-τε--τ ITIT 1-ε-τ--ετ end -- 1 Transitions: Emissions: M  Pr[(a,b) | M] = p ab I S  Pr[(a,-) | I S ] = q a I T  Pr[(-,a) | I T ] = q a M ISIS M ITIT ITIT ISIS ATAT C-C- CCCC -C-C -G-G T-T- S E (PAM estimation) τ – termination probability δ - gap introduction ε - gap elongation (-,a) (-,z) (a,-) (z,-) M ISIS ITIT STARTEND (a,a) (a,b) (z,z)

5 5 Global Alignment HMM Given a sequence-pair S,T  Find the most probable alignment of S,T A sequence-pair S,T and a path in HMM defines an alignment Use a Viterbi-like algorithm V X (i,j) – probability of most probable alignment of S 1 …S i and T 1 …T j ending in state X ATAT C-C- CCCC -C-C GGGG T-T- M ISIS M ITIT M ISIS star t end S = ACCGT T = TCCG

6 6 Global Alignment HMM Viterbi algorithm V X (i,j) – probability of most probable alignment of S 1 …S i and T 1 …T j ending in state X Initialize: V M (0,0) = τ ; V S (0,0) = V T (0,0) = 0 start MISIS ITIT end start --1-2δ-τδδτ M --1-2δ-τδδτ ISIS --1-ε-τε--τ ITIT 1-ε-τ--ετ end -- 1 Recursion formulae: Hold update-pointers ( START is like M) pre-payment for terminating alignment M ISIS M ITIT ITIT ISIS ATAT C-C- CCCC -C-C -G-G T-T- S E

7 7 1-η η azaz Random sequence generator: Problem: We cannot compare alignments of different lengths Elongating an alignment always reduces its probability We cannot perform local alignment !! Solution: Compare to a neutral random model for generating sequences Calculate background probability (-,a) (-,z) (a,-) (z,-) M ISIS ITIT STARTEND (a,a) (a,b) (z,z) PW alignment generator: Log-Odds Score For Alignment

8 8 Eliminate background probability Compare alignment probability with probability to observe S,T at random:  Pr[S] = Pr[S | Random] = Π i=1..n ( q S i ) ∙ (1- η ) n-1 ∙ η  Pr[S,T | Random] = Pr[S] ∙ Pr[T] Log-odds score: log(Pr * [S,T | Aligned] / Pr[S,T | Random]) = log(Pr[A * ]) – (log(Pr[S]) + log(Pr[T]))  Enables Comparison of alignments of different sequences and different lengths 1-η η azaz Random sequence generator:

9 9 Log-odds score: log(Pr[A * ]) – (log(Pr[S]) + log(Pr[T])) Change Viterbi algorithm accordingly: v X (i,j) – log of ratio between V X (i,j) and background probability of S 1 …S i and T 1 …T j Initialize: v M (0,0) = log(τ) – 2log(η) ; v S (0,0) = v T (0,0) = -∞ Recursion formulae: Log-Odds Score For Alignment Modified Viterbi algorithm pre-payment for terminating sequences and alignment

10 10 Initialize: v M (0,0) = log(τ) – 2log(η) ; v S (0,0) = v T (0,0) = -∞ Recursion formulae: Log-Odds Score For Alignment Modified Viterbi algorithm

11 11 Initialize: v M (0,0) = w=log(τ) – 2log(η) ; v S (0,0) = v T (0,0) = -∞ Recursion formulae: Log-Odds Score For Alignment Local Alignment We reset computation if Pr*[S 1..i,T 1..j | Aligned] < Pr[S 1..i,T 1..j | Random] We can terminate anywhere in the middle

12 12 Switch to constant payment when entering state M Transforming into The Familiar Form We pay for closing a gap when it is opened d = -(log(δ)-log(1-η) + log(1-ε- τ)- log(1-2δ-τ)) e = -(log(ε)-log(1-η)) Define σ(a,b) = log(p ab ) - log(q a ) - log(q b ) - 2log(1-η) + log(1-2δ-τ) pre-payment DP algorithm for affine gap-score (tutorial #3)  At the end pick max{v M (n,m) ; v S (n,m)+c ; v T (n,m)+c }

13 13 What is the probability S,T are associated?  One answer: use log-odds score of their most probable alignment Problem: maybe S,T have many alignments with similar score  Better solution: sum-up probabilities of all possible alignments Use a ‘forward’ (or ‘backward’) type algorithm to sum-up probabilities Initialize: F M (0,0) = τ ; F S (0,0) = F T (0,0) = 0 Recursion formulae: Probability of Association Pr[S,T | Aligned] = F M (n,m) + F S (n,m) + F T (n,m)

14 14 Probability of Association Background distribution Log-odds: Compare Pr[S,T | Aligned] with Pr[S,T | Random] Pr[S,T | Aligned] = F M (n,m) + F S (n,m) + F T (n,m) 1-η η azaz Random sequence generator: (-,a) (-,z) (a,-) (z,-) M ISIS ITIT STARTEND (a,a) (a,b) (z,z) PW alignment generator: Pr[S,T | Random] = Π i=1..n ( q S i ) ∙ Π j=1..m ( q T j ) ∙ (1- η ) n+m-2 ∙ η 2 Local association: Divide by background distribution in recursive calculation Reset when Pr[S 1..i,T 1..j | Aligned] < Pr[S 1..i,T 1..j | Random] Terminate anywhere in the middle


Download ppt ". Alignment HMMs Tutorial #10 © Ilan Gronau. 2 Global Alignment HMM M ISIS ITIT STARTEND (a,a) (a,b) (z,z) (-,a) (-,z) (a,-) (z,-) Probability distributions."

Similar presentations


Ads by Google