HMM Hidden Markov Model Hidden Markov Model. CpG islands CpG islands In human genome, CG dinucleotides are relatively rare In human genome, CG dinucleotides.

Slides:



Advertisements
Similar presentations
. Inference and Parameter Estimation in HMM Lecture 11 Computational Genomics © Shlomo Moran, Ydo Wexler, Dan Geiger (Technion) modified by Benny Chor.
Advertisements

Hidden Markov Model in Biological Sequence Analysis – Part 2
Marjolijn Elsinga & Elze de Groot1 Markov Chains and Hidden Markov Models Marjolijn Elsinga & Elze de Groot.
HMM II: Parameter Estimation. Reminder: Hidden Markov Model Markov Chain transition probabilities: p(S i+1 = t|S i = s) = a st Emission probabilities:
Hidden Markov Model.
1 Hidden Markov Model Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Hidden Markov Models Chapter 11. CG “islands” The dinucleotide “CG” is rare –C in a “CG” often gets “methylated” and the resulting C then mutates to T.
Hidden Markov Models.
Lecture 8: Hidden Markov Models (HMMs) Michael Gutkin Shlomi Haba Prepared by Originally presented at Yaakov Stein’s DSPCSP Seminar, spring 2002 Modified.
Bioinformatics Hidden Markov Models. Markov Random Processes n A random sequence has the Markov property if its distribution is determined solely by its.
Hidden Markov Models Eine Einführung.
Hidden Markov Models.
. Computational Genomics Lecture 10 Hidden Markov Models (HMMs) © Ydo Wexler & Dan Geiger (Technion) and by Nir Friedman (HU) Modified by Benny Chor (TAU)
 CpG is a pair of nucleotides C and G, appearing successively, in this order, along one DNA strand.  CpG islands are particular short subsequences in.
Hidden Markov Models Modified from:
Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.
Statistical NLP: Lecture 11
Hidden Markov Models Theory By Johan Walters (SR 2003)
Hidden Markov Models Fundamentals and applications to bioinformatics.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Model 11/28/07. Bayes Rule The posterior distribution Select k with the largest posterior distribution. Minimizes the average misclassification.
Hidden Markov Models. Two learning scenarios 1.Estimation when the “right answer” is known Examples: GIVEN:a genomic region x = x 1 …x 1,000,000 where.
Hidden Markov Models. Decoding GIVEN x = x 1 x 2 ……x N We want to find  =  1, ……,  N, such that P[ x,  ] is maximized  * = argmax  P[ x,  ] We.
Hidden Markov Models Lecture 6, Thursday April 17, 2003.
Hidden Markov Models.
Hidden Markov Models I Biology 162 Computational Genetics Todd Vision 14 Sep 2004.
. Hidden Markov Model Lecture #6 Background Readings: Chapters 3.1, 3.2 in the text book, Biological Sequence Analysis, Durbin et al., 2001.
Hidden Markov Models Lecture 5, Tuesday April 15, 2003.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
S. Maarschalkerweerd & A. Tjhang1 Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability Chapter
. Hidden Markov Model Lecture #6 Background Readings: Chapters 3.1, 3.2 in the text book, Biological Sequence Analysis, Durbin et al., 2001.
CpG islands in DNA sequences
Hidden Markov Models Lecture 5, Tuesday April 15, 2003.
. Computational Genomics Lecture 8a Hidden Markov Models (HMMs) © Ydo Wexler & Dan Geiger (Technion) and by Nir Friedman (HU) Modified by Benny Chor (TAU)
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Bioinformatics Hidden Markov Models. Markov Random Processes n A random sequence has the Markov property if its distribution is determined solely by its.
Hidden Markov Models.
Hidden Markov models Sushmita Roy BMI/CS 576 Oct 16 th, 2014.
Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.
CS262 Lecture 5, Win07, Batzoglou Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
. Class 5: Hidden Markov Models. Sequence Models u So far we examined several probabilistic model sequence models u These model, however, assumed that.
Hidden Markov Model Continues …. Finite State Markov Chain A discrete time stochastic process, consisting of a domain D of m states {1,…,m} and 1.An m.
1 Markov Chains. 2 Hidden Markov Models 3 Review Markov Chain can solve the CpG island finding problem Positive model, negative model Length? Solution:
BINF6201/8201 Hidden Markov Models for Sequence Analysis
H IDDEN M ARKOV M ODELS. O VERVIEW Markov models Hidden Markov models(HMM) Issues Regarding HMM Algorithmic approach to Issues of HMM.
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
Hidden Markov Models Yves Moreau Katholieke Universiteit Leuven.
Hidden Markov Models Usman Roshan CS 675 Machine Learning.
= stochastic, generative models
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
Algorithms in Computational Biology11Department of Mathematics & Computer Science Algorithms in Computational Biology Markov Chains and Hidden Markov Model.
B IOINFORMATICS Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 5 Hidden Markov Model Aleppo University Faculty of technical engineering.
Hidden Markov Models An Introduction to Bioinformatics Algorithms (Jones and Pevzner)
Hidden Markov Models (HMMs) –probabilistic models for learning patterns in sequences (e.g. DNA, speech, weather, cards...) (2 nd order model)
1 DNA Analysis Part II Amir Golnabi ENGS 112 Spring 2008.
COMP3456 – adapted from textbook slides Hidden Markov Models.
1 Hidden Markov Model Xiaole Shirley Liu STAT115, STAT215.
Hidden Markov Models – Concepts 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models BMI/CS 576
Hidden Markov Model I.
Hidden Markov Models.
Professor of Computer Science and Mathematics
Professor of Computer Science and Mathematics
Professor of Computer Science and Mathematics
CSE 5290: Algorithms for Bioinformatics Fall 2009
CSE 5290: Algorithms for Bioinformatics Fall 2011
CSE 5290: Algorithms for Bioinformatics Fall 2009
Presentation transcript:

HMM Hidden Markov Model Hidden Markov Model

CpG islands CpG islands In human genome, CG dinucleotides are relatively rare In human genome, CG dinucleotides are relatively rare C p G pairs undergo a process called methylation that modifies the C nucleotide C p G pairs undergo a process called methylation that modifies the C nucleotide A methylated C mutates (with relatively high chance) to a T A methylated C mutates (with relatively high chance) to a T Promotor regions are CG rich Promotor regions are CG rich These regions are not methylated, and thus mutate less often These regions are not methylated, and thus mutate less often These are called CG (aka C p G) islands These are called CG (aka C p G) islands

Initiation of Transcription from a Promoter

Properties of CpG Islands

Finding CpG islands Simple approach: Pick a window of size N ( N = 100, for example) Pick a window of size N ( N = 100, for example) Compute log-ratio for the sequence in the window, and classify based on that Compute log-ratio for the sequence in the window, and classify based on thatProblems: How do we select N ? How do we select N ? What do we do when the window intersects the boundary of a CpG island? What do we do when the window intersects the boundary of a CpG island?

Slot machine Slot machine

Fair Bet Casino The game is to flip coins, which results in only two possible outcomes: Head or Tail. The game is to flip coins, which results in only two possible outcomes: Head or Tail. The Fair coin will give Heads and Tails with same probability ½. The Fair coin will give Heads and Tails with same probability ½. The Biased coin will give Heads with prob. ¾. The Biased coin will give Heads with prob. ¾. Fair coin: ½ ½ Biased coin: ¾ ¼

The “Fair Bet Casino” (cont’d) Thus, we define the probabilities: Thus, we define the probabilities: P(H|F) = P(T|F) = ½ P(H|F) = P(T|F) = ½ P(H|B) = ¾, P(T|B) = ¼ P(H|B) = ¾, P(T|B) = ¼ The crooked dealer changes between Fair and Biased coins with probability 10% The crooked dealer changes between Fair and Biased coins with probability 10% Game start: Game start: T H H H H T T T T H T FFFFFFFBBBB

The Fair Bet Casino Problem Input: A sequence x = x 1 x 2 x 3 …x n of coin tosses made by two possible coins (F or B). Input: A sequence x = x 1 x 2 x 3 …x n of coin tosses made by two possible coins (F or B). Output: A sequence π = π 1 π 2 π 3 … π n, with each π i being either F or B indicating that x i is the result of tossing the Fair or Biased coin respectively. Output: A sequence π = π 1 π 2 π 3 … π n, with each π i being either F or B indicating that x i is the result of tossing the Fair or Biased coin respectively.

P(x|fair coin) vs. P(x|biased coin) Suppose first that dealer never changes coins. Some definitions: Suppose first that dealer never changes coins. Some definitions: P(x|fair coin): prob. of the dealer using P(x|fair coin): prob. of the dealer using the F coin and generating the outcome x. the F coin and generating the outcome x. P(x|biased coin): prob. of the dealer using the B coin and generating outcome x. P(x|biased coin): prob. of the dealer using the B coin and generating outcome x.

P(x|fair coin) vs. P(x|biased coin) P(x|fair coin)=P(x 1 …x n |fair coin) Π i=1,n p (x i |fair coin)= (1/2) n P(x|biased coin)= P(x 1 …x n |biased coin)= Π i=1,n p (x i |biased coin)=(3/4) k (1/4) n-k = 3 k /4 n k - the number of Heads in x.

P(x|fair coin) vs. P(x|biased coin) P(x|fair coin) = P(x|biased coin) 1/2 n = 3 k /4 n 1/2 n = 3 k /4 n 2 n = 3 k 2 n = 3 k n = k log 2 3 n = k log 2 3 when k = n / log 2 3 (k ~ 0.67n)

Log-odds Ratio We define log-odds ratio as follows: We define log-odds ratio as follows: log 2 (P(x|fair coin) / P(x|biased coin)) = Σ k i=1 log 2 (p + (x i ) / p - (x i )) = n – k log 2 3 = n – k log 2 3

Log-odds Ratio in Sliding Windows x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 …x n Consider a sliding window of the outcome sequence. Find the log-odds for this short window. Log-odds value 0 Fair coin most likely used Biased coin most likely used Disadvantages: - the length of Fair coin seq is not known in advance - different windows may classify the same position differently

HMM

Markov Process & Transition Matrix A stochastic process for which the probability of entering a certain state depends only on the last state occupied is called a Markov process. The process is governed by a transition matrix. Ex. Suppose that the 1995 state of land use in a city of 50 square miles of area is Assuming that the transition matrix for 5-year intervals are given by Assuming that the transition matrix for 5-year intervals are given by The 2000 state: I (Residential) = ? I (residential used) 30% II (commercially used) 20% III (industrially used) 50% To I To II To III From I From II From III *30+.1*20+0*50 = 26%

Markov Model A stochastic process for which the probability of entering a certain state depends only on the last state occupied is called a Markov process. Residentiallyused Commerciallyused Industriallyused RIICCRRRCCII Markov Chain

Basic Mathematics Markov chains: probability of a sequence: S=a 1 a 2...a n Markov chains: probability of a sequence: S=a 1 a 2...a n P(S)=P(a 1 )P(a 2 |a 1 )P(a 3 |a 1 a 2 )...P(a n |a 1... a n-1 ) P(S)=P(a 1 )P(a 2 |a 1 )P(a 3 |a 1 a 2 )...P(a n |a 1... a n-1 ) P(S)=P(a 1 )P(a 2 |a 1 )P(a 3 |a 2 )...P(a n |a n-1 ) P(S)=P(a 1 )P(a 2 |a 1 )P(a 3 |a 2 )...P(a n |a n-1 ) P(S)=P(a 1 )  P(a i |a i-1 ) P(S)=P(a 1 )  P(a i |a i-1 ) Exercise Exercise Probability of sequence CATG Bayes’ theorem ATGC A T G 0.25 C

Hidden Markov Model … Which dice was used is hidden.

protein/DNA sequence patterns intronexon A G C T A G C T AACATGGTACATGTTAG... AACATGGTACATGTTAG... States: exon, intron are hidden

HMM The state sequence is hidden The state sequence is hidden The process is governed by a transition matrix The process is governed by a transition matrix a kl = P (  i =l |  i-1 =k) and emission probabilities and emission probabilities e k (b)= P (x i =b |  i =k) HMMs: prob. depends on states passed thru HMMs: prob. depends on states passed thru if known: (states = s 1 s 2... s n ) if known: (states = s 1 s 2... s n ) P(Seq)=P(a 1 |s 1 )P(s 2 |s 1 )P(a 2 |s 2 )P(s 3 |s 2 )...P(s n |s n-1 )P(a n |s n ) P(Seq)=P(a 1 |s 1 )P(s 2 |s 1 )P(a 2 |s 2 )P(s 3 |s 2 )...P(s n |s n-1 )P(a n |s n ) if unknown, sum over all possible paths find the sequence that maximize P(S)? if unknown, sum over all possible paths find the sequence that maximize P(S)?

Hidden Markov Model (HMM) Can be viewed as an abstract machine with k hidden states that emits symbols from an alphabet Σ. Can be viewed as an abstract machine with k hidden states that emits symbols from an alphabet Σ. Each state has its own probability distribution, and the machine switches between states according to this probability distribution. Each state has its own probability distribution, and the machine switches between states according to this probability distribution. While in a certain state, the machine makes 2 decisions: While in a certain state, the machine makes 2 decisions: What state should I move to next? What state should I move to next? What symbol - from the alphabet Σ - should I emit? What symbol - from the alphabet Σ - should I emit?

Why “Hidden”? Observers can see the emitted symbols of an HMM but have no ability to know which state the HMM is currently in. Observers can see the emitted symbols of an HMM but have no ability to know which state the HMM is currently in. Thus, the goal is to infer the most likely hidden states of an HMM based on the given sequence of emitted symbols. Thus, the goal is to infer the most likely hidden states of an HMM based on the given sequence of emitted symbols.

Fair Bet Casino Problem Any observed outcome of coin tosses could have been generated by any sequence of states! Need to incorporate a way to grade different sequences differently. Decoding Problem

HMM Parameters Σ: set of emission characters. Ex.: Σ = {H, T} for coin tossing Σ = {1, 2, 3, 4, 5, 6} for dice tossing Σ = {1, 2, 3, 4, 5, 6} for dice tossing Σ = {A, C, G, T} for a DNA seq Σ = {A, C, G, T} for a DNA seq Q: set of hidden states, each emitting symbols from Σ. Q={F,B} for coin tossing Q={F,B} for coin tossing Q={intron, exon} for a gene Q={intron, exon} for a gene

HMM Parameters (cont’d) A = (a kl ): a |Q|  |Q| matrix of probability of changing from state k to state l. a FF = 0.9 a FB = 0.1 a FF = 0.9 a FB = 0.1 a BF = 0.1 a BB = 0.9 a BF = 0.1 a BB = 0.9 E = (e k (b)): a |Q|  |Σ| matrix of probability of emitting symbol b while being in state k. e F (0) = ½ e F (1) = ½ e F (0) = ½ e F (1) = ½ e B (0) = ¼ e B (1) = ¾ e B (0) = ¼ e B (1) = ¾

HMM for Fair Bet Casino The Fair Bet Casino in HMM terms: The Fair Bet Casino in HMM terms: Σ = {0, 1} (0 for Tails and 1 Heads) Q = {F,B} – F for Fair & B for Biased coin. Transition Probabilities A *** Emission Probabilities E FairBiased Fair a FF = 0.9 a FB = 0.1 Biased a BF = 0.1 a BB = 0.9 Tails(0)Heads(1)Fair e F (0) = ½ e F (1) = ½ Biased e B (0) = ¼ e B (1) = ¾

HMM for Fair Bet Casino (cont’d) HMM model for the Fair Bet Casino Problem

Hidden Paths A path π = π 1 … π n in the HMM is defined as a sequence of states. A path π = π 1 … π n in the HMM is defined as a sequence of states. Consider path π = FFFBBBBBFFF and sequence x = (0=T, 1=H) Consider path π = FFFBBBBBFFF and sequence x = (0=T, 1=H) x π = F F F B B B B B F F F P(x i |π i ) ½ ½ ½ ¾ ¾ ¾ ¼ ¾ ½ ½ ½ P(π i-1  π i ) ½ 9 / 10 9 / 10 1 / 10 9 / 10 9 / 10 9 / 10 9 / 10 1 / 10 9 / 10 9 / 10 π i-1 to state π i Transition probability from state π i-1 to state π i π i Probability that x i was emitted from state π i

Decoding Problem Goal: Find an optimal (most probable) hidden path of states given observations. Goal: Find an optimal (most probable) hidden path of states given observations. Input: Sequence of observations x = x 1 …x n generated by an HMM M(Σ, Q, A, E) Input: Sequence of observations x = x 1 …x n generated by an HMM M(Σ, Q, A, E) Output: A path that maximizes P(x, π) over all possible paths π. Output: A path that maximizes P(x, π) over all possible paths π.

Decoding Problem T H H H H T T T T FBFB

P(x, π) Calculation P(x, π): Probability that sequence x was generated by the path π: P(x, π): Probability that sequence x was generated by the path π: n P(x, π) = P(π 0 → π 1 ) · Π P(x i |π i ) · P(π i → π i+1 ) i=1 i=1 = a π 0, π 1 · Π e π i (x i ) · a π i, π i+1 = a π 0, π 1 · Π e π i (x i ) · a π i, π i+1 = Π e π i+1 (x i+1 ) · a π i, π i+1 = Π e π i+1 (x i+1 ) · a π i, π i+1 if we count from i=0 instead of i=1 to i=n if we count from i=0 instead of i=1 to i=n Number of possible paths?

Building Manhattan for Decoding Problem Andrew Viterbi used the Manhattan grid model to solve the Decoding Problem. Andrew Viterbi used the Manhattan grid model to solve the Decoding Problem. Every choice of π = π 1 … π n corresponds to a path in the graph. Every choice of π = π 1 … π n corresponds to a path in the graph. The only valid direction in the graph is eastward. The only valid direction in the graph is eastward. This graph has |Q| 2 (n-1) edges. This graph has |Q| 2 (n-1) edges. ?

Graph for Decoding Problem

Decoding Problem vs. Alignment Problem Valid directions in the alignment problem. Valid directions in the decoding problem.

Decoding Problem as Finding a Longest Path in a DAG The Decoding Problem is reduced to finding a longest path in the directed acyclic graph (DAG) above. The Decoding Problem is reduced to finding a longest path in the directed acyclic graph (DAG) above. Notes: the length of the path is defined as the product of its edges’ weights, not the sum. Notes: the length of the path is defined as the product of its edges’ weights, not the sum.

Decoding Problem: weights of edges The weight w = e l (x i+1 ). a kl w? (k, i)(l, i+1) T H H H H T T T T FBFB

Decoding Problem n Maximize: P(x, π) = Π e π i+1 (x i+1 ). a π i, π i+1 Maximize: P(x, π) = Π e π i+1 (x i+1 ). a π i, π i+1 i=0 i=0

Decoding Problem (cont’d) Every path in the graph has the probability P(x,π) (= length of the path). Every path in the graph has the probability P(x,π) (= length of the path). The Viterbi algorithm finds the path that maximizes P(x, π) among all possible paths. The Viterbi algorithm finds the path that maximizes P(x, π) among all possible paths. The Viterbi algorithm runs in O(n|Q| 2 ) time. The Viterbi algorithm runs in O(n|Q| 2 ) time. ?

Decoding Problem and Dynamic Programming s l,i+1 =max k  Q {s k,i · weight of (k,i)  (l,i+1)} =max k  Q {s k,i · a kl · e l (x i+1 )} =max k  Q {s k,i · a kl · e l (x i+1 )} =e l (x i+1 ) · max k  Q {s k,i · a kl } =e l (x i+1 ) · max k  Q {s k,i · a kl } l k i i+1

Decoding Problem (cont’d) Initialization: Initialization: s begin,0 = 1 s begin,0 = 1 s k,0 = 0 for k ≠ begin. s k,0 = 0 for k ≠ begin. Let π * be the optimal path. Then, Let π * be the optimal path. Then, P( x, π * ) = max k Є Q { s k,n. a k,end }

Most probable path: Viterbi alg Dynamic programming Dynamic programming define p i,j as the prob of the most probable path ending in state j after emitting element i define p i,j as the prob of the most probable path ending in state j after emitting element i Define solution recursively: Define solution recursively: suppose we know p i-1,j  states up to previous char suppose we know p i-1,j  states up to previous char update: p i,k =P(a i, s k ) * max j (p i-1,j *P(s k, s j )) update: p i,k =P(a i, s k ) * max j (p i-1,j *P(s k, s j )) traceback traceback keep table of state probs, start with 1st char, assign prob to each state, iterate updates... keep table of state probs, start with 1st char, assign prob to each state, iterate updates... a1a2a3a4 s s s

Most probable path: Viterbi alg Dynamic programming Dynamic programming a1a2a3a4 s s s

Problem with Viterbi Algorithm The value of the product can become extremely small, which leads to under-flowing. The value of the product can become extremely small, which leads to under-flowing. To avoid overflowing, use log value instead. To avoid overflowing, use log value instead. s k,i+1 = log e l (x i+1 ) + max k Є Q { s k,i + log( a kl )}

Exercise Consider a hidden Markov model with the following transition and emission matrices: Consider a hidden Markov model with the following transition and emission matrices: What is the most probable sequence of states for a given DNA sequence ACGG? What is the most probable sequence of states for a given DNA sequence ACGG? exonintron exon intron purinepyrimidine exon intron0.5

Forward-Backward Problem Given: a sequence of coin tosses generated by an HMM. Goal: find the probability that the dealer was using a biased coin at a particular time i. T H H H H T T T T H T

Forward-Backward Problem T H H H H T T T T FBFB i

Forward Algorithm Define f k,i (forward probability) as the probability of emitting the prefix x 1 …x i and reaching the state π i = k. The recurrence for the forward algorithm: f k,i = e k (x i )  Σ f l,i- 1  a lk l Є Q Similar to Viterbi, except replace ‘max’ with probabilistic ‘sum’! k l i  1 i

Dishonest Casino Computing posterior probabilities for “fair” at each point in a long sequence: Computing posterior probabilities for “fair” at each point in a long sequence:

Backward Algorithm However, forward probability is not the only quantity that provides info to P(π i = k|x). However, forward probability is not the only quantity that provides info to P(π i = k|x). The sequence of transitions and emissions that the HMM undergoes between π i+1 and π n also affect P(π i = k|x). The sequence of transitions and emissions that the HMM undergoes between π i+1 and π n also affect P(π i = k|x). forward x i backward forward x i backward

Define backward probability b k,i as the probability of being in state π i = k and emitting the suffix x i+1 …x n. Define backward probability b k,i as the probability of being in state π i = k and emitting the suffix x i+1 …x n. The recurrence for the backward algorithm: The recurrence for the backward algorithm: b k,i = Σ a kl e l (x i+1 ) b l,i+1 b k,i = Σ a kl  e l (x i+1 )  b l,i+1 l Є Q l Є Q Backward Algorithm (cont’d) k l i i+1

The probability that the dealer used a biased coin at any moment i: The probability that the dealer used a biased coin at any moment i: P(x, π i = k) f k (i). b k (i) P(x, π i = k) f k (i). b k (i) P(π i = k|x) = _______________ = ____________ P(π i = k|x) = _______________ = ____________ P(x) P(x) P(x) P(x) Backward-Forward Algorithm P(x, π i = k) P(x) is the sum of P(x, π i = k) over all k

Markov chain for CpG Islands Construct a Markov chain for CpG rich and another for CpG poor regions Construct a Markov chain for CpG rich and another for CpG poor regions Using maximum likelihood estimates from 60K nucleotide, the two models: Using maximum likelihood estimates from 60K nucleotide, the two models:

Ratio Test for CpC islands Given a sequence X 1,…,X n, compute the likelihood ratio Given a sequence X 1,…,X n, compute the likelihood ratio 2 2

HMM Approach Build one model that include “+” states and “-” states Build one model that include “+” states and “-” states A state “remembers” last nucleotide and the type of region A state “remembers” last nucleotide and the type of region A transition from a  state to a + corresponds to the start of a CpG island A transition from a  state to a + corresponds to the start of a CpG island

p? q?