Professor of Computer Science and Mathematics

Slides:



Advertisements
Similar presentations
Hidden Markov Model in Biological Sequence Analysis – Part 2
Advertisements

Marjolijn Elsinga & Elze de Groot1 Markov Chains and Hidden Markov Models Marjolijn Elsinga & Elze de Groot.
HMM II: Parameter Estimation. Reminder: Hidden Markov Model Markov Chain transition probabilities: p(S i+1 = t|S i = s) = a st Emission probabilities:
Hidden Markov Model.
Hidden Markov Models Chapter 11. CG “islands” The dinucleotide “CG” is rare –C in a “CG” often gets “methylated” and the resulting C then mutates to T.
Hidden Markov Models.
Bioinformatics Hidden Markov Models. Markov Random Processes n A random sequence has the Markov property if its distribution is determined solely by its.
Hidden Markov Models Eine Einführung.
Hidden Markov Models.
 CpG is a pair of nucleotides C and G, appearing successively, in this order, along one DNA strand.  CpG islands are particular short subsequences in.
Hidden Markov Models Modified from:
Ka-Lok Ng Dept. of Bioinformatics Asia University
Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.
Hidden Markov Models Theory By Johan Walters (SR 2003)
Hidden Markov Models Fundamentals and applications to bioinformatics.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Hidden Markov Model 11/28/07. Bayes Rule The posterior distribution Select k with the largest posterior distribution. Minimizes the average misclassification.
Hidden Markov Models. Two learning scenarios 1.Estimation when the “right answer” is known Examples: GIVEN:a genomic region x = x 1 …x 1,000,000 where.
Hidden Markov Models. Decoding GIVEN x = x 1 x 2 ……x N We want to find  =  1, ……,  N, such that P[ x,  ] is maximized  * = argmax  P[ x,  ] We.
Hidden Markov Models Lecture 6, Thursday April 17, 2003.
Hidden Markov Models I Biology 162 Computational Genetics Todd Vision 14 Sep 2004.
Hidden Markov Models Lecture 5, Tuesday April 15, 2003.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
CpG islands in DNA sequences
Hidden Markov Models Lecture 5, Tuesday April 15, 2003.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Bioinformatics Hidden Markov Models. Markov Random Processes n A random sequence has the Markov property if its distribution is determined solely by its.
Hidden Markov Models.
Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.
CS262 Lecture 5, Win07, Batzoglou Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
. Class 5: Hidden Markov Models. Sequence Models u So far we examined several probabilistic model sequence models u These model, however, assumed that.
1 Markov Chains. 2 Hidden Markov Models 3 Review Markov Chain can solve the CpG island finding problem Positive model, negative model Length? Solution:
HMM Hidden Markov Model Hidden Markov Model. CpG islands CpG islands In human genome, CG dinucleotides are relatively rare In human genome, CG dinucleotides.
Gene finding with GeneMark.HMM (Lukashin & Borodovsky, 1997 ) CS 466 Saurabh Sinha.
BINF6201/8201 Hidden Markov Models for Sequence Analysis
Expectation Maximization and Gibbs Sampling – Algorithms for Computational Biology Lecture 1- Introduction Lecture 2- Hashing and BLAST Lecture 3-
H IDDEN M ARKOV M ODELS. O VERVIEW Markov models Hidden Markov models(HMM) Issues Regarding HMM Algorithmic approach to Issues of HMM.
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
Hidden Markov Models Yves Moreau Katholieke Universiteit Leuven.
HMMs for alignments & Sequence pattern discovery I519 Introduction to Bioinformatics.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
Algorithms in Computational Biology11Department of Mathematics & Computer Science Algorithms in Computational Biology Markov Chains and Hidden Markov Model.
Learning Sequence Motifs Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 Mark Craven
1 DNA Analysis Part II Amir Golnabi ENGS 112 Spring 2008.
Hidden Markov Model Parameter Estimation BMI/CS 576 Colin Dewey Fall 2015.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
More on HMMs and Multiple Sequence Alignment BMI/CS 776 Mark Craven March 2002.
Hidden Markov Models – Concepts 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models BMI/CS 576
Hidden Markov Models (HMMs)
Hidden Markov Models (HMMs)
Learning Sequence Motif Models Using Expectation Maximization (EM)
Hidden Markov Models - Training
Hidden Markov Model I.
6.047/ Computational Biology: Genomes, Networks, Evolution
Hidden Markov Models.
Professor of Computer Science and Mathematics
CISC 667 Intro to Bioinformatics (Fall 2005) Hidden Markov Models (I)
Hidden Markov Models (HMMs)
Hidden Markov Model ..
Professor of Computer Science and Mathematics
CONTEXT DEPENDENT CLASSIFICATION
CSE 5290: Algorithms for Bioinformatics Fall 2009
CSE 5290: Algorithms for Bioinformatics Fall 2011
CSE 5290: Algorithms for Bioinformatics Fall 2009
Hidden Markov Model Lecture #6
CISC 667 Intro to Bioinformatics (Fall 2005) Hidden Markov Models (I)
Presentation transcript:

Professor of Computer Science and Mathematics Computational Biology Lecture #8: Hidden Markov Models, Motifs & Patterns Bud Mishra Professor of Computer Science and Mathematics 11 ¦ 12 ¦ 2001 1/18/2019 ©Bud Mishra, 2001

Hidden Markov Models (HMM) Defined by an alphabet S, A set of (hidden) states Q, A matrix of state transition probabilities A, and a matrix of emission probabilities E. 1/18/2019 ©Bud Mishra, 2001

States ©Bud Mishra, 2001 S = An alphabet of symbols Q = A set of states that emit symbols from the alphabet S A = (akl) = |Q| £ |Q| matrix of state transition probabilities E = (eK(B)) = |Q| £ |S| matrix of emission probabilities 1/18/2019 ©Bud Mishra, 2001

A Simple Exaple ©Bud Mishra, 2001 Fair Bet Casino: S = {H, T} ! A two sided coin Q = {F, B} ! The casino may use a fair coin Pr(H) = Pr(T) = ½ or a biased coin Pr(H) = ¾, Pr(T) = ¼. 1/18/2019 ©Bud Mishra, 2001

Fair Bet Casino Problem 0.9 0.9 0.1 B F Pr[H]= ½ Pr[T]= ½ Pr[H]= ¾ Pr[T]= ¼ 0.1 1/18/2019 ©Bud Mishra, 2001

Transition and Emission aFF = aBB = 0.9 aFB = aBF = 0.1 The casino switches the coin with probability = 0.9 eF(H) = ½ eF(T) = ½ eB(H) = ¾ eB(T) = ¼ Given a sequence of coin tosses find out when the dealer used a fair coin ad when the dealer used a biased coin. 1/18/2019 ©Bud Mishra, 2001

HMM in Genomics ©Bud Mishra, 2001 CG-Islands: Subsequences around genes where CG appear relatively frequently. Methylation is suppressed in CG islands. C within the CG nucleotide is typically methylated. Methyl C has a tendency to mutate into a T. 1/18/2019 ©Bud Mishra, 2001

A Path in the HMM ©Bud Mishra, 2001 p = p1 pi2 L pn = a sequence of states 2 Q* in the hidden Markov model M x 2 S* = sequence generated by the path p, determined by the model M P(x| p) = P(p1)[ Õi=1n P(xi | pi) P(pi | pi+1) ] 1/18/2019 ©Bud Mishra, 2001

A Path in the HMM ©Bud Mishra, 2001 P(x| p) = [Õi=1n P(xi | pi) P(pi | pi+1) ] P(p1) P(xi | pi) = epi(xi) P(pi | pi+1) = api, pi+1 p0 = Initial state “begin” pn+1 = Final state “end” P(x| p) = ap0, p1 ep1(x1) ap1, p2 ep2(x2)L epn(xn) apn, pn+1 = ap0, p1 Õi=1n epi(xi) api, pi+1 1/18/2019 ©Bud Mishra, 2001

Decoding Problem ©Bud Mishra, 2001 p* = arg maxp P(x|p) For a given sequence x, and a given path p, The model (Markovian) defines the probability P(x | p) The dealer knows p and x The player knows x but not p “The path of x is hidden.” Decoding Problem: Find an optimal path p* for x such that P(x|p) is maximized. p* = arg maxp P(x|p) 1/18/2019 ©Bud Mishra, 2001

Dynamic Programming Approach Principle of Optimality: Optimal path for the (i+1)-prefix of x x1 L xi+1 uses a path for an i-prefix of x that is optimal among the paths ending in an (unknown) state pi = k 2 Q 1/18/2019 ©Bud Mishra, 2001

Dynamic Programming Approach sk(i) = The probability of the most probable path for the i-prefix ending in state k. 8k 2 Q 81 5 i 5 n sl(i+1) = el(xi+1). maxk2 Q [sk(i) . akl] 1/18/2019 ©Bud Mishra, 2001

Dynamic Programming ©Bud Mishra, 2001 i=0 sbegin(0) = 1, sk(0) =0, 8k ¹ begin 0 < i 5 n sl(i+1) = el(xi+1) ¢ maxk2 Q [ sk(i) ¢ akl ] i= n+1 P(x | p*) = maxk2 Q sk(n) ak, end 1/18/2019 ©Bud Mishra, 2001

+ maxk2 Q [ Sk(i) + log akl ] Viterbi Algorithm Dynamic Programming with log-score function Sl(i) = log sl(i) Space complexity = O(n |Q|) Time complexity = O(n |Q|) Sl(i+1) = log el(xi+1) + maxk2 Q [ Sk(i) + log akl ] 1/18/2019 ©Bud Mishra, 2001

Estimating the ith State P(pi = k | x) = Given a sequence x 2 S*, the probability that the HMM was in state k at instant i. 1/18/2019 ©Bud Mishra, 2001

Forward Estimate: ©Bud Mishra, 2001 fk(i) = P(x1 L xi, pi = k) = Probability of emitting the prefix x1 L xi and reaching the state pi = k fk(i) = ek(xi) ¢ ål 2 Q fl(i-1) alk 1/18/2019 ©Bud Mishra, 2001

Backward Estimate: ©Bud Mishra, 2001 bk(i) = P(xi+1 L xn, pi = k) = Probability of being at the state pi = k and emitting the suffix xi+1 L xn . bk(i) = ål 2 Q ek(xi+1) ¢ bl(i+1) akl 1/18/2019 ©Bud Mishra, 2001

Applying Bayes’ Rule ©Bud Mishra, 2001 P(pi = k | x) = (1/P(x) £ P(x1 … xi, pi = k) £ P(xi+1…xn, pi = k) = [fk(i) bk(i)] / P(x) P(x) = åp P(x | p) 1/18/2019 ©Bud Mishra, 2001

Sequence Motifs ©Bud Mishra, 2001 A Sequence of patterns of biological significance. Examples: DNA: Protein binding sites (e.g. promoters, regulatory sequences) Protein: sequences corresponding to conserved pieces of structure (e.g. Local features, At various scales: blocks, domains & families) 1/18/2019 ©Bud Mishra, 2001

MEME Algorithm ©Bud Mishra, 2001 Description of a motif: Uses EM (Expectation Minimization) algorithm to find multiple motifs in a set of sequences. Description of a motif: W = (Fixed) width of a motif P = (plc)l2 S, c 2 1..W = Matrix of probabilities that letter l occurs at position c = |S| £ W matrix 1/18/2019 ©Bud Mishra, 2001

Example ©Bud Mishra, 2001 DNA motif of width S = { A, T, C, G} r = motif ) Wr = 3, Pr = 4 £ 3 stochastic matrix Pr = 1/18/2019 ©Bud Mishra, 2001

Computational Problem Given: A set of sequences, G A width parameter W Find: Motifs of width W common to sequences G and present their probabilistic descriptions. Note that motif start sites in each sequence are unknown (hidden). 1/18/2019 ©Bud Mishra, 2001

Basic EM Approach ©Bud Mishra, 2001 G = Training sequences. |G| = Total number of sequences = m; Minimum length of a sequence in G = l. Z = m £ l matrix of probabilities zij = Probability that the motif starts in position j in sequence i. 1/18/2019 ©Bud Mishra, 2001

EM Algorithm ©Bud Mishra, 2001 Set initial values for P do Re-estimate Z from P Re-estimate P from Z until change in P < e return P 1/18/2019 ©Bud Mishra, 2001

EM Algorithm ©Bud Mishra, 2001 Maximize the likelihood of a motif in the training sequence: si 2 G ith sequence Iij= {1, if motif starts at posn. j in seq. i {0, otherwise. lk = the char. at posn. j+k-1 in seq. Si Pr (Si | Iij = 1, r) = Õk=1W rl{k, k}. 1/18/2019 ©Bud Mishra, 2001

Example ©Bud Mishra, 2001 Si = AGGCTGTAGACAC Pr(Si = TGT | Ii5 =1, r) = rT,1 £ rG,2 £ rT,3 = 0.2 £ 0.1 £ 0.1 = 2 £ 10-3 Pr = 1/18/2019 ©Bud Mishra, 2001

Estimating Z ©Bud Mishra, 2001 zij = Pr(Iij = 1 | r, Si) = Estimates the starting position in Si 2 G. zij = Pr ( Iij =1 | r, Si) = Pr( Si, Iij = 1 | r)/ Pr(Si | r) = Pr( Si | Iij = 1, r) Pr(Iij = 1)/ åk Pr( Si | Iik = 1, r) Pr(Iik = 1) = Pr( Si | Iij = 1, r) / åk Pr( Si | Iik = 1, r) Follows from an application of the Bayes’ rule and the assumption that “it is equally likely that the motif will start in any position.” 8jk Pr(Iij = 1) = Pr(Iik=1) 1/18/2019 ©Bud Mishra, 2001

Example ©Bud Mishra, 2001 Si = AGGCTGTAGACAC z’i1 = 6 £ 10-3 Pr = 0.1 £ 0.1 £ 0.6 z’i1 = 6 £ 10-3 z’i2 = 3 £ 10-3 z’i3 = 6 £ 10-3 … NORMALIZE 0.3 £ 0.1 £ 0.1 0.3 £ 0.2 £ 0.1 1/18/2019 ©Bud Mishra, 2001

Estimating Pr nck = ås(i) 2 G;{j | s(i, j+k-1) = c} zij Given Z, estimate the probability that the character c occurs at the kth position of a motif. nck = ås(i) 2 G;{j | s(i, j+k-1) = c} zij Expected number of occurrences of the character c at the k^{th} position of a motif \rho (assuming that the motif “start position” is known.) pck = (nck + 1)/ åd (ndk + 1) 1/18/2019 ©Bud Mishra, 2001

Example ©Bud Mishra, 2001 s1 : A C A G C A s2 : A G G C A G s3 : T C A G T C z1,1 z1,3 z2,1 z3,3 pA,1 = (z11 +z13+ z21 + z33 +1)/ (z11 + z12 + L+ z33 + z34 +4) 1/18/2019 ©Bud Mishra, 2001

Meme ©Bud Mishra, 2001 Uses the basic EM approach Try many starting points. Allow multiple occurrences of a motif per sequence Allows multiple motifs to be learned simultaneously. 1/18/2019 ©Bud Mishra, 2001

Meme ©Bud Mishra, 2001 Initial set of possible motifs: Take every distinct subsequences of length W in the training set Derive an initial matrix P pck = { a if c occurs in position k in the subsequence {(1-a)/ (|S| - 1) Otherwise 1/18/2019 ©Bud Mishra, 2001

Example ©Bud Mishra, 2001 W = 3, r = T A T, a = 0.5 Choose the motif model with the highest likelihood. Run EM to convergence Pr = 1/18/2019 ©Bud Mishra, 2001