Lecture 6, Thursday April 17, 2003

Slides:



Advertisements
Similar presentations
Hidden Markov Models (1)  Brief review of discrete time finite Markov Chain  Hidden Markov Model  Examples of HMM in Bioinformatics  Estimations Basic.
Advertisements

Hidden Markov Model.
1 Hidden Markov Model Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Bioinformatics Hidden Markov Models. Markov Random Processes n A random sequence has the Markov property if its distribution is determined solely by its.
Hidden Markov Models Eine Einführung.
 CpG is a pair of nucleotides C and G, appearing successively, in this order, along one DNA strand.  CpG islands are particular short subsequences in.
Hidden Markov Models Fundamentals and applications to bioinformatics.
Lecture 15 Hidden Markov Models Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Bioinformatics Finding signals and motifs in DNA and proteins Expectation Maximization Algorithm MEME The Gibbs sampler Lecture 10.
CpG islands in DNA sequences
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models. Two learning scenarios 1.Estimation when the “right answer” is known Examples: GIVEN:a genomic region x = x 1 …x 1,000,000 where.
Hidden Markov Models. Decoding GIVEN x = x 1 x 2 ……x N We want to find  =  1, ……,  N, such that P[ x,  ] is maximized  * = argmax  P[ x,  ] We.
Connection Between Alignment and HMMs. A state model for alignment -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC--- TAG-CTATCAC--GACCGC-GGTCGATTTGCCCGACC IMMJMMMMMMMJJMMMMMMJMMMMMMMIIMMMMMIII.
Hidden Markov Models Pairwise Alignments. Hidden Markov Models Finite state automata with multiple states as a convenient description of complex dynamic.
. Parameter Estimation and Relative Entropy Lecture #8 Background Readings: Chapters 3.3, 11.2 in the text book, Biological Sequence Analysis, Durbin et.
Hidden Markov Models. Two learning scenarios 1.Estimation when the “right answer” is known Examples: GIVEN:a genomic region x = x 1 …x 1,000,000 where.
Hidden Markov Models Lecture 6, Thursday April 17, 2003.
Hidden Markov Models I Biology 162 Computational Genetics Todd Vision 14 Sep 2004.
Hidden Markov Models Lecture 5, Tuesday April 15, 2003.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
. Sequence Alignment via HMM Background Readings: chapters 3.4, 3.5, 4, in the Durbin et al.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Lecture 5: Learning models using EM
S. Maarschalkerweerd & A. Tjhang1 Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability Chapter
Hidden Markov Models: an Introduction by Rachel Karchin.
CpG islands in DNA sequences
Hidden Markov Models Lecture 5, Tuesday April 15, 2003.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Lecture 9 Hidden Markov Models BioE 480 Sept 21, 2004.
Hidden Markov Models—Variants Conditional Random Fields 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models 1 2 K … x1 x2 x3 xK.
Bioinformatics Hidden Markov Models. Markov Random Processes n A random sequence has the Markov property if its distribution is determined solely by its.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Elze de Groot1 Parameter estimation for HMMs, Baum-Welch algorithm, Model topology, Numerical stability Chapter
Proteins, Pair HMMs, and Alignment. CS262 Lecture 8, Win06, Batzoglou A state model for alignment -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC--- TAG-CTATCAC--GACCGC-GGTCGATTTGCCCGACC.
Hidden Markov Models.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.
CS262 Lecture 5, Win07, Batzoglou Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Variants of HMMs. Higher-order HMMs How do we model “memory” larger than one time point? P(  i+1 = l |  i = k)a kl P(  i+1 = l |  i = k,  i -1 =
1 Markov Chains. 2 Hidden Markov Models 3 Review Markov Chain can solve the CpG island finding problem Positive model, negative model Length? Solution:
Hidden Markov Models for Sequence Analysis 4
Segmental Hidden Markov Models with Random Effects for Waveform Modeling Author: Seyoung Kim & Padhraic Smyth Presentor: Lu Ren.
BINF6201/8201 Hidden Markov Models for Sequence Analysis
1 HMM - Part 2 Review of the last lecture The EM algorithm Continuous density HMM.
Hidden Markov Models Yves Moreau Katholieke Universiteit Leuven.
Hidden Markov Models Usman Roshan CS 675 Machine Learning.
CS5263 Bioinformatics Lecture 12: Hidden Markov Models and applications.
. Correctness proof of EM Variants of HMM Sequence Alignment via HMM Lecture # 10 This class has been edited from Nir Friedman’s lecture. Changes made.
HMMs for alignments & Sequence pattern discovery I519 Introduction to Bioinformatics.
Expected accuracy sequence alignment Usman Roshan.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Algorithms in Computational Biology11Department of Mathematics & Computer Science Algorithms in Computational Biology Markov Chains and Hidden Markov Model.
. Parameter Estimation and Relative Entropy Lecture #8 Background Readings: Chapters 3.3, 11.2 in the text book, Biological Sequence Analysis, Durbin et.
Hidden Markov Model Parameter Estimation BMI/CS 576 Colin Dewey Fall 2015.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),
4.2 - Algorithms Sébastien Lemieux Elitra Canada Ltd.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
1 Hidden Markov Model Xiaole Shirley Liu STAT115, STAT215.
Hidden Markov Models – Concepts 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
9/6/07BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST1 BCB 444/544 Lab 3 BLAST Scoring Matrices & Alignment Statistics Sept6.
Hidden Markov Models - Training
Variants of HMMs.
CSE 5290: Algorithms for Bioinformatics Fall 2009
Presentation transcript:

Lecture 6, Thursday April 17, 2003 Hidden Markov Models Lecture 6, Thursday April 17, 2003

Lecture 6, Thursday April 17, 2003 Review of Last Lecture Lecture 6, Thursday April 17, 2003

1. When the true underlying states are known Given x = x1…xN for which the true  = 1…N is known, Define: Akl = # times kl transition occurs in  Ek(b) = # times state k in  emits b in x We can show that the maximum likelihood parameters  are: Akl Ek(b) akl = ––––– ek(b) = ––––––– i Aki c Ek(c) Lecture 7, Tuesday April 22, 2003

2. When not – The Baum-Welch Algorithm Initialization: Pick the best-guess for model parameters (or arbitrary) Iteration: Forward Backward Calculate Akl, Ek(b) Calculate new model parameters akl, ek(b) Calculate new log-likelihood P(x | ) GUARANTEED TO BE HIGHER BY EXPECTATION-MAXIMIZATION Until P(x | ) does not change much Lecture 7, Tuesday April 22, 2003

Alternative: Viterbi Training Initialization: Same Iteration: Perform Viterbi, to find * Calculate Akl, Ek(b) according to * + pseudocounts Calculate the new parameters akl, ek(b) Until convergence Notes: Convergence is guaranteed – Why? Does not maximize P(x | ) In general, worse performance than Baum-Welch Convenient – when interested in Viterbi parsing, no need to implement additional procedures (Forward, Backward)!! Lecture 7, Tuesday April 22, 2003

Lecture 6, Thursday April 17, 2003 Variants of HMMs Lecture 6, Thursday April 17, 2003

Higher-order HMMs The Genetic Code 3 nucleotides make 1 amino acid Statistical dependencies in triplets Question: Recognize protein-coding segments with a HMM Lecture 7, Tuesday April 22, 2003

One way to model protein-coding regions P(xixi+1xi+2 | xi-1xixi+1) Every state of the HMM emits 3 nucleotides Transition probabilities: Probability of one triplet, given previous triplet P(i, | i-1) Emission probabilities: P(xixi-1xi-2 | i ) = 1/0 P(xi-1xi-2xi-3 | i-1 ) = 1/0 AAA AAC AAT … TTT … Lecture 7, Tuesday April 22, 2003

A more elegant way A C G T Every state of the HMM emits 1 nucleotide Transition probabilities: Probability of one triplet, given previous 3 triplets P(i, | i-1, i-2, i-3) Emission probabilities: P(xi | i) Algorithms extend with small modifications A C G T Lecture 7, Tuesday April 22, 2003

Modeling the Duration of States 1-p Length distribution of region X: E[lX] = 1/(1-p) Exponential distribution, with mean 1/(1-p) This is a significant disadvantage of HMMs Several solutions exist for modeling different length distributions X Y p q 1-q Lecture 7, Tuesday April 22, 2003

Solution 1: Chain several states p 1-p X X X Y q 1-q Disadvantage: Still very inflexible lX = C + exponential with mean 1/(1-p) Lecture 7, Tuesday April 22, 2003

Solution 2: Negative binomial distribution p p p 1 – p 1 – p X X X l-1 P(lX = n) = n-1 pl-n(1-p)n Lecture 7, Tuesday April 22, 2003

Solution 3: Duration modeling Upon entering a state: Choose duration d, according to probability distribution Generate d letters according to emission probs Take a transition to next state according to transition probs Disadvantage: Increase in complexity: Time: O(D2) Space: O(D) Where D = maximum duration of state X Lecture 7, Tuesday April 22, 2003

Connection Between Alignment and HMMs Lecture 6, Thursday April 17, 2003

A state model for alignment (+1,+1) Alignments correspond 1-to-1 with sequences of states M, I, J I (+1, 0) J (0, +1) -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC--- TAG-CTATCAC--GACCGC-GGTCGATTTGCCCGACC IMMJMMMMMMMJJMMMMMMJMMMMMMMIIMMMMMIII Lecture 7, Tuesday April 22, 2003

Let’s score the transitions s(xi, yj) M (+1,+1) Alignments correspond 1-to-1 with sequences of states M, I, J s(xi, yj) s(xi, yj) -d -d I (+1, 0) J (0, +1) -e -e -e -e -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC--- TAG-CTATCAC--GACCGC-GGTCGATTTGCCCGACC IMMJMMMMMMMJJMMMMMMJMMMMMMMIIMMMMMIII Lecture 7, Tuesday April 22, 2003

How do we find optimal alignment according to this model? Dynamic Programming: M(i, j): Optimal alignment of x1…xi to y1…yj ending in M I(i, j): Optimal alignment of x1…xi to y1…yj ending in I J(i, j): Optimal alignment of x1…xi to y1…yj ending in J The score is additive, therefore we can apply DP recurrence formulas Lecture 7, Tuesday April 22, 2003

Needleman Wunsch with affine gaps – state version Initialization: M(0,0) = 0; M(i,0) = M(0,j) = -, for i, j > 0 I(i,0) = d + ie; J(0,j) = d + je Iteration: M(i-1, j-1) M(i, j) = s(xi, yj) + max I(i-1, j-1) J(i-1, j-1) e + I(i-1, j) I(i, j) = max e + J(i, j-1) d + M(i-1, j-1) J(i, j) = max e + J(i, j-1) Termination: Optimal alignment given by max { M(m, n), I(m, n), J(m, n) } Lecture 7, Tuesday April 22, 2003

Probabilistic interpretation of an alignment An alignment is a hypothesis that the two sequences are related by evolution Goal: Produce the most likely alignment Assert the likelihood that the sequences are indeed related Lecture 7, Tuesday April 22, 2003

A Pair HMM for alignments BEGIN M P(xi, yj) I P(xi) J P(yj) 1 - 2 1-  - 2   M I J END Lecture 7, Tuesday April 22, 2003

A Pair HMM for alignments BEGIN 1 - 2 -  M P(xi, yj) 1-  - 2 -  1-  - 2 -  1 - 2 -     M I J I P(xi) J P(yj)      END Lecture 7, Tuesday April 22, 2003

A Pair HMM for not aligned sequences Model R 1 -  1 -  I P(xi) J P(yj) BEGIN END BEGIN END 1 -  1 -    P(x, y | R) = (1 – )m P(x1)…P(xm) (1 – )n P(y1)…P(yn) = 2(1 – )m+n i P(xi) j P(yj) Lecture 7, Tuesday April 22, 2003

To compare ALIGNMENT vs. RANDOM hypothesis 1 - 2 -  Every pair of letters contributes: (1 – 2 – ) P(xi, yj) when matched  P(xi) P(yj) when gapped (1 – )2 P(xi) P(yj) in random model Focus on comparison of P(xi, yj) vs. P(xi) P(yj) M P(xi, yj) 1-  - 2 -  1-  - 2 -    I P(xi) J P(yj)     1 -  1 -  BEGIN I P(xi) END BEGIN J P(yj) END 1 -  1 -    Lecture 7, Tuesday April 22, 2003

To compare ALIGNMENT vs. RANDOM hypothesis Idea: We will divide alignment score by the random score, and take logarithms Let P(xi, yj) (1 – 2 – ) s(xi, yj) = log ––––––––––– + log ––––––––––– P(xi) P(yj) (1 – )2 (1 –  – ) P(xi) d = - log –––––––––––––––––––––– (1 – ) (1 – 2 – ) P(xi)  P(xi) e = - log ––––––––––– (1 – ) P(xi) Every letter b in random model contributes (1 – ) P(b) Lecture 7, Tuesday April 22, 2003

The meaning of alignment scores Because , , are small, and ,  are very small, P(xi, yj) (1 – 2 – ) P(xi, yj) s(xi, yj) = log ––––––––––– + log –––––––––––  log ––––––––––– P(xi) P(yj) (1 – )2 P(xi) P(yj) (1 –  – ) d = - log ––––––––––––––––––  - log  (1 – ) (1 – 2 – )  e = - log –––––––  - log  (1 – ) Lecture 7, Tuesday April 22, 2003

The meaning of alignment scores The Viterbi algorithm for Pair HMMs corresponds exactly to the Needleman-Wunsch algorithm with affine gaps However, now we need to score alignment with parameters that add up to probability distributions : 1/mean arrival time of next gap : 1/mean length of next gap affine gaps decouple arrival time with length : 1/mean length of conserved segments (set to ~0) : 1/mean length of sequences of interest (set to ~0) Lecture 7, Tuesday April 22, 2003

The meaning of alignment scores Match/mismatch scores: P(xi, yj) s(a, b)  log ––––––––––– P(xi) P(yj) Example: Say DNA regions between human and mouse have average conservation of 50% Then P(A,A) = P(C,C) = P(G,G) = P(T,T) = 1/8 (so they sum to ½) P(A,C) = P(A,G) =……= P(T,G) = 1/24 (24 mismatches, sum to ½) Say P(A) = P(C) = P(G) = P(T) = ¼ log [ (1/8) / (1/4 * 1/4) ] = log 2 = 1, for match Then, s(a, b) = log [ (1/24) / (1/4 * 1/4) ] = log 16/24 = -0.585 Note: 0.585 / 1.585 = 37.5 According to this model, a 37.5%-conserved sequence with no gaps would score on average 0.375 * 1 – 0.725 * 0.585 = 0 Why? 37.5% is between the 50% conservation model, and the random 25% conservation model ! Lecture 7, Tuesday April 22, 2003

Substitution matrices A more meaningful way to assign match/mismatch scores For protein sequences, different substitutions have dramatically different frequencies! BLOSUM matrices: Start from BLOCKS database (curated, gap-free alignments) Cluster sequences according to % identity For a given L% identity, calculate Aab: # of aligned pos a-b Estimate P(a) = (b Aab)/(cd Acd); P(a, b) = Aab/(cd Acd) Lecture 7, Tuesday April 22, 2003

BLOSUM matrices BLOSUM 50 BLOSUM 62 (The two are scaled differently) Lecture 7, Tuesday April 22, 2003

Lecture 7, Tuesday April 22, 2003