Copyright (c) 2002 by SNU CSE Biointelligence Lab 1 Chap. 4 Pairwise alignment using HMMs Biointelligence Laboratory School of Computer Sci. & Eng. Seoul.

Slides:



Advertisements
Similar presentations
Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.
Advertisements

Hidden Markov Models (1)  Brief review of discrete time finite Markov Chain  Hidden Markov Model  Examples of HMM in Bioinformatics  Estimations Basic.
Learning HMM parameters
Hidden Markov Model.
Hidden Markov Models Eine Einführung.
Hidden Markov Models Modified from:
Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.
Ch 9. Markov Models 고려대학교 자연어처리연구실 한 경 수
Hidden Markov Models Theory By Johan Walters (SR 2003)
Hidden Markov Models Usman Roshan BNFO 601.
A Hidden Markov Model for Progressive Multiple Alignment Ari Löytynoja and Michel C. Milinkovitch Appeared in BioInformatics, Vol 19, no.12, 2003 Presented.
… Hidden Markov Models Markov assumption: Transition model:
Hidden Markov Models Hidden Markov Models Supplement to the Probabilistic Graphical Models Course 2009 School of Computer Science and Engineering Seoul.
Profile HMMs for sequence families and Viterbi equations Linda Muselaars and Miranda Stobbe.
SNU BioIntelligence Lab. ( 1 Ch 5. Profile HMMs for sequence families Biological sequence analysis: Probabilistic models of proteins.
Lecture 6, Thursday April 17, 2003
Hidden Markov Models. Two learning scenarios 1.Estimation when the “right answer” is known Examples: GIVEN:a genomic region x = x 1 …x 1,000,000 where.
S. Maarschalkerweerd & A. Tjhang1 Probability Theory and Basic Alignment of String Sequences Chapter
Hidden Markov Models Pairwise Alignments. Hidden Markov Models Finite state automata with multiple states as a convenient description of complex dynamic.
. Parameter Estimation and Relative Entropy Lecture #8 Background Readings: Chapters 3.3, 11.2 in the text book, Biological Sequence Analysis, Durbin et.
. Class 8: Pair HMMs. FSA  HHMs: Why? Advantages: u Obtain reliability of alignment u Explore alternative (sub-optimal) alignments l Score similarity.
Ch 13. Sequential Data (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Kim Jin-young Biointelligence Laboratory, Seoul.
Hidden Markov Models Lecture 5, Tuesday April 15, 2003.
. Sequence Alignment via HMM Background Readings: chapters 3.4, 3.5, 4, in the Durbin et al.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Expected accuracy sequence alignment
Hidden Markov Models Lecture 5, Tuesday April 15, 2003.
. Alignment HMMs Tutorial #10 © Ilan Gronau. 2 Global Alignment HMM M ISIS ITIT STARTEND (a,a) (a,b) (z,z) (-,a) (-,z) (a,-) (z,-) Probability distributions.
Lecture 9 Hidden Markov Models BioE 480 Sept 21, 2004.
Hidden Markov Models Usman Roshan BNFO 601. Hidden Markov Models Alphabet of symbols: Set of states that emit symbols from the alphabet: Set of probabilities.
Hidden Markov Models 1 2 K … x1 x2 x3 xK.
Finding the optimal pairwise alignment We are interested in finding the alignment of two sequences that maximizes the similarity score given an arbitrary.
Profile Hidden Markov Models PHMM 1 Mark Stamp. Hidden Markov Models  Here, we assume you know about HMMs o If not, see “A revealing introduction to.
Hidden Markov models Sushmita Roy BMI/CS 576 Oct 16 th, 2014.
Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.
Hidden Markov Models Usman Roshan BNFO 601. Hidden Markov Models Alphabet of symbols: Set of states that emit symbols from the alphabet: Set of probabilities.
Variants of HMMs. Higher-order HMMs How do we model “memory” larger than one time point? P(  i+1 = l |  i = k)a kl P(  i+1 = l |  i = k,  i -1 =
Probabilistic Sequence Alignment BMI 877 Colin Dewey February 25, 2014.
Introduction to Profile Hidden Markov Models
Hidden Markov Models for Sequence Analysis 4
. EM with Many Random Variables Another Example of EM Sequence Alignment via HMM Lecture # 10 This class has been edited from Nir Friedman’s lecture. Changes.
Fundamentals of Hidden Markov Model Mehmet Yunus Dönmez.
Hidden Markov Models Yves Moreau Katholieke Universiteit Leuven.
Hidden Markov Models Usman Roshan CS 675 Machine Learning.
. Correctness proof of EM Variants of HMM Sequence Alignment via HMM Lecture # 10 This class has been edited from Nir Friedman’s lecture. Changes made.
HMMs for alignments & Sequence pattern discovery I519 Introduction to Bioinformatics.
Hidden Markov Models CBB 231 / COMPSCI 261 part 2.
Expected accuracy sequence alignment Usman Roshan.
1 MARKOV MODELS MARKOV MODELS Presentation by Jeff Rosenberg, Toru Sakamoto, Freeman Chen HIDDEN.
EVOLUTIONARY HMMS BAYESIAN APPROACH TO MULTIPLE ALIGNMENT Siva Theja Maguluri CS 598 SS.
Artificial Intelligence Chapter 10 Planning, Acting, and Learning Biointelligence Lab School of Computer Sci. & Eng. Seoul National University.
Algorithms in Computational Biology11Department of Mathematics & Computer Science Algorithms in Computational Biology Markov Chains and Hidden Markov Model.
Pairwise sequence alignment Lecture 02. Overview  Sequence comparison lies at the heart of bioinformatics analysis.  It is the first step towards structural.
Dynamic programming with more complex models When gaps do occur, they are often longer than one residue.(biology) We can still use all the dynamic programming.
Expected accuracy sequence alignment Usman Roshan.
Data-Intensive Computing with MapReduce Jimmy Lin University of Maryland Thursday, March 14, 2013 Session 8: Sequence Labeling This work is licensed under.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.
Bayesian Brain - Chapter 11 Neural Models of Bayesian Belief Propagation Rajesh P.N. Rao Summary by B.-H. Kim Biointelligence Lab School of.
Gil McVean Department of Statistics, Oxford
Pairwise Sequence Alignment
Hidden Markov Models Part 2: Algorithms
Using Dynamic Programming To Align Sequences
Pair Hidden Markov Model
BCB 444/544 Lecture 7 #7_Sept5 Global vs Local Alignment
Variants of HMMs.
Artificial Intelligence Chapter 10 Planning, Acting, and Learning
Summarized by Kim Jin-young
Artificial Intelligence Chapter 10 Planning, Acting, and Learning
Presentation transcript:

Copyright (c) 2002 by SNU CSE Biointelligence Lab 1 Chap. 4 Pairwise alignment using HMMs Biointelligence Laboratory School of Computer Sci. & Eng. Seoul National University Seoul , Korea This slide file is available online at

2 Copyright (c) 2002 by SNU CSE Biointelligence Lab Contents FSA → HMM Pair HMMs The full probability of x & y Suboptimal alignment posterior that x i is aligned to y i Pair HMMs vs FSAs for searching

3 Copyright (c) 2002 by SNU CSE Biointelligence Lab Figure 4.1 A finite state machine diagram for affine gap alignment on the left, and the corresponding probabilistic model on the right. X (+1,+0) M (+1,+1) Y (+0,+1) -e-e -d-d -d-d -e-e s(x i,y j ) X q xi M p xiyj Y q yj ε ε 1-ε1-ε 1-ε1-ε δ δ 1-δ1-δ

4 Copyright (c) 2002 by SNU CSE Biointelligence Lab Recurrence Relation

5 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (1) FSA → HMMs: How to?  Specification of emission & transition probabilities X (+1,+0) M (+1,+1) Y (+0,+1) -e-e -d-d -d-d -e-e s(x i,y j ) X q xi M p xiyj Y q yj ε ε 1-ε1-ε 1-ε1-ε δ δ 1-δ1-δ

6 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (2) Definition of begin state & end state  Providing pd. over all possible sequences Pair HMM  Identical to ordinary HMM  Emitting a pairwise alignment

7 Copyright (c) 2002 by SNU CSE Biointelligence Lab X q xi M p xiyj Y q yj ε ε 1-ε-τ1-ε-τ δ δ 1-2δ-τ 1-ε-τ1-ε-τ δ δ τ τ τ τ Begin End Figure 4.2 The full probabilistic version of Figure δ-τ

8 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (3) Algorithm: Viterbi algorithm for pair HMMs  Initialization: v M (0, 0) = 1. v X (0, 0) = v Y (0, 0) = 0 v * (-1, j) = v * (i, -1) = 0.  Recurrence: i = 0,…,n, j = 0,…,m, except for(0,0);  Termination:

9 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (4) Random model Probability of a pair of sequences x and y X q xi Y q yj η η 1-η η η 1-η1-η BeginEnd

10 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (5) Correspondence with FSA  Probability terms to log-odd terms  Viterbi match / random match Tricks Compensating term

11 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (6) Algorithm: Optimal log-odds alignment  Initialization: V M (0, 0) = 2logη, V X (0,0) = V Y (0,0)= - . All V (i,-1), V (-1, j) are set to - .  Recurrence: i = 0,…,n, j = 0,…,m except(0,0);  Termination: Last compensating term

12 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (7) Pair HMM for local alignment figure 4.3

13 Copyright (c) 2002 by SNU CSE Biointelligence Lab The full probability of x and y (1) Summation over all alignments  Forward algorithm does it.  P(x, y) = f E (n, m) Posterior distribution P(π|x,y) can be acquired.

14 Copyright (c) 2002 by SNU CSE Biointelligence Lab The full probability of x and y (2) Algorithm: Forward calculation for pair HMMs  Initialization: f M (0, 0) = 1, f X (0,0) = f Y (0,0)= 0. All f (i,-1), f (-1, j) are set to 0.  Recurrence: i = 0,…,n, j = 0,…,m except (0,0);  Termination:

15 Copyright (c) 2002 by SNU CSE Biointelligence Lab Suboptimal alignment (1) Type of suboptimal alignment  Slightly different from optimal alignment in a few positions  Substantially or completely different Repeats in one or both of the sequences

16 Copyright (c) 2002 by SNU CSE Biointelligence Lab Suboptimal alignment (2) Probabilistic sampling of alignments  Sampling from the posterior distribution  Trace back through f k (i, j)

17 Copyright (c) 2002 by SNU CSE Biointelligence Lab Suboptimal alignment (3) Finding distinct suboptimal alignments  Waterman & Eggert [1987]  Finding the next best alignment  No aligned residue pairs in common with any previously determined alignment

18 Copyright (c) 2002 by SNU CSE Biointelligence Lab figure 4.5

19 Copyright (c) 2002 by SNU CSE Biointelligence Lab posterior that x i is aligned to y j (1) Reliability measure for each part of an alignment Interest Forward algorithmBackward algorithm

20 Copyright (c) 2002 by SNU CSE Biointelligence Lab posterior that x i is aligned to y j (2) Algorithm: Backward calculation for pair HMMs  Initialization: b M (n, m) = b X (n, m) = b Y (n,m) = τ. All b (i, m+1), b (n+1, j) are set to 0.  Recurrence: i = 1,…,n, j = 1,…,m except (n, m);

21 Copyright (c) 2002 by SNU CSE Biointelligence Lab posterior that x i is aligned to y j (3) The expected accuracy of an alignment  Expected overlap between π and paths sampled from the posterior distribution  Dynamic programming

22 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs vs FSAs for searching (1) Two difficulties of conventional methods in searching  Not a probabilistic models for searching  Not computable full probability P(x, y|M) abac qaqa S B α 1-α P S (abac) = α 4 q a q b q a q c P B (abac) = 1-α Model comparison using the best match rather than the total probability

23 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs vs FSAs for searching (2) Conversion FSA into probabilistic model  Probabilistic models may underperform standard alignment methods if Viterbi is used for database searching.  Buf if forward algorithm is used, it would be better than standard methods.