Presentation is loading. Please wait.

Presentation is loading. Please wait.

درس بیوانفورماتیک December 2013 مدل ‌ مخفی مارکوف و تعمیم ‌ های آن به نام خدا.

Similar presentations


Presentation on theme: "درس بیوانفورماتیک December 2013 مدل ‌ مخفی مارکوف و تعمیم ‌ های آن به نام خدا."— Presentation transcript:

1 درس بیوانفورماتیک December 2013 مدل ‌ مخفی مارکوف و تعمیم ‌ های آن به نام خدا

2 Sharif University of Technology HMM Concept Markov Chain: Observable Markov Model: State=weather condition S3S4S1S2S5S2 S3S4 S2 S1 S5 State Seri: Obs. Seri: Observation: o2o3o6o5o4o1 q2q3q6q5q4q1 2 In a regular Markov model, the state is directly visible to the observer, the state transition probabilities are the only parameters.

3 Sharif University of Technology HMM Concept Markov Hidden Model: State=Pressure of Atmosphere S2 S4S3 S1 S4S3S2 S1S2 Markov Chain: State Seri: Obs. Seri: Observation: q2q3q6q5q4q1 o3o4o2o1o6o5 3 In a hidden Markov model, the state is not directly visible, but variables influenced by the state are visible. Each state has a probability distribution over the possible output tokens. Therefore the sequence of tokens generated by an HMM gives some information about the sequence of states.

4 S2 S4S3 S1 v3v4v1v2v5v2 q2q3q6q5q4q1 o2o3o6o5o4o1 S4S3S2 S1S2 Sharif University of Technology HMM Model 4

5 Sharif University of Technology HMM Evaluation Problem 1: Given an observation sequence and a model, compute the probability of the observation sequence v3v4v1v2v5v2 S4S3S2 S1S2 q2q3q6q5q4q1 o2o3o6o5o4o1 5

6 Sharif University of Technology HMM Forward & Backward t-1 t SmSn Ot t t+1 SmSn Ot+1 Forward Backward 6

7 Sharif University of Technology HMM Decoding / Classification / Inference Problem 2: Given an observation sequence and a model, compute the optimal state sequence to produce given observations Viterbi v3v4v1v2v5v2 S4S3S2 S1S2 q2q3q6q5q4q1 o2o3o6o5o4o1 t-1 t Sm Sn Ot RecursionBacktracking 7

8 Sharif University of Technology HMM Learning v3v4v1v2v5v2 S4S3S2 S1S2 q2q3q6q5q4q1 o2o3o6o5o4o1 Problem 3: Given an observation sequence estimate the parameters of the model: whether knowing the sequence of states or not t t+1 Sm Sn Ot Expectation/Maximization 8

9 Sharif University of Technology Protein Structure APPLICATION 9

10 Sharif University of Technology Profile HMM HMM Variants 10 Constructing a profile HMM each consensus column can exist in 3 states match, insert and delete states number of states depends upon length of the alignment A typical profile HMM architecture squares represent match states diamonds represent insert states circles represent delete states arrows represent transitions transition between match states - transition from match state to insert state - transition within insert state - transition from match state to delete state - transition within delete state - emission of symbol at a state -

11 Sharif University of Technology HMM Variants There exist a large number of HMM variants that modify and extend the basic model to meet the needs of various applications. Adding silent states to the model to represent the absence of certain symbols that are expected to be present at specific locations Making the states emit two aligned symbols, instead of a single symbol, so that the resulting HMM simultaneously generates two related symbol sequences Make the probabilities at certain states dependent on part of the previous emissions to describe more complex symbol correlations. 11

12 Sharif University of Technology Example: CpG islands Profile HMM 12

13 protein classification motif detection finding multiple sequence alignments Scoring a sequence against a profile HMM Comparing two profile HMMs Sharif University of Technology Concept and Model Profile HMM 13 Stochastic methods to model multiple sequence alignments – proteins and DNA sequences Potential application domains: protein families could be modeled as an HMM or a group of HMMs constructing a profile HMM new protein sequences could be aligned with stored models to detect remote homology aligning a sequence with a stored profile HMM align two or more protein family profile HMMs to detect homology finding statistical similarities between two profile HMM models

14 Sharif University of Technology Example: Problem2 Profile HMM 14

15 Sharif University of Technology Example: Problem1 Profile HMM 15

16 Sharif University of Technology Applications Profile HMM 16 Comparing two multiple sequence alignments or sequence profiles, instead of comparing a single sequence against a multiple alignment or a profile. Comparing sequence profiles can be beneficial for detecting remote homologues: For example: COACH allows us to compare sequence alignments, by building a profile-HMM from one alignment and aligning the other alignment to the constructed profile-HMM. HHsearch generalizes the traditional pairwise sequence alignment algorithm for finding the alignment of two profile-HMMs. PRC (profile comparer) provides a tool for scoring and aligning profile-HMMs produced by popular software tools model sequences of protein secondary structure symbols: helix (H), strand (E), and coil (C) feature-based profile-HMM was proposed to improve the performance of remote protein homology detection. Instead of emitting amino acids, emissions of these HMMs are based on `features' that capture the biochemical properties of the protein family of interest.

17 Sharif University of Technology Concept and Model Pair HMM 17 The optimal state sequence y * can be found using dynamic programming, by a simple modification of the Viterbi algorithm. The computational complexity of the resulting alignment algorithm is only O(L x L z ).

18 Sharif University of Technology Applications Pair HMM finding pairwise alignment of proteins and DNA sequences. In other words, find the optimal sequence alignment, compute the overall alignment probability, and estimate the reliability of the individual alignment regions. Many multiple sequence alignment (MSA) algorithms also make use of pair-HMMs. The most widely adopted strategy for constructing a multiple alignment is the progressive alignment approach, where sequences are assembled into one large multiple alignment through consecutive pairwise alignment steps according to a guide tree Gene prediction, For example, a method called Pairagon+N-SCAN_EST : pair-HMM is first used to find accurate alignments of cDNA sequences to a given genome, and these alignments are combined with a gene prediction algorithm for accurate genome annotation. Compare two DNA sequences and jointly analyze their gene structures. Aligning more complex structures, such as trees.

19 Sharif University of Technology HsMM model d1 S4S3S2 S1S2 d2d3d6d5d4 19

20 Sharif University of Technology Concept & Model Coupling S3S4 S2S1 S3S4 S2 S1 S5 S2 S1 S5 Brand, [3] S2 S4S3 S1 20

21 Sharif University of Technology Ancestors Diagram CHSMM Brand 1996 Ferguson 1980 Natarajan 2007 Baum 1966 21


Download ppt "درس بیوانفورماتیک December 2013 مدل ‌ مخفی مارکوف و تعمیم ‌ های آن به نام خدا."

Similar presentations


Ads by Google