Ab Initio Profile HMM Generation

Slides:



Advertisements
Similar presentations
Three Basic Problems Compute the probability of a text: P m (W 1,N ) Compute maximum probability tag sequence: arg max T 1,N P m (T 1,N | W 1,N ) Compute.
Advertisements

Image Modeling & Segmentation
. Inference and Parameter Estimation in HMM Lecture 11 Computational Genomics © Shlomo Moran, Ydo Wexler, Dan Geiger (Technion) modified by Benny Chor.
Hidden Markov Model in Biological Sequence Analysis – Part 2
Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.
Header, Specification, Body Input Parameter List Output Parameter List
HMM II: Parameter Estimation. Reminder: Hidden Markov Model Markov Chain transition probabilities: p(S i+1 = t|S i = s) = a st Emission probabilities:
Learning HMM parameters
Maximum Likelihood And Expectation Maximization Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Hidden Markov Model.
The EM algorithm LING 572 Fei Xia Week 10: 03/09/2010.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
. EM algorithm and applications Lecture #9 Background Readings: Chapters 11.2, 11.6 in the text book, Biological Sequence Analysis, Durbin et al., 2001.
. Learning Hidden Markov Models Tutorial #7 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
Lecture 6, Thursday April 17, 2003
Hidden Markov Model 11/28/07. Bayes Rule The posterior distribution Select k with the largest posterior distribution. Minimizes the average misclassification.
Hidden Markov Models. Two learning scenarios 1.Estimation when the “right answer” is known Examples: GIVEN:a genomic region x = x 1 …x 1,000,000 where.
. Parameter Estimation and Relative Entropy Lecture #8 Background Readings: Chapters 3.3, 11.2 in the text book, Biological Sequence Analysis, Durbin et.
First introduced in 1977 Lots of mathematical derivation Problem : given a set of data (data is incomplete or having missing values). Goal : assume the.
Hidden Markov Models I Biology 162 Computational Genetics Todd Vision 14 Sep 2004.
The EM algorithm (Part 1) LING 572 Fei Xia 02/23/06.
Lecture 5: Learning models using EM
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Lecture 9 Hidden Markov Models BioE 480 Sept 21, 2004.
Realistic evolutionary models Marjolijn Elsinga & Lars Hemel.
The EM algorithm LING 572 Fei Xia 03/01/07. What is EM? EM stands for “expectation maximization”. A parameter estimation method: it falls into the general.
. PGM: Tirgul 10 Parameter Learning and Priors. 2 Why learning? Knowledge acquisition bottleneck u Knowledge acquisition is an expensive process u Often.
Finding the optimal pairwise alignment We are interested in finding the alignment of two sequences that maximizes the similarity score given an arbitrary.
Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.
EM and expected complete log-likelihood Mixture of Experts
Hidden Markov Models for Sequence Analysis 4
. Correctness proof of EM Variants of HMM Sequence Alignment via HMM Lecture # 10 This class has been edited from Nir Friedman’s lecture. Changes made.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
S. Salzberg CMSC 828N 1 Three classic HMM problems 2.Decoding: given a model and an output sequence, what is the most likely state sequence through the.
Expected accuracy sequence alignment Usman Roshan.
Lecture 6 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
Learning In Bayesian Networks. General Learning Problem Set of random variables X = {X 1, X 2, X 3, X 4, …} Training set D = { X (1), X (2), …, X (N)
. Parameter Estimation and Relative Entropy Lecture #8 Background Readings: Chapters 3.3, 11.2 in the text book, Biological Sequence Analysis, Durbin et.
Learning Sequence Motifs Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 Mark Craven
. EM in Hidden Markov Models Tutorial 7 © Ydo Wexler & Dan Geiger, revised by Sivan Yogev.
ECE 8443 – Pattern Recognition Objectives: Reestimation Equations Continuous Distributions Gaussian Mixture Models EM Derivation of Reestimation Resources:
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
16.0 Some Fundamental Principles – EM Algorithm References: , of Huang, or of Jelinek of Rabiner and Juang 3.
Comp. Genomics Recitation 6 14/11/06 ML and EM.
Learning, Uncertainty, and Information: Learning Parameters
Bayesian Estimation and Confidence Intervals
Tutorial 9 EM and Beta distribution
Hidden Markov Models.
An Iterative Approach to Discriminative Structure Learning
Non-Parametric Models
LECTURE 10: EXPECTATION MAXIMIZATION (EM)
CSC 594 Topics in AI – Natural Language Processing
Bayes Net Learning: Bayesian Approaches
Hidden Markov Models - Training
Expectation-Maximization
Hidden Markov Models Part 2: Algorithms
1.
Introduction to EM algorithm
Three classic HMM problems
Hidden Markov Model LR Rabiner
LECTURE 21: CLUSTERING Objectives: Mixture Densities Maximum Likelihood Estimates Application to Gaussian Mixture Models k-Means Clustering Fuzzy k-Means.
Algorithms of POS Tagging
Lecture 11 Generalizations of EM.
CSE 5290: Algorithms for Bioinformatics Fall 2009
CS590I: Information Retrieval
EM Algorithm 主講人:虞台文.
CS224N Section 2: EM Nate Chambers April 17, 2009
Presentation transcript:

Ab Initio Profile HMM Generation Sam Gross

Profile HMMs STOLEN FROM BATZOGLOU LECTURE BEGIN I0 I1 Im-1 D1 D2 Dm END Im Dm-1 Protein profile H Each M state has a position-specific pre-computed substitution table Each I and D state has position-specific gap penalties Profile is a generative model: The sequence X that is aligned to H, is thought of as “generated by” H Therefore, H parametrizes a conditional distribution P(X | H)

Ab Initio Profile Generation Given N related protein sequences x1…xN Construct a profile HMM H such that is maximized Õ i x H P ) | (

Easier Said Than Done Profile HMM length is unknown Use average sequence length Alignment is unknown HMM parameters are unknown

Not A New Problem Instance of the general problem of HMM parameter estimation using unlabelled outputs Instance of the even more general problem of MLE with partially missing data We want We know arg max P ( D | q ) obs q P ( D , D | q ) obs hid

The Expectation Maximization (EM) Algorithm Start with initial guess for parameters Iterate until convergence: E-step: Calculate expectations for missing data M-step: Treating expectations as observations, calculate MLE for parameters

Baum-Welsh: EM For HMMs Start with initial guess of HMM parameters Iterate until convergence: Forward-backward algorithm MLE using forward-backward posterior probabilities

Incorporating Prior Knowledge We know in advance certain types of residues tend to align together Use a Dirichlet mixture prior over outputs for match states Each distribution in the mixture corresponds to a different “alignment environment”

Coin Flips Example Two trick coins used to generated a sequence of heads and tails You see only the sequence, and must determine the probability of heads for each coin Coin A Coin B

10,000 Coin Flips Real coins Initial guess Learned model PA(heads) = 0.4 PB(heads) = 0.8 Initial guess PA(heads) = 0.51 PB(heads) = 0.49 Learned model PA(heads) = 0.801 PB(heads) = 0.413

Toy Profile Example Create a profile for the following sequences: ADACGIH ADAGIH ADACGH AACQH ADAYGIH Use the profile to align the sequences

Results ADACGIH ADA-GIH ADACG-H A-ACQ-H ADAYGIH Match1 A 100% Match2 D 100% Match3 A 100% Match4 C 75%, Y 25% Match5 G 80%, Q 20% Match6 I 62%, H 38% Match7 H 100%

Clustering With A Mixture Of Profiles Given N protein sequences x1…xN Construct M profile HMMs H1…HM and a mapping F: xH such that is maximized F is a natural clustering of the protein sequences into M groups Õ i x F P )) ( |