Download presentation
Presentation is loading. Please wait.
Published byMavis McDaniel Modified over 9 years ago
1
Hidden Markov Models
2
A Hidden Markov Model consists of 1.A sequence of states {X t |t T } = {X 1, X 2,..., X T }, and 2.A sequence of observations {Y t |t T } = {Y 1, Y 2,..., Y T }
3
Some basic problems: from the observations {Y 1, Y 2,..., Y T } 1.Determine the sequence of states {X 1, X 2,..., X T }. (Assuming the model) - Viterbi path - State probabilities given observations {Y 1, Y 2,..., Y T 2.Determine (or estimate) the parameters of the stochastic process that is generating the states and the observations.;
4
Computing Likelihood Let ij = P[X t+1 = j|X t = i] and = ( ij ) = the M M transition matrix. Let = P[X 1 = i] and = the initial distribution over the states.
5
P[X 1 = i 1,X 2 = i 2..,X T = i T, Y 1 = y 1, Y 2 = y 2,..., Y T = y T ] = P[X = i, Y = y] = Computing Likelihood
6
Therefore P[Y 1 = y 1, Y 2 = y 2,..., Y T = y T ] = P[Y = y]
7
In the case when Y 1, Y 2,..., Y T are continuous random variables or continuous random vectors, Let f(y| ) denote the conditional distribution of Y t given X t = i. Then the joint density of Y 1, Y 2,..., Y T is given by = f(y 1, y 2,..., y T ) = f(y) where = f(y t | )
8
Efficient Methods for computing Likelihood The Forward Method
9
The Backward Procedure
10
Prediction of states from the observations and the model:
11
The Viterbi Algorithm (Viterbi Paths) The Viterbi Path is the sequence of States X 1 = i 1, X 2 = i 2,..., X T = i T That maximizes P[X 1 = i 1,..., X T = i T, Y 1 = y 1,..., Y T = y T ] for a given set of observations Y 1 = y 1, Y 2 = y 2,..., Y T = y T
12
Summary of calculations of Viterbi Path 1. i 1 = 1, 2, …, M 2. i t+1 = 1, 2, …, M; t = 1,…, T-2 3. HMM generator (normal).xls
13
Estimation of Parameters of a Hidden Markov Model If both the sequence of observations Y 1, Y 2,..., Y T and the sequence of States X 1, X 2,..., X T is observed Y 1 = y 1, Y 2 = y 2,..., Y T = y T, X 1 = i 1, X 2 = i 2,..., X T = i T, then the Likelihood is given by:
14
the log-Likelihood is given by:
15
In this case the Maximum Likelihood estimates are: = the MLE of i computed from the observations yt where X t = i.
16
MLE (states unknown) If only the sequence of observations Y 1 = y 1, Y 2 = y 2,..., Y T = y T are observed then the Likelihood is given by:
17
It is difficult to find the Maximum Likelihood Estimates directly from the Likelihood function. The Techniques that are used are 1. The Segmental K-means Algorithm 2. The Baum-Welch (E-M) Algorithm
18
The Segmental K-means Algorithm In this method the parameters are adjusted to maximize where is the Viterbi path
19
Consider this with the special case Case: The observations {Y 1, Y 2,..., Y T } are continuous Multivariate Normal with mean vector and covariance matrix when, i.e.
20
1.Pick arbitrarily M centroids a 1, a 2, … a M. Assign each of the T observations y t (kT if multiple realizations are observed) to a state i t by determining : 2.Then
21
3. And 4.Calculate the Viterbi path (i 1, i 2, …, i T ) based on the parameters of step 2 and 3. 5.If there is a change in the sequence (i 1, i 2, …, i T ) repeat steps 2 to 4.
22
The Baum-Welch (E-M) Algorithm The E-M algorithm was designed originally to handle “Missing observations”. In this case the missing observations are the states {X 1, X 2,..., X T }. Assuming a model, the states are estimated by finding their expected values under this model. (The E part of the E-M algorithm).
23
With these values the model is estimated by Maximum Likelihood Estimation (The M part of the E-M algorithm). The process is repeated until the estimated model converges.
24
The E-M Algorithm Let denote the joint distribution of Y,X. Consider the function: Starting with an initial estimate of. A sequence of estimates are formed by finding to maximize with respect to.
25
The sequence of estimates converge to a local maximum of the likelihood.
26
In the case of an HMM the log-Likelihood is given by:
27
Recall and Expected no. of transitions from state i.
28
Let Expected no. of transitions from state i to state j.
29
The E-M Re-estimation Formulae Case 1: The observations {Y 1, Y 2,..., Y T } are discrete with K possible values and
31
Case 2: The observations {Y 1, Y 2,..., Y T } are continuous Multivariate Normal with mean vector and covariance matrix when, i.e.
33
Measuring distance between two HMM’s Let and denote the parameters of two different HMM models. We now consider defining a distance between these two models.
34
The Kullback-Leibler distance Consider the two discrete distributions and ( and in the continuous case) then define
35
and in the continuous case:
36
These measures of distance between the two distributions are not symmetric but can be made symmetric by the following:
37
In the case of a Hidden Markov model. where The computation of in this case is formidable
38
Juang and Rabiner distance Let denote a sequence of observations generated from the HMM with parameters: Let denote the optimal (Viterbi) sequence of states assuming HMM model.
39
Then define: and
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.