Download presentation
Presentation is loading. Please wait.
Published byCameron Byrd Modified over 9 years ago
1
Hidden Markov Model 主講人:虞台文 大同大學資工所 智慧型多媒體研究室
2
Contents Introduction – Markov Chain – Hidden Markov Model (HMM) Formal Definition of HMM & Problems Estimate HMMs by EM-Algorithm HMM with GMM
3
Hidden Markov Model Introduction 大同大學資工所 智慧型多媒體研究室
4
Introduction Signal Generation Source Observation Sequence: O = O 1 O 2 O 3 O T Model Approximation
5
Example (Traffic Light) Signal Generation Source Observation Sequence: red - red/amber - green - amber - red Deterministic Patterns Each state is dependent solely on the previous state. Deterministic systems are relatively easy to understand and analyze.
6
Example (Weather) Signal Generation Source Observation Sequence: Nondeterministic Patterns This is the so-called Markov model. The observation is the state sequence. The state transition probability is dependent solely on the previous state. What probabilities has to be estimated?
7
Markov Chain A set of state: S = {S 1, S 2, …, S N } Transition Probability Matrix: Observation: State Transition Sequence Initial State Probability vector: Markov Model:
8
Example S1S1 S2S2 S3S3 S 1 : Sunny S 2 : Rainy S 3 : Cloudy 0.8 0.1 0.4 0.3 0.2 0.3 0.6 0.2
9
Properties of Markov Model Define state probability vector Given Then, or
10
Properties of Markov Model or Steady State: q is an eigenvector of matrix A T with eigenvalue 1.
11
Properties of Markov Model Consider the following Observation Sequence: DiDi SiSi a ii a ij ’s How to estimate a ii ?
12
Example (Coin Tossing) Signal Generation Source Observation Sequence: HHTTTHTTH…H What will be the model? Several different coins may have. Coins may be biased.
13
One-Coin Model Observation Sequence: HHTTTHTTH…H P(H)P(H) 1P(H)1P(H) 1P(H)1P(H) P(H)P(H) The observation sequence is the same as the state sequence. How to estimate the parameter of the model?
14
Two-Coin Model Observation Sequence: HHTTTHTTH…H a 11 a 12 1 a 11 1 a 12 P(H) = P 1 P(L) = 1 P 1 P(H) = P 2 P(L) = 1 P 2 State Sequence: 1 2 2 1 1 1 2 2 1 … 2
15
Two-Coin Model Observation Sequence: HHTTTHTTH…H a 11 a 12 1 a 11 1 a 12 P(H) = P 1 P(L) = 1 P 1 P(H) = P 2 P(L) = 1 P 2 State Sequence: 1 2 2 1 1 1 2 2 1 … 2 Observable Unobservable
16
Two-Coin Model Observation Sequence: HHTTTHTTH…H a 11 a 12 1 a 11 1 a 12 P(H) = P 1 P(L) = 1 P 1 P(H) = P 2 P(L) = 1 P 2 State Sequence: 1 2 2 1 1 1 2 2 1 … 2 Given the observation sequence, how to estimate the model? Because the state sequence is hidden, it is called the Hidden Markov Model (HMM). Given the observation sequence, how to estimate the model? Because the state sequence is hidden, it is called the Hidden Markov Model (HMM).
17
Example: The Urn and Ball urn1urn2urn3 a 11 a 22 a 33 a 12 a 23 a 21 a 32 a 31 a 13 Observation Sequence:
18
Example: The Urn and Ball urn1urn2urn3 a 11 a 22 a 33 a 12 a 23 a 21 a 32 a 31 a 13 Observation Sequence: Given the observation sequence, how to estimate the model? it is a Hidden Markov Model (HMM). Given the observation sequence, how to estimate the model? it is a Hidden Markov Model (HMM).
19
What is HMM? An HMM is a Markov chain, where each state generates observations. You only see the observations, and the goal is to infer the hidden state sequence. HMMs are very useful for time-series modeling, since the discrete state-space can be used to approximate many non-linear, non-Gaussian systems.
20
HMM Applications Pattern Recognition – Speech Recognition – Face Recognition – Gesture Recognition – Handwriting Character Recognition Molecular biology, Biochemistry and Genetics. Sequence Data Analysis & Alignment
21
Hidden Markov Model Formal Definition of HMM & Problems 大同大學資工所 智慧型多媒體研究室
22
Elements of an HMM A set of N states A set of M observation symbols The state transition probability distribution The observation symbol distribution for each state The Initial state distribution
23
The Definition We represent an HMM by a tuple, say, by A : state transition probability distribution. B : observation symbol probability distribution. : initial state distribution.
24
The Three Basic Problems for HMMs Problem I: Problem II: Problem III: Given 1.Observation sequence O = O 1 O 2 …O T 2.Model = (A, B, ) How to find the state sequence Q = q 1 q 2 … q T that best explains the observations? How to find the state sequence Q = q 1 q 2 … q T that best explains the observations? How to adjust the model parameter of = (A, B, ) to maximize P(O| )? How to adjust the model parameter of = (A, B, ) to maximize P(O| )?
25
Problem I A straightforward solution: Let Q = q 1 q 2 … q T be the state sequence that generates the observation sequence O = O 1 O 2 …O T. Time complexity is really huge.
26
Problem I Solution by a forward induction procedure: Define
27
Problem I Solution by a forward induction procedure:.............................. state 1 2 3 N 123 T1T1 T time... Define Time complexity
28
Problem I Solution by a backward induction procedure: Define Time complexity
29
Problem I Side product of the forward-backward procedure: Define Fact: Most likely state at time t
30
Problem II How to find the state sequence Q = q 1 q 2 … q T that best explains the observations? How to find the state sequence Q = q 1 q 2 … q T that best explains the observations? Unlike Problem I, the solution is dependent on the optimality criterion. For examples: 1.Maximize the expected number of correct states. 2.Viterbi Algorithm Choose
31
Viterbi Algorithm Define To retrieve backward path, we define
32
Viterbi Algorithm Backtracking Initialization Recursion Termination
33
Problem III The most difficult problem of the three. Approaches: 1.Baum-Welch Method 2.EM Algorithm How to adjust the model parameter of = (A, B, ) to maximize P(O| )? How to adjust the model parameter of = (A, B, ) to maximize P(O| )? These two methods, in fact, are the same but with different formulations.
34
Baum-Welch Method How to adjust the model parameter of = (A, B, ) to maximize P(O| )? How to adjust the model parameter of = (A, B, ) to maximize P(O| )? What parameter’s must be estimated? How?
35
Baum-Welch Method How to adjust the model parameter of = (A, B, ) to maximize P(O| )? How to adjust the model parameter of = (A, B, ) to maximize P(O| )? Review:
36
Baum-Welch Method SjSj SiSi Define time t time t+1 How to adjust the model parameter of = (A, B, ) to maximize P(O| )? How to adjust the model parameter of = (A, B, ) to maximize P(O| )?
37
Baum-Welch Method Define How to adjust the model parameter of = (A, B, ) to maximize P(O| )? How to adjust the model parameter of = (A, B, ) to maximize P(O| )? probability that q t = S i Facts: As stated before.
38
Baum-Welch Method
39
1. Provide an initial model 2. Reestimate the model as 3. Let 4. If not converge go to step 2. Baum-Welch Method
40
Hidden Markov Model Estimate HMMs by EM-Algorithm 大同大學資工所 智慧型多媒體研究室
41
Complete-Data Likelihood for HMM o = o 1 o 2 o 3 o T q = q 1 q 2 q 3 q T Observation sequence: Hidden state sequence: Complete data
42
Q-function (E-Step) Independent on q
43
Maximization Step maximize subject to
44
Maximization Step Solve subject to Solve for i for a ij for b j (k)
45
Maximization Step Solve for i
46
Maximization Step The same result as Baum-Welch Method
47
Maximization Step Solvefor a ij
48
Maximization Step Solvefor a ij
49
Maximization Step Again, it has the same result as Baum-Welch Method
50
Maximization Step Solve for b j (k)
51
Maximization Step Solve for b j (k)
52
Maximization Step These conclude that EM-algorithm to learn an HMM is equivalent to Baum-Welch Method.
53
Summary
54
Hidden Markov Model HMM with GMM 大同大學資工所 智慧型多媒體研究室
55
Continuous Observation Densities in HMMs In many cases, the observation densities are continuous, e.g., speech signal processing. Approaches: – Quantize the signal to a discrete one, e.g., vector quantization. – Using tractable/reasonable statistical models, e.g., GMM.
56
GMM (Gaussian Mixture Model) Gaussian Density frequency mean covariance matrix for the k th mixture in the j th state.
57
GMM (Gaussian Mixture Model) How to estimate the parameters?
58
Baum-Welch Method initial model reestimated model
59
Definitions Prob. reach S i at time t. Prob. reach S i and the k th mixture is selected at time t.
60
Definitions
61
Method
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.