Download presentation
Presentation is loading. Please wait.
Published byKatrina Austin Modified over 8 years ago
1
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas Donkersteeg 98.32.769 pieterbas@zonnet.nl Spoken Language Processing, A guide to theory, algorithm, and system development
2
A SSR presentation: 8.2 Definition of the Hidden Markov Model Presentation Content Presentation Content The model explained Problems to address Dynamic Programming and DTW How to evaluate an HMM – The Forward Algorithm How to decode an HMM – The Viterbi Algorithm How to estimate HMM Parameters – Baum-Welch Algorithm Questions
3
A SSR presentation: 8.2 Definition of the Hidden Markov Model The model explained The model explained The Markov Chain (1/3) based on the Markov Chain (8.1): P(X i | X i-1 ) minimum memory usage “probability of the random variable at a given time depends only on the value at the preceding time.” (Markov assumption) State transition probability matrix
4
A SSR presentation: 8.2 Definition of the Hidden Markov Model The model explained The model explained Extension to the Hidden Markov Model (3/3) Deterministically Non-deterministically observable events The observation is a probabilistic function of the state Relation observation sequence and state sequence is hidden Characterizing spectral properties of pattern frames (phonemes)
5
A SSR presentation: 8.2 Definition of the Hidden Markov Model The model explained The model explained Hidden Markov Model (3/3) Formal definition HMM An output observation alphabet The set of states A transition probability matrix An output probability matrix An initial state distribution Assumptions Markov assumption Output independence assumption Ease of use / no significant affect Formal notation whole parameter set
6
A SSR presentation: 8.2 Definition of the Hidden Markov Model Problems to address Problems to address… Given: a model Ф and a sequence of observations The Evaluation Problem Given the observation sequence O and the model Ф, how do we efficiently compute P(O|L), the probability of the observation sequence, given the model? The Decoding Problem Finding the optimal sequence associated with a given observation. Viterbi Algorithm finds the single best sequence q for the given observation sequence O. The Learning Problem How can we adjust the model parameter to maximize the joint probability (likelihood)? [TOUGHEST PROBLEM]
7
A SSR presentation: 8.2 Definition of the Hidden Markov Model Dynamic Programming and DTW … Solving the time alignment problem Derive distortion between two speech templates Calculate minimal distance between two speech vectors Algorithm effective for small differences between speech vectors HMM is better in finding an ‘averaged’ template for each pattern
8
A SSR presentation: 8.2 Definition of the Hidden Markov Model How to evaluate an HMM The Forward Algorithm Calculate probability of the observed sequence Enumerate all possible state sequences S of length T Sum up all probabilities of these sequences Probability of path S (calculate for all paths) State sequence probability * joint output probability Forward Algorithm is used to calculate above idea Complexity O(N 2 T) Full use of partially computed probabilities for efficiency
9
A SSR presentation: 8.2 Definition of the Hidden Markov Model How to decode an HMM The Viterbi Algorithm Forward algorithm does not find best state sequence (‘best path’) Using optimal path problem in Dynamic Programming Viterbi is based on DP and is used to find best state sequence Can be used for evaluation HMM Finding best path is CORNERSTONE of continuous speech recognition Looking for state sequence S that maximizes P(S,X|ф) It doesn’t sum up, it dynamically search for the best state sequence Initialization, Induction (Max, Arg), Termination, Backtracking Complexity O(N 2 T)
10
A SSR presentation: 8.2 Definition of the Hidden Markov Model How to estimate HMM Parameters Baum-Welch Algorithm Estimation of model parameters ф=(A,B,) No known analytical method to maximize joint probability in closed form Iterative Baum-Welch algorithm aka forward-backward algorithm Unsupervised learning Forward probability term and backward probability term refine the HMM parameters vector ф by max. the likelihood P(X|ф) for each iteration Pick the best path until time t based on Expectation/Maximization algorithm (incomplete data) Initialization, E-step,M-step, Iteration
11
A SSR presentation: 8.2 Definition of the Hidden Markov Model Questions … ANY QUESTIONS…?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.