Download presentation
Presentation is loading. Please wait.
Published byGeraldine Stephens Modified over 8 years ago
1
MACHINE LEARNING 16. HMM
2
Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Modeling dependencies in input; no longer iid Sequences: Temporal: In speech; phonemes in a word (dictionary), words in a sentence (syntax, semantics of the language). In handwriting, pen movements Spatial: In a DNA sequence; base pairs
3
Discrete Markov Process Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 3 N states: S 1, S 2,..., S N State at “time” t, q t = S i First-order Markov P(q t+1 =S j | q t =S i, q t-1 =S k,...) = P(q t+1 =S j | q t =S i ) Transition probabilities a ij ≡ P(q t+1 =S j | q t =S i ) a ij ≥ 0 and Σ j=1 N a ij =1 Initial probabilities π i ≡ P(q 1 =S i ) Σ j=1 N π i =1
4
Stochastic Automaton Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 4
5
Example: Balls and Urns Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 5 Three urns each full of balls of one color S 1 : red, S 2 : blue, S 3 : green
6
Balls and Urns: Learning Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 6 Given K example sequences of length T
7
Hidden Markov Models Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 7 States are not observable Discrete observations {v 1,v 2,...,v M } are recorded; a probabilistic function of the state Emission probabilities b j (m) ≡ P(O t =v m | q t =S j ) Example: In each urn, there are balls of different colors, but with different probabilities. For each observation sequence, there are multiple state sequences
8
HMM Unfolded in Time Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 8
9
Elements of an HMM Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 9 N: Number of states M: Number of observation symbols A = [a ij ]: N by N state transition probability matrix B = b j (m): N by M observation probability matrix Π = [ π i ]: N by 1 initial state probability vector λ = (A, B, Π ), parameter set of HMM
10
Three Basic Problems of HMMs Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 10 1. Evaluation: Given λ, and O, calculate P (O | λ ) 2. State sequence: Given λ, and O, find Q * such that P (Q * | O, λ ) = max Q P (Q | O, λ ) 3. Learning: Given X ={O k } k, find λ * such that P ( X | λ * )=max λ P ( X | λ ) (Rabiner, 1989)
11
Evaluation Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 11 Forward variable:
12
Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 12 Backward variable:
13
Finding the State Sequence Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 13 No! Choose the state that has the highest probability, for each time step: q t * = arg max i γ t (i)
14
Viterbi’s Algorithm Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 14 δ t (i) ≡ max q1q2∙∙∙ qt-1 p(q 1 q 2 ∙∙∙q t-1,q t =S i,O 1 ∙∙∙O t | λ ) Initialization: δ 1 (i) = π i b i (O 1 ), ψ 1 (i) = 0 Recursion: δ t (j) = max i δ t-1 (i)a ij b j (O t ), ψ t (j) = argmax i δ t- 1 (i)a ij Termination: p * = max i δ T (i), q T * = argmax i δ T (i) Path backtracking: q t * = ψ t+1 (q t+1 * ), t=T-1, T-2,..., 1
15
Learning Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 15
16
Baum-Welch (EM) Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 16
17
Continuous Observations Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 17 Discrete: Gaussian mixture (Discretize using k-means): Continuous: Use EM to learn parameters, e.g.,
18
HMM with Input Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 18 Input-dependent observations: Input-dependent transitions (Meila and Jordan, 1996; Bengio and Frasconi, 1996): Time-delay input:
19
Model Selection in HMM Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 19 Left-to-right HMMs: In classification, for each C i, estimate P (O | λ i ) by a separate HMM and use Bayes’ rule
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.