MACHINE LEARNING 16. HMM. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Modeling dependencies.

MACHINE LEARNING 16. HMM

Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Modeling dependencies in input; no longer iid  Sequences:  Temporal: In speech; phonemes in a word (dictionary), words in a sentence (syntax, semantics of the language). In handwriting, pen movements  Spatial: In a DNA sequence; base pairs

Discrete Markov Process Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 3  N states: S 1, S 2,..., S N State at “time” t, q t = S i  First-order Markov P(q t+1 =S j | q t =S i, q t-1 =S k,...) = P(q t+1 =S j | q t =S i )  Transition probabilities a ij ≡ P(q t+1 =S j | q t =S i ) a ij ≥ 0 and Σ j=1 N a ij =1  Initial probabilities π i ≡ P(q 1 =S i ) Σ j=1 N π i =1

Stochastic Automaton Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 4

Example: Balls and Urns Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 5  Three urns each full of balls of one color S 1 : red, S 2 : blue, S 3 : green

Balls and Urns: Learning Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 6  Given K example sequences of length T

Hidden Markov Models Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 7  States are not observable  Discrete observations {v 1,v 2,...,v M } are recorded; a probabilistic function of the state  Emission probabilities b j (m) ≡ P(O t =v m | q t =S j )  Example: In each urn, there are balls of different colors, but with different probabilities.  For each observation sequence, there are multiple state sequences

HMM Unfolded in Time Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 8

Elements of an HMM Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 9  N: Number of states  M: Number of observation symbols  A = [a ij ]: N by N state transition probability matrix  B = b j (m): N by M observation probability matrix  Π = [ π i ]: N by 1 initial state probability vector λ = (A, B, Π ), parameter set of HMM

Three Basic Problems of HMMs Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 10 1. Evaluation: Given λ, and O, calculate P (O | λ ) 2. State sequence: Given λ, and O, find Q * such that P (Q * | O, λ ) = max Q P (Q | O, λ ) 3. Learning: Given X ={O k } k, find λ * such that P ( X | λ * )=max λ P ( X | λ ) (Rabiner, 1989)

Evaluation Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 11  Forward variable:

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 12  Backward variable:

Finding the State Sequence Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 13 No! Choose the state that has the highest probability, for each time step: q t * = arg max i γ t (i)

Viterbi’s Algorithm Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 14 δ t (i) ≡ max q1q2∙∙∙ qt-1 p(q 1 q 2 ∙∙∙q t-1,q t =S i,O 1 ∙∙∙O t | λ )  Initialization: δ 1 (i) = π i b i (O 1 ), ψ 1 (i) = 0  Recursion: δ t (j) = max i δ t-1 (i)a ij b j (O t ), ψ t (j) = argmax i δ t- 1 (i)a ij  Termination: p * = max i δ T (i), q T * = argmax i δ T (i)  Path backtracking: q t * = ψ t+1 (q t+1 * ), t=T-1, T-2,..., 1

Continuous Observations Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 17  Discrete:  Gaussian mixture (Discretize using k-means):  Continuous: Use EM to learn parameters, e.g.,

HMM with Input Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 18  Input-dependent observations:  Input-dependent transitions (Meila and Jordan, 1996; Bengio and Frasconi, 1996):  Time-delay input:

Model Selection in HMM Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 19  Left-to-right HMMs:  In classification, for each C i, estimate P (O | λ i ) by a separate HMM and use Bayes’ rule

MACHINE LEARNING 16. HMM. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Modeling dependencies.

Similar presentations

Presentation on theme: "MACHINE LEARNING 16. HMM. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Modeling dependencies."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

MACHINE LEARNING 16. HMM. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Modeling dependencies.

Similar presentations

Presentation on theme: "MACHINE LEARNING 16. HMM. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Modeling dependencies."— Presentation transcript:

Similar presentations

About project

Feedback