Download presentation
Presentation is loading. Please wait.
1
Ch 13. Sequential Data (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by Kim Jin-young Biointelligence Laboratory, Seoul National University http://bi.snu.ac.kr/
2
2(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/Contents 13.1 Markov Models 13.2 Hidden Markov Models 13.2.1 Maximum likelihood for the HMM 13.2.2 The forward-backward algorithm 13.2.3 The sum-product algorithm for the HMM 13.2.4 Scaling factors 13.2.5 The Viterbi Algorithm 13.2.6 Extensions of the HMM
3
3(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Sequential Data Data dependency exists according to a sequence Weather data, DNA, characters in sentence i.i.d. assumption doesn’t hold Sequential Distribution Stationary vs. Nonstationary Markov Model No latent variable State Space Models Hidden Markov Model (discrete latent variables) Linear Dynamical Systems
4
4(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Markov Models Markov Chain State Space Model (HMM) (free of Markov assumption of any order with reasonable no. of extra parameters)
5
5(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Hidden Markov Model (overview) Overview Introduction of discrete latent vars. (based on prior knowledge) Examples Coin toss Urn and ball Three Issues (given observation,) Parameter estimation Prob. of observation seq. Most likely seq. of latent var.
6
6(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Hidden Markov Model (example) Lattice Representation Left-to-right HMM
7
7(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Hidden Markov Model Given the following, Joint prob. dist. for HMM is: Whose elements are: (observation,latent var,model parameters) (initial latent node) (cond. dist. among latent vars) (emission prob.) K : 상태의 수 / N : 총 시간 Z n-1j,nk : 시각 n-1 에서 j 상태였다가 시각 n 에서 k 상태로 transition (initial state, state transition, emission)
8
8(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ EM Revisited (slide by Seok Ho-sik) General EM Maximizing the log likelihood function Given a joint distribution p(X, Z|Θ) over observed variables X and latent variables Z, governed by parameters Θ 1.Choose an initial setting for the parameters Θ old 2.E step Evaluate p(Z|X,Θ old ) – posterior dist. of latent vars 3.M step Evaluate Θ new given by Θ new = argmax Θ Q(Θ,Θ old ) Q(Θ,Θ old ) = Σ Z p(Z|X, Θ old )ln p(X, Z| Θ) 4.It the covariance criterion is not satisfied, then let Θ old Θ new
9
9(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Estimation of HMM Parameter The Likelihood Function Using EM Algorithm E-Step (marginalization over latent var Z)
10
10(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Estimation of HMM Parameter M-Step Initial Transition Emission (Given Gaussian Emission Density)
11
11(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Forward-backward Algorithm Probability for a single latent var Probability for two successive latent vars (parameter estimation)
12
12(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Forward & Backward Variable Defining alpha & beta Recursively Probability of Observation (probability of observation)
13
13(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Sum-product Algorithm Factor graph representation Same result as before (alternative to forward-backward algo.) (We condition on x1,x2…,xN)
14
14(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Scaling Factors Alpha & Beta variable can go to zero exponentially quickly. What if we rescale Alpha & Beta so that their values remain of order unity? (Implementation Issue)
15
15(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ The Viterbi Algorithm From max-sum algorithm Joint dist. by the most probable path Backtracking the most probable path (most likely state sequence) (Eq. 13.68 Revised)
16
16(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Extensions of HMM Autoregressive HMM Considering long-term time dependency Input-output HMM For supervised learning Factorial HMM For decoding multiple bits of info.
17
17(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/References HMM A Tutorial On Hidden Markov Models And Selected Applications In Speech Recognition (Rabiner) ETC http://en.wikipedia.org/wiki/Expectation- maximization_algorithm http://en.wikipedia.org/wiki/Lagrange_multipliers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.