Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ch 13. Sequential Data (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by Kim Jin-young Biointelligence Laboratory, Seoul.

Similar presentations


Presentation on theme: "Ch 13. Sequential Data (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by Kim Jin-young Biointelligence Laboratory, Seoul."— Presentation transcript:

1 Ch 13. Sequential Data (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by Kim Jin-young Biointelligence Laboratory, Seoul National University http://bi.snu.ac.kr/

2 2(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/Contents 13.1 Markov Models 13.2 Hidden Markov Models  13.2.1 Maximum likelihood for the HMM  13.2.2 The forward-backward algorithm  13.2.3 The sum-product algorithm for the HMM  13.2.4 Scaling factors  13.2.5 The Viterbi Algorithm  13.2.6 Extensions of the HMM

3 3(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Sequential Data Data dependency exists according to a sequence  Weather data, DNA, characters in sentence  i.i.d. assumption doesn’t hold Sequential Distribution  Stationary vs. Nonstationary Markov Model  No latent variable State Space Models  Hidden Markov Model (discrete latent variables)  Linear Dynamical Systems

4 4(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Markov Models Markov Chain State Space Model (HMM) (free of Markov assumption of any order with reasonable no. of extra parameters)

5 5(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Hidden Markov Model (overview) Overview  Introduction of discrete latent vars. (based on prior knowledge) Examples  Coin toss  Urn and ball Three Issues (given observation,)  Parameter estimation  Prob. of observation seq.  Most likely seq. of latent var.

6 6(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Hidden Markov Model (example) Lattice Representation Left-to-right HMM

7 7(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Hidden Markov Model Given the following, Joint prob. dist. for HMM is: Whose elements are: (observation,latent var,model parameters) (initial latent node) (cond. dist. among latent vars) (emission prob.) K : 상태의 수 / N : 총 시간 Z n-1j,nk : 시각 n-1 에서 j 상태였다가 시각 n 에서 k 상태로 transition (initial state, state transition, emission)

8 8(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ EM Revisited (slide by Seok Ho-sik) General EM  Maximizing the log likelihood function  Given a joint distribution p(X, Z|Θ) over observed variables X and latent variables Z, governed by parameters Θ 1.Choose an initial setting for the parameters Θ old 2.E step Evaluate p(Z|X,Θ old ) – posterior dist. of latent vars 3.M step Evaluate Θ new given by Θ new = argmax Θ Q(Θ,Θ old ) Q(Θ,Θ old ) = Σ Z p(Z|X, Θ old )ln p(X, Z| Θ) 4.It the covariance criterion is not satisfied, then let Θ old  Θ new

9 9(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Estimation of HMM Parameter The Likelihood Function Using EM Algorithm  E-Step (marginalization over latent var Z)

10 10(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Estimation of HMM Parameter  M-Step  Initial  Transition  Emission (Given Gaussian Emission Density)

11 11(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Forward-backward Algorithm Probability for a single latent var Probability for two successive latent vars (parameter estimation)

12 12(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Forward & Backward Variable Defining alpha & beta Recursively Probability of Observation (probability of observation)

13 13(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Sum-product Algorithm Factor graph representation Same result as before (alternative to forward-backward algo.) (We condition on x1,x2…,xN)

14 14(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Scaling Factors Alpha & Beta variable can go to zero exponentially quickly. What if we rescale Alpha & Beta so that their values remain of order unity? (Implementation Issue)

15 15(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ The Viterbi Algorithm From max-sum algorithm Joint dist. by the most probable path Backtracking the most probable path (most likely state sequence) (Eq. 13.68 Revised)

16 16(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/ Extensions of HMM Autoregressive HMM  Considering long-term time dependency Input-output HMM  For supervised learning Factorial HMM  For decoding multiple bits of info.

17 17(C) 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/http://bi.snu.ac.kr/References HMM  A Tutorial On Hidden Markov Models And Selected Applications In Speech Recognition (Rabiner) ETC  http://en.wikipedia.org/wiki/Expectation- maximization_algorithm  http://en.wikipedia.org/wiki/Lagrange_multipliers


Download ppt "Ch 13. Sequential Data (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Summarized by Kim Jin-young Biointelligence Laboratory, Seoul."

Similar presentations


Ads by Google