CPSC 7373: Artificial Intelligence Lecture 12: Hidden Markov Models and Filters Jiang Bian, Fall 2012 University of Arkansas at Little Rock
Hidden Markov Models Hidden Markov Models (HMMs) – to analyze, or – to predict – time series Applications: – Robotics – Medical – Finance – Speech and – Language Technologies – etc.
Bayes Network of HMMs HMMs – A sequence of states that evolves over time; – Each state only depends on the previous state; and – Each state emits a measurement Filters – Kalman Filters – Particle Filters S1 S2 S3 Sn... Z1 Z2 Z3 Zn
Markov Chain R R S S The initial state: P(R0) = 1 P(S0) = 0 P(R1) = ??? P(R2) = ??? P(R3) = ???
Markov Chain R R S S The initial state: P(R0) = 1 P(S0) = 0 P(R1) = 0.6 P(R1|R0) * P(R0) = 0.6 P(R2) = 0.44 P(R2) = P(R2|R1) * P(R1) + P(R2|S1)*P(S1) = 0.44 P(R3) = P(R3) = P(R3|R2) * P(R2) + P(R3|S2)*P(S2) = 0.376
Markov Chain P(A0) = 1 – P(A1) = ??? – P(A2) = ??? – P(A3) = ??? A A B B 0.5 1
Markov Chain P(A0) = 1 – P(A1) = 0.5 – P(A2) = 0.75 – P(A3) = A A B B 0.5 1
Stationary Distribution P(A1000) = ??? Stationary Distribution: – P(At) = P(At-1) – P(At|At-1)*P(At-1) +P(At|Bt-1)*P(Bt-1) – P(At) = X X = 0.5 * X + 1 * (1- X) X = 2/3 = P(A) P(B) = 1 – P(A) = 1/3 A A B B 0.5 1
Stationary Distribution R R S S
Stationary Distribution R R S S
Transition Probabilities Finding the transition probabilities from observation. – e.g., R, S, S, S, R, S, R (7 days) Maximum Likelihood – P(R0) = 1 – P(S|R) = 1, P(R|R) = 0 – P(S|S) = 0.5, P(R|S) = 0.5 R R S S ? ? ? ?
Transition Probabilities - Quiz Finding the transition probabilities from observation. – e.g., S, S, S, S, S, R, S, S, S, R, R Maximum Likelihood – P(R0) = ??; P(S0) = ?? – P(S|R) = ??, P(R|R) = ?? – P(S|S) = ??, P(R|S) = ?? R R S S ? ? ? ?
Transition Probabilities - Quiz Finding the transition probabilities from observation. – e.g., S, S, S, S, S, R, S, S, S, R, R Maximum Likelihood – P(R0) = 0; P(S0) = 1 – P(S|R) = 0.5, P(R|R) = 0.5 – P(S|S) = 0.75, P(R|S) = 0.25 R R S S ? ? ? ?
Laplacian Smoothing Laplacian smoothing; k = 1 – P(R0) = ??; P(S0) = ?? – P(S|R) = ??, P(R|R) = ?? – P(S|S) = ??, P(R|S) = ?? R R S S ? ? ? ?
Laplacian Smoothing R, S, S, S, S Laplacian smoothing; k = 1 – P(R0) = 2/3; P(S0) = 1/3 – P(S|R) = 4/5, P(R|R) = 1/5 – P(S|S) = 2/3, P(R|S) = 1/3 R R S S ? ? ? ?
Hidden Markov Models Suppose that I can’t observe the weather (e.g., I am grounded with no windows); But I can make a guess based on whether my wife carries a umbrella; and – P(U|R) = 0.9; P(-U|R) = 0.1 – P(U|S) = 0.2; P(-U|S) = 0.8 P(R1|U1) = P(U1|R1) * P(R1) / P(U1) = 0.9 * 0.4 / (0.4* *0.6) = 0.75 – P(R1) = P(R1|R0)P(R0) + P(R1|S0) P(S0) = 0.4 R R S S U U U U P(R0) = ½ P(S0) = ½
Specification of an HMM N - number of states – Q = {q 1 ; q 2 ; : : : ;q T } - set of states M - the number of symbols (observables) – O = {o 1 ; o 2 ; : : : ;o T } - set of symbols A - the state transition probability matrix – aij = P(q t+1 = j|q t = i) B- observation probability distribution – b j (k) = P(o t = k|q t = j) i ≤ k ≤ M π - the initial state distribution
Specification of an HMM Full HMM is thus specified as a triplet: – λ = (A,B,π)
Central problems in HMM modelling Problem 1 Evaluation: – Probability of occurrence of a particular observation sequence, O = {o 1,…,o k }, given the model – P(O|λ) – Complicated – hidden states – Useful in sequence classification
Central problems in HMM modelling Problem 2 Decoding: – Optimal state sequence to produce given observations, O = {o 1,…,o k }, given model – Optimality criterion – Useful in recognition problems
Central problems in HMM modelling Problem 3 Learning: – Determine optimum model, given a training set of observations – Find λ, such that P(O|λ) is maximal
Particle Filter
Particle Filters
Sensor Information: Importance Sampling
Robot Motion
Sensor Information: Importance Sampling
Robot Motion
1. Algorithm particle_filter( S t-1, u t-1 z t ): For Generate new samples 4. Sample index j(i) from the discrete distribution given by w t-1 5. Sample from using and 6. Compute importance weight 7. Update normalization factor 8. Insert 9. For 10. Normalize weights Particle Filter Algorithm
Particle Filters Pros: – easy to implement – Work well in many applications Cons: – Don’t work in high dimension spaces – Problems with degenerate conditions. Very few particles, and Not much noise in either measurements or controls.