CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 38-39: Baum Welch Algorithm; HMM training.

CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 38-39: Baum Welch Algorithm; HMM training

Training Hidden Markov Model (not structure learning, i.e., the structure of the HMM is pre-given). This involves: Learning probability values ONLY Correspondence with PCFG: Not learning production rule but probabilities associated with them Training algorithm for PCFG is called Inside-Outside algorithm Baum Welch algorithm

Key Intuition Given: Training sequence Initialization: Probability values Compute:Pr (state seq | training seq) get expected count of transition compute rule probabilities Approach:Initialize the probabilities and recompute them… EM like approach a b a b a b a b qr

Building blocks: Probabilities to be used 1. S1S1 S2S2 SnSn S n+1 W1W1 W 2 …………… W n-1 WnWn

Probabilities to be used, contd… 2. Exercise 1:- Prove the following:

Start of baum-welch algorithm String = aab aaa aab aaa Sequence of states with respect to input symbols b a q r a b o/p seq State seq

Calculating probabilities from table Table of counts T=#states A=#alphabet symbols Now if we have a non-deterministic transitions then multiple state seq possible for the given o/p seq (ref. to previous slide’s feature). Our aim is to find expected count through this. SrcDestO/PCount qra5 qqb3 rqa3 rqb2

Interplay Between Two Equations w k No. of times the transitions s i  s j occurs in the string

Learning probabilities a:0.67 b:1.0 b:0.17 a:0.16 qr a:0.4 b:1.0 b:0.48 a:0.48 qr Actual (Desired) HMM Initial guess

One run of Baum-Welch algorithm: string ababa P(path) qrqrqq0.000770.00154 00.00077 qrqqqq0.00442 0.00884 qqqrqq0.00442 0.00884 qqqqqq0.025480.00.0000.050960.07644 Rounded Total  0.0350.01 0.060.095 New Probabilities (P)  0.06 (0.01/(0.01 +0.06+0.09 5) 1.00.360.581 * is considered as starting and ending symbol of the input sequence string State sequences This way through multiple iterations the probability values will converge.

Appling Naïve Bayes Hence multiplying the transition probabilities is valid

Discussions 1.Symmetry breaking: Example: Symmetry breaking leads to no change in initial values 1.Struck in Local maxima 2.Label bias problem Probabilities have to sum to 1. Values can rise at the cost of fall of values for others. sss b:1.0 b:0.5 a:0.5 a:1.0 sss a:0.5 b:0.5 a:0.25 a:0.5 b:0.5 a:0.25 b:0.25 b:0.5 Desired Initialized

Computational part Exercise 2: What is the complexity of calculating the above expression? Hint: To find this first solve Exercise 1 i.e. understand how probability of given string can be represented as

CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 38-39: Baum Welch Algorithm; HMM training.

Similar presentations

Presentation on theme: "CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 38-39: Baum Welch Algorithm; HMM training."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 38-39: Baum Welch Algorithm; HMM training.

Similar presentations

Presentation on theme: "CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 38-39: Baum Welch Algorithm; HMM training."— Presentation transcript:

Similar presentations

About project

Feedback