Presentation is loading. Please wait.

Presentation is loading. Please wait.

Expectation-Maximization Algorithm

Similar presentations


Presentation on theme: "Expectation-Maximization Algorithm"— Presentation transcript:

1 Expectation-Maximization Algorithm
M.B.Chandak

2 Principle-EM Algorithm
Maximum Data Likelihood Estimation. This algorithm operates on parallel corpus. For example: English-Hindi aligned parallel corpus. The algorithm aims to find out MLE [Maximum likelihood estimation] of two words to be used for Machine Translation. In the following example: English and Hindi languages are used source and target language. Let Es-represents English and Hs-represents Hindi corpus.

3 Implementation: It is an iterative algorithm.
The two steps are: Computing the probability of word alignment [M-step] and generating the expected count of these alignment [E-step] Initially: To all alignment uniform probability is assigned.

4 Example: Sentence: English-Hindi Green House The House हरा घर यह घर
हरा घर यह घर Uniform probability table Green House The t(Green|हरा )=1/3 t(house|हरा )=1/3 t(the|हरा )=1/3 t(Green|घर)=1/3 t(house|घर)=1/3 t(the|घर)=1/3 t(Green|यह)=1/3 t(house|यह)=1/3 t(the|यह)=1/3

5 Example Compute P(a, e|h) by multiplying all “t” probabilities
Green House The House हरा घर यह घर 1/3 * 1/3 = 1/9 1/3 * 1/3 = 1/9 1/3 * 1/3 = 1/9 1/3 * 1/3 = 1/9

6 Re-calculating values
Green House हरा घर THE GREEN HOUSE यह हरा घर The House यह घर THE GREEN HOUSE यह हरा घर

7 Calculate “tcounts”=tc
Green House The TOTAL tc(Green|हरा )=1/2 tc(house|हरा )=1/2 tc(the|हरा )=0 t(the|हरा )=1 tc(Green|घर)=1/2 tc(house|घर)=[1/2+1/2]=1 tc(the|घर)=1/2 t(the|घर)=2 tc(Green|यह)=0 tc(house|यह)=1/2 tc(the|यह)=1/2 t(the|यह)=1

8 M-Step t(Green|हरा )=1/2 t(house|हरा )=1/2 t(the|हरा )=0
TOTAL t(Green|हरा )=1/2/1 tc(house|हरा )=1/2/1 t(the|हरा )=0/1 t(the|हरा )=1 t(Green|घर)=1/2/2 t(house|घर)=[1/2+1/2]=1/2 t(the|घर)=1/2/2 t(the|घर)=2 t(Green|यह)=0/1 t(house|यह)=1/2/1 t(the|यह)=1/2/1 t(the|यह)=1 Green House The t(Green|हरा )=1/2 t(house|हरा )=1/2 t(the|हरा )=0 t(Green|घर)=1/4 t(house|घर)=1/2 t(the|घर)=1/4 t(Green|यह)=1/2 t(house|यह)=1/2 t(the|यह)=1/2

9 E-step: Part 2: Identifying higher probability phrase
Compute P(a, e|h) by multiplying all “t” probabilities Green House The House हरा घर यह घर 1/2 * 1/2 = 1/4 1/2 * 1/2 = 1/4 1/4 * 1/2 = 1/8 1/4 * 1/2= 1/8

10 Further:: The process continues to iterate with E-step followed by M-step. The probability values are changed from 1/9 to 1/4 and 1/9 to 1/8.


Download ppt "Expectation-Maximization Algorithm"

Similar presentations


Ads by Google