Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS224N Section 2: PA2 & EM Shrey Gupta January 21,2011.

Similar presentations


Presentation on theme: "CS224N Section 2: PA2 & EM Shrey Gupta January 21,2011."— Presentation transcript:

1 CS224N Section 2: PA2 & EM Shrey Gupta January 21,2011

2 Outline for today Interactive Session ! Brief Review of MT Examples
Brief EM review

3 Statistical Machine Translation
P(e|f) = P(f|e)*P(e)/P(f) maxe P(e|f) = maxe (P(f|e)*P(e)) Language Models (P(e)) help alleviate shortcomings of P(f|e)

4 Concepts Translation Probabilities (t) Distortion Probabilities (d)
Fertility (ø) NULL

5 PA2 Requirements Naïve Model IBM Model 1 IBM Model 2
Integration with Decoder

6 IBM Model 1 Simplest of the IBM models
Does not consider word order (bag-of-words approach) Does not model one-to-many alignments Computationally inexpensive Useful for parameter estimations that are passed on to more elaborate models

7 IBM Model 1 We only learn the translation probabilities.

8 IBM Model 1 Steps Initialize the probabilities uniformly. E-Step
M –Step Calculate Repeat until convergence Let’s do an example

9 IBM Model 2 In model two we learn translation probabilities and also distortion probabilities.

10 IBM Model 2 IBM Model 2 tries to learn the alignment probabilities in addition to the translation probabilities. The alignment probabilities are handled at an abstract level, by grouping alignment pairs into buckets. Let the number of buckets be N (indexed from 0:N-1) For a pair , let n = ,the pair is placed is bucket n if n<N-1 or in the Nth bucket if n>=N.

11 IBM Model 2 In Model 2, during the EM step we also collect fractional counts of each bucket and subsequently normalize the same to have a true probability distribution. Many possible implementations – Variable number of Buckets Signed Buckets Hand Fixed Weights

12 EM Revisited Similar to k-means Soft Count v/s Hard Counts

13 Tips Start Early Read Knight’s Tutorial
Plan your approach before you start

14 Questions ?


Download ppt "CS224N Section 2: PA2 & EM Shrey Gupta January 21,2011."

Similar presentations


Ads by Google