Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bayesian Word Alignment for Statistical Machine Translation Authors: Coskun Mermer, Murat Saraclar Present by Jun Lang 2011-10-13 I2R SMT-Reading Group.

Similar presentations


Presentation on theme: "Bayesian Word Alignment for Statistical Machine Translation Authors: Coskun Mermer, Murat Saraclar Present by Jun Lang 2011-10-13 I2R SMT-Reading Group."— Presentation transcript:

1 Bayesian Word Alignment for Statistical Machine Translation Authors: Coskun Mermer, Murat Saraclar Present by Jun Lang 2011-10-13 I2R SMT-Reading Group

2 Paper info Bayesian Word Alignment for Statistical Machine Translation ACL 2011 Short Paper With Source Code in Perl on 379 lines Authors –Coskun Mermer –Murat Saraclar

3 Core Idea Propose a Gibbs Sampler for Fully Bayesian Inference in IBM Model 1 Result –Outperform classical EM in BLEU up to 2.99 –Effectively address the rare word problem –Much smaller phrase table than EM

4 Mathematics (E, F): parallel corpus e i, f j : i-th (j-th) source (target) word in e (f), which contains I (J) words in corpus E (F). e 0 : Each E sentence contains “null” word V E (V F ): size of source (target) vocabulary a (A): alignment for sentence (corpus) a j : f j has alignment a j for source word e aj T: parameter table, size is V E x V F t e,f = P(f|e): word translation probability

5 IBM Model 1 T as a random variable

6 Dirichlet Distribution T={t e,f } is an exponential family distribution Specifically being multinomial distribution We choose the conjugate prior In the case of Dirichlet Distribution for computational convenience

7 Dirichlet Distribution Each source word type te is a distribution over the target vocabulary, to be a Dirichlet distribution Avoid rare words acting as “garbage collectors”

8 Dirichlet Distribution sample the unknowns A and T in turn ¬j denotes the exclusion of the current value of aj.

9 Algorithm A can be arbitrary, but normal EM output is better

10 Results

11

12

13 Code View bayesalign.pl

14 Conclusions Outperform classical EM in BLEU up to 2.99 Effectively address the rare word problem Much smaller phrase table than EM Shortcomings –Too slow: 100 sentence pairs costs 18 mins –Maybe can be speedup by parallel computing

15 3


Download ppt "Bayesian Word Alignment for Statistical Machine Translation Authors: Coskun Mermer, Murat Saraclar Present by Jun Lang 2011-10-13 I2R SMT-Reading Group."

Similar presentations


Ads by Google