Download presentation
Presentation is loading. Please wait.
Published byBirgit Elsa Lind Modified over 5 years ago
1
Improving IBM Word-Alignment Model 1(Robert C. MOORE)
nonstructural problems with IBM model 1 cause alignment error. Rare words in source language tend to be “garbage collectors”. Null source word is aligned with too few target words. Changing parameter estimation addresses the problem. Smoothing translation counts: “add-n” smoothing. Adding null words to the source sentence. Initialize Model 1 with LLR statistics.
2
Improving IBM Word-Alignment Model 1(Robert C. MOORE)
Evaluation results show that parameter estimation helps reducing AER (Och and Ney, 2003). Each method helps reduce AER. Combined model reduces AER by 30%. Conclusion: 30% reduction in AER is achieved simply by changing parameter estimation. LLR, compared with Dice coefficient, as the initializing statistics, addresses the over-fitting problem for rare words.
3
Multi-Engine Machine Translation with Voted Language Model (Tadashi Nomoto)
Describes a particular approach that: Takes into account the reliability of each model in MEMT. Uses a voting scheme to pick up an LM. Nomoto (2003) uses Support Vector regression (SVR) to exploit bias in performance of MT systems. Votes for language model (LM). Experiments show choice of LM does influence performance. V-by-M: perplexity of LM is a good predictor.
4
Multi-Engine Machine Translation with Voted Language Model (Tadashi Nomoto)
Experiments results: V-by-M scheme improves significantly the performance of MEMT. V-by-M does not influence regressive MEMT systems as much as it influences MEMT systems. Both MEMT and regressive MEMT outperform single MT systems.
5
Statistical Machine Translation with Word and Sentence-Aligned Parallel Corpora (Chris Callison-Burch David Talbot Miles Osborne) Significant improvement can be achieved by including word-aligned information during training. The modified parameter estimation approach: There are sentence pairs that have explicit word-level alignment. In parameter estimation, the mixed likelihood function combines the expected information in sentence-aligned pairs and complete information in word-aligned pairs.
6
Statistical Machine Translation with Word and Sentence-Aligned Parallel Corpora (Chris Callison-Burch David Talbot Miles Osborne) Adding word-aligned data reduces AER. For IBM models 1, 3, 4 and HMM, adding word-aligned sentence pairs helps reduce AER. The difference in best models with (IBM model 4) and without (HMM model) word-aligned information is 38% reduction in AER. Using word-aligned data improves translation quality. Increasing the weight and ratio of word-aligned data both increase AER decrease. Discussion and future work: Using word-aligned data is much cheaper and more accurate to build parallel corpora than using professional translators. Which sentences in the training corpus should be word-aligned?
7
Align using matrix factorization (Cyril Goutte, Kenji Yamada and Eric Gaussler)
The paper: Views aligning words from sentences as Orthogonal Non-negative Matrix Factorization (ONMF). Develops an algorithm that performs ONMF. Improves in several ways over state-of-the-art results. An algorithm performs ONMF in 2 steps. Factorize M using Probabilistic Latent Semantic Analysis, akaPLSA. Orthogonalise factors using Maximum A Posteriori (MAP) assignment of words to cepts. Estimate the number of cepts by maximising AIC or BIC between min(I, J) and 1.
8
Align using matrix factorization (Cyril Goutte, Kenji Yamada and Eric Gaussler)
9
Align using matrix factorization (Cyril Goutte, Kenji Yamada and Eric Gaussler)
Results show: For HLT-NAACL French-English task, better recall and F-score is achieved however at the cost of a low precision. For Romanian-English task, matrix factorization approach increases recall, but decreases AER and precision. For both tasks, matrix factorization approach provides 100% coverage and aligns all words. Discussion and conclusion. Problems: local optima with PLSA and other ways to get original translation matrix M. This matrix factorization does not improve AER, but guarantees both proper alignments and good coverage.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.