Statistical Machine Translation Papers from COLING 2004

Statistical Machine Translation Papers from COLING 2004
Marine CARPUAT 15 September 2004

Improving alignment models
“fix” IBM-style models Improved word alignment using a symmetric lexicon model Richard Zens, Evgeny Matusov and Hermann Ney Symmetric word alignments for SMT Evgeny Matusov, Richard Zens and Hermann Ney Improving word alignment quality using morpho-syntactic information Maja Popovic and Hermann Ney Improving statistical word alignment with a rule-based machine translation system Hua Wu and Haifeng Wang Syntax-based alignment: supervised or unsupervised? Hao Zhang and Daniel Gildea Use tree-based models SMT papers from COLING-04 Marine, 15/09/2004

Improving alignment models
Improved word alignment using a symmetric lexicon model Richard Zens, Evgeny Matusov and Hermann Ney Symmetric word alignments for SMT Evgeny Matusov, Richard Zens and Hermann Ney Improving word alignment quality using morpho-syntactic information Maja Popovic and Hermann Ney Improving statistical word alignment with a rule-based machine translation system Hua Wu and Haifeng Wang Syntax-based alignment: supervised or unsupervised? Hao Zhang and Daniel Gildea SMT papers from COLING-04 Marine, 15/09/2004

Decoding Reordering constraints for phrase-based statistical machine translation Richard Zens, Hermann Ney, Taro Watanabe, Eiichiro Sumita Improving a statistical MT system with automatically learned rewrite patterns Fei Xia and Michael McCord SMT papers from COLING-04 Marine, 15/09/2004

Improved word alignment using a symmetric lexicon model [Richard ZENS, Evgeny MATUSOV and Hermann NEY] Hypothesis: We lose useful information by training the lexicons for IBM and HMM alignment models in one translation direction only. Solution: train lexicon models symmetrically At each iteration of EM training Instead of training a unique source-to-target lexicon Combine the source->target lexicon, and the target->source lexicon into a unique symmetric lexicon by: Linear interpolation of the counts (union of the 2 lexicons) Log-linear interpolation of the counts (intersection of the 2 lexicons) Experiments: German-English Verbmobil (~340K English words) French-English Canadian Hansards (~1.9M English words) SMT papers from COLING-04 Marine, 15/09/2004

Improved word alignment using a symmetric lexicon model [Richard ZENS, Evgeny MATUSOV and Hermann NEY] Results: Symmetric lexicon training always yields statistically significant improvement of precision and recall over the Och&Ney baseline Canadian Hansardsl: AER=8.6% (baseline=12.6%) Verbmobil: AER=4.3% (baseline=5.7%) However, the improvement becomes much smaller when the training corpus size increases. Linear interpolation performs best on Verbmobil vs. loglinear on Hansards Does the improved alignment model yield significantly better translations? Also propose a lexicon smoothing method that reduces alignment error rate [on German-English task only]: Back-off lexicon based on base-form (vs. full-form) words SMT papers from COLING-04 Marine, 15/09/2004

Symmetric word alignments for SMT [Evgeny MATUSOV, Richard ZENS, and Hermann NEY]
Problem: IBM and HMM alignment models constrain each source word to be aligned to at most one target word. Solution: symmetric word alignment models allow instead many-to-one and one-to-many alignments. Word alignment is viewed as finding a mapping between the source and the target words with 3 constraints: each source position is covered each target position is covered the total costs of the alignment are minimal. alignment cost: negated log of state occupation probability derived from IBM and HMM models There exists polynomial algorithms for minimum weighted edge cover (MWEC) in a bipartite graph. SMT papers from COLING-04 Marine, 15/09/2004

Symmetric word alignments for SMT [Evgeny MATUSOV, Richard ZENS, and Hermann NEY]
Och & Ney (2003) proposed a set of heuristics to generalize the intersection of bidirectional alignments. This paper proposes a systematic and theoretically well-founded approach instead. Results: The new method yields significant reduction in AER compared to the Och & Ney baseline. Canadian Hansards: best AER=6.0% (baseline=6.9%) Verbmobil: best AER=3.7% (baseline=4.7%) However, the best results are obtained by intersection of one-sided MWEC alignments SMT papers from COLING-04 Marine, 15/09/2004

Syntax-based alignment: supervised or unsupervised
Syntax-based alignment: supervised or unsupervised? [Hao ZHANG and Daniel GILDEA] Unsupervised! directly compares two alignment models: syntactically unsupervised BTGs: trees are learned from parallel corpora vs. syntactically supervised tree-to-string model (Yamada and Knight 2001): trees are provided by a parser trained on hand-annotated treebanks Experimental set-up: 2 different tasks: Chinese-English and French-English small training corpora (~20k sent. pairs, sent. length < 25) The tree-based models both outperform IBM model 4 BTG outperforms the tree-to-string model Chinese-English AER: BTG=0.40 vs. tree-to-string=0.50 French-English AER: BTG=0.16 vs. tree-to-string=0.15 SMT papers from COLING-04 Marine, 15/09/2004

Syntax-based alignment: supervised or unsupervised
Syntax-based alignment: supervised or unsupervised? [Hao ZHANG and Daniel GILDEA] The tree-to-string model suffers from the performance of the automatic parser: The parallel corpus newswire text data is very different from the Penn Treebank. However, even when the output of the automatic parser is correct, the syntactic structure of the two languages may not correspond. Adding the cloning operation (Gildea 2003) improves tree-to-string results by 2% precision/recall/AER but BTGs still yield better AER (on Chinese English) SMT papers from COLING-04 Marine, 15/09/2004

Reordering constraints for phrase-based SMT [Richard ZENS, Hermann NEY, Taro WATANABE and Eiichiro SUMITA] IBM vs. BTG reordering constraints for phrase-based models IBM reordering constraints: The sentence is produced phrase-by-phrase Skipping of up to k phrases is allowed BTG constraints: Each alignment template is a block Select 2 consecutive blocks and merge them by: Keeping the target phrases monotone, or Inverting the order of the target phrases This paper proposes a controlled comparison of the reordering constraints efficient dynamic programming algorithms SMT papers from COLING-04 Marine, 15/09/2004

Reordering constraints for phrase-based SMT [Richard ZENS, Hermann NEY, Taro WATANABE and Eiichiro SUMITA] For the BTG constraints, a new decoding algorithm that generates the sentence phrase by phrase is proposed: Use the beam search decoder for unconstrained search Modify it so that it will not produce any reorderings that violate the BTG constraints Experimental set-up: 2 relatively small corpora with short sentences BTEC (Basic Travel Expression Corpus) ~6 words per sent. SLDB (Spoken Language DataBase) ~12 words per sent. SMT papers from COLING-04 Marine, 15/09/2004

Reordering constraints for phrase-based SMT [Richard ZENS, Hermann NEY, Taro WATANABE and Eiichiro SUMITA] Results: Both IBM and BTG constraints improve on the monotone search IBM constraints give similar translation quality as unconstrained search BTG constraints significantly outperform both IBM and unconstrained search Constraint type WER (%) PER (%) BLEU (%) NIST (%) Unconstrained 11.5 10.0 88.0 14.19 IBM (k=2) 11.4 10.1 88.1 14.20 BTG 11.0 9.9 88.2 14.25 Results on BTEC corpus SMT papers from COLING-04 Marine, 15/09/2004

Conclusion Unsupervised tree-based approaches are promising for both alignment and decoding Unlike IBM models, BTGs are intrinsically symmetric Controlled comparisons show that BTG-based alignment models and decoders outperform other approaches We really need to get our system up and running… … and to train on all the available data before trying to improve the models. SMT papers from COLING-04 Marine, 15/09/2004

Statistical Machine Translation Papers from COLING 2004

Similar presentations

Presentation on theme: "Statistical Machine Translation Papers from COLING 2004"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Statistical Machine Translation Papers from COLING 2004

Similar presentations

Presentation on theme: "Statistical Machine Translation Papers from COLING 2004"— Presentation transcript:

Similar presentations

About project

Feedback