Download presentation
Presentation is loading. Please wait.
Published byElaine Rogers Modified over 9 years ago
1
LREC 2008 Marrakech 29 May 20081 Caroline Lavecchia, Kamel Smaïli and David Langlois LORIA / Groupe Parole, Vandoeuvre-Lès-Nancy, France Phrase-Based Machine Translation based on Simulated Annealing
2
LREC 2008 Marrakech 29 May 20082 Outline Statistical Machine Translation (SMT) Concept of inter-lingual triggers Our SMT system based on inter-lingual triggers –Word-based approach –Phrase-based approach using Simulated Annealing algorithm (SA) Experiments Conclusion
3
LREC 2008 Marrakech 29 May 20083 T* = argmax T P(T|S) T* = argmax T P(T) * P(S|T) Given a source sentence S, find the best target sentence T * which maximizes the probability P(T|S) Noisy channel approach Language model Translation Model Introduction Approaches Statistical Machine Translation Introduction:
4
LREC 2008 Marrakech 29 May 20084 Word-based approach –Translation process is done word-by-word –IBM models (Brown et al., 1993) Phrase-based approach (Och et al., 1999), (Yamada and Knight, 2001), (Marcu and Wong, 2002 ) –Better MT system quality –Advantages: Explicitly models lexical units Ex: rat de bibliothèque bookworm Easily captures local reordering Ex: Tour Eiffel Eiffel tower Introduction Approaches Statistical Machine Translation Approaches:
5
LREC 2008 Marrakech 29 May 20085 Current translation models are complex Their estimation needs a lot of time and memory A new translation model based on inter-lingual triggers: We propose a new translation model based on a simple concept: the triggers.
6
LREC 2008 Marrakech 29 May 20086 Triggers in statistical language modeling: Concept of inter-lingual triggers A trigger is a set composed of a word and its best correlated words. Triggers are determined by computing Mutual Information (MI) between words on a monolingual corpus. A trigger is a set composed of a word and its best correlated words. Triggers are determined by computing Mutual Information (MI) between words on a monolingual corpus. Gary Kasparov is a chess champion In statistical language modeling, triggers allow to enhance the probability of triggered words given a triggering word. Gary Kasparov is a chess champion In statistical language modeling, triggers allow to enhance the probability of triggered words given a triggering word. Review of triggers Inter-lingual triggers
7
LREC 2008 Marrakech 29 May 20087 Inter-lingual triggers: Review of triggers Inter-lingual triggers An inter-lingual trigger is a set composed of a source unit s and its best correlated target units: t 1, …, t n. Inter-lingual triggers are determined by computing Mutual Information (MI) between units on a bilingual aligned corpus. An inter-lingual trigger is a set composed of a source unit s and its best correlated target units: t 1, …, t n. Inter-lingual triggers are determined by computing Mutual Information (MI) between units on a bilingual aligned corpus. Gary Kasparov is a chess champion Gary Kasparov est un champion d’échecs We hope to find possible translations of s among the set of its triggered target units t 1, …, t n Gary Kasparov is a chess champion Gary Kasparov est un champion d’échecs We hope to find possible translations of s among the set of its triggered target units t 1, …, t n Concept of inter-lingual triggers
8
1 source word triggers 1 target word. LREC 2008 Marrakech 29 May 20088 1-To-1 triggers: Review of triggers Inter-lingual triggers n-To-m triggers: Gary→ →Kasparov →Gary Kasparov→chess échecs→chess champion→ n source words trigger m target words. Gary Kasparov→ champion d’échecs→chess champion un champion→a champion Kasparov est un→is a chess échecs→chess champion champion→is a chess Source: Gary Kasparov est un champion d’échecs Target: Gary Kasparov is a chess champion Source: Gary Kasparov est un champion d’échecs Target: Gary Kasparov is a chess champion Concept of inter-lingual triggers
9
LREC 2008 Marrakech 29 May 20089 SMT based on inter-lingual triggers How to make good use of inter-lingual triggers in order to estimate a translation model? Word-based translation model using 1-To-1 triggers Phrase-based translation model using n-To-m triggers
10
LREC 2008 Marrakech 29 May 200810 SMT based on inter-lingual triggers For each source word, we keep its k best 1-To-1 triggers. We hope this constitute its potential translations. Translation model –We assign to each inter-lingual trigger a probability calculated as follow: Word-based translation model Word-based translation model using 1-To-1 triggers
11
Motivations: –Most methods for learning phrase translations require word alignments –All phrase pairs that are consistent with this word alignment are collected phrases with no linguistic motivation noisy phrases LREC 2008 Marrakech 29 May 200811 Phrase-based translation model using on n-To-m triggers SMT based on inter-lingual triggers Phrase-based translation model
12
1.Extract phrases from the source corpus 2.Determine potential translations of the source phrases by using n-To-m triggers 3.Start with 1-To-1 triggers to set a baseline MT system 4.Select an optimal subset of n-To-m triggers by Simulated Annealing algorithm LREC 2008 Marrakech 29 May 200812 Method for learning phrase translation: SMT based on inter-lingual triggers Phrase-based translation model
13
Iterative process which selects phrases by grouping words with high Mutual Information. (Zitouni et al., 2003) Only those which improve the perplexity on the source corpus are kept. → pertinent source phrases LREC 2008 Marrakech 29 May 200813 Phrase extraction: Method for learning phrase translation Source phrase extraction Determine potential phrase translation Select optimal phrase translations by SA algorithm
14
LREC 2008 Marrakech 29 May 200814 Learning potential phrase translation: A source phrase can be translated by different target sequences of variable sizes. Assumption: each source phrase of l words can be translated by a sequence of j target words where j Є [l-Δl, l+Δl] For each source phrase of length l, potential translations are: sets of n-To-m triggers with n = l and m Є [l-Δl, l+ Δl] Method for learning phrase translation Source phrase extraction Determine potential phrase translation Select optimal phrase translations by SA algorithm
15
LREC 2008 Marrakech 29 May 200815 Example: Method for learning phrase translation Source phrase extraction Determine potential phrase translation Select optimal phrase translations by SA algorithm Potential translations of porter plainte 2-To-12-To-22-To-3 presspress chargescan press charges chargescan pressnot press charges easynot pressyou can press Source phrase: porter plainte ( l=2 ) We assume that porter plainte can be translated by sequences of 1, 2 or 3 target words ( Δl=1 ).
16
LREC 2008 Marrakech 29 May 200816 General case: Method for learning phrase translation Source phrase extraction Determine potential phrase translation Select optimal phrase translations by SA algorithm All source phrases associated with its k potential translations constitute the set of n-To-m triggers. We have to select among n-To-m triggers pertinent translations and discard noisy ones. Our problem: find a optimal subset of phrase translations which leads to the best MT performance Unreasonnable to try all possibilities!! Proposed method: use Simulated Annealing algorithm
17
LREC 2008 Marrakech 29 May 200817 Terminate search Initial configuration Pertub the configuration Accept new configuration Accept new configuration Update current configuration Adjust temperature Stop no yes no Simulated Annealing: Method for learning phrase translation Source phrase extraction Determine potential phrase translation Select optimal phrase translations by SA algorithm Technique applied to find an optimal solution to a combinatorial problem Initial temperature
18
LREC 2008 Marrakech 29 May 200818 Algorithm applied to SMT: Method for learning phrase translation Source phrase extraction Determine potential phrase translation Select optimal phrase translations by SA algorithm 1.Start with a high temperature T and a baseline word-based MT system using 1-To-1 triggers 2.do a)Perturb the system from state i to state j by randomly adding a subset of n-To-m triggers into the currrent SMT system b)Evaluate the performance of the new system ( E j ) c)If ( E j >E i ) then move from state i to state j Otherwise accepte state j with a probability random(P)<e(E i -E j )/T with P Є[0-1] Until the performance of our SMT system stops increasing 3.Decrease the temperature and go to step 2 until the performance of the system stops increasing
19
LREC 2008 Marrakech 29 May 200819 Text input decoder Text output Translation model Language model Bleu initial Subset of n-to-m trigers Bleu new > Bleu current Initial system Bleu new ≤ Bleu current New system Bleu current Pertubation of the current system Bleu new 1-To-1 triggers n-To-m triggers
20
LREC 2008 Marrakech 29 May 200820 Subtitle copora: Experiments Corpora Tuning step Evaluation Subtitle parallel corpora built using Dynamic Time Wrapping algorithm (Lavecchia et al., 2007) FrenchEnglish TrainSentences 27523 Words 191185205785 Singletons 70665400 Vocabulary 1465511718 DevSentences 1959 Words 1359814739 TestSentences 756 Words 5314756
21
LREC 2008 Marrakech 29 May 200821 SA algorithm parameters: Experiments Corpora Tuning step Evaluation 1-To-1 triggers: all source words associated with its best 50 target words n-To-m triggers: –15860 source phrases –all source phrases associated with its 30 best n-To-1, n-To-2 and n-To-3 inter-lingual triggers Initial temperature: 10 -4 System perturbation: adding 10 potential translations of 10 source phrases
22
LREC 2008 Marrakech 29 May 200822 Initial system: Experiments Corpora Tuning step Evaluation Translation ModeltmlmdwBleu 1-To-1 triggers0.60.3 012.49 IBM M3 (2) 0.80.6 012.39 (1)Trigram model (2)(Brown et al., 1993) Text input Text output Pharaoh decoder Word translation model Language model (1)
23
LREC 2008 Marrakech 29 May 200823 Final system: Experiments Corpora Tuning step Evaluation Translation ModeltmlmdwBleu optimal n-To-m triggers0.60.3 014.14 Reference (2) 0.80.6 07.02 (1)Trigram model (2)(Och, 2002) Text input Text output Pharaoh decoder Phrase translation model Language model (1)
24
LREC 2008 Marrakech 29 May 200824 Evaluation of the final system: Experiments Corpora Tuning step Evaluation Inter-lingual triggersState of the art 1-To-1n-To-m IBM3Reference Dev12.4914.1412.397.02 Test13.6310.7714.006.57 Lead of n-to-m triggers on 1-to-1 triggers not corroborated on the test corpus Explanations: - Over-fitting due to poor amount of data - Corpora of different movie styles Lead of n-to-m triggers on 1-to-1 triggers not corroborated on the test corpus Explanations: - Over-fitting due to poor amount of data - Corpora of different movie styles Impact of over-fitting more important on the state-of-the-art systems.
25
LREC 2008 Marrakech 29 May 200825 Conclusion and future work: A new method for learning phrase translations 1.Extract source phrases 2.Find phrase translations using inter-lingual triggers 3.Select the pertinent ones using SA algorithm advantages: no word alignment + more pertinent phrase translations Experiments on movie subtitle corpora More robust on sparse data than a state-of-the-art approach Better translation quality in terms of Bleu score (+7pts dev., +4pts test) A new method for learning phrase translations 1.Extract source phrases 2.Find phrase translations using inter-lingual triggers 3.Select the pertinent ones using SA algorithm advantages: no word alignment + more pertinent phrase translations Experiments on movie subtitle corpora More robust on sparse data than a state-of-the-art approach Better translation quality in terms of Bleu score (+7pts dev., +4pts test) Improvement of our system Classify movies Integrate linguistic knowledge in the translation process Considering inter-lingual triggers not only on word surface forms Improvement of our system Classify movies Integrate linguistic knowledge in the translation process Considering inter-lingual triggers not only on word surface forms
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.