Automating Post-Editing To Improve MT Systems Ariadna Font Llitjós and Jaime Carbonell APE Workshop, AMTA – Boston August 12, 2006
Outline Introduction –Problem and Motivation –Goal and Solution –Related Work Theoretical Space Online Error Elicitation Rule Refinement Framework MT Evaluation Framework Conclusions
Problem and Motivation MT output: Assassinated a diplomat Russian and kidnapped other four in Bagdad
Problem and Motivation MT output: Assassinated a diplomat Russian and kidnapped other four in Bagdad Could hire post-editors to correct machine translations
Problem and Motivation MT output: Assassinated a diplomat Russian and kidnapped other four in Bagdad Could hire post-editors to correct machine translations Expensive + time consuming
Problem and Motivation MT output: Assassinated a diplomat Russian and kidnapped other four in Bagdad Not feasible for large amounts of data (google, yahoo, etc.) Does not generalize to new sentences
Ultimate Goal Automatically Improve MT Quality
Two Alternatives Automatic learning of post-editing rules (APE) + system independent - several thousands of sentences might need to be corrected for the same error Automatic Refinement of MT System + attacks the core of the problem -system dependent
Our Approach Automatically Improve MT Quality by Recycling Post-Editing Information Back to MT Systems
Our Solution Get non-expert bilingual speakers to provide correction feedback online –Make correcting translations easy and fun –5-10 minutes a day Large amounts of correction data SL: Asesinado un diplomático ruso y secuestrados otros cuatro en Bagdad TL: Assassinated a diplomat Russian and kidnapped other four in Bagdad
Our Solution Get non-expert bilingual speakers to provide correction feedback online –Make correcting translations easy and fun –5-10 minutes a day Large amounts of correction data Feed corrections back into the MT system, so that they can be generalized System will translate new sentences better SL: Asesinado un diplomático ruso y secuestrados otros cuatro en Bagdad TL: Assassinated a diplomat Russian and kidnapped other four in Bagdad
Bilingual Post-editing In addition to: –TL sentence, and –Context information, when available It also provides post-editors with: –Source Language sentence –Alignment information Traditional post-editing Allegedly a harder task, but we can now get much more data for free
MT Approaches Interlingua Syntactic Parsing Semantic Analysis Sentence Planning Text Generation Source (e.g. English) Target (e.g. Spanish) Transfer Rules Direct: SMT, EBMT
[Nishida et al. 1988] [Corston-Oliver & Gammon, 2003] [Imamura et al. 2003] [Menezes & Richardson, 2001] Related Work [Su et al. 1995] [Brill, 1993] [Gavaldà, 2000] Post-editing Rule Adaptation Fixing Machine Translation [Callison-Burch, 2004] [Allen 2003] [Allen & Hogan, 2000] [Knight & Chander, 1994] Our Approach Non-expert user feedback Provides relevant reference translations Generalizes over unseen data
System Architecture INPUT TEXT OUTPUT TEXT Rule Learning and other Resources Run-Time MT System Online Translation Correction Tool Rule Refinement
Main Technical Challenge Mapping between Simple user edits to MT output Improved Translation Rules Blame assignment Rule Modifications Lexical Expansions
Outline Introduction Theoretical Space –Error classification –Limitations Online Error Elicitation Rule Refinement Framework MT Evaluation Framework Conclusions
Error Typology for Automatic Rule Refinement (simplified) Missing word Extra word Wrong word order Incorrect word Local vs Long distance Word vs. phrase + Word change Sense Form Selectional restrictions Idiom Missing constraint Extra constraint
Error Typology for Automatic Rule Refinement (simplified) Missing word Extra word Wrong word order Incorrect word Local vs Long distance Word vs. phrase + Word change Sense Form Selectional restrictions Idiom Missing constraint Extra constraint
Limitations of Approach The SL sentence needs to be fully parsed by the translation grammar. –current approach needs rules to refine Depend on bilingual speakers’ ability to detect MT errors.
Outline Introduction Theoretical Space Online Error Elicitation –Translation Correction Tool Rule Refinement Framework MT Evaluation Framework Conclusions
TCTool (Demo)Demo Add a word Delete a word Modify a word Change word order Actions: Interactive elicitation of error information
Expanding the lexicon 1. OOV words/idoms Hamas cabinet calls a truce to avoid Israeli retaliation El gabinete de Hamas llama un *TRUCE* para evitar la venganza israelí …acuerda una tregua… 2. New Word Form The children fell *los niños cayeron los niños se cayeron 3. New Word Sense The girl plays the guitar *la muchacha juega la guitarra la muchacha toca la guitarra
Refining the grammar 4. Add Agreement constraints I see the red car *veo el coche roja veo el coche rojo 5. Add Word or Constituent You saw the woman viste la mujer viste a la mujer 6. Move Constituent Order I saw you yo vi te yo te vi
SL: Gaudí was a great artist TL: Gaudí era un artista grande change constituent order Edit Word Change Word Order CTL: Gaudí era un gran artista Adding new form to lexicon
Eng2Spa User Study [LREC 2004] MT error classification 9 linguistically- motivated classes [Flanagan, 1994], [White et al. 1994]: word order, sense, agreement error (number, person, gender, tense), form, incorrect word and no translation Interactive elicitation of error information precisionrecall error detection90%89% error classification72%71%
Outline Introduction Theoretical Space Online Error Elicitation Rule Refinement Framework –RR operations –Formalizing Error information –Refinement Steps (Example) MT Evaluation Framework Conclusions
1. Refine a translation rule: R0 R1 (change R0 to make it more specific or more general) Types of Refinement Operations Automatic Rule Adaptation R0: R1: NP DET N ADJ NP DET ADJ N a nice house una casa bonito NP DET N ADJ NP DET ADJ N a nice house una casa bonita N gender = ADJ gender
1. Refine a translation rule: R0 R1 (change R0 to make it more specific or more general) Types of Refinement Operations Automatic Rule Adaptation R0: R1: NP DET N ADJ NP DET ADJ N a nice house una casa bonito NP DET N ADJ NP DET ADJ N a nice house una casa bonita N gender = ADJ gender
2. Bifurcate a translation rule: R0 R0 (same, general rule) Types of RefinementOperations (2) Automatic Rule Adaptation R0: NP DET N ADJ NP DET ADJ N a nice house una casa bonita NP DET ADJ N NP DET ADJ N R1: a great artist un gran artista ADJ type: pre-nominal
2. Bifurcate a translation rule: R0 R0 (same, general rule) R1 (add a new more specific rule) Types of RefinementOperations (2) Automatic Rule Adaptation R0: NP DET N ADJ NP DET ADJ N a nice house una casa bonita NP DET ADJ N NP DET ADJ N R1: a great artist un gran artista ADJ type: pre-nominal
W i = error W i ’ = correction W c = clue word Automatic Rule Adaptation NP DET N ADJ NP DET ADJ N a nice house una casa bonito NP DET N ADJ NP DET ADJ N a nice house una casa bonita N gender = ADJ gender W i = bonito W i ’ = bonita W c = casa Formalizing Error Information
Triggering Feature Detection Comparison at the feature level to detect triggering feature(s) Delta function: (W i,W i ’) Examples: (bonito,bonita) = {gender} (comiamos,comia) = {person,number} (mujer,guitarra) = { } If set is empty, need to postulate a new binary feature (feat_i = {+,−}) Automatic Rule Adaptation gen = mascgen = fem
Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement
Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement
SL: Gaudí was a great artist TL: Gaudí era un artista grande 1. Error Correction Elicitation W i = grande Edit Word Change Word Order W i ’ = gran CTL: Gaudí era un gran artista
Refinement Steps Error Correction Elicitation Finding Triggering Feature Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran gran artista
Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran gran artista
ADJ [great] [grande] agr num = sg agr gen = masc ADJ [great] [gran] agr num = sg agr gen = masc Delta function difference at the feature level? (grande, gran) = 2. Finding Triggering Features
ADJ [great] [grande] agr num = sg agr gen = masc ADJ [great] [gran] agr num = sg agr gen = masc Delta function difference at the feature level? (grande, gran) = need to postulate a new binary feature: feat1 2. Finding Triggering Features
ADJ [great] [grande] agr num = sg agr gen = masc ADJ [great] [gran] agr num = sg agr gen = masc Delta function difference at the feature level? (grande, gran) = need to postulate a new binary feature: feat1 [feat1 = +] = [type = pre-nominal] [feat1 = −] = [type = post-nominal] 2. Finding Triggering Features
ADJ [great] [grande] agr num = sg agr gen = masc ADJ [great] [gran] agr num = sg agr gen = masc Delta function difference at the feature level? (grande, gran) = new binary feature: feat1 ADJ [great] [grande] agr num = sg agr gen = masc feat1 = − ADJ [great] [gran] agr num = sg agr gen = masc feat1 = + REFINE
Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran gran artista grande [feat1 = −] gran [feat1 = +]
Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran gran artista grande [feat1 = −] gran [feat1 = +]
(from transfer and generation tree) tree: <( S,1 (NP,2 (N,5:1 "GAUDI") ) (VP,3 (VB,2 (AUX,17:2 "ERA") ) (NP,8 (DET,0:3 "UN") (N,4:5 "ARTISTA") (ADJ,5:4 "GRANDE"))) )> 3. Blame Assignment S,1 … NP,1 … NP,8 … Grammar
Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran gran artista grande [feat1 = −] gran [feat1 = +] NP,8 N ADJ ADJ N
Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran gran artista grande [feat1 = −] gran [feat1 = +] NP,8 N ADJ ADJ N
4. Rule Refinement NP DET ADJ N NP DET N ADJ NP,8: a great artist un artista gran NP DET ADJ N NP DET ADJ N NP,8’: a great artist un gran artista ADJ feat1 =c + BIFURCATE REFINE
4. Rule Refinement NP DET ADJ N NP DET N ADJ NP,8: a great artist un artista gran NP DET ADJ N NP DET ADJ N NP,8’: a great artist un gran artista ADJ type = pre-nominal
Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran gran artista grande [feat1 = −] gran [feat1 = +] NP,8 (N ADJ ADJ N) NP,8 ADJ N N ADJ NP,8’ ADJ N ADJ N [ADJ feat 1 =c +]
Correct Translation Output NP,8 ADJ(great-grande) [feat1 = -] NP,8’ ADJ(great-gran) [ADJ feat1 =c +] [feat1 = +] Gaudi era un artista grande Gaudi era un gran artista *Gaudi era un grande artista
Done? Not yet NP,8 (R0)ADJ(grande) [feat1 = -] NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +] Need to restrict application of general rule (R0) to just post-nominal ADJ un artista grande un artista gran un gran artista *un grande artista
Add Blocking Constraint NP,8 (R0)ADJ(grande) [feat1 = -] [feat1 = -] NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +] Can we also eliminate incorrect translations automatically? un artista grande *un artista gran un gran artista *un grande artista
Making the grammar tighter If W c = artista Add [feat1= +] to N(artista) Add agreement constraint to NP,8 (R0) between N and ADJ ((N feat1) = (ADJ feat1)) *un artista grande *un artista gran un gran artista *un grande artista
Generalization Power: abstract feature (feat_i) Irina is a great friend Irina es una gran amiga (instead of *Irina es una amiga grande) NP DET ADJ N NP DET ADJ N NP,8’: a great friend una gran amiga NP DET ADJ N NP DET ADJ N a great person una gran persona Juan is a great person Juan es una gran persona (instead of *Juan es una persona grande) ADJ feat1 = +
Generalization Power++ When triggering feature already exists in the grammar/lexicon (POS, gender, number, etc.) I see the red car *veo un auto roja ADJ gender = N gender Refinements generalize to all lexical entries that have that feature (gender): The yellow houses are his las casas amarillas son suyas (before: *las casas amarillos son suyas) We need to go to a dark cave tenemos que ir a una cueva oscura (before: *cueva oscuro) gender = fem gender = masc gender = fem gender = masc gender = fem veo un auto rojo o
Outline Introduction Theoretical Space Online Error Elicitation Rule Refinement Framework MT Evaluation Framework –Relevant (and free) human reference translations –Not punished by BLEU, NIST and METEOR. Conclusions
MT Evaluation Framework Multiple Human Reference Translations relevant to specific MT system errors Bilingual speakers - MT systems not be punished for picking a different synonym or morpho-syntactic variation by BLEU and METEOR. - Similar to what [Snover et al. 2006] propose (HTER).
MT Evaluation continued Did the refined MT system generate translation as corrected by the user (CTL)? Simple “recall” {0=CTL not in output, 1=CTL in output} Did the number of bad translations (implicitly identified by users) generated by the system decrease? Precision at rank k (k=5 in TCTool) Did the system successfully managed to reduce ambiguity (number of alternative translations)? Reduction ratio
Outline Introduction Theoretical Space Online Error Elicitation Rule Refinement Framework MT Evaluation Framework Conclusions –Impact on MT systems –Work in Progress and Contributions –Future Work
Impact on Transfer-Based MT Learning Module Transfer Rules Lexical Resources Transfer System Translation Candidate Lattice Rule Learning and other Resources Run-Time System Handcrafted rules Morpho- logical analyzer INPUT TEXT OUTPUT TEXT
Learning Module Transfer Rules Lexical Resources Transfer System Translation Candidate Lattice Rule Learning and other Resources Run-Time System Handcrafted rules Morpho- logical analyzer INPUT TEXT Online Translation Correction Tool Rule Refinement Impact on Transfer-Based MT
Learning Module Transfer Rules Lexical Resources Transfer System Translation Candidate Lattice Rule Learning and other Resources Run-Time System Handcrafted rules Morpho- logical analyzer INPUT TEXT Online Translation Correction Tool Rule Refinement Impact on Transfer-Based MT
Learning Module Transfer Rules Lexical Resources Transfer System Translation Candidate Lattice Rule Learning and other Resources Run-Time System Handcrafted rules Morpho- logical analyzer INPUT TEXT Online Translation Correction Tool Rule Refinement Impact on Transfer-Based MT
Learning Module Transfer Rules Lexical Resources Transfer System Translation Candidate Lattice Rule Learning and other Resources Run-Time System Handcrafted rules Morpho- logical analyzer INPUT TEXT Online Translation Correction Tool Rule Refinement OUTPUT TEXT Impact on Transfer-Based MT
TCTool can help improve Rule-based MT (grammar, lexicon, LM) EBMT (examples, lexicon, alignments) Statistical MT (lexicon, and alignments) + relevant annotated data develop smarter training algorithms Panel this afternoon 3:45-4:45pm
Work in Progress Finalizing regression and test sets to perform rigorous evaluation of approach Handling incorrect Correction Instances –Have multiple users correct the same set of sentences filter out noise (threshold: 90% users agree) User study with multiple users evaluate improvement after refinements
Contributions so far An efficient online GUI to display translations and alignments and solicit pinpoint fixes from non- expert bilingual users. New Framework to improve MT quality: an expandable set of rule refinement operations MT Evaluation Framework, which provides relevant reference translations
Future work Explore other ways to make the interaction with users more fun Games with a purpose [Von Ahn and Blum, 2004 & 2006]
Future work Explore other ways to make the interaction with users more fun Games with a purpose [Von Ahn and Blum, 2004 & 2006] Second Language Learning
Questions?
Thanks! MT + =
Backup slides TCTool next stepTCTool next step What if users corrections are different (user noise)?What if users corrections are different (user noise)? More than one correction per sentence? W c exampleW c example Data set Where is appropriate to refine vs bifurcate?Where is appropriate to refine vs bifurcate? Lexical bifurcate Is the refinement process invariable?
Backup slides (ii) TCTool Demo Simulation RR operation patterns Automatic Evaluation feasibility study AMTA paper results User studies map Precision, recall, F1 NIST, BLEU, METEOR
Backup slides (iii) Done? Not yet… (blocking constraints++) Minimal pair example Batch mode implementation Interactive mode implementation User studies Evaluation of refined output Another Generalization Example Avenue Architecture Technical Challenges Rules Formalism
Player = The Teacher Goal: teach the MT system to translate correctly Different levels of expertise –Beginner: language learning and improving (labeled data) –Intermediate-Advanced: provide labeled data to improve the system and for beginner levels MT
Two players: cooperation In English to Spanish MT: –Player 1: Spanish native speaker + learning English –Player 2: English native speaker + learning Spanish MT back to main back to main
Translation rule example {NP,8} NP::NP : [DET ADJ N] -> [DET N ADJ] ( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) (x0 = x3) (y2 = x3) ((y1 agr) = (y2 agr)) ((y3 agr) = (y2 agr)) )
Translation rule example {NP,8} SL side (English) TL side (Spanish) ;;X0 Y0 X1 X2 X3 Y1 Y2 Y3 NP::NP : [DET ADJ N] -> [DET N ADJ] ( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) (x0 = x3) (y2 = x3) ((y1 agr) = (y2 agr)) ((y3 agr) = (y2 agr)) ) Rule ID ;; analysis ;; transfer ;; generation
Translation rule example {NP,8} ;;X0 Y0 X1 X2 X3 Y1 Y2 Y3 NP::NP : [DET ADJ N] -> [DET N ADJ] ( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) ; passing definiteness up (x0 = x3); X3 is the head of X0 (y2 = x3); pass features of head to Y2 ((y1 agr) = (y2 agr)) ; det-noun agreement ((y3 agr) = (y2 agr)) ; adj-noun agreement ) back to main back to main
Done? Not yet NP,8 (R0)ADJ(grande) [feat1 = -] NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +] Need to restrict application of general rule (R0) to just post-nominal ADJ Automatic Rule Adaptation un artista grande un artista gran un gran artista *un grande artista
Add Blocking Constraint NP,8 (R0)ADJ(grande) [feat1 = -] [feat1 = -] NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +] Can we also eliminate incorrect translations automatically? Automatic Rule Adaptation un artista grande *un artista gran un gran artista *un grande artista
Making the grammar tighter If W c = artista Add [feat1= +] to N(artista) Add agreement constraint to NP,8 (R0) between N and ADJ ((N feat1) = (ADJ feat1)) Automatic Rule Adaptation *un artista grande *un artista gran un gran artista *un grande artista back to main back to main
Example Requiring Minimal Pair Automatic Rule Adaptation 1. Run SL sentence through the transfer engine I see them *veo los Correct TL: los veo 2. W i = los but no W i ’ nor W c Need a minimal pair to determine appropriate refinement: I see cars veo autos 3. Triggering feature(s): (veo los, veo autos) (los,autos) = {pos} PRON(los) [pos=pron] N(autos) [pos=n] Proposed Work back to main back to main
Avenue Architecture Learning Module Learned Transfer Rules Lexical Resources Run Time Transfer System Decoder Translation Correction Tool Word- Aligned Parallel Corpus Elicitation Tool Elicitation Corpus ElicitationRule Learning Run-Time System Rule Refinement Rule Refinement Module Morphology Analyzer Learning Module Handcrafted rules INPUT TEXT OUTPUT TEXT back to main back to main
Technical Challenges Elicit minimal MT information from non-expert users Automatically Refine and Expand Translation Rules minimally Manually written Automatically Learned Automatic Evaluation of Refinement process back to main back to main
Batch Mode Implementation Given a set of user corrections, apply refinement module. For Refinement Operations of errors that can be refined fully automatically using: 1.Correction information only 2. Correction and error information Proposed Work Automatic Rule Adaptation error type, clue word
Rule Refinement Operations Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c = +rule – rule +al – al =W i W i W i ’ RuleLearner POS i =POS i ’ POS i POS i ’
Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c = +rule – rule +al – al =W i W i W i ’ RuleLearner POS i =POS i ’ POS i POS i ’ Rule Refinement Operations 1. Correction info only It is a nice house – *Es una casa bonito Es una casa bonita Gaudi was a great artist – *Gaudi era un artista grande Gaudi era un gran artista
Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c = +rule – rule +al – al =W i W i W i ’ RuleLearner POS i =POS i ’ POS i POS i ’ Rule Refinement Operations I am proud of you – *Estoy orgullosa de tu Estoy orgullosa de ti PP PREP NP 2. Correction and Error info back to main back to main
Interactive Mode Implementation Extra error information is required to determine triggering context automatically Need to give other relevant sentences to the user at run-time (minimal pairs) For Refinement Operations of errors that can be refined fully automatically but: 3. require a further user interaction Proposed Work Automatic Rule Adaptation
Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c = +rule – rule +al – al =W i W i W i ’ RuleLearner POS i =POS i ’ POS i POS i ’ Rule Refinement Operations 3. Further user interaction I see them – *Veo los Los veo back to main back to main
Refining and Adding Constraints VP,3: VP NP VP NP (veo los, veo autos) VP,3’: VP NP NP VP + [NP pos =c pron] (los veo, *autos veo) Percolate triggering features up to the constituent level: NP: PRON PRON + [NP pos = PRON pos] Block application of general rule (VP,3): VP,3: VP NP VP NP + [NP pos = (*NOT* pron)] *veo los, veo autos (los veo, *autos veo) Proposed Work
User Studies TCTool: new MT classification (Eng2Spa) Different language pair Mapudungun or Quechua Spanish Batch vs Interactive mode Amount of information elicited just corrections vs + error information Proposed Work back to main back to main
Evaluation of Refined MT Output 1. Evaluate best translation Automatic evaluation metrics (BLEU, NIST, METEOR) 2. Evaluate translation candidate list size precision (includes parsimony)
1. Evaluate Best translation Hypothesis file (translations to be evaluated automatically) Raw MT output: –Best sentence (picked by user to be correct or requiring the least amount of correction) Refined MT output: –Use METEOR score at sentence level to pick best candidate from the list Run all automatic metrics on the new hypothesis file using user corrections as reference translations.
2. Evaluate Translation Candidate List Precision: tp binary {0,1} (1 =user correction) tp + fp total number of TLs SL X TL X SL TL X TL X SL TL X TL X TL SL TL X TL X = 0/31/51/5 1/3 user correction SL TL X TL X = 1/2 back to main back to main
Generalization Power When triggering feature already exists in the feature language (pos, gender, number, etc.) I see them *veo los los veo - I love him lo amo (before: *amo lo) - They called me yesterday me llamaron ayer (before: *llamaron me ayer ) - Mary helps her with her homework Maria le ayuda con sus tareas (before: *Maria ayuda le con sus tareas) back to main back to main
Precision, Recall and F1 Precision: tp tp + fp (selected, incorrect) Recall: tp tp + fn (correct, not selected) F1: 2 PR (P + R) back to main back to main
Automatic Evaluation Metrics BLEU: averages the precision for unigram, bigram and up to 4-grams and applies a length penalty [Papineni, 2001]. NIST: instead of n-gram precision the information gain from each n-gram is taken into account [NIST 2002]. METEOR: assigns most of the weight to recall, instead of precision and uses stemming [Lavie, 2004] back to main back to main
Data Set Split development set (~400 sentence) into: –Dev set Run User Studies Develop Refinement Module Validate functionality –Test set Evaluate effect of Refinement operations + Wild test set (from naturally occurring text) Requirement: need to be fully parsed by grammar Proposed Work back to questions back to questions
Refine vs Bifurcate Batch mode bifurcate (no way to tell if the original rule should never apply) Interactive mode refine (change original rule) if can get enough evidence that original rule never applies. Corrections involving agreement constraints seem to hold for all cases refine (Open research question) back to questions back to questions
More than one correction/sentence AB TLXX A 1 st X B 2 nd X Tetris approach to Automatic Rule Refinement Assumption: different corrections to different words different error back to questions back to questions
Exception: structural divergences He danced her out of the room * La bailó fuera de la habitación her he-danced out of the room La sacó de la habitación bailando her he-take-out of the room dancing Have no way of knowing that these corrections are related Do one error at a time, if TQ decreases over the test set, hypothesize that it’s a divergence Feed to the Rule Learner as a new (manually corrected) training example back to questions back to questions
Constituent order change I gave him the tools *di a él las herramientas I-gave to him the tools le di las herramientas a él him I-gave the toolsto him desired refinement: VP VP PP(a PRON) NP VP VP NP PP(a PRON) Can extract constituent information from MT output (parse tree) and treat as one error/correction
More than one correction/error A+B TLX Example: edit and move same word (like: gran, bailó) Occam’s razor Assumption: both corrections are part of the same error back to questions back to questions
W c Example I am proud of you *Estoy orgullosa de tu W c =de tu ti I-am proud of you-nom Estoy orgullosa de ti I-amproud of you-oblic Without W c information, would need to increase the ambiguity of the grammar significantly! +[you ti]: I love you *ti quiero (te quiero,…) you read *ti lees (tu lees,…) back to questions back to questions
Lexical bifurcate Should the system copy all the features to the new entry? –Good starting point –Might want to copy just a subset of features (possibly POS-dependent) Open Research Question back to questions back to questions
Automatic Rule Adaptation
SL + best TL picked by user
Automatic Rule Adaptation Changing word order
Automatic Rule Adaptation Changing “grande” into “gran”
Automatic Rule Adaptation back to main back to main
1 2 3 Automatic Rule Adaptation
Input to RR module sl: I see them tl: VEO LOS tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") ) (NP,0 (PRON,2:3 "LOS") ) ) ) )> - User correction log file - Transfer engine output (+ parse tree): Automatic Rule Adaptation sl: I see cars tl: VEO AUTOS tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") ) (NP,2 (N,1:3 “AUTOS") ) ) ) )> back to main back to main
Types of RR Operations Grammar: –R0 R0 + R1 [=R0’ + constr] Cov[R0] Cov[R0,R1] –R0 R1[=R0 + constr= -] R2[=R0’ + constr=c +] Cov[R0] Cov[R1,R2] –R0 R1 [=R0 + constr] Cov[R0] Cov[R1] Lexicon –Lex0 Lex0 + Lex1[=Lex0 + constr] –Lex0 Lex1[=Lex0 + constr] –Lex0 Lex0 + Lex1[ Lex0 + TLword] – Lex1 (adding lexical item) bifurcate refine Completed Work Automatic Rule Adaptation back to main back to main
Manual vs Learned Grammars [AMTA 2004] Automatic Rule Adaptation NISTBLEUMETEOR Manual grammar Learned grammar Manual inspection: Automatic MT Evaluation: back to main back to main
Human Oracle experiment As a feasibility experiment, compared raw output with manually corrected MT: statistically significant (confidence interval test) These is an upper-bound on how much difference we should expect any refinement approach to make. Automatic Rule Adaptation Completed Work back to main back to main
Invariable process? Rule Refinement process is not invariable Order of the corrected sentences input to the system matters Example: 1 st : gran artista bifurcate (2 rules) 2 nd : casa bonito add agr constraint to only 1 rule (original, general rule) the specific rule is still incorrect (missing agr constraint) 1 st : casa bonito add agr constraint 2 nd : gran artista bifurcate both rules have agreement constraint (optimal order) back to questions back to questions
User noise? Solution: Have several users evaluate and correct the same test set threshold, 90% agreement - correction - error information (type, clue word) Only modify the grammar if enough evidence of incorrect rule. back to questions back to questions
User Studies Map RR module Manual grammars Learned grammars Eng2spa Only Corrections Corrections+error info Active learning Batch mode interactive mode Proposed Work X X Mapu2Spa X back to main back to main
Recycle corrections of Machine Translation output back into the system by refining and expanding existing translation rules
Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c = +rule – rule +al – al =W i W i W i ’ RuleLearner POS i =POS i ’ POS i POS i ’ Rule Refinement Operations 1. Correction info only It is a nice house – Es una casa bonito Es una casa bonita John and Mary fell – Juan y Maria cayeron Juan y Maria se cayeron Gaudi was a great artist – Gaudi era un artista grande Gaudi era un gran artista
Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c = +rule – rule +al – al =W i W i W i ’ RuleLearner POS i =POS i ’ POS i POS i ’ Rule Refinement Operations Gaudi was a great artist – Gaudi era un artista grande Gaudi era un gran artista Es una casa bonito Es una casa bonita J y M cayeron J y M se cayeron I will help him fix the car – Ayudaré a él a arreglar el auto Le ayudare a arreglar el auto 1. Correction info only
Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c = +rule – rule +al – al =W i W i W i ’ RuleLearner POS i =POS i ’ POS i POS i ’ Rule Refinement Operations I will help him fix the car – Ayudaré a él a arreglar el auto Le ayudare a arreglar el auto I would like to go – Me gustaria que ir Me gustaria ir 1. Correction info only
Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c = +rule – rule +al – al =W i W i W i ’ RuleLearner POS i =POS i ’ POS i POS i ’ Rule Refinement Operations I am proud of you – Estoy orgullosa tu Estoy orgullosa de ti PP PREP NP 2. Correction and Error info
Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c = +rule – rule +al – al =W i W i W i ’ RuleLearner POS i =POS i ’ POS i POS i ’ Rule Refinement Operations Focus 3 Wally plays the guitar – Wally juega la guitarra Wally toca la guitarra I saw the woman – Vi la mujer Vi a la mujer I see them – Veo los Los veo
Rule Refinement Operations Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c = +rule – rule +al – al =W i W i W i ’ RuleLearner POS i =POS i ’ POS i POS i ’ Outside Scope of Thesis John read the book – A Juan leyó el libro Juan leyó el libro Where are you from? – Donde eres tu de? De donde eres tu?