Automating Post-Editing To Improve MT Systems Ariadna Font Llitjós and Jaime Carbonell APE Workshop, AMTA – Boston August 12, 2006.

Slides:

Advertisements

Similar presentations

The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.

Advertisements

Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.

CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.

Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Improving Machine Translation Quality via Hybrid Systems and Refined Evaluation Methods Andreas Eisele DFKI GmbH and Saarland University Helsinki, November.

Vamshi Ambati | Stephan Vogel | Jaime Carbonell Language Technologies Institute Carnegie Mellon University A ctive Learning and C rowd-Sourcing for Machine.

Confidence Estimation for Machine Translation J. Blatz et.al, Coling 04 SSLI MTRG 11/17/2004 Takahiro Shinozaki.

“Applying Morphology Generation Models to Machine Translation” By Kristina Toutanova, Hisami Suzuki, Achim Ruopp (Microsoft Research). UW Machine Translation.

Automatic Rule Learning for Resource-Limited Machine Translation Alon Lavie, Katharina Probst, Erik Peterson, Jaime Carbonell, Lori Levin, Ralf Brown Language.

Language Model. Major role: Language Models help a speech recognizer figure out how likely a word sequence is, independent of the acoustics. A lot of.

Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.

Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.

Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.

© 2014 The MITRE Corporation. All rights reserved. Stacey Bailey and Keith Miller On the Value of Machine Translation Adaptation LREC Workshop: Automatic.

SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.

Automated Essay Evaluation Martin Angert Rachel Drossman.

Machine Translation Dr. Radhika Mamidi. What is Machine Translation? A sub-field of computational linguistics It investigates the use of computer software.

Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

Evaluation of the Statistical Machine Translation Service for Croatian-English Marija Brkić Department of Informatics, University of Rijeka

An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.

Profile The METIS Approach Future Work Evaluation METIS II Architecture METIS II, the continuation of the successful assessment project METIS I, is an.

Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.

Recent Major MT Developments at CMU Briefing for Joe Olive February 5, 2008 Alon Lavie and Stephan Vogel Language Technologies Institute Carnegie Mellon.

A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:

Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.

Rule Learning - Overview Goal: Syntactic Transfer Rules 1) Flat Seed Generation: produce rules from word- aligned sentence pairs, abstracted only to POS.

AMTEXT: Extraction-based MT for Arabic Faculty: Alon Lavie, Jaime Carbonell Students and Staff: Laura Kieras, Peter Jansen Informant: Loubna El Abadi.

Transfer-based MT with Strong Decoding for a Miserly Data Scenario Alon Lavie Language Technologies Institute Carnegie Mellon University Joint work with:

AVENUE Automatic Machine Translation for low-density languages Ariadna Font Llitjós Language Technologies Institute SCS Carnegie Mellon University.

Using Surface Syntactic Parser & Deviation from Randomness Jean-Pierre Chevallet IPAL I2R Gilles Sérasset CLIPS IMAG.

Carnegie Mellon Goal Recycle non-expert post-editing efforts to: - Refine translation rules automatically - Improve overall translation quality Proposed.

A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.

Hebrew-to-English XFER MT Project - Update Alon Lavie June 2, 2004.

Chapter 23: Probabilistic Language Models April 13, 2004.

1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.

A Trainable Transfer-based MT Approach for Languages with Limited Resources Alon Lavie Language Technologies Institute Carnegie Mellon University Joint.

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.

Wei Lu, Hwee Tou Ng, Wee Sun Lee National University of Singapore

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Bridging the Gap: Machine Translation for Lesser Resourced Languages

Avenue Architecture Learning Module Learned Transfer Rules Lexical Resources Run Time Transfer System Decoder Translation Correction Tool Word- Aligned.

October 10, 2003BLTS Kickoff Meeting1 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP.

Eliciting a corpus of word- aligned phrases for MT Lori Levin, Alon Lavie, Erik Peterson Language Technologies Institute Carnegie Mellon University.

CMU MilliRADD Small-MT Report TIDES PI Meeting 2002 The CMU MilliRADD Team: Jaime Carbonell, Lori Levin, Ralf Brown, Stephan Vogel, Alon Lavie, Kathrin.

AVENUE: Machine Translation for Resource-Poor Languages NSF ITR

Developing affordable technologies for resource-poor languages Ariadna Font Llitjós Language Technologies Institute Carnegie Mellon University September.

FROM BITS TO BOTS: Women Everywhere, Leading the Way Lenore Blum, Anastassia Ailamaki, Manuela Veloso, Sonya Allin, Bernardine Dias, Ariadna Font Llitjós.

Review: Review: Translating without in-domain corpus: Machine translation post-editing with online learning techniques Antonio L. Lagarda, Daniel Ortiz-Martínez,

Semi-Automatic Learning of Transfer Rules for Machine Translation of Minority Languages Katharina Probst Language Technologies Institute Carnegie Mellon.

Learning to Generate Complex Morphology for Machine Translation Einat Minkov †, Kristina Toutanova* and Hisami Suzuki* *Microsoft Research † Carnegie Mellon.

Eliciting a corpus of word-aligned phrases for MT

Statistical Machine Translation Part II: Word Alignments and EM

Approaches to Machine Translation

Ariadna Font Llitjós March 10, 2004

Towards Interactive and Automatic Refinement of Translation Rules

Towards Interactive and Automatic Refinement of Translation Rules

Eiji Aramaki* Sadao Kurohashi* * University of Tokyo

Approaches to Machine Translation

Towards Interactive and Automatic Refinement of Translation Rules

Statistical Machine Translation Papers from COLING 2004

Towards Interactive and Automatic Refinement of Translation Rules

NAACL-HLT 2010 June 5, 2010 Jee Eun Kim (HUFS) & Kong Joo Lee (CNU)

AMTEXT: Extraction-based MT for Arabic

Johns Hopkins 2003 Summer Workshop on Syntax and Statistical Machine Translation Chapters 5-8 Ethan Phelps-Goodman.

Presentation transcript:

Automating Post-Editing To Improve MT Systems Ariadna Font Llitjós and Jaime Carbonell APE Workshop, AMTA – Boston August 12, 2006

Outline Introduction –Problem and Motivation –Goal and Solution –Related Work Theoretical Space Online Error Elicitation Rule Refinement Framework MT Evaluation Framework Conclusions

Problem and Motivation MT output: Assassinated a diplomat Russian and kidnapped other four in Bagdad

Problem and Motivation MT output: Assassinated a diplomat Russian and kidnapped other four in Bagdad Could hire post-editors to correct machine translations

Problem and Motivation MT output: Assassinated a diplomat Russian and kidnapped other four in Bagdad Could hire post-editors to correct machine translations Expensive + time consuming

Problem and Motivation MT output: Assassinated a diplomat Russian and kidnapped other four in Bagdad Not feasible for large amounts of data (google, yahoo, etc.) Does not generalize to new sentences

Ultimate Goal Automatically Improve MT Quality

Two Alternatives Automatic learning of post-editing rules (APE) + system independent - several thousands of sentences might need to be corrected for the same error Automatic Refinement of MT System + attacks the core of the problem -system dependent

Our Approach Automatically Improve MT Quality by Recycling Post-Editing Information Back to MT Systems

Our Solution Get non-expert bilingual speakers to provide correction feedback online –Make correcting translations easy and fun –5-10 minutes a day  Large amounts of correction data SL: Asesinado un diplomático ruso y secuestrados otros cuatro en Bagdad TL: Assassinated a diplomat Russian and kidnapped other four in Bagdad

Our Solution Get non-expert bilingual speakers to provide correction feedback online –Make correcting translations easy and fun –5-10 minutes a day  Large amounts of correction data Feed corrections back into the MT system, so that they can be generalized  System will translate new sentences better SL: Asesinado un diplomático ruso y secuestrados otros cuatro en Bagdad TL: Assassinated a diplomat Russian and kidnapped other four in Bagdad

Bilingual Post-editing In addition to: –TL sentence, and –Context information, when available It also provides post-editors with: –Source Language sentence –Alignment information Traditional post-editing Allegedly a harder task, but we can now get much more data for free

MT Approaches Interlingua Syntactic Parsing Semantic Analysis Sentence Planning Text Generation Source (e.g. English) Target (e.g. Spanish) Transfer Rules Direct: SMT, EBMT

[Nishida et al. 1988] [Corston-Oliver & Gammon, 2003] [Imamura et al. 2003] [Menezes & Richardson, 2001] Related Work [Su et al. 1995] [Brill, 1993] [Gavaldà, 2000] Post-editing Rule Adaptation Fixing Machine Translation [Callison-Burch, 2004] [Allen 2003] [Allen & Hogan, 2000] [Knight & Chander, 1994] Our Approach Non-expert user feedback Provides relevant reference translations Generalizes over unseen data

System Architecture INPUT TEXT OUTPUT TEXT Rule Learning and other Resources Run-Time MT System Online Translation Correction Tool Rule Refinement

Main Technical Challenge Mapping between Simple user edits to MT output Improved Translation Rules Blame assignment Rule Modifications Lexical Expansions

Outline Introduction Theoretical Space –Error classification –Limitations Online Error Elicitation Rule Refinement Framework MT Evaluation Framework Conclusions

Error Typology for Automatic Rule Refinement (simplified) Missing word Extra word Wrong word order Incorrect word Local vs Long distance Word vs. phrase + Word change Sense Form Selectional restrictions Idiom Missing constraint Extra constraint

Error Typology for Automatic Rule Refinement (simplified) Missing word Extra word Wrong word order Incorrect word Local vs Long distance Word vs. phrase + Word change Sense Form Selectional restrictions Idiom Missing constraint Extra constraint

Limitations of Approach The SL sentence needs to be fully parsed by the translation grammar. –current approach needs rules to refine Depend on bilingual speakers’ ability to detect MT errors.

Outline Introduction Theoretical Space Online Error Elicitation –Translation Correction Tool Rule Refinement Framework MT Evaluation Framework Conclusions

TCTool (Demo)Demo Add a word Delete a word Modify a word Change word order Actions: Interactive elicitation of error information

Expanding the lexicon 1. OOV words/idoms Hamas cabinet calls a truce to avoid Israeli retaliation  El gabinete de Hamas llama un *TRUCE* para evitar la venganza israelí  …acuerda una tregua… 2. New Word Form The children fell  *los niños cayeron  los niños se cayeron 3. New Word Sense The girl plays the guitar  *la muchacha juega la guitarra  la muchacha toca la guitarra

Refining the grammar 4. Add Agreement constraints I see the red car  *veo el coche roja  veo el coche rojo 5. Add Word or Constituent You saw the woman  viste la mujer  viste a la mujer 6. Move Constituent Order I saw you  yo vi te  yo te vi

SL: Gaudí was a great artist TL: Gaudí era un artista grande change constituent order Edit Word Change Word Order CTL: Gaudí era un gran artista Adding new form to lexicon

Eng2Spa User Study [LREC 2004] MT error classification  9 linguistically- motivated classes [Flanagan, 1994], [White et al. 1994]: word order, sense, agreement error (number, person, gender, tense), form, incorrect word and no translation Interactive elicitation of error information precisionrecall error detection90%89% error classification72%71%

Outline Introduction Theoretical Space Online Error Elicitation Rule Refinement Framework –RR operations –Formalizing Error information –Refinement Steps (Example) MT Evaluation Framework Conclusions

1. Refine a translation rule: R0  R1 (change R0 to make it more specific or more general) Types of Refinement Operations Automatic Rule Adaptation R0: R1: NP DET N ADJ NP DET ADJ N a nice house una casa bonito NP DET N ADJ NP DET ADJ N a nice house una casa bonita N gender = ADJ gender

1. Refine a translation rule: R0  R1 (change R0 to make it more specific or more general) Types of Refinement Operations Automatic Rule Adaptation R0: R1: NP DET N ADJ NP DET ADJ N a nice house una casa bonito NP DET N ADJ NP DET ADJ N a nice house una casa bonita N gender = ADJ gender

2. Bifurcate a translation rule: R0  R0 (same, general rule) Types of RefinementOperations (2) Automatic Rule Adaptation R0: NP DET N ADJ NP DET ADJ N a nice house una casa bonita NP DET ADJ N NP DET ADJ N R1: a great artist un gran artista ADJ type: pre-nominal

2. Bifurcate a translation rule: R0  R0 (same, general rule)  R1 (add a new more specific rule) Types of RefinementOperations (2) Automatic Rule Adaptation R0: NP DET N ADJ NP DET ADJ N a nice house una casa bonita NP DET ADJ N NP DET ADJ N R1: a great artist un gran artista ADJ type: pre-nominal

W i = error W i ’ = correction W c = clue word Automatic Rule Adaptation NP DET N ADJ NP DET ADJ N a nice house una casa bonito NP DET N ADJ NP DET ADJ N a nice house una casa bonita N gender = ADJ gender W i = bonito W i ’ = bonita W c = casa Formalizing Error Information

Triggering Feature Detection Comparison at the feature level to detect triggering feature(s)  Delta function:  (W i,W i ’) Examples:  (bonito,bonita) = {gender}  (comiamos,comia) = {person,number}  (mujer,guitarra) = {  } If  set is empty, need to postulate a new binary feature (feat_i = {+,−}) Automatic Rule Adaptation gen = mascgen = fem

Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement

Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement

SL: Gaudí was a great artist TL: Gaudí era un artista grande 1. Error Correction Elicitation W i = grande Edit Word Change Word Order W i ’ = gran CTL: Gaudí era un gran artista

Refinement Steps Error Correction Elicitation Finding Triggering Feature Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran  gran artista

Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran  gran artista

ADJ [great]  [grande] agr num = sg agr gen = masc ADJ [great]  [gran] agr num = sg agr gen = masc Delta function difference at the feature level?  (grande, gran) =  2. Finding Triggering Features

ADJ [great]  [grande] agr num = sg agr gen = masc ADJ [great]  [gran] agr num = sg agr gen = masc Delta function difference at the feature level?  (grande, gran) =   need to postulate a new binary feature: feat1 2. Finding Triggering Features

ADJ [great]  [grande] agr num = sg agr gen = masc ADJ [great]  [gran] agr num = sg agr gen = masc Delta function difference at the feature level?  (grande, gran) =   need to postulate a new binary feature: feat1 [feat1 = +] = [type = pre-nominal] [feat1 = −] = [type = post-nominal] 2. Finding Triggering Features

ADJ [great]  [grande] agr num = sg agr gen = masc ADJ [great]  [gran] agr num = sg agr gen = masc Delta function difference at the feature level?  (grande, gran) =  new binary feature: feat1 ADJ [great]  [grande] agr num = sg agr gen = masc feat1 = − ADJ [great]  [gran] agr num = sg agr gen = masc feat1 = + REFINE

Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran  gran artista grande [feat1 = −] gran [feat1 = +]

Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran  gran artista grande [feat1 = −] gran [feat1 = +]

(from transfer and generation tree) tree: <( S,1 (NP,2 (N,5:1 "GAUDI") ) (VP,3 (VB,2 (AUX,17:2 "ERA") ) (NP,8 (DET,0:3 "UN") (N,4:5 "ARTISTA") (ADJ,5:4 "GRANDE"))) )> 3. Blame Assignment S,1 … NP,1 … NP,8 … Grammar

Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran  gran artista grande [feat1 = −] gran [feat1 = +] NP,8 N ADJ  ADJ N

Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran  gran artista grande [feat1 = −] gran [feat1 = +] NP,8 N ADJ  ADJ N

4. Rule Refinement NP DET ADJ N NP DET N ADJ NP,8: a great artist un artista gran NP DET ADJ N NP DET ADJ N NP,8’: a great artist un gran artista ADJ feat1 =c + BIFURCATE REFINE

4. Rule Refinement NP DET ADJ N NP DET N ADJ NP,8: a great artist un artista gran NP DET ADJ N NP DET ADJ N NP,8’: a great artist un gran artista ADJ type = pre-nominal

Refinement Steps Error Correction Elicitation Finding Triggering Features Blame Assignment Rule Refinement 1.Edit: W i = grande W i ’ = gran 2. Change Word Order: artista gran  gran artista grande [feat1 = −] gran [feat1 = +] NP,8 (N ADJ  ADJ N) NP,8 ADJ N  N ADJ NP,8’ ADJ N  ADJ N [ADJ feat 1 =c +]

Correct Translation Output NP,8 ADJ(great-grande) [feat1 = -] NP,8’ ADJ(great-gran) [ADJ feat1 =c +] [feat1 = +] Gaudi era un artista grande Gaudi era un gran artista *Gaudi era un grande artista

Done? Not yet NP,8 (R0)ADJ(grande) [feat1 = -] NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +]  Need to restrict application of general rule (R0) to just post-nominal ADJ un artista grande un artista gran un gran artista *un grande artista

Add Blocking Constraint NP,8 (R0)ADJ(grande) [feat1 = -] [feat1 = -] NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +] Can we also eliminate incorrect translations automatically? un artista grande *un artista gran un gran artista *un grande artista

Making the grammar tighter If W c = artista  Add [feat1= +] to N(artista)  Add agreement constraint to NP,8 (R0) between N and ADJ ((N feat1) = (ADJ feat1)) *un artista grande *un artista gran un gran artista *un grande artista

Generalization Power: abstract feature (feat_i) Irina is a great friend  Irina es una gran amiga (instead of *Irina es una amiga grande) NP DET ADJ N NP DET ADJ N NP,8’: a great friend una gran amiga NP DET ADJ N NP DET ADJ N a great person una gran persona Juan is a great person  Juan es una gran persona (instead of *Juan es una persona grande) ADJ feat1 = +

Generalization Power++ When triggering feature already exists in the grammar/lexicon (POS, gender, number, etc.) I see the red car  *veo un auto roja ADJ gender = N gender Refinements generalize to all lexical entries that have that feature (gender): The yellow houses are his  las casas amarillas son suyas (before: *las casas amarillos son suyas) We need to go to a dark cave  tenemos que ir a una cueva oscura (before: *cueva oscuro) gender = fem gender = masc gender = fem gender = masc gender = fem  veo un auto rojo o

Outline Introduction Theoretical Space Online Error Elicitation Rule Refinement Framework MT Evaluation Framework –Relevant (and free) human reference translations –Not punished by BLEU, NIST and METEOR. Conclusions

MT Evaluation Framework Multiple Human Reference Translations relevant to specific MT system errors Bilingual speakers - MT systems not be punished for picking a different synonym or morpho-syntactic variation by BLEU and METEOR. - Similar to what [Snover et al. 2006] propose (HTER).

MT Evaluation continued Did the refined MT system generate translation as corrected by the user (CTL)?  Simple “recall” {0=CTL not in output, 1=CTL in output} Did the number of bad translations (implicitly identified by users) generated by the system decrease?  Precision at rank k (k=5 in TCTool) Did the system successfully managed to reduce ambiguity (number of alternative translations)?  Reduction ratio

Outline Introduction Theoretical Space Online Error Elicitation Rule Refinement Framework MT Evaluation Framework Conclusions –Impact on MT systems –Work in Progress and Contributions –Future Work

Impact on Transfer-Based MT Learning Module Transfer Rules Lexical Resources Transfer System Translation Candidate Lattice Rule Learning and other Resources Run-Time System Handcrafted rules Morpho- logical analyzer INPUT TEXT OUTPUT TEXT

Learning Module Transfer Rules Lexical Resources Transfer System Translation Candidate Lattice Rule Learning and other Resources Run-Time System Handcrafted rules Morpho- logical analyzer INPUT TEXT Online Translation Correction Tool Rule Refinement Impact on Transfer-Based MT

Learning Module Transfer Rules Lexical Resources Transfer System Translation Candidate Lattice Rule Learning and other Resources Run-Time System Handcrafted rules Morpho- logical analyzer INPUT TEXT Online Translation Correction Tool Rule Refinement Impact on Transfer-Based MT

Learning Module Transfer Rules Lexical Resources Transfer System Translation Candidate Lattice Rule Learning and other Resources Run-Time System Handcrafted rules Morpho- logical analyzer INPUT TEXT Online Translation Correction Tool Rule Refinement Impact on Transfer-Based MT

Learning Module Transfer Rules Lexical Resources Transfer System Translation Candidate Lattice Rule Learning and other Resources Run-Time System Handcrafted rules Morpho- logical analyzer INPUT TEXT Online Translation Correction Tool Rule Refinement OUTPUT TEXT Impact on Transfer-Based MT

TCTool can help improve Rule-based MT (grammar, lexicon, LM) EBMT (examples, lexicon, alignments) Statistical MT (lexicon, and alignments) + relevant annotated data  develop smarter training algorithms Panel this afternoon 3:45-4:45pm

Work in Progress Finalizing regression and test sets to perform rigorous evaluation of approach Handling incorrect Correction Instances –Have multiple users correct the same set of sentences  filter out noise (threshold: 90% users agree) User study with multiple users  evaluate improvement after refinements

Contributions so far An efficient online GUI to display translations and alignments and solicit pinpoint fixes from non- expert bilingual users. New Framework to improve MT quality: an expandable set of rule refinement operations MT Evaluation Framework, which provides relevant reference translations

Future work Explore other ways to make the interaction with users more fun Games with a purpose [Von Ahn and Blum, 2004 & 2006]

Future work Explore other ways to make the interaction with users more fun Games with a purpose [Von Ahn and Blum, 2004 & 2006] Second Language Learning

Questions?

Thanks! MT + =

Backup slides TCTool next stepTCTool next step What if users corrections are different (user noise)?What if users corrections are different (user noise)? More than one correction per sentence? W c exampleW c example Data set Where is appropriate to refine vs bifurcate?Where is appropriate to refine vs bifurcate? Lexical bifurcate Is the refinement process invariable?

Backup slides (ii) TCTool Demo Simulation RR operation patterns Automatic Evaluation feasibility study AMTA paper results User studies map Precision, recall, F1 NIST, BLEU, METEOR

Backup slides (iii) Done? Not yet… (blocking constraints++) Minimal pair example Batch mode implementation Interactive mode implementation User studies Evaluation of refined output Another Generalization Example Avenue Architecture Technical Challenges Rules Formalism

Player = The Teacher Goal: teach the MT system to translate correctly Different levels of expertise –Beginner: language learning and improving (labeled data) –Intermediate-Advanced: provide labeled data to improve the system and for beginner levels MT

Two players: cooperation In English to Spanish MT: –Player 1: Spanish native speaker + learning English –Player 2: English native speaker + learning Spanish MT back to main back to main

Translation rule example {NP,8} NP::NP : [DET ADJ N] -> [DET N ADJ] ( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) (x0 = x3) (y2 = x3) ((y1 agr) = (y2 agr)) ((y3 agr) = (y2 agr)) )

Translation rule example {NP,8} SL side (English) TL side (Spanish) ;;X0 Y0 X1 X2 X3 Y1 Y2 Y3 NP::NP : [DET ADJ N] -> [DET N ADJ] ( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) (x0 = x3) (y2 = x3) ((y1 agr) = (y2 agr)) ((y3 agr) = (y2 agr)) ) Rule ID ;; analysis ;; transfer ;; generation

Translation rule example {NP,8} ;;X0 Y0 X1 X2 X3 Y1 Y2 Y3 NP::NP : [DET ADJ N] -> [DET N ADJ] ( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) ; passing definiteness up (x0 = x3); X3 is the head of X0 (y2 = x3); pass features of head to Y2 ((y1 agr) = (y2 agr)) ; det-noun agreement ((y3 agr) = (y2 agr)) ; adj-noun agreement ) back to main back to main

Done? Not yet NP,8 (R0)ADJ(grande) [feat1 = -] NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +]  Need to restrict application of general rule (R0) to just post-nominal ADJ Automatic Rule Adaptation un artista grande un artista gran un gran artista *un grande artista

Add Blocking Constraint NP,8 (R0)ADJ(grande) [feat1 = -] [feat1 = -] NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +] Can we also eliminate incorrect translations automatically? Automatic Rule Adaptation un artista grande *un artista gran un gran artista *un grande artista

Making the grammar tighter If W c = artista  Add [feat1= +] to N(artista)  Add agreement constraint to NP,8 (R0) between N and ADJ ((N feat1) = (ADJ feat1)) Automatic Rule Adaptation *un artista grande *un artista gran un gran artista *un grande artista back to main back to main

Example Requiring Minimal Pair Automatic Rule Adaptation 1. Run SL sentence through the transfer engine I see them  *veo los Correct TL: los veo 2. W i = los but no W i ’ nor W c  Need a minimal pair to determine appropriate refinement: I see cars  veo autos 3. Triggering feature(s):  (veo los, veo autos)  (los,autos) = {pos} PRON(los) [pos=pron] N(autos) [pos=n] Proposed Work back to main back to main

Avenue Architecture Learning Module Learned Transfer Rules Lexical Resources Run Time Transfer System Decoder Translation Correction Tool Word- Aligned Parallel Corpus Elicitation Tool Elicitation Corpus ElicitationRule Learning Run-Time System Rule Refinement Rule Refinement Module Morphology Analyzer Learning Module Handcrafted rules INPUT TEXT OUTPUT TEXT back to main back to main

Technical Challenges Elicit minimal MT information from non-expert users Automatically Refine and Expand Translation Rules minimally Manually written Automatically Learned Automatic Evaluation of Refinement process back to main back to main

Batch Mode Implementation Given a set of user corrections, apply refinement module. For Refinement Operations of errors that can be refined fully automatically using: 1.Correction information only 2. Correction and error information Proposed Work Automatic Rule Adaptation error type, clue word

Rule Refinement Operations Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c   =  +rule – rule +al – al =W i W i  W i ’ RuleLearner POS i =POS i ’ POS i  POS i ’

Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c   =  +rule – rule +al – al =W i W i  W i ’ RuleLearner POS i =POS i ’ POS i  POS i ’ Rule Refinement Operations 1. Correction info only It is a nice house – *Es una casa bonito  Es una casa bonita Gaudi was a great artist – *Gaudi era un artista grande  Gaudi era un gran artista

Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c   =  +rule – rule +al – al =W i W i  W i ’ RuleLearner POS i =POS i ’ POS i  POS i ’ Rule Refinement Operations I am proud of you – *Estoy orgullosa de tu  Estoy orgullosa de ti PP  PREP NP 2. Correction and Error info back to main back to main

Interactive Mode Implementation Extra error information is required to determine triggering context automatically  Need to give other relevant sentences to the user at run-time (minimal pairs) For Refinement Operations of errors that can be refined fully automatically but: 3. require a further user interaction Proposed Work Automatic Rule Adaptation

Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c   =  +rule – rule +al – al =W i W i  W i ’ RuleLearner POS i =POS i ’ POS i  POS i ’ Rule Refinement Operations 3. Further user interaction I see them – *Veo los  Los veo back to main back to main

Refining and Adding Constraints VP,3: VP NP  VP NP (veo los, veo autos) VP,3’: VP NP  NP VP + [NP pos =c pron] (los veo, *autos veo) Percolate triggering features up to the constituent level: NP: PRON  PRON + [NP pos = PRON pos] Block application of general rule (VP,3): VP,3: VP NP  VP NP + [NP pos = (*NOT* pron)] *veo los, veo autos (los veo, *autos veo) Proposed Work

User Studies TCTool: new MT classification (Eng2Spa) Different language pair Mapudungun or Quechua  Spanish Batch vs Interactive mode Amount of information elicited just corrections vs + error information Proposed Work back to main back to main

Evaluation of Refined MT Output 1. Evaluate best translation  Automatic evaluation metrics (BLEU, NIST, METEOR) 2. Evaluate translation candidate list size  precision (includes parsimony)

1. Evaluate Best translation Hypothesis file (translations to be evaluated automatically) Raw MT output: –Best sentence (picked by user to be correct or requiring the least amount of correction) Refined MT output: –Use METEOR score at sentence level to pick best candidate from the list  Run all automatic metrics on the new hypothesis file using user corrections as reference translations.

2. Evaluate Translation Candidate List Precision: tp binary {0,1} (1 =user correction) tp + fp total number of TLs SL X TL X SL TL X TL X SL TL X TL X TL  SL TL X TL X  = 0/31/51/5 1/3  user correction SL TL X TL X   =  1/2  back to main back to main

Generalization Power  When triggering feature already exists in the feature language (pos, gender, number, etc.) I see them  *veo los  los veo - I love him  lo amo (before: *amo lo) - They called me yesterday  me llamaron ayer (before: *llamaron me ayer ) - Mary helps her with her homework  Maria le ayuda con sus tareas (before: *Maria ayuda le con sus tareas) back to main back to main

Precision, Recall and F1 Precision: tp tp + fp (selected, incorrect) Recall: tp tp + fn (correct, not selected) F1: 2 PR (P + R) back to main back to main

Automatic Evaluation Metrics BLEU: averages the precision for unigram, bigram and up to 4-grams and applies a length penalty [Papineni, 2001]. NIST: instead of n-gram precision the information gain from each n-gram is taken into account [NIST 2002]. METEOR: assigns most of the weight to recall, instead of precision and uses stemming [Lavie, 2004] back to main back to main

Data Set Split development set (~400 sentence) into: –Dev set  Run User Studies  Develop Refinement Module  Validate functionality –Test set  Evaluate effect of Refinement operations + Wild test set (from naturally occurring text) Requirement: need to be fully parsed by grammar Proposed Work back to questions back to questions

Refine vs Bifurcate Batch mode  bifurcate (no way to tell if the original rule should never apply) Interactive mode  refine (change original rule) if can get enough evidence that original rule never applies. Corrections involving agreement constraints seem to hold for all cases  refine (Open research question) back to questions back to questions

More than one correction/sentence AB TLXX A 1 st X B 2 nd X Tetris approach to Automatic Rule Refinement Assumption: different corrections to different words  different error back to questions back to questions

Exception: structural divergences He danced her out of the room  * La bailó fuera de la habitación her he-danced out of the room  La sacó de la habitación bailando her he-take-out of the room dancing Have no way of knowing that these corrections are related  Do one error at a time, if TQ decreases over the test set, hypothesize that it’s a divergence  Feed to the Rule Learner as a new (manually corrected) training example back to questions back to questions

Constituent order change I gave him the tools  *di a él las herramientas I-gave to him the tools  le di las herramientas a él him I-gave the toolsto him desired refinement: VP  VP PP(a PRON) NP VP  VP NP PP(a PRON)  Can extract constituent information from MT output (parse tree) and treat as one error/correction

More than one correction/error A+B TLX Example: edit and move same word (like: gran, bailó) Occam’s razor Assumption: both corrections are part of the same error back to questions back to questions

W c Example I am proud of you  *Estoy orgullosa de tu W c =de tu  ti I-am proud of you-nom  Estoy orgullosa de ti I-amproud of you-oblic Without W c information, would need to increase the ambiguity of the grammar significantly! +[you  ti]: I love you  *ti quiero (te quiero,…) you read  *ti lees (tu lees,…) back to questions back to questions

Lexical bifurcate Should the system copy all the features to the new entry? –Good starting point –Might want to copy just a subset of features (possibly POS-dependent)  Open Research Question back to questions back to questions

Automatic Rule Adaptation

SL + best TL picked by user

Automatic Rule Adaptation Changing word order

Automatic Rule Adaptation Changing “grande” into “gran”

Automatic Rule Adaptation back to main back to main

1 2 3 Automatic Rule Adaptation

Input to RR module sl: I see them tl: VEO LOS tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") ) (NP,0 (PRON,2:3 "LOS") ) ) ) )> - User correction log file - Transfer engine output (+ parse tree): Automatic Rule Adaptation sl: I see cars tl: VEO AUTOS tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") ) (NP,2 (N,1:3 “AUTOS") ) ) ) )> back to main back to main

Types of RR Operations Grammar: –R0  R0 + R1 [=R0’ + constr] Cov[R0]  Cov[R0,R1] –R0  R1[=R0 + constr= -]  R2[=R0’ + constr=c +] Cov[R0]  Cov[R1,R2] –R0  R1 [=R0 + constr] Cov[R0]  Cov[R1] Lexicon –Lex0  Lex0 + Lex1[=Lex0 + constr] –Lex0  Lex1[=Lex0 + constr] –Lex0  Lex0 + Lex1[  Lex0 +  TLword] –   Lex1 (adding lexical item) bifurcate refine Completed Work Automatic Rule Adaptation back to main back to main

Manual vs Learned Grammars [AMTA 2004] Automatic Rule Adaptation NISTBLEUMETEOR Manual grammar Learned grammar Manual inspection: Automatic MT Evaluation: back to main back to main

Human Oracle experiment As a feasibility experiment, compared raw output with manually corrected MT: statistically significant (confidence interval test) These is an upper-bound on how much difference we should expect any refinement approach to make. Automatic Rule Adaptation Completed Work back to main back to main

Invariable process? Rule Refinement process is not invariable  Order of the corrected sentences input to the system matters Example: 1 st : gran artista  bifurcate (2 rules) 2 nd : casa bonito  add agr constraint to only 1 rule (original, general rule)  the specific rule is still incorrect (missing agr constraint) 1 st : casa bonito  add agr constraint 2 nd : gran artista  bifurcate  both rules have agreement constraint (optimal order) back to questions back to questions

User noise? Solution: Have several users evaluate and correct the same test set  threshold, 90% agreement - correction - error information (type, clue word) Only modify the grammar if enough evidence of incorrect rule. back to questions back to questions

User Studies Map RR module Manual grammars Learned grammars Eng2spa Only Corrections Corrections+error info Active learning Batch mode interactive mode Proposed Work X X Mapu2Spa X back to main back to main

Recycle corrections of Machine Translation output back into the system by refining and expanding existing translation rules

Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c   =  +rule – rule +al – al =W i W i  W i ’ RuleLearner POS i =POS i ’ POS i  POS i ’ Rule Refinement Operations 1. Correction info only It is a nice house – Es una casa bonito  Es una casa bonita John and Mary fell – Juan y Maria  cayeron  Juan y Maria se cayeron Gaudi was a great artist – Gaudi era un artista grande  Gaudi era un gran artista

Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c   =  +rule – rule +al – al =W i W i  W i ’ RuleLearner POS i =POS i ’ POS i  POS i ’ Rule Refinement Operations Gaudi was a great artist – Gaudi era un artista grande  Gaudi era un gran artista Es una casa bonito  Es una casa bonita J y M cayeron  J y M se cayeron I will help him fix the car – Ayudaré a él a arreglar el auto  Le ayudare a arreglar el auto 1. Correction info only

Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c   =  +rule – rule +al – al =W i W i  W i ’ RuleLearner POS i =POS i ’ POS i  POS i ’ Rule Refinement Operations I will help him fix the car – Ayudaré a él a arreglar el auto  Le ayudare a arreglar el auto I would like to go – Me gustaria que ir  Me gustaria  ir 1. Correction info only

Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c   =  +rule – rule +al – al =W i W i  W i ’ RuleLearner POS i =POS i ’ POS i  POS i ’ Rule Refinement Operations I am proud of you – Estoy orgullosa tu  Estoy orgullosa de ti PP  PREP NP 2. Correction and Error info

Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c   =  +rule – rule +al – al =W i W i  W i ’ RuleLearner POS i =POS i ’ POS i  POS i ’ Rule Refinement Operations Focus 3 Wally plays the guitar – Wally juega la guitarra  Wally toca la guitarra I saw the woman – Vi  la mujer  Vi a la mujer I see them – Veo los  Los veo

Rule Refinement Operations Modify AddDeleteChange W Order +W c –W c +W c –W c +al –al W i W c W i (…) W c –W c   =  +rule – rule +al – al =W i W i  W i ’ RuleLearner POS i =POS i ’ POS i  POS i ’ Outside Scope of Thesis John read the book – A Juan leyó el libro   Juan leyó el libro Where are you from? – Donde eres tu de?  De donde eres tu?