Download presentation
Presentation is loading. Please wait.
Published byFranklin Silas Barton Modified over 9 years ago
1
Avenue Architecture Learning Module Learned Transfer Rules Lexical Resources Run Time Transfer System Decoder Translation Correction Tool Word- Aligned Parallel Corpus Elicitation Tool Elicitation Corpus ElicitationRule Learning Run-Time System Rule Refinement Rule Refinement Module Morphology Analyzer Learning Module Handcrafted rules INPUT TEXT OUTPUT TEXT
2
Interactive and Automatic Refinement of translation Rules Problem: Improve Machine Translation Quality. Proposed Solution: Put bilingual speakers back into the loop; use their corrections to detect the source of the error and automatically improve the lexicon and the grammar. Approach: Automate post-editing efforts by feeding them back into the MT system. Automatic refinement of translation rules that caused an error beyond post-editing. Goal: Improve MT coverage and overall quality.
3
Technical Challenges Elicit minimal MT information from non-expert users Automatically Refine and Expand Translation Rules minimally Manually written Automatically Learned Automatic Evaluation of Refinement process
4
Error Typology for Automatic Rule Refinement (simplified) Missing word Extra word Wrong word order Incorrect word Wrong agreement Interactive elicitation of error information Local vs Long distance Word vs. phrase + Word change Sense Form Selectional restrictions Idiom Missing constraint Extra constraint
5
TCTool (Demo)Demo Add a word Delete a word Modify a word Change word order Actions: Interactive elicitation of error information precisionrecall error detection90%89% error classification72%71%
6
1. Refine a translation rule: R0 R1 (change R0 to make it more specific or more general) Types of Refinement Operations Automatic Rule Adaptation R0: R1: NP DET N ADJ NP DET ADJ N a nice house una casa bonito NP DET N ADJ NP DET ADJ N a nice house una casa bonita N gender = ADJ gender
7
2. Bifurcate a translation rule: R0 R0 (same, general rule) R1 (add a new more specific rule) Types of Refinement Operations Automatic Rule Adaptation R0: NP DET N ADJ NP DET ADJ N NP DET ADJ N NP DET ADJ N R1: a nice house una casa bonita a great artist un gran artista ADJ type: pre-nominal
8
Error Information Elicitation Refinement Operation Typology Automatic Rule Adaptation Change word order SL: Gaudí was a great artist MT system output: TL: Gaudí era un artista grande Ucorrection: *Gaudí era un artista grande Gaudí era un gran artista A concrete example clue word error correction
9
Finding Triggering Feature(s): (error word, corrected word ) = need to postulate a new binary feature: feat1 Blame assignment (from MT system output) tree: <((S,1 (NP,2 (N,5:1 "GAUDI") ) (VP,3 (VB,2 (AUX,17:2 "ERA") ) (NP,8 (DET,0:3 "UN") (N,4:5 "ARTISTA") (ADJ,5:4 "GRANDE") ) ) ) )> Automatic Rule Adaptation S,1 … NP,1 … NP,8 … Grammar ADJ::ADJ |: [great] -> [grande] ((X1::Y1) ((x0 form) = great) ((y0 agr num) = sg) ((y0 agr gen) = masc)) ADJ::ADJ |: [great] -> [gran] ((X1::Y1) ((x0 form) = great) ((y0 agr num) = sg) ((y0 agr gen) = masc))
10
Refining Rules Bifurcate NP,8 NP,8 (R0) + NP,8’ (R1) (flip order of ADJ-N) {NP,8’} NP::NP : [DET ADJ N] -> [DET ADJ N] ( (X1::Y1) (X2::Y2) (X3::Y3) ((x0 def) = (x1 def)) (x0 = x3) ((y1 agr) = (y3 agr)) ; det-noun agreement ((y2 agr) = (y3 agr)) ; adj-noun agreement (y2 = x3) ((y2 feat1) =c + )) Automatic Rule Adaptation
11
Refining Lexical Entries ADJ::ADJ |: [great] -> [grande] ((X1::Y1) ((x0 form) = great) ((y0 agr num) = sg) ((y0 agr gen) = masc) ((y0 feat1) = -)) ADJ::ADJ |: [great] -> [gran] ((X1::Y1) ((x0 form) = great) ((y0 agr num) = sg) ((y0 agr gen) = masc) ((y0 feat1) = +)) Automatic Rule Adaptation
12
Evaluating Improvement Automatic Rule Adaptation -Given the initial and final Translation Lattices, the Rule Refinement module needs to take into account, whether the following are present: -Corrected Translation Sentence -Original Translation Sentence (labelled as incorrect by the user) un artista gran un gran artista un grande artista *un artista grande
13
Evaluating Improvement Automatic Rule Adaptation -Given the initial and final Translation Lattices, the Rule Refinement module needs to take into account, whether the following are present: -Corrected Translation Sentence -Original Translation Sentence (labelled as incorrect by the user) *un artista gran un gran artista *un grande artista *un artista grande
14
Challenges and future work Credit and Blame assignment from TCTool Log Files and Xfer engine’s trace Order of corrections matters ~ explore rule interactions Explore the space between batch mode and fully interactive system Online TCTool always running to collect corrections from bilingual speakers make it into a game with rewards for the best users
15
Publications Font Llitjós, A., J.G. Carbonell and A. Lavie. "A Framework for Interactive and Automatic Refinement of Transfer-based Machine Translation" EAMT 10th Annual Conference 30-31 May 2005, Budapest, Hungary. "A Framework for Interactive and Automatic Refinement of Transfer-based Machine Translation" Font Llitjós, A., R. Aranovich and L. Levin. "Building Machine translation systems for indigenous languages". Second Conference on the Indigenous Languages of Latin America (CILLA II), 27-29 October 2005, Texas, USA. "Building Machine translation systems for indigenous languages" Font Llitjós, A., K. Probst and J.G. Carbonell. "Error Analysis of Two Types of Grammar for the Purpose of Automatic Rule Refinement". AMTA, 2004, Washington, USA. "Error Analysis of Two Types of Grammar for the Purpose of Automatic Rule Refinement" Font Llitjós, A. and J.G. Carbonell. "The Translation Correction Tool: English-Spanish user studies“. LREC, 2004. Lisbon, Portugal. "The Translation Correction Tool: English-Spanish user studies“
16
Quechua Spanish MT V-Unit: funded Summer project in Cusco (Peru) June-August 2005 [preparations and data collection started earlier] Intensive Quechua course in Centro Bartolome de las Casas (CBC) Worked together with two Quechua native and one non-native speakers on developing infrastructure (correcting elicited translations, segmenting and translating list of most frequent words)
17
Quechua Spanish prototype MT system Stem Lexicon (semi-automatically generated): 753 lexical entries Suffix lexicon: 21 suffixes (150 Cusihuaman) Quechua morphology analyzer 25 translation rules Spanish morphology generation module User-Studies: 10 sentences, 3 users (2 native, 1 non- native)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.