Download presentation
Presentation is loading. Please wait.
Published byMargarita San Martín Cano Modified over 6 years ago
1
Towards Interactive and Automatic Refinement of Translation Rules
Ariadna Font Llitjós PhD Thesis Proposal Jaime Carbonell (advisor) Alon Lavie (co-advisor) Lori Levin Bonnie Dorr (Univ. Maryland) 5 November 2004 Thank the advisors and the committee and the audience for attending
2
Interactive and Automatic Rule Refinement
Outline Introduction Thesis statement and scope Preliminary Research Interactive elicitation of error information A framework for automatic rule adaptation Proposed Research Contributions and Thesis Timeline Motivation, related work, goals think if there is a better place to put the outline... before post-editing? Interactive and Automatic Rule Refinement
3
Machine Translation (MT)
Source Language (SL) sentence: Gaudi was a great artist Spanish translation: Gaudi era un gran artista MT System outputs : *Gaudi estaba un artista grande *Gaudi era un artista grande If we are given the sentence (SL): Gaudi is a great artist we would like our MT system to translate it like: Gaudi es un gran artista But instead our MT system outputs two incorrect Target Language (TL) sentences Spanish has two forms of for the verb to be, one that indicates a temporary state (estaba) and one that indicates a more permanent (state). In this case, to be a great artist is permanent (unlike to be sick). On the other hand, “gran” is a pre-nominal adjective, which instanciates an exception to the general rule... [next slide] Interactive and Automatic Rule Refinement
4
Interactive and Automatic Rule Refinement
Completed Work Spanish Adjectives Automatic Rule Adaptation General order: grande big in size NP DET ADJ N NP DET N ADJ a big house una casa grande Exception: gran exceptional This is an exception to the rule, since “gran” is a pre-nominal adjective NP DET ADJ N a great artist un gran artista Interactive and Automatic Rule Refinement
5
Commercial and Online Systems
Correct Translation: Gaudi era un gran artista Systran, Babelfish (Altavista), WorldLingo, Translated.net : *Gaudi era gran artista ImTranslation: *El Gaudi era un gran artista 1-800-Translate *Gaudi era un fenomenal artista Explain errors!! Even for such a short and simple sentence like this, the output of commercial MT systems is often incorrect. In this case, and unlike the output of our system, the meaning is pretty much preserved, but more often than not, State-of-the-art MT systems mistranslate sentences giving them a totally incorrect meaning. [Show this is a big/general problem -> solution has great impact Other MT systems also produce incorrect translations MT output requires post-editing] ImTranslation: translation2.paralink.com/ Interactive and Automatic Rule Refinement
6
Interactive and Automatic Rule Refinement
Post-editing Current solutions: Post-editing [Allen, 2003] by human linguists or editors (experts) Automated Post-Editing prototype module (APE) [Allen & Hogan, 2000] to alleviate the tedious task of correcting most frequent errors over and over No solution to fully automate post-editing process MT output still requires post-editing!!! post-editing: the correction of MT output by human linguists or editors human post-editors: some tools exist to assist in this tedious task – - APE: prototype model, much better before my research, as far as I know, there is no solution to fully automate the post-editing process, in the sense of automatically learning the post-editing rules that would fix any MT output. Interactive and Automatic Rule Refinement
7
Drawbacks of Current Methods
Manual post-editing Corrections do not generalize Gaudi era un artista grande Juan es un amigo grande (Juan is a great friend) Era una oportunidad grande (It is a great opportunity) APE Humans need to predict all the errors ahead of time and code for the post-editing rules; given new error Current systems do not recycle post-editing efforts back into the system, beyond adding as new training data. Manual post-editing: most common method very tedious, since they need to do the same actions over and over for the same MT error, it would be nice to automatize APE: prototype module to do precisely that, but errors still need to be detected and predicted by a human first and then post-editing rules need to be written manually for each type of error In the face of a new error type, the APE doesn’t do anything, human editor needs to fix it, over an over. Interactive and Automatic Rule Refinement
8
Interactive and Automatic Rule Refinement
My Solution Automate post-editing efforts by feeding them back into the MT system. Possible alternatives: Automatic learning of post-editing rules + system independent - several thousands of sentences might need to be corrected for the same error Automatic refinement of translation rules + attacks the core of the problem for transfer-based MT systems (need rules to fix!) Goal: take an existing MT system and improve it (with user feedback) ALPER: system independent, but interaction of two incorrect rules can generate thousands of (very) incorrect sentences ARTR: system dependent, goes to the core of the problem, fixing one rule can prevent thousands of incorrect sentences from being generated (both are language-pair dependent) Interactive and Automatic Rule Refinement
9
Interactive and Automatic Rule Refinement
Related Work [Corston-Oliver & Gammon, 2003] [Imamura et al. 2003] [Menezes & Richardson, 2001] [Brill, 1993] [Gavaldà, 2000] [Callison-Burch, 2004] Fixing Machine Translation Rule Adaptation [Su et al. 1995] My Thesis Post-editing My research vs. Post-editing: replacing expert post-editor by a non-expert bilingual speakers + software but it does more than just mere correcting the current sentence it corrects the problem in its core so that the same error would not occur twice. Callison-Burch Fixing MT: experts fixing alignments (training data) for SMT (human intervention to create training data) Su et al adjusting evaluation parameters of a SMT to respect user style [My research: transfer-based MT systems (there need to be rules to refine)] The idea of Rule Adaptation to correct language Technology applications is not new. There have been researchers applying RA to (among other NLP applications): - parsing [brill 93] (reduce error by learning a set of simple structural transformations), - NLU [gavalda, 2000] (mechanism to enable a non-expert user to dynamically extend the coverage of a NLU module, just by answering simple classification questions) - and, more relevant for my thesis, to MT [click] where researchers have applied different techniques ranging from [Corston-Oliver and Gammon] DTs to fix automatically learned linguistic representation to reduce noise (very similar to our goal!) to [Imamura and Menezes and Richardson] comparing MT translations with human reference translations to correct and eliminate redundant rules. In contrast to filtering out incorrect or redundant rules, though, I propose to actually refine and expand the translation rules themselves, by editing valid but inaccurate rules that might be lacking a constraint for example, which will result into correctly translating sentences, that were incorrectly translated before. make sure this is clear: As far as I know, nobody has taken user corrections and incorporated into the MT system (My niche!) Furthermore, and most importantly, my research does not rely on the existence of a parallel training corpus or a set of human reference translations, unlike all other existing methods. No pre-existing training data required No human reference translations required Non-expert user feedback [Allen & Hogan, 2000] Interactive and Automatic Rule Refinement
10
Resource-poor Scenarios (AVENUE)
Lack of electronic parallel data Lack of computational linguists Lack of manual grammar Why bother? Indigenous communities have difficult access to crucial information that directly affects their life (such as land laws, plagues, health warnings, etc.) Preservation of their language and culture Resource-poor Languages: Mapudungun Quechua Aymara My research approach can be applied to any language-pair (given a transfer MT system), but it’s specifically designed to be able to tackle languages with little or no electronic resources, such as Mapudungun and Quechua, which makes the problem of refining translation rules even harder. Since my work is embedded in the AVENUE project, I can’t assume that there is any pre-existing training corpus or even reference translations for my testing sets. 1. (often oral tradition) SMT and EBMT out of the question 2. lack of computational linguists with knowledge of the RP language Interactive and Automatic Rule Refinement
11
How is MT possible for resource-poor languages?
Bilingual speakers Mapuche people from Chile I took the picture Interactive and Automatic Rule Refinement
12
AVENUE Project Overview
Learning Module Transfer Rules Lexical Resources Run Time Transfer System Lattice Word-Aligned Parallel Corpus Elicitation Tool Elicitation Corpus Elicitation Rule Learning Run-Time System Handcrafted rules Morphology Morpho-logical analyzer For those of you who are not familiar with the Avenue project, there are 4 main modules. The first one, elicits translations and word alignments from bilingual speakers with a user-friendly tool. The Rule Learner (3rd module) learns the translation rules from the word-aligned parallel corpus result of the Elicitation phase, and The last module is the actual transfer engine, which given the translation rules and lexical entries, and a sentence to be translated, produces a list of translation candidates or lattice. But, until now there was no way to validate the generalizations learned by the Automatic Rule Learner Nor to verify the quality of the translations produced by the system Interactive and Automatic Rule Refinement
13
My Thesis Interactive and Automatic Rule Refinement Elicitation
Morphology Rule Learning Run-Time System Rule Refinement Word-Aligned Parallel Corpus Learning Module Translation Correction Tool Handcrafted rules Run Time Transfer System Transfer Rules Morpho-logical analyzer Rule Refinement Module Elicitation Corpus So, my thesis work consists of extracting bilingual speaker feedback with an online GUI (the TCTool) and hypothesizing refinement operations to be performed both on the grammar (transfer rules) and the lexicon so as to improve translation quality and coverage. Note that my this approach is specifically well-suited for languages with scarce resources, but can be applied to any language pair, as long as there are bilingual speakers available and there is an initial set of transfer rules for that pair. Lexical Resources Lattice Elicitation Tool Interactive and Automatic Rule Refinement
14
Resource-poor languages
Related Work Post-editing Rule Adaptation Fixing Machine Translation My Thesis Resource-poor languages Interactive and Automatic Rule Refinement
15
Interactive and Automatic Rule Refinement
Thesis Statement Given a rule-based Transfer MT system, we can: - Extract useful information from non-expert bilingual speakers to correct MT output. Automatically refine and expand translation rules, given corrected and aligned translation pairs and some error information. So that the set of refined rules has better coverage and higher overall MT quality. Given a rule-based Transfer MT system, we can extract useful information from non-expert bilingual speakers about the corrections required to make MT output acceptable. Furthermore, We can automatically refine and expand translation rules, given corrected and aligned translation pairs and some error information, so that the resulting set of rules has better coverage and higher overall MT quality. Interactive and Automatic Rule Refinement
16
Interactive and Automatic Rule Refinement
Assumptions No pre-existing parallel training data No pre-existing human reference translations The SL sentence needs to be fully parsed by the translation grammar. Bilingual speakers can give enough information about the MT errors. Explain what these mean!! 1 & 2: No pre-existing data of this form is assumed imposed by the AVENUE project makes the problem more challenging 3. To be able to reconstruct the entire TL tree so that I can search thru that tree to find the source of the error. (some rule needs to have applied, so that we can attempt to refine it automatically.) 4. Which I’ll show you is true in the preliminary work section. Interactive and Automatic Rule Refinement
17
Interactive and Automatic Rule Refinement
Scope Evaluate automatic refinement for the following conditions: 1. User correction information only. 2. Correction and error information. 3. Extra information is required user interaction. Both in manually written and automatically learned grammars [AMTA 2004]. I will evaluate how well I can do Automatic RR for each of these conditions: 1. only taking into account the correction of the MT output given by user 2. taking into account correction plus any other error information elicited from user, such as error type, triggering word, etc. 3. a reasonable amount of user interaction is required Fall outside the scope: types of errors for which - users cannot give the required information, or - further user interaction might take too long and might not provide with the required information In the AMTA paper we compare manual and automatically learned grammars for the purpose of automatic RR. We found that there is a difference (albeit) small between the TQ produced by the manual grammar and the AL grammar, but the important distinction is that I need to apply different types of RR for different types of grammar. Interactive and Automatic Rule Refinement
18
Automatic Evaluation of Refinement process
Technical Challenges Automatic Evaluation of Refinement process Automatically Refine and Expand Translation Rules minimally Manually written Automatically Learned Base: crucial component, without which it makes no sense to try to automatically learn Refinement operations. Minimal in this context refers to minimal post-editing, which usually means to make the least amount of changes possible for producing an understandable working document, rather than producing a high quality document. However, unlike Allen’s (2003) definition of MPE, I do not mean MPE in the sense of producing MT output of less quality, but rather of having bilingual users perform the least amount of changes to the translation so that the same rule that originally generated the SL sentence can be refined. Because my goal is to elicit rather technical information from non-expert users, I need to find a way to classify errors that is not too hard for naive users to do accurately, and at the same time is useful for automatic RR. 2. (minimal description length) MDL = parsimony so that grammar size doesn't increase unnecessarily (blow up) with lots of local refinements Part of the challenge of this second aspect of my thesis is to refine both manual and automatically learned translation rules In order to be able to improve translation quality and coverage automatically, We need to devise a method evaluate the refined system output also automatically so that evaluation metrics results can drive the refinement process. Elicit minimal MT information from non-expert users Interactive and Automatic Rule Refinement
19
Preliminary Work Interactive elicitation of error information
A framework for automatic rule adaptation
20
Error Typology for Automatic Rule Refinement (simplified)
Completed Work Interactive elicitation of error information Error Typology for Automatic Rule Refinement (simplified) Local vs Long distance Word vs. phrase + Word change Sense Form Selectional restrictions Idiom Missing constraint Extra constraint Missing word Extra word Wrong word order Incorrect word Wrong agreement Walk them thru it, explain!!! After looking at several hundred sentences (eng2spa) I organized the different types of MT errors into a typology, keeping in mind the ultimate goal of automatic rule refinement. Need to Find appropriate level of granularity for MT error classification Interactive and Automatic Rule Refinement
21
Interactive and Automatic Rule Refinement
TCTool (Demo) Interactive elicitation of error information Actions: Add a word Delete a word Modify a word Change word order To address the types of MT errors identified (described in previous page), we built an online GUI which allows non-expert bilingual users to reliably detect and minimally correct MT errors by doing one of the following correcting actions: (missing a word ) Adding a word (Extra word ) deleting a word, etc. (incorrect word + wrong agreement ) Modify a word (wrong word order ) change word order for edit-word window, just say: asks user to choose among several non-technical statements (as opposed to asking them to classify between linguistically motivated classes) Given: SL sentence (e.g. I see them) TL sentence (e.g. Yo veo los) word-to-word alignments (I-yo, see-veo, them-los) (context) Interactive and Automatic Rule Refinement
22
Interactive and Automatic Rule Refinement
1st Eng2Spa User Study Completed Work Interactive elicitation of error information [LREC 2004] MT error classification 9 linguistically-motivated classes [Flanagan, 1994], [White et al. 1994]: word order, sense, agreement error (number, person, gender, tense), form, incorrect word and no translation precision recall error detection 90% 89% error classification 72% 71% In the LREC paper, we show that users can reliably detect and classify MT errors, given an initial error classification with 9 linguistically-motivated classes it turns out this is harder than needs be, since some of these distinctions are not important for automatic RR, whereas some other info which could be elicited is not currently being elicited with this classification. Manual grammar (12 rules) lexical entries Test set: 32 sentences from the AVENUE Elicitation Corpus (4 correct / 28 incorrect) Interested in high precision, even at the expense of lower recall Users did not always fix a translation in the same way When the final translation was not = gold standard, it was still correct or better (better stylistically) ------ For more details, see LREC paper or thesis proposal document Interactive and Automatic Rule Refinement
23
Interactive and Automatic Rule Refinement
Translation Rules {NP,8} NP::NP : [DET ADJ N] -> [DET N ADJ] ( (X1::Y1) (X2::Y3) (X3::Y2) ;; English parsing: ((x0 def) = (x1 def)) NP definiteness = DET definiteness (x0 = x3) NP = N (N is the head of the NP) ;; Spanish generation: ((y1 agr) = (y2 agr)) DET agreement = N agreement ((y3 agr) = (y2 agr)) ADJ agreement = N agreement (y2 = x3) ) Pass the features of English N to Spanish N ADJ::ADJ |: [nice] -> [bonito] ((X1::Y1) ((x0 pos) = adj) ((x0 form) = nice) ((y0 agr num) = sg) Spanish ADJ is singular in number ((y0 agr gen) = masc)) Spanish ADJ is masculine in number Rule formalism Showing raw material my module works on (grammar + lexicon) X-side = English (SL) Y-side = Spanish (TL) Interactive and Automatic Rule Refinement
24
Automatic Rule Refinement Framework
Completed Work Automatic Rule Adaptation Find best RR operations given a: Grammar (G), Lexicon (L), (Set of) Source Language sentence(s) (SL), (Set of) Target Language sentence(s) (TL), Its Parse tree (P), and Minimal correction of TL (TL’) such that TQ2 > TQ1 Which can also be expressed as: max TQ(TL|TL’,P,SL,RR(G,L)) such that the translation quality of the refined grammar is higher than that of the original grammar. And/Or coverage increased And/Or ambiguity reduced Interactive and Automatic Rule Refinement
25
Types of Refinement Operations
Completed Work Types of Refinement Operations Automatic Rule Adaptation 1. Refine a translation rule: R0 R1 (change R0 to make it more specific or more general) R0: NP DET N ADJ DET ADJ N a nice house una casa bonito There are 2 main refinement operations that can be applied both to grammar rules and lexical entries, with some minor differences. Refine: change existing rule, replace old (incorrect) rule with the new corrected rule Sometimes a rule is mostly correct, but is missing an agreement constraint: Example on this slide Automatically learned grammars some times over-generalize, and such cases the rules have an extra agreement constraint, which makes the rule incorrect. Results in higher precision and a tighter grammar reducing size of candidate list R1: NP DET N ADJ DET ADJ N N gender = ADJ gender a nice house una casa bonita Interactive and Automatic Rule Refinement
26
Types of Refinement Operations (2)
Completed Work Types of Refinement Operations (2) Automatic Rule Adaptation 2. Bifurcate a translation rule: R0 R0 (same, general rule) R1 (add a new more specific rule) R0: NP DET ADJ N NP DET N ADJ a nice house una casa bonita The second type of operation I’ll like to mention is bifurcate, which adds a copy of the general rule and makes it more specific rule without removing the original (more general) rule Rop1: leave original R0 rule as is, modify the duplicate, R1, This example shows the refinement operation required to also cover the pre-nominal order of ADJ in Spanish (exception to the general rule). You can find a more comprehensive discussion of possible types of refinement operations in the proposal document. Unfortunately I cannot cover them all in my talk. In the document I give a more comprehensive discussion of When users add or delete a word, the safest Rop type is bifurcate, until we have evidence that the original rule can never apply ( interactive mode) High precision, but increases size of candidate list Coverage in terms of TL patterns: every time R0 is modified ->R0’, the coverage on the TL side increases R1: NP DET ADJ N ADJ type: pre-nominal a great artist un gran artista Interactive and Automatic Rule Refinement
27
Formalizing Error Information
Completed Work Formalizing Error Information Automatic Rule Adaptation Wi = error Wi’ = correction Wc = clue word NP DET ADJ N NP DET N ADJ Wi = bonito a nice house una casa bonito Wi = namely the word that needs to be modified, deleted or dragged into a different position by the user in order for the sentence to be correct Wi’= namely the user modification of Wi or the word that needs to be added by the user in order for the sentence to be correct. Wc = represents the word that gives away the clue with respect to what triggered the correction, namely the cause of the error. Example: in the case of lack of agreement between a noun and the adjective that modifies it, as in *el auto roja (the red car), Wc is instantiated with auto, namely the word that gives away the clue about what the gender agreement feature value of Wi (roja) should be. Wc can also be a phrase or constituent like a plural subject (e.g.. *[Juan y Maria] cai). Wc is not always present and it can be before or after Wi (i>c or i<c). They can be contiguous or separated by one or more words. Wc = casa NP DET ADJ N NP DET N ADJ N gender = ADJ gender a nice house una casa bonita Wi’ = bonita Interactive and Automatic Rule Refinement
28
Triggering Feature Detection
Completed Work Triggering Feature Detection Automatic Rule Adaptation Comparison at the feature level to detect triggering feature(s) Delta function: (Wi,Wi’) Examples: (bonito,bonita) = {gender} (comiamos,comia) = {person,number} (mujer,guitarra) = {} If set is empty, need to postulate a new binary feature gen = masc gen = masc Once we have user’s correction (Wi’), we can compare it with Wi at the feature level and find which is the triggering feature. Triggering feature: namely what feature attributes has a different value in Wi and Wi’ Delta set empty = the existing feature language is not expressive enough to distinguish between Wi and Wi’ Interactive and Automatic Rule Refinement
29
Deciding on the Refinement Op
Completed Work Deciding on the Refinement Op Automatic Rule Adaptation Given: - Action performed by the user (add, delete, modify, change word order) - Error information available (clue word, word alignments, etc.) Refinement Action Deciding on the appropriate rule refinement operation: Given (1 and 2), the method proposed can determine what Refinement action to take Interactive and Automatic Rule Refinement
30
Interactive and Automatic Rule Refinement
Rule Refinement Operations Modify Add Delete Change W Order +Wc –Wc Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ This tree represents the space of all possible Rule refinement operations, given a user correcting action and the amount of information available at refinement time And here is an example of how the proposed approach would work thru a simulation Interactive and Automatic Rule Refinement
31
Proposed Work Rule Refinement Example Batch mode implementation
Interactive mode implementation User Studies Evaluation
32
Rule Refinement Example
Automatic Rule Adaptation Change word order SL: Gaudí was a great artist MT system output: TL: Gaudí era un artista grande Goal (given by user correction): *Gaudí era un artista grande Gaudí era un gran artista Given incorrect MT output and the translation grammar that produced it, an expert would know what to do to fix the error (editor: fix sentence + computational linguist: fix the translation rule), but it is a very hard problem for a piece of software to do that. Here is a successful example of how my approach can solve this. Interactive and Automatic Rule Refinement
33
Interactive and Automatic Rule Refinement
Automatic Rule Adaptation 1. Error Information Elicitation Given the TL produced by the MT system for a specific SL sentence, Feed the translation pair thru the TCTool to elicit the correction and error information. Refinement Operation Typology Interactive and Automatic Rule Refinement
34
2. Variable Instantiation from Log File
Automatic Rule Adaptation Correcting Actions: 1. Word order change (artista grande grande artista): Wi = grande 2. Edited grande into gran: Wi’ = gran identified artist as clue word Wc = artista In this case, even if user had not identified Wc, refinement process would have been the same Input correction log file with transfer engine output (parse tree) to Refinement module variable instantiation. Assumption: The reason I know “great” should be “gran” and not “grande” is that it modifies “artist”. (in this case artista might not really be the triggering word (gran is always a pre-nominal adj) but I do it for the sake of the argument, I don’t have time to introduce a new example and explain it well) Interactive and Automatic Rule Refinement
35
3. Retrieve Relevant Lexical Entries
Automatic Rule Adaptation No lexical entry for [great gran] Duplicate lexical entry [great grande] and change TL side: ADJ::ADJ |: [great] -> [gran] ((X1::Y1) (…) ((y0 agr num) = sg) ((y0 agr gen) = masc)) (Morphological analyzer: grande = gran) ADJ::ADJ |: [great] -> [grande] ((X1::Y1) (…) ((y0 agr num) = sg) ((y0 agr gen) = masc)) First step is to retrieve Relevant Lexical Entries, but since there is no lexical entry for greatgran, need to Add Lexical Entry for “gran” Lexical bifurcation (adding a new sense to the lexicon) always? simple starting position, we might want to just copy certain features to the new entry (maybe POS-dependant), but it’s a research question which features. Lex0 Lex0 + Lex1[Lex0 + TLword] Even if we had a morphological analyzer , it would find no difference between grande and gran my research can do something we couldn’t do even with a morphology module grande grande AQ0CS0 grande NCCS000 gran gran AQ0CS0 Parole tags: A=adjective Q=qualifying 0=no degree C=common gender S= number sg 0= no case Interactive and Automatic Rule Refinement
36
4. Finding Triggering Feature(s)
Automatic Rule Adaptation Feature function: (Wi, Wi’) = need to postulate a new binary feature: feat1 5. Blame assignment (from MT system output) tree: <((S,1 (NP,2 (N,5:1 "GAUDI") ) (VP,3 (VB,2 (AUX,17:2 "ERA") ) (NP,8 (DET,0:3 "UN") (N,4:5 "ARTISTA") (ADJ,5:4 "GRANDE") ) ) ) )> Grammar Empty delta set = feature language is not expressive enough to distinguish between “gran” and “grande”, need to postulate a new feature Blame assignment = what rules need to be refined (from parse output by xfer engine): This is the MT output tracing what rules applied and in what order. Even if user had not identified Wc, the same RR process could be done fully automatically, since “grande/gran” is moved around locally within the NP. S,1 … NP,1 NP,8 Interactive and Automatic Rule Refinement
37
6. Variable Instantiation in the Rules
Automatic Rule Adaptation Wi = grande POSi = ADJ = Y3, y3 Wc = artista POSc = N = Y2, y2 {NP,8} ;; Y1 Y2 Y3 NP::NP : [DET ADJ N] -> [DET N ADJ] ( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) (x0 = x3) ((y1 agr) = (y2 agr)) ; det-noun agreement ((y3 agr) = (y2 agr)) ; adj-noun agreement (y2 = x3) ) So that when the order gets flipped we know what variables need to be changed in the rule Interactive and Automatic Rule Refinement
38
Interactive and Automatic Rule Refinement
7. Refining Rules Automatic Rule Adaptation Bifurcate NP,8 NP,8 (R0) + NP,8’ (R1) (flip order of ADJ-N) {NP,8’} NP::NP : [DET ADJ N] -> [DET ADJ N] ( (X1::Y1) (X2::Y2) (X3::Y3) ((x0 def) = (x1 def)) (x0 = x3) ((y1 agr) = (y3 agr)) ; det-noun agreement ((y2 agr) = (y3 agr)) ; adj-noun agreement (y2 = x3) ((y2 feat1) =c + )) =c requires the feature to be specified in the ADJ lexical entry and to be + If it’s underspecified or -, won’t unify Interactive and Automatic Rule Refinement
39
8. Refining Lexical Entries
Automatic Rule Adaptation ADJ::ADJ |: [great] -> [grande] ((X1::Y1) ((x0 form) = great) ((y0 agr num) = sg) ((y0 agr gen) = masc) ((y0 feat1) = -)) ADJ::ADJ |: [great] -> [gran] ((y0 feat1) = +)) 5. Modify grammar and lexicon by applying RR ops. Interactive and Automatic Rule Refinement
40
Interactive and Automatic Rule Refinement
Done? Not yet Automatic Rule Adaptation NP,8 (R0) ADJ(grande) [feat1 = -] NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +] Need to restrict application of general rule (R0) to just post-nominal ADJ un artista grande un artista gran un gran artista *un grande artista Now the refined grammar produces the correct translation, but it still produces an incorrect translation that we know is incorrect (by the minimal post-editing maximum). Since the Refinement operation has increased ambiguity in the grammar: translation candidate list size has increased by more than double, since both “grande” and “gran” can be unified with {NP,8} and “gran” now unifies with {NP,8’} Given another adjective underspecified with respect to feat1 (all new adjectives), it will unify with the general rule (NP,8), which is the desired effect default. Interactive and Automatic Rule Refinement
41
Add Blocking Constraint
Automatic Rule Adaptation NP,8 (R0) ADJ(grande) [feat1 = -] [feat1 = -] NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +] Can we also eliminate incorrect translations automatically? un artista grande *un artista gran un gran artista *un grande artista Now the refined grammar produces the correct translation, but it still produces incorrect translations that we know are incorrect. Since the Refinement operation has increased ambiguity in the grammar: translation candidate list size has increased by more than double, since both “grande” and “gran” can be unified with {NP,8} and “gran” now unifies with {NP,8’} Eliminate incorrect translations automatically? tighter grammar (minimal description length) Since great translates both as grande and gran, both rules will be applied, one with each type of adjective. Interactive and Automatic Rule Refinement
42
Making the grammar tighter
Automatic Rule Adaptation If Wc = artista Add [feat1= +] to N(artista) Add agreement constraint to NP,8 (R0) between N and ADJ ((N feat1) = (ADJ feat1)) *un artista grande *un artista gran un gran artista *un grande artista If user identified Wc, the RR module can also tag artista with ((feat1)= +) and add an agreement constraint between ADJ and N ((y2 feat1) = (y3 feat1)) to {NP,8} For each new adjective that appears in this context, the output produced by one of the two NP rules will be picked (as best) by users, and the RR module will be able to tag them as being pre-nominal (feat1 = +) or post-nominal (feat1 = -) in the lexicon. Note that without Wc RR generated correct translation, but can’t eliminate all incorrect translations I plan to make the grammar as tight as possible with the information available. If no triggering word is identified for a particular error, it might be impossible to do so automatically Interactive and Automatic Rule Refinement
43
Batch Mode Implementation
Proposed Work Batch Mode Implementation Automatic Rule Adaptation Given a set of user corrections, apply refinement module. For Refinement Operations of errors that can be refined fully automatically using: Correction information only 2. Correction and error information Main focus of my thesis 2. Such as type of error and whether there is a clue word error type, clue word Interactive and Automatic Rule Refinement
44
Interactive and Automatic Rule Refinement
Rule Refinement Operations Modify Add Delete Change W Order +Wc –Wc Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ This is the Refinement typology introduced at the end of the preliminary work. (This tree represents the space of all possible Rule refinement operations, given a user correcting action and the amount of information available at refinement time) [click] Interactive and Automatic Rule Refinement
45
Interactive and Automatic Rule Refinement
1. Correction info only Rule Refinement Operations Modify Add Delete Change W Order +Wc –Wc Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ Fully automatically just by using correction information There is no information about the clue word or error type, just the info from the user correcting with the TCTool default settings (look for words with the value fem for gender in the sentence and hypothesize constraints, validate with test set) Add –Wc +al(ignment) John and May fell Juan y Maria cayeron Juan y Maria se cayeron from “se” to “fell” (Se cayeron fell) Change word order WiWc WiWi’ POS!=POS: I will help him fix the car ayudare a el a arreglar el auto le ayudare… It is a nice house – *Es una casa bonito Es una casa bonita Gaudi was a great artist – *Gaudi era un artista grande Gaudi era un gran artista Interactive and Automatic Rule Refinement
46
Interactive and Automatic Rule Refinement
2. Correction and Error info Rule Refinement Operations Modify Add Delete Change W Order +Wc –Wc Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ 2: Fully automatically using correction and error information, such as clue words, error type Need to know that orgullosa is the word which is triggering the correction (namely the addition of “de”) Wc=orgullosa in order to be able to automatically refine it. The rule PP(Prep NP) already exists (the level of generality appropriate for the constraint is set by default and can be confirmed thru user interaction). PP PREP NP I am proud of you – *Estoy orgullosa de tu Estoy orgullosa de ti Interactive and Automatic Rule Refinement
47
Interactive Mode Implementation
Proposed Work Interactive Mode Implementation Automatic Rule Adaptation Extra error information is required to determine triggering context automatically Need to give other relevant sentences to the user at run-time (minimal pairs) For Refinement Operations of errors that can be refined fully automatically but: 3. require a further user interaction In some cases, however, extra info required So I’ll also need to implement an interactive mode of the rule refinement module so that when it doesn’t have enough info to determine the triggering context, it can present users with more sentences to evaluate This mode of operation will be implemented for Ref Op of errors… require a reasonable amount of further user interaction and can be solved by available correction and error information. Typically requires more effort from users and takes longer smaller test set I’d like to investigate Active Learning methods to maximize user time (by presenting them with the most informative MP first) AL = Method to Minimize the number of examples a human annotator must label [Cohn et al. 1994] usually by processing examples in order of usefulness usually by processing examples in order of usefulness. [Lewis and Catlett 94] used uncertainty as a measure of usefulness. [Callison-Burch 2003] proposed AL to reduce the cost of creating a corpus of labeled training examples for Statistical MT. Interactive and Automatic Rule Refinement
48
Interactive and Automatic Rule Refinement
3. Further user interaction Rule Refinement Operations Modify Add Delete Change W Order +Wc –Wc Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ This shows the refinements for error types that require a reasonable amount of further user interaction and can be solved by available correction and error information (-rule: suppose the rule required to generate the new POS sequence is not in the grammar (if the grammar didn’t have a PP)) object pronouns in Spanish need to precede the verb Let’s take a closer look at this last example and how the proposed approach would go about refining it. I see them – *Veo los Los veo Interactive and Automatic Rule Refinement
49
Example Requiring Minimal Pair
Proposed Work Example Requiring Minimal Pair Automatic Rule Adaptation 1. Run SL sentence through the transfer engine I see them *veo los Correct TL: los veo 2. Wi = los but no Wi’ nor Wc Need a minimal pair to determine appropriate refinement: I see cars veo autos 3. Triggering feature(s): (veo los, veo autos) (los,autos) = {pos} PRON(los)[pos=pron] N(autos)[pos=n] In Spanish if the direct Obj is a pronoun (clitic), it needs to be in a pre-verbal position. (explain it!!!) Wc Not a specific word, but rather a combination of features in Wi(pron+acc) Interactive and Automatic Rule Refinement
50
Refining and Adding Constraints
Proposed Work Refining and Adding Constraints VP,3: VP NP VP NP (veo los, veo autos) VP,3’: VP NP NP VP + [NP pos =c pron] (los veo, *autos veo) Percolate triggering features up to the constituent level: NP: PRON PRON + [NP pos = PRON pos] Block application of general rule (VP,3): VP,3: VP NP VP NP + [NP pos = (*NOT* pron)] *veo los, veo autos (los veo, *autos veo) To the appropriate rules … but for that to have any effect, we need to make sure that they pos feature is passed from the PRON to the constituent level if applicable [know if it’s applicable and which rule from MT output tree] NP… Which has the desired effect of not overgenerating for cases where the Npobj is not a pronoun If operating in batch mode, the refined rule would have the order of the VP and NP flipped and would generate the correct translation, but would not be able to eliminate incorrect translations (such as autos veo, etc.) But we’d still need to block the application of the general rule… Interactive and Automatic Rule Refinement
51
Interactive and Automatic Rule Refinement
Generalization Power When triggering feature already exists in the feature language (pos, gender, number, etc.) I see them *veo los los veo - I love him lo amo (before: *amo lo) - They called me yesterday me llamaron ayer (before: *llamaron me ayer) - Mary helps her with her homework Maria le ayuda con sus tareas (before: *Maria ayuda le con sus tareas) Is the correction specific to a particular lexical entry or can I generalize it across a group of lexical entries that have the same properties (and thus behave the same way)? Unlike when the feat language is not expressive enough, and a new feature has to be postulated (feat1) (where refinement doesn’t generalize to other lexical entries that exhibit the same behaviour, unless tagged by the system with new (local) user corrections. these cases represent a step towards generalization (semantic annotation), which can then be used a posteriori to make generalizations automatically. (this is the case of the Gaudi example I’ve shown you) can use wordnet to find relation between words that required same refinement With previous refinement, all these sentences will now be correctly translated by the system: Now X instead of Y Different rule (NP VP NP) John gave her the book Juan le dio el libro Interactive and Automatic Rule Refinement
52
Interactive and Automatic Rule Refinement
Proposed Work User Studies TCTool: new MT classification (Eng2Spa) Different language pair Mapudungun or Quechua Spanish Batch vs Interactive mode Amount of information elicited just corrections vs + error information 1. New way of eliciting MT error information simple statements/questions about the error possibly ordering from most informative to least informative Eng2Spa II: test new version of TCTool (v0.2) and compare error correction + classification accuracy results. Need to find the tradeoff between safety and generality. In particular, how much does the “interactivity” add (in safety) to the most general batch settings? (Alon’s suggestion) This will probably result into at least 4-8 more user studies Interactive and Automatic Rule Refinement
53
Evaluation of Refined MT Output
1. Evaluate best translation Automatic evaluation metrics (BLEU, NIST, METEOR) 2. Evaluate translation candidate list size precision (includes parsimony) Automatic evaluation of Refined MT output Necessary to maximize TQ and C while deciding on what RR operations to apply (at least 2 users correcting the same set of sentences have 2 or more reference translations (sets of user corrections) Interactive and Automatic Rule Refinement
54
1. Evaluate Best translation
Hypothesis file (translations to be evaluated automatically) Raw MT output: Best sentence (picked by user to be correct or requiring the least amount of correction) Refined MT output: Use METEOR score at sentence level to pick best candidate from the list Run all automatic metrics on the new hypothesis file using user corrections as reference translations. Assumption: user corrections = gold standard reference translations Method: compare raw MT output with MT output by the refined grammar and lexicon using automatic evaluation metrics, such as BLEU and METEOR Interactive and Automatic Rule Refinement
55
2. Evaluate Translation Candidate List
Precision: tp binary {0,1} (1 =user correction) tp + fp total number of TLs SL TL X SL TL TL X SL TL TL X SL TL TL X SL TL TL X = “tp” binary, since it’s either like the user correction (reference translation) 1 or it isn’t 0. precision is also a measure of parsimony, since if the size of CL grows, precision decreases Say: 1: 3 incorrect precision = 0 2: 1 correct, 4 incorrect = precision 0.2 3: 2 correct and 3 incorrect, but precision is still since the last correct translation is not like any user correction (reference translation), the system doesn’t know it’s correct, so it doesn’t contribute to the final score. This demonstrates the problem of automatic eval metrics, however there is no predefined set of all possible translations (fn). So if no user corrected the sentence in that way, it won’t be picked up by the measures. That’s is why recall cannot be used in this context to measure improvement. recall (tp/tp+fn) makes no sense in this context since we would need to know all the possible correct translations in order to calculate fn (correct translations not identified by users) 4: 1 correct, 2 incorrect = precision 0.333 5: 1 correct 1 incorrect = precision 0.5 user correction = 0/3 1/5 1/5 1/3 1/2 Interactive and Automatic Rule Refinement
56
Expected Contributions
An efficient online GUI to display translations and alignments and solicit pinpoint fixes from non-expert bilingual users. An expandable set of rule refinement operations triggered by user corrections, to automatically refine and expand different types of grammars. A mechanism to automatically evaluate rule refinements with user corrections as reference translations. 1. contribution almost completed % done (laying out the theory, figuring out how to do it (is an important contribution) 3. 70% (method identified, easy to implement) Interactive and Automatic Rule Refinement
57
Interactive and Automatic Rule Refinement
Thesis Timeline Research components Duration (months) Back-end implementation User Studies Resource-poor language (data + manual grammar) 2 Adapt system to new language pair 1 Evaluation Write and defend thesis Total Acknowledgements Committee Avenue team Friends and colleagues Expected graduation date: May 2006 Interactive and Automatic Rule Refinement
58
Interactive and Automatic Rule Refinement
Thanks! Questions? Interactive and Automatic Rule Refinement
59
Interactive and Automatic Rule Refinement
60
Interactive and Automatic Rule Refinement
61
Interactive and Automatic Rule Refinement
Some Questions What if users corrections are different (user noise)? More than one correction per sentence? Wc example Data set Where is appropriate to refine vs bifurcate? Lexical bifurcate Is the refinement process deterministic? Interactive and Automatic Rule Refinement
62
Interactive and Automatic Rule Refinement
Others TCTool Demo Simulation RR operation patterns Automatic Evaluation feasibility study AMTA paper results User studies map Precision, recall, F1 NIST, BLEU, METEOR Interactive and Automatic Rule Refinement
63
Interactive and Automatic Rule Refinement
Precision, Recall and F1 Precision: tp tp + fp (selected, incorrect) Recall: tp tp + fn (correct, not selected) F1: PR (P + R) F measures overall performance F1 measures an even combination of precision and recall. It is defined as 2*p*r/(p+r). The F1 measure is often the only simple measure that is worth trying to maximize on its own - consider the fact that you can get a perfect precision score by always assigning zero categories, or a perfect recall score by always assigning every category. A truly smart system will assign the correct categories and only the correct categories, maximizing precision and recall at the same time, and therefore maximizing the F1 score. Percentage of things I got right accuracy Percentage of things I got wrong error (usually not good measures, since tn is huge!) back to main Interactive and Automatic Rule Refinement
64
Automatic Evaluation Metrics
BLEU: averages the precision for unigram, bigram and up to 4-grams and applies a length penalty [Papineni, 2001]. NIST: instead of n-gram precision the information gain from each n-gram is taken into account [NIST 2002]. METEOR: assigns most of the weight to recall, instead of precision and uses stemming [Lavie, 2004] BLEU: length penalty if the generated sentence is shorter than the best matching (in length) reference translation The NIST metric is derived from the BLEU evaluation criterion but differs in one fundamental aspect: Idea behind NIST score: gives more credit if a system gets an n-gram match that is difficult, but to give less credit for an n-gram match which is easy Recent research has shown that a balanced harmonic mean (F1 measure) of unigram precision and recall outperforms the widely used BLEU and NIST metrics for Machine Translation evaluation in terms of correlation with human judgments of translation quality. METEOR (Metric for Evaluation of Translation with Explicit word Ordering) [lavie04] performs a maximal-cardinality match between translations and references, and uses the match to compute a coherence-based penalty. This computation is done by assessing the extent to which the matched words between translation and reference constitute well ordered coherent ``chunks'' back to main Interactive and Automatic Rule Refinement
65
Interactive and Automatic Rule Refinement
Proposed Work Data Set Split development set (~400 sentence) into: Dev set Run User Studies Develop Refinement Module Validate functionality Test set Evaluate effect of Refinement operations + Wild test set (from naturally occurring text) Requirement: need to be fully parsed by grammar 1. which can be fully parsed by the original manual grammar ( rules) [Right now ~40 rules (from 12 in the previous manual grammar)] from Typological and Structural Elicitation Corpus categorized by error MT error correction larger data set MT error correction+error info smaller data set back to questions Interactive and Automatic Rule Refinement
66
Interactive and Automatic Rule Refinement
Refine vs Bifurcate Batch mode bifurcate (no way to tell if the original rule should never apply) Interactive mode refine (change original rule) if can get enough evidence that original rule never applies. Corrections involving agreement constraints seem to hold for all cases refine (Open research question) batc mode: need to be conservative, no way to tell... unless there are plenty of relevant MP examples in the test set that support the hypothesis that the general rule is incorrect for all cases, back to questions Interactive and Automatic Rule Refinement
67
More than one correction/sentence
A B TL X X A 1st X B 2nd X Tetris approach to Automatic Rule Refinement Assumption: different corrections to different words different error 2 different corrections to 2 different words assume they are different corrections Exception: structural divergences (I swam across the lake nade a traves del lago cruze el lago nadando) next slide back to questions Interactive and Automatic Rule Refinement
68
Exception: structural divergences
He danced her out of the room * La bailó fuera de la habitación her he-danced out of the room La sacó de la habitación bailando her he-take-out of the room dancing Have no way of knowing that these corrections are related Do one error at a time, if TQ decreases over the test set, hypothesize that it’s a divergence Feed to the Rule Learner as a new (manually corrected) training example How can we automatically distinguish between structural divergences and totally garbled sentences? it essentially does what Bonnie’s divergences module (DUSTER) does. back to questions Interactive and Automatic Rule Refinement
69
Constituent order change
I gave him the tools *di a él las herramientas I-gave to him the tools le di las herramientas a él him I-gave the tools to him desired refinement: VPVP PP(a PRON) NP VPVP NP PP(a PRON) Can extract constituent information from MT output (parse tree) and treat as one error/correction Even if user changes a whole constituent to a different position, if it has more than one words, the RR module will have 2 different correcting actions done to 2 different words might want to check in the parse tree (MT output) if this is the case, treat as one correction/error. Interactive and Automatic Rule Refinement
70
More than one correction/error
A+B TL X Example: edit and move same word (like: gran, bailó) Occam’s razor Assumption: both corrections are part of the same error can always validate with automatic evaluation back to questions Interactive and Automatic Rule Refinement
71
Interactive and Automatic Rule Refinement
Wc Example I am proud of you *Estoy orgullosa de tu Wc=de tuti I-am proud of you-nom Estoy orgullosa de ti I-am proud of you-oblic Without Wc information, would need to increase the ambiguity of the grammar significantly! +[you ti]: I love you *ti quiero (te quiero,…) you read *ti lees (tu lees,…) If we added a lexical entry for ti (supposing there isn’t one already), without any restriction about the context it should appear in, The MT system will now output “ti” as a translation for “you” both in object and subject positions back to questions Interactive and Automatic Rule Refinement
72
Interactive and Automatic Rule Refinement
Lexical bifurcate Should the system copy all the features to the new entry? Good starting point Might want to copy just a subset of features (possibly POS-dependent) Open Research Question But it more research needs to be done to see which features should get copied. back to questions Interactive and Automatic Rule Refinement
73
Interactive and Automatic Rule Refinement
Automatic Rule Adaptation Interactive and Automatic Rule Refinement
74
Interactive and Automatic Rule Refinement
Automatic Rule Adaptation SL + best TL picked by user Interactive and Automatic Rule Refinement
75
Interactive and Automatic Rule Refinement
Automatic Rule Adaptation Changing “grande” into “gran” Interactive and Automatic Rule Refinement
76
Interactive and Automatic Rule Refinement
Automatic Rule Adaptation back to main Interactive and Automatic Rule Refinement
77
Interactive and Automatic Rule Refinement
Automatic Rule Adaptation 1 2 3 Interactive and Automatic Rule Refinement
78
Interactive and Automatic Rule Refinement
Input to RR module Automatic Rule Adaptation User correction log file Transfer engine output (+ parse tree): sl: I see them tl: VEO LOS tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") ) (NP,0 (PRON,2:3 "LOS") ) ) ) )> sl: I see cars tl: VEO AUTOS tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") ) (NP,2 (N,1:3 “AUTOS") ) ) ) )> back to main Interactive and Automatic Rule Refinement
79
Interactive and Automatic Rule Refinement
Completed Work Types of RR Operations Automatic Rule Adaptation Grammar: R0 R0 + R1 [=R0’ + constr] Cov[R0] Cov[R0,R1] R0 R1[=R0 + constr= -] R2[=R0’ + constr=c +] Cov[R0] Cov[R1,R2] R0 R1 [=R0 + constr] Cov[R0] Cov[R1] Lexicon Lex0 Lex0 + Lex1[=Lex0 + constr] Lex0 Lex1[=Lex0 + constr] Lex0 Lex0 + Lex1[Lex0 + TLword] Lex1 (adding lexical item) bifurcate refine Grammar: Bifurcates: leave original as is, modify the other one (maybe change word oder + make the other one more specific [VP: V Nppron in Spanish] Make more specific [agreement constraint missing] Bifurcate: general case (blocking constraint) + specific case (triggering constraint) [pre-nominal adjectives] Cov in terms of TL patterns, every time R0 is modified ->R0’, the coverage on the TL side increases Lexicon 1. cayeron se cayeron 2. woman + animate= + 4. Sense missing 3. OVW back to main Interactive and Automatic Rule Refinement
80
Manual vs Learned Grammars
Automatic Rule Adaptation Manual inspection: Automatic MT Evaluation: [AMTA 2004] Same test set than user study (32 sentences) Same manual grammar + automatically learned grammar using Avenue’s Rule Learning module [Probst et al. 2002] (+ added feature constraints) Looking at 5 first translations Looking at the best translation for each grammar and compared Automatic eval reflects the findings thru manual evaluation that even though MG output is better, it is also the same in many cases, scores are not that different Different types of errors different types of RR operations required Learned grammars higher level of lexicalization, need RR module to achieve appropriate level of generalization. Conclusions: - Manual G will need to be refined to encode exceptions, whereas Learned G will need to be refined to achieve the right level of generalization. - We expect the RR to give the most leverage when combined with the Learned Grammar. NIST BLEU METEOR Manual grammar 4.3 0.16 0.6 Learned grammar 3.7 0.14 0.55 back to main Interactive and Automatic Rule Refinement
81
Human Oracle experiment
Completed Work Human Oracle experiment Automatic Rule Adaptation As a feasibility experiment, compared raw output with manually corrected MT: statistically significant (confidence interval test) These is an upper-bound on how much difference we should expect any refinement approach to make. NIST is the least discriminative of metrics, since for bigrams that appear only once (very small set) the system doesn’t get any credit. back to main Interactive and Automatic Rule Refinement
82
Interactive and Automatic Rule Refinement
Order deterministic? RR op application is not deterministic Order of the corrected sentences input to the system Example: 1st: gran artista bifurcate (2 rules) 2nd: casa bonito add agr constraint to only 1 rule (original, general rule) the specific rule is still incorrect (missing agr constraint) 1st: casa bonito add agr constraint 2nd: gran artista bifurcate both rules have agreement constraint (optimal order) Given the same set of corrected sentences, the application of RR operations is not deterministic, it directly depends on Establish a Refinement Operation Hierarchy: 1. Refine operations apply first (agreement constraints) 2. Bifurcate operations apply second (different word order, add a word, delete a word) back to questions Interactive and Automatic Rule Refinement
83
Interactive and Automatic Rule Refinement
User noise? Solution: Have several users evaluate and correct the same test set threshold, 90% agreement - correction - error information (type, clue word) Only modify the grammar if enough evidence of incorrect rule. in general, we only want to refine the grammar/lexicon, if we have enough evidence that... back to questions Interactive and Automatic Rule Refinement
84
Interactive and Automatic Rule Refinement
Proposed Work User Studies Map Batch mode Only Corrections Learned grammars Corrections+error info RR module Eng2spa X Manual grammars X Mapu2Spa X interactive mode Active learning Most of these dimensions interact with each other, and thus several sets of user studies need to be run. back to main Interactive and Automatic Rule Refinement
85
Recycle corrections of Machine Translation output back into the system by refining and expanding existing translation rules In other words, my thesis focuses on Recyling non-expert user corrections back into the Machine Translation system by refining and expanding existing translation rules Left: represents an initial grammar (small manual grammar or automatically learned grammar) Right: refined grammar (codes for exceptions to general rules, has more (agreement) constraints, new rules, etc.)
86
Interactive and Automatic Rule Refinement
1. Correction info only Rule Refinement Operations Modify Add Delete Change W Order +Wc –Wc Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ Focus 1: Fully automatically just by using correction information There is no information about the clue word or error type, just the info from the user correcting with the TCTool 2nd example: +al(ignment) from “se” to “fell” (Se cayeron fell) It is a nice house – Es una casa bonito Es una casa bonita John and Mary fell – Juan y Maria cayeron Juan y Maria se cayeron Gaudi was a great artist – Gaudi era un artista grande Gaudi era un gran artista Interactive and Automatic Rule Refinement
87
Interactive and Automatic Rule Refinement
1. Correction info only Rule Refinement Operations Modify Add Delete Change W Order +Wc –Wc Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ Focus 1: Fully automatically just by using correction information There is no information about the clue word or error type, just the info from the user correcting with the TCTool Es una casa bonito Es una casa bonita J y M cayeron J y M se cayeron Gaudi was a great artist – Gaudi era un artista grande Gaudi era un gran artista I will help him fix the car – Ayudaré a él a arreglar el auto Le ayudare a arreglar el auto Interactive and Automatic Rule Refinement
88
Interactive and Automatic Rule Refinement
1. Correction info only Rule Refinement Operations Modify Add Delete Change W Order +Wc –Wc Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ Focus 1: Fully automatically just by using correction information There is no information about the clue word or error type, just the info from the user correcting with the TCTool I would like to go – Me gustaria que ir Me gustaria ir I will help him fix the car – Ayudaré a él a arreglar el auto Le ayudare a arreglar el auto Interactive and Automatic Rule Refinement
89
Interactive and Automatic Rule Refinement
2. Correction and Error info Rule Refinement Operations Modify Add Delete Change W Order +Wc –Wc Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ Focus 2: Fully automatically using correction and error information, such as clue words, error type The rule PP(Prep NP) already exists (the level of generality appropriate for the constraint is set by default and can be confirmed thru user interaction). PP PREP NP I am proud of you – Estoy orgullosa tu Estoy orgullosa de ti Interactive and Automatic Rule Refinement
90
Interactive and Automatic Rule Refinement
Focus 3 Rule Refinement Operations Modify Add Delete Change W Order +Wc –Wc Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ This shows the refinements for error types that require a reasonable amount of further user interaction and can be solved by available correction and error information (Focus 3) -rule: suppose the rule required to generate the new POS sequence is not in the grammar (if the grammar didn’t have a PP) Let’s take a closer look at this last example and how the proposed approach would go about refining it. Wally plays the guitar – Wally juega la guitarra Wally toca la guitarra I saw the woman – Vi la mujer Vi a la mujer I see them – Veo los Los veo Interactive and Automatic Rule Refinement
91
Interactive and Automatic Rule Refinement
Outside Scope of Thesis Rule Refinement Operations Modify Add Delete Change W Order +Wc –Wc Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc = +rule –rule +al –al =Wi WiWi’ RuleLearner POSi=POSi’ POSiPOSi’ By definition, there is no MP that will provide with the triggering context for this example John would have to be in the object position (I gave John the book – le di el libro a Juan), and there are only two words in common. If a word is moved outside the local rule, the RR module will feed the corrected sentence pair back into the system (to the RL) as a new training example. John read the book – A Juan leyó el libro Juan leyó el libro Where are you from? – Donde eres tu de? De donde eres tu? Interactive and Automatic Rule Refinement
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.