Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.

Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies Institute Carnegie Mellon University AMTA 2004

October 1AMTA 20042 Outline Automatic Rule Refinement AVENUE and resource-poor scenarios Experiment Data (eng2spa) Two types of grammar Evaluation results Error analysis RR required for each type Conclusions and Future Work

October 1AMTA 20043 General -MT output still requires post-editing -Current systems do not recycle post-editing efforts back into the system, beyond adding as new training data within Avenue -Resource-poor scenarios: lack of manual grammar or very small initial grammar -Need to validate elicitation corpus and automatically learned translation rules Motivation for Automatic RR

October 1AMTA 20044 Motivation for Automatic RR General -MT output still requires post-editing -Current systems do not recycle post-editing efforts back into the system, beyond adding as new training data within Avenue -Resource-poor scenarios: lack of manual grammar or very small initial grammar -Need to validate elicitation corpus and automatically learned translation rules

October 1AMTA 20045 AVENUE and resource-poor scenarios No e-data available (often spoken tradition) SMT or EBMT lack of computational linguists to write a grammar So how can we even start to think about MT? –That’s what AVENUE is all about Elicitation Corpus + Automatic Rule Learning + Rule Refinement What do we usually have available in resource-poor scenarios? Bilingual users

October 1AMTA 20046 AVENUE overview Learning Module Transfer Rules Lexical Resources Run Time Transfer System Lattice Translation Correction Tool Word- Aligned Parallel Corpus Elicitation Tool Elicitation Corpus ElicitationRule Learning Run-Time System Rule Refinement Rule Refinement Module Handcrafted rules Morphology Morpho- logical analyzer

October 1AMTA 20047 Automatic and Interactive RLR SLS3 SLSentence1– TLSentence1SLSentence2– TLSentence2 Automatically Learned Rule R TLS3 1 st step 2 nd step TLS3’ RR module R’ (R refined) SLS3 TLS3’

October 1AMTA 20048 Interactive Elicitation of MT errors Assumptions: non-expert bilingual users can reliably detect and minimally correct MT errors, given: –SL sentence (I saw you) –up to 5 TL sentences (Yo vi tú,...) –word-to-word alignments (I-yo, saw-vi, you-tú) –(context) using an online GUI: the Translation Correction Tool (TCTool) Goal: Simplify MT correction task maximally User studies: 90% error detection accuracy and 73% error classification [LREC 2004]

October 1AMTA 20049 1 st Eng2Spa user study [LREC 2004] Manual grammar: 12 rules + 442 lexical entries MT error classification (v0.0): 9 linguistically- motivated classes word order, sense, agreement error (number, person, gender, tense), form, incorrect word and no translation Test set: 32 sentences from the AVENUE Elicitation Corpus (4 correct / 28 incorrect) Interactive elicitation of error information

October 1AMTA 200410 Data Analysis For 10 (of the 29) users: - from Spain (to reduce geographical differences) - 2 had Linguistics background - 2 had a Bachelor's degree, 5 a Masters and 3 a PhD. Interested in high precision, even at the expense of lower recall - ideally: no false positives (users correcting something that is not strictly necessary) - we don't care so much about having false negatives (errors that were not corrected) Interactive elicitation of error information

October 1AMTA 200411 TCTool v0.1 Add a word Delete a word Modify a word Change word order Actions:

October 1AMTA 200412 RR Framework Find best RR operations given a: grammar (G), lexicon (L), (set of) source language sentence(s) (SL), (set of) target language sentence(s) (TL), its parse tree (P), and minimal correction of TL (TL’) such that TQ2 > TQ1 Which can also be expressed as: max TQ (TL|TL’,P,SL,RR(G,L))

October 1AMTA 200413 Types of RR operations Grammar: –R0  R0 + R1 [=R0’ + contr] Cov[R0]  Cov[R0,R1] –R0  R1 [=R0 + constr] Cov[R0]  Cov[R1] –R0  R1[=R0 + constr= -]  R2[=R0’ + constr=c +] Cov[R0]  Cov[R1,R2] Lexicon –Lex0  Lex0 + Lex1[=Lex0 + constr] –Lex0  Lex1[=Lex0 + constr] –Lex0  Lex1[  Lex0 +  TLword] –   Lex1 (adding lexical item) bifurcate refine

October 1AMTA 200414 Experiment Data (eng2spa) Grammars: manual vs learned Results Error analysis Types of RR operations required by each grammar

October 1AMTA 200415 Data: English - Spanish Training First 200 sentences from AVENUE Elicitation Corpus Lexicon: extracted semi-automatically from first 400 sentences (442 entries) Test 32 sentences manually selected from the next 200 sentences in the EC to showcase a variety of MT errors

October 1AMTA 200416 Manual grammar 12 rules (2 S, 7 NP, 3 VP) Produces 1.6 different translations on average

October 1AMTA 200417 Learned Grammar + feature constraints 316 rules (194 S, 43 NP, 78 VP, 1 PP) emulated decoder by reordering of 3 rules Produces 18.6 different translations on average

October 1AMTA 200418 Comparing Grammar Output: Results Manually: Automatic MT Evaluation: NISTBLEUMETEOR Manual grammar 4.30.160.6 Learned grammar 3.70.140.55

October 1AMTA 200419 Error Analysis Most of the errors produced by the manual grammar can be classified into: –lack of subj-pred agreement –wrong word order of object pronouns (clitic) –wrong preposition –wrong form (case) –OOV words On top of these, the learned grammar output exhibited errors of the following type: –lack of agreement constraints –missing preposition –over-generalization

October 1AMTA 200420 Same (both good) Manual Grammar better Learned Grammar better Different (both bad) Examples

October 1AMTA 200421 Types of RR required for Manual Grammar Bifurcate a rule to code an exception: –R0  R0 + R1 [=R0’ + contr] Cov[R0]  Cov[R0,R1] –R0  R1[=R0 + constr= -]  R2[=R0’ + constr=c +] Cov[R0]  Cov[R1,R2] Learned Grammar Adjust feature constraints, such as agreement: –R0  R1 [=R0 +|- constr] Cov[R0]  Cov[R1]

October 1AMTA 200422 Conclusions TCTool + RR can improve both hand-crafted and automatically learned grammars. In the current experiment, MT errors differ almost 50% of the time, depending on the type of grammar. Manual G will need to be refined to encode exceptions, whereas Learned G will need to be refined to achieve the right level of generalization. We expect the RR to give the most leverage when combined with the Learned Grammar.

October 1AMTA 200423 Future Work Experiment where user corrections are used both as new training examples for RL and to refine the existing grammar with the RR module. Investigate using reference translations to refine MT grammars automatically... but much harder since they are not minimal post-editions.

October 1AMTA 200424 Questions??? Thank you!

October 1AMTA 200425 2 steps to ARR Interactive elicitation of error information Automatic Rule Adaptation

October 1AMTA 200426 Error Correction by bilingual users

October 1AMTA 200427 MT error typology for RR (simplified) missing word extra word word order (local vs long-distance, word vs phrase, word change) incorrect word (sense, form, selectional restrictions, idiom,...) agreement (missing constraint, extra agreement constraint)

October 1AMTA 200428 RR Framework types of operations: bifurcate, make more specific/general, add blocking constraints, etc. formalizing error information (clue word) finding triggering features

Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.

Similar presentations

Presentation on theme: "Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.

Similar presentations

Presentation on theme: "Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies."— Presentation transcript:

Similar presentations

About project

Feedback