Presentation is loading. Please wait.

Presentation is loading. Please wait.

Joke Daems Supervised by: Lieve Macken, Sonia Vandepitte, Robert Hartsuiker Two sides of the same.

Similar presentations


Presentation on theme: "Joke Daems Supervised by: Lieve Macken, Sonia Vandepitte, Robert Hartsuiker Two sides of the same."— Presentation transcript:

1 Joke Daems joke.daems@ugent.be www.lt3.ugent.be/en/projects/robot Supervised by: Lieve Macken, Sonia Vandepitte, Robert Hartsuiker Two sides of the same coin assessing translation quality through adequacy and acceptability error analysis

2 What makes error analysis so complicated? “There are some errors for all types of distinctions, but the most problematic distinctions were for adequacy/fluency and seriousness.” – Stymne & Ahrenberg, 2012  Does a problem concern adequacy, fluency, both, neither?  How do we determine the seriousness of an error?

3 Two types of quality “Whereas adherence to source norms determines a translation's adequacy as compared to the source text, subscription to norms originating in the target culture determines its acceptability.” - Toury, 1995  Why mix?

4 2-step TQA approach Acceptability = target norms Adequacy = target vs. source Quality Assessment

5 Subcategories Acceptability Grammar & Syntax Lexicon Spelling & typos Style & register Coherence AdequacyContradictionDeletionAdditionWord senseMeaning shift

6 Acceptability: fine-grained Grammar & SyntaxLexiconSpelling & TyposStyle & RegisterCoherence articlewrong prepositioncapitalizationregisterconjunction comparative/superlativewrong collocationspelling mistakeuntranslatedmissing info singular/pluralword nonexistentcompoundrepetitionlogical problem verb form punctuationdisfluentparagraph article-noun agreement typoshort sentencesinconsistency noun-adj agreement long sentencecoherence - other subject-verb agreement text type reference style – other missing superfluous word order structure grammar – other

7 Adequacy: fine-grained Meaning shift contradiction meaning shift caused by misplaced word word sense disambiguationdeletion hyponymyaddition hyperonymyexplicitation terminologycoherence quantityinconsistent terminology timeother meaning shift caused by punctuation

8 How serious is an error? “Different thresholds exist for major, minor and critical errors. These should be flexible, depending on the content type, end-user profile and perishability of the content.” - TAUS, error typology guidelines, 2013  Give different weights to error categories depending on text type & translation brief

9 Reducing subjectivity Flexible error weights More than one annotator Consolidation phase

10 TQA: Annotation (brat) 1) Acceptability 2) Adequacy

11 Application example: comparative analysis

12 Next step: diagnostic & comparative evaluation What makes a ST-passage problematic? How problematic is this passage really? (i.e.: how many translators make errors) Which PE errors are caused by MT? Which MT errors are hardest to solve?  Link all errors to corresponding ST-passage

13 Source text-related error sets ST: Changes in the environment that are sweeping the planet... MT: Veranderingen in de omgeving die het vegen van de planeet tot stand brengen... (wrong word sense) "Changes in the environment that bring about the brushing of the planet..." PE1: Veranderingen in de omgeving die het evenwicht op de planeet verstoren... (other type of meaning shift) "Changes in the environment that disturb the balance on the planet..." PE2: Veranderingen in de omgeving die over de planeet rasen... (wrong collocation + spelling mistake) "Changes in the environment that raige over the planet..."

14 Application example: impact of MT errors on PE

15 Summary Improve error analysis by: – judging acceptability and adequacy separately – making error weights depend on translation brief – having more than one annotator – introducing consolidation phase Improve diagnostic and comparative evaluation by: – linking errors to ST-passages – taking number of translators into account

16 Open questions How can we reduce annotation time? – Ways of automating (part) of the process? – Limit annotation to subset of errors? How to better implement ST-related error sets? – Ways of automatically aligning ST, MT, and various TT’s at word-level?

17 Thank you for listening For more information, contact: joke.daems@ugent.be Suggestions? Questions?

18 Quantification of ST-related error sets ST MT (1) MT1(0.5) wrong word sense (0.5) MT2 (0.5) PE (1) PE1 (0.5) other meaning shift (0.5) PE2(0.5) wrong collocation (0.25) spelling mistake (0.25)

19 Inter-annotator agreement HT&PE acceptability HT&PE adequacyMT acceptabilityMT adequacy Exp1Exp2Exp1Exp2Exp1Exp2Exp1Exp2 Initial agreement 39% (κ=0.32) 50% (κ=0.44) 42% (κ=0.31) 46% (κ=0.30) 53% (κ=0.49) 79% (κ=0.77) 57% (κ=0.46) 51% (κ=0.41) Agreement after consolidation 67% (κ=0.65) 81% (κ=0.80) 82% (κ=0.79) 94% (κ=0.92) 84% (κ=0.83) 95% (κ=0.94) 94% (κ=0.92) 86% (κ=0.83) Correlation between annotators r=0.67, n=38, p<0.001 r=0.95, n=34, p<0.001 r=0.87, n=38, p<0.001 r=0.86, n=34, p<0.001 n/a Agreement on categories 90% (κ=0.89) 89% (κ=0.88) 89% (κ=0.87) 88% (κ=0.83) 83% (κ=0.81) 93% (κ=0.93) 86% (κ=0.79) 86% (κ=0.82)


Download ppt "Joke Daems Supervised by: Lieve Macken, Sonia Vandepitte, Robert Hartsuiker Two sides of the same."

Similar presentations


Ads by Google