Post-editing: how to future-proof your career in translation Paulo Camargo, PhD. Owner, Terminologist BLC - Brazilian Localization.

Post-editing: how to future-proof your career in translation Paulo Camargo, PhD. Owner, Terminologist pcamargo@blc.com.br BLC - Brazilian Localization Company Web: www.blc.com.br

Purpose of this presentation Promote the adoption of Machine Translation (MT) and Post-editing (PE) –How we can work faster, better, and make more money Target-audience: –Novice, experienced, and advanced freelance translators –Small LSPs and in-house translators

Perspectives: novice translator Introduce PE as a new profession –Background information –Current adoption of PE –PE productivity/compensation Explore availability of PE training –Why a translator need PE training –What are the required skills –PE certifications available: TAUS, and SDL

Perspectives: experienced/advanced Use MT output as translation aid -Research shows MT increases productivity -Translators prefer MT instead of unaided -GT, SDL Cloud, MS Hub, Systran Advanced: combine MT / term. manag. -Term extraction/customization of MT -Generation/PE of MT output ↑ productivity -Replace combined TM/on-line TM servers?

Perspectives: small LSP Large/medium LSPs use MT (> decade) –Small LSPs: need to catch-up How to get started with low budget -MT developments: ↓ need of specialized IT -Key-resource: in-house translator -Terminology management -Customizable MT -Preliminary analysis/PE Guidelines

Definition of post-editing (TAUS) Post-editing: “the correction of machine-generated translation output to ensure it meets a level of quality negotiated in advance between client and post-editor”. “Post-editing seeks the minimum steps required for an acceptable text”

Background information PE reality: driven by advances in MT –Hybrid MT: rule-based / statistic-based –Rule-based: dictionaries, rules; e.g Systran –Statistic-based: training data (TM); e.g GT Pre-editing –Customization: glossary, training data –Preliminary analysis: language rules, client rules, example card

Preliminary analysis (Rico, 2011) After engine customization –Select MT samples –Check term consistency/accuracy –Check for recurrent MT errors Draw guidelines (quality acceptance) –Quality/errors to expect / how to proceed –Language independent/dependent rules –Feedback (glossary update, errors report)

Guidelines for PE (Rico, 2011) Language independent rules –Fix terminology, syntactic, morphology –Fix misspelling, punctuation, omissions –Edit offensive/inappropriate text Language dependent rules –Language specific examples: –Example card: expected errors/how to fix

Custom MT: what to expect (O’Brien, 2002) Custom MT: high-level MT output –Most segments = 85% TM fuzzy –Some better than 100% TM match (review) –Some bad translations: retranslate Translator: critical to MT success Need human assessment: always! “Not only will MTPE not replace the translator but it also will not happen without the translator”

Full MT post-editing (Dillinger, 2004) Goal: human-quality output Most frequent: higher visibility texts Quality expectations: high (TEP) Grammar, syntactic, semantic correct Stylistically appropriate Productivity expected: 4K – 10 K w/day

Current adoption of post-editing Common Sense Advisory report (2012) –Freelance: 21.7% (15.4% plan) –Small LSP: 32.5% (22.6 % plan) –Large LSP: 72.0% (28.0% plan) ** ALC report (2015) –Small LSP: 20.0% (USA), 25.0% (Europe) Lionbridge (Marciano, 2015) –Apply 30% projects (goal 50%), 60M, 2014

Post-editing productivity data Post-editing productivity (O’Brien, 2006) –Equal/higher than editing TM High Fuzzy –Typical: 4K to 10K words/day –Proficiency: 100K w (1 month full-time PE) Other productivity data – Full PE: 5K–8K w/day (DePalma, 2011)

PE compensation: follow TM fuzzy TM fuzzy matches (Guerberof, 2013) – 60-66% of full TR rate for 75%-94% match MT full post-editing –70-50% of rate (Guerberof, 2013) –65-68% of rate (Marciano, 2015) –Smaller companies: prefer to pay per hour

Proposal for PE training (O’Brien, 2002) PE: what do TRs think about? –Dislike for correcting repetitive errors –Fear of losing proficiency (poor MT output) –Dislike for limited freedom of expression Why do TRs need PE training? –≠ skills: 2 source texts –Quality requirements, different error types –Qualified translator ≠ successful post-editor

What skills does a post-editor need? Same as the translator (O’Brien, 2002) -Expert in subject area and target language -Excellent knowledge of source language -Word-processing (WP) skills, tolerance Skills for post-editor only (Rico, 2011) -Adv WP: RegEx, S&R, term. management -Positive attitude towards MT

Proposal for PE course (O’Brien, 2002) Theoretical component –Intro to PE/MT tech / controlled language –Adv. term. Management / text linguistics –Basic programming skills Required background –TRA skills; basic linguistics/term manag –IT skills; intro lang tech; source/target skills

Sources for PE certification TAUS (Transl. Automation User Society) –English >23 lang (European, Arabic, Asian) –Also Spanish > English –Cost: 60 Euro (member), 80 Euro (non) SDL MT PE Certification –Free with SDL Language Cloud MT

Perspectives: experienced/advanced What possibilities can MT offer other than post-editing? Is it worth using MT output as an aid to increase TR productivity? Can MT replace with advantages the use of combined TM/on-line TM servers?

Efficiency of PE for language translation Rigorous, controlled analysis (Spence, 2013) -Hypothesis 1: PE reduces translation time -Hypothesis 2: PE increases quality -Hypothesis 3: MT primes the translator Compared PE vs. unaided translation –Blind experiment: TR did not GT was used –Pre-interview: TR showed strong MT dislike –16 PRO TRA/pair: EN-AR, EN-FR, EN-GE

Results clarify value of post-editing Which one is faster? 69% PE Useful? Yes 56%, No 29%, Unsure 15% Suggestions improved quality (all) MT output primes the translator –PE text (closer MT) ≠ Unaided ≠ Raw MT –Lower the TR experience → closer to MT

Does MT output increase productivity? Example 1: Google Translate –Now a paid service: $20/M characters –Plug-in to SDL Trados/other CAT tools –General statistical MT engine –Not customizable –Confidentiality issues –See app for complete setup procedure

Does MT output increase productivity? Example 2: SDL Cloud MT –Price range: $5 – $75 /month (Expert) –Plug-in to SDL Trados/other CAT tools –Complete confidentiality (nothing is stored) –Pre-trained engines: Travel, IT, Life Sciences, Automotive, Consumer Electronics –Customizable MT: can add own glossaries –Comprehensive analytics (quality analysis)

Does MT output increase productivity? Example 3: Microsoft Translator Hub –Plug-in to SDL Trados/others, secure –First 2M char free; 4M/mon $40 –Fully customizable MT engine Previous translations (> 20K words) Add glossaries Request training / evaluate results Option to “Use Microsoft Models”

How about confidentiality? Consider e.g. Microsoft and Google –Among the largest providers of MT –Among the largest buyers of translation –Control information flow around the globe Confidentiality should not be problem –Google not option → MS Hub/SDL Cloud –Uncomfortable sending data to MS/SDL –Use desktop/server solution: Systran

Changes in MT offer to TRs Common scenario for TR (4 years ago) –One affordable desktop product (Systran) –Macros, RegEx, format conversion –No plug-in for CAT tools (high-end) Current scenario –Software as a service (GT, SDL, MS Hub) –Plug-in for CAT tools is standard –Much lower IT requirements

Can MT replace combined/online TMs? Experienced/advanced translators –Use combined/on-line TM for productivity –Proud users: ↑ 50% prod, see TM as asset –TM is error-prone (consistency, mistranslation) –Need to check term consistency MT improved a lot in last 5 years –TRs trust TM fuzzy > raw MT (Guerberof, 2008) –Mistake MT output for TM output (human?)

MTPE can provide a better result Avoid problems in combined/on-line TM: –Terminology inconsistencies –Mistranslations –Waste time correcting TUs that will never use New approach using MTPE –Extract terms (Systran, rule-based) –Customizable MT (SDL Cloud or MS Hub) –Post-edit fresh MT for ↑ productivity/quality

How small LSPs can get started Scenario: 40% TRs use MT (TAUS) Actual post-editing offer (2015) –PE of our MT engine output (GT?) –Payment: 900 words/hour –Instruction: as readable as possible –No pre-editing: cal, gloss or guidelines Translators were really upset

Need more than just Google Translate Pre-editing: custom., prelim. analysis –Key: in-house translator (O’Brien, 2002) Allocate translator for MTPE activities –Use secure on-line customizable engines –Define suitable projects –Invest in terminology management –Develop PE guidelines –No rate discount initially (learning curve)

MT implementation at BLC (4 years) Smaller projects: 10 - 50K words –Terminology extraction (Systran, rule-based) –Normal TR + ED procedure –Semi-customized: SDL Cloud + Multiterm Larger projects > 50K words –Extract bigger glossary (higher coverage) –Raw MT (Systran, SDL Cloud, MS Hub) –PE + ED (no pre-editing, no discount)

MT implementation at BLC What does MT do for BLC? –Leverage my knowledge: engineering/science –Increase productivity (more/larger projects) –Increase quality (terminology/TM updates) Future developments –MS Hub, SDL Cloud; Systran –Hire new translator (2016) –Develop a PE team/service

Conclusion The combination of machine translation (MT) and post-editing (PE) is a disruptive innovation that can improve translator’s productivity and translation quality, no matter how you plan to use it. Can you afford to ignore it?

References Guerberof, Ana (2008). Productivity and Quality in the Post-editing of Outputs from Translation Memories and Machine Translation. Masters Dissertation. Universitat Rovira i Virgili. O’Brien, Sharon (2006). Eye-tracking and Translation Memory matches. Perspectives: Studies in Translatology 14 (3), 185-205. Spence, Green et al (2013), The Efficacy of Human Post-Editing for Language Translation, ACM Human Factors in Computing Systems (CHI), Computer Science Department, Stanford University Rico, Celia et al (2011), EDI-TA: Post-editing Methodology for Machine Translation, MultilingualWeb-LT. O’Brien, Sharon (2002), Teaching Post-editing: A Proposal for Course Content. Proceedings of the 6th EAMT Workshop on Teaching Machine Translation. EAMT/BCS, UMIST, Manchester, UK. 99-106. DePalma, Donald (2011), Common Sense Advisory, Trends in Machine Translation. Dillinger, Mike et al (2004), Implementing Machine Translation, LISA Best Practice Guides. TAUS (2014), MT Post-editing Guidelines Marciano, Jay (2015), Personal communication.

Post-editing: how to future-proof your career in translation Paulo Camargo, PhD. Owner, Terminologist BLC - Brazilian Localization.

Similar presentations

Presentation on theme: "Post-editing: how to future-proof your career in translation Paulo Camargo, PhD. Owner, Terminologist BLC - Brazilian Localization."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Post-editing: how to future-proof your career in translation Paulo Camargo, PhD. Owner, Terminologist BLC - Brazilian Localization.

Similar presentations

Presentation on theme: "Post-editing: how to future-proof your career in translation Paulo Camargo, PhD. Owner, Terminologist BLC - Brazilian Localization."— Presentation transcript:

Similar presentations

About project

Feedback