1 Session 1 Advantages and Disadvantages of Translation Technology (TT) - Historical development of translation technology - Focus on TM and MT (Theory.

Slides:



Advertisements
Similar presentations
Machine Translation. Can you imagine working as a translator without the help of computer?
Advertisements

The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
Machine Translation II How MT works Modes of use.
Introduction to Computational Linguistics
How to Use a Translation Memory Prof. Reima Al-Jarf King Saud University, Riyadh, Saudi Arabia Homepage:
CSA4050: Advanced Topics in NLP Example Based MT.
A Syntactic Translation Memory Vincent Vandeghinste Centre for Computational Linguistics K.U.Leuven
Improving Machine Translation Quality via Hybrid Systems and Refined Evaluation Methods Andreas Eisele DFKI GmbH and Saarland University Helsinki, November.
Computer Assisted Translation CAT Alexander C. Wu
C SC 620 Advanced Topics in Natural Language Processing Lecture 22 4/15.
Languages & The Media, 4 Nov 2004, Berlin 1 Multimodal multilingual information processing for automatic subtitle generation: Resources, Methods and System.
1 Historical Developments of Translation Technology (TT) widespread use of fax machines, enabling translation services to operate internationally 1980s.
Machine Translation (Level 2) Anna Sågvall Hein GSLT Course, September 2004.
April 2004 TM RASMAT 2004 – Uppsala Business Needs and Practices Pierre-Yves Foucou CTO - SYSTRAN.
Machine Translation Anna Sågvall Hein Mösg F
C SC 620 Advanced Topics in Natural Language Processing Lecture 20 4/8.
Computer Assisted Translation CAT Alexander C. Wu Fall 2004.
C SC 620 Advanced Topics in Natural Language Processing Lecture 19 4/6.
Chapter 12: Intelligent Systems in Business
C SC 620 Advanced Topics in Natural Language Processing Lecture 24 4/22.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
1 The Web as a Parallel Corpus  Parallel corpora are useful  Training data for statistical MT  Lexical correspondences for cross-lingual IR  Early.
Language Translators By: Henry Zaremba. Origins of Translator Technology ▫1954- IBM gives a demo of a translation program called the “Georgetown-IBM experiment”
An innovative platform to allow translation and indexing of internet sites Localization World
MACHINE TRANSLATION TRANSLATION(5) LECTURE[1-1] Eman Baghlaf.
Funded under the EU ICT Policy Support Programme Automated Solutions for Patent Translation John Tinsley Project PLuTO WIPO Symposium of.
1 Unit 7 Computer-aided Translation. 2 MT and CAT  Human-aided Machine Translation (HAMT)  The machine (the computer) plays the central role in translation.
MACHINE TRANSLATION A precious key to communicate beyond linguistic barriers 1.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
New Directions in Machine Translation Introduction 陳惠群 中央研究院 語言所 / 資訊所.
Machine translation Context-based approach Lucia Otoyo.
Machine Translation Dr. Radhika Mamidi. What is Machine Translation? A sub-field of computational linguistics It investigates the use of computer software.
English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.
1 Teletranslation Context : An Infrastructural Shift Paradigm of Teletranslation Internet and computer-mediated communication with digital media Information.
Week 9: resources for globalisation Finish spell checkers Machine Translation (MT) The ‘decoding’ paradigm Ambiguity Translation models Interlingua and.
Evaluation of the Statistical Machine Translation Service for Croatian-English Marija Brkić Department of Informatics, University of Rijeka
Globalisation and machine translation Machine Translation (MT) The ‘decoding’ paradigm Ambiguity Translation models Interlingua and First Order Predicate.
The Internet The advantages and disadvantages. What are the advantages of the Internet as a source of information?  It can be accessed from anywhere.
practical aspects1 Translation Tools Translation Memory Systems Text Concordance Tools Useful Websites.
TRANSLATION & THE HIGH TECH INDUSTRY. INTRODUCTION Translation has been existing ever since mythology began, passed the prophethood, and now in modern.
1 Machine Translation Dai Xinyu Outline  Introduction  Architecture of MT  Rule-Based MT vs. Data-Driven MT  Evaluation of MT  Development.
Evolution of Machine Translation: systems and use John Hutchins [ homepages/WJHutchins] [
Chapter 10 Language and Computer English Linguistics: An Introduction.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
Translation Memory System (TMS)1 Translation Memory Systems Presentation by1 Melina Takanen & Julianna Ekert CAT Prof. Thorsten Trippel University.
Machine Translation (Level 2) Anna Sågvall Hein GSLT Course, January 2003.
February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation.
Xml:tm XML Text Memory Using XML technology to reduce the cost of translating XML documents.
Jan 2005CSA4050 Machine Translation II1 CSA4050: Advanced Techniques in NLP Machine Translation II Direct MT Transfer MT Interlingual MT.
Introduction to the European Union. The European Union Foundation Purpose.
Development of an Intelligent Translation Memory MorphoLogic SZAK Publishers Balázs Kis
Keeping up with translation technologies: a call for experimental pedagogies Anthony Pym.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Evaluating Translation Memory Software Francie Gow MA Translation, University of Ottawa Translator, Translation Bureau, Government of Canada
Review: Review: Translating without in-domain corpus: Machine translation post-editing with online learning techniques Antonio L. Lagarda, Daniel Ortiz-Martínez,
Jan 2012MT Architectures1 Human Language Technology Machine Translation Architectures Direct MT Transfer MT Interlingual MT.
#APMP2016. Submitting proposals in more than one language: a survival guide Considering language and translation as a key component of your value proposition.
Introduction to Machine Translation
LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.
How to teach what you don’t know: a call for pedagogical experiments Anthony Pym.
Approaches to Machine Translation
How to teach translation technologies
Introduction to Machine Translation
Language Technologies Institute Carnegie Mellon University
CAT TOOLS.
Workshop CAT Technology & Localization
ITS 2.0 Enriched Terminology Annotation Showcase
Approaches to Machine Translation
Introduction to Machine Translation
Machine Translation(MT)
Presentation transcript:

1 Session 1 Advantages and Disadvantages of Translation Technology (TT) - Historical development of translation technology - Focus on TM and MT (Theory and Practice) Objectives: - To understand historical backgrounds to TT developments - To develop a balanced view on pros and cons of TT - To understand the basic principle behind TM and its difference from MT

2 (Japanese) Translator’s Life in New Zealand c.a Life without Life without - word-processing - word-processing - electronic dictionary - electronic dictionary - Google - Google - translation memory - translation memory Amazon.com - Amazon.com

3 Translation Paradigm (80s into early 90s)  after-thought  word-processing  asynchronous  text for paper-based circulation  no engineering input  work in isolation

4 Historical Developments of Translation Technology (TT) penetration of PCs - Desktop publishing (DTP) speech recognition PCs connected via modem telework Internet (Web) Sony PlayStation Google mobile phones -texting 1990s software localisation services localisation tools data-driven MT online term banks free WebMT (1997: Babelfish) web localisation services Translation Memory ICT Development TT Development

5 Translation Technology Continuum automation human involvement Automatic Translation Unaided Translation Computer-aided Translation (CAT) Translation process automated by use of Machine Translation Translation process aided by electronic tools such as Translation Memory Translation process not aided by any electronic tools Adapted from Hutchins & Somers (1992)

6 Machine Translation (MT) ………..Translation is a fine and exacting art, but there is much about it that is mechanical and routine, if this were given over to a machine, the productivity of the translator would not only be magnified but this work would become more rewarding, more exciting, more human.” Martin Kay (1987) “A computer is a device that can be used to magnify human productivity. Properly used, it does not dehumanize by imposing its own Orwellian stamp on the products of human spirit ………. Rationale for Technology Applications to Translation

7 Machine Translation (MT) MT research began in 1950’s – Warren Weaver’s 1949 Memo: “When I look at an article in Russian, I say: This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode.” (in Locke and Booth 1955:18)

8 Machine Translation Initially based on some misconception about human translation Initially based on some misconception about human translation - knowledge of two language systems suffice - it is a matter of looking up dictionaries - it is easy to define “a good translation” - there is only one correct translation possible - there is only one correct translation possible

9 Machine Translation (MT) MT history milestones (pre-ALPAC) 1954: Georgetown system demo successful translation of 49 Russian sentences into English : $50m spent in 20 research centres in USA 1966: Automatic Language Processing Advisory Committee (ALPAC) Report concludes: - MT was slower, less accurate and twice as expensive as Human Translation - there was no prospect of useful MT either immediately or in the future

10 Machine Translation (MT) MT history milestones (post-ALPAC) 1969 – privately funded projects Logos system (1969); Weidner-CAT (1977); ALPS (1980) 1975 – Météo project in Canada 1976 – European Commission acquires Systran 1979 – Eurotra project in Europe for Multilingual system 1980 – PC-based system 1990 – data-driven system; WebMT

11 Machine Translation (MT) 1975 Météo project in Canada Automatic translation of weather forecasts (En -> Fr) Sublanguage approach (domain-specific MT) Most successful MT application to date - public broadcasting since Fr -> En available since only 4% of output needs post-editing - rapid translation staff turnover no longer a problem

12 Machine Translation (MT) Technological factors Technological factors - prevalence of PC with improved processing power Translation market factors Translation market factors - official bilingualism/multilingualism create institutional needs - globalisation creates huge commercial needs Advances in computational linguistics Advances in computational linguistics More realistic user expectations More realistic user expectations Internet creates casual access to multilingual information Internet creates casual access to multilingual information Renewed interest in MT in 80s and 90s

13 Machine Translation (MT) MT Design Rule-based vs Data-driven Systems (SMT & EBMT) –Rule-based systems by far the more common Architecture for Rule-based Systems –Direct 1 st generation MT systems –Transfer –Interlingua 2 nd generation MT systems

14 Machine Translation (MT) MT Design Transfer-based Systems - based on systematic linguistic theory - convenient way of categorising linguistic problems (monolingual or contrastive) - modular - need detailed coding of monolingual and bilingual dictionaries and grammars - a dedicated transfer component is needed for each language pair, in each direction

15 Machine Translation (MT) transfer direct translation Source TextTarget Text analysis generation Interilingua

16 Machine Translation (MT) MT Design Data-driven systems: Statistical MT ( SMT) - linguistic knowledge not encoded - takes advantage of a bilingual parallel corpus to arrive at probable translations of each word - corpus-dependent - At run-time the best translation is searched (Carl & Way, 2003) e.g. IBM’s experiment with Canadian Hansard corpus: Candid (1988)

17 Machine Translation (MT) MT Design Statistical MT: Candid In Canadian Hansard (parliamentary debates of 40K sentences in each of en and fr) the  le p=.610 (ie 610 times out of 1000) the  lap=.178 the  l’ p=.083 the  les p=.023 the  ce p=

18 Machine Translation (MT) MT Design Data-driven systems: Example-based MT ( EBMT) inspired by Nagao (1984) who talked about translation by analogy “Man does not translate a simple sentence by doing deep linguistic analysis, rather, man does translation, first, by properly decomposing an input sentence into certain fragmental phrases, then by translating these phrases into other language phrases, and finally by properly composing these fragmental translations into one long sentence. The translation of each fragmental phrase will be done by the analogy translation principle with proper examples as its reference”

19 Machine Translation (MT) MT Design Example-based MT ( EBMT) It operates on a bilingual corpus with alignments of translation units on word, phrase, and sentence level During runtime, the system checks whether an adequate translation is stored in the corpus Best results are obtained if large coherent parts are found in the corpus

20 Machine Translation (MT) MT Design Example-based MT ( EBMT) 1.He buys a book on international politics [ST]. 2.a. (E) He buys a notebook. (J) Kare wa noto o kau. He [topic] notebook[obj] buy. b. (E) I read a book on international politics. (J) Watashi wa kokusaiseiji nitsuite kakareta hon o yomu. I [topic]international politics about concerned book[obj] read 3. Kare wa kokusaiseiji nitsuite kakareta hon o kau [TT]. (Sato & Nagao, 1990)

21 EBMT Principle transfer direct translation Source TextTarget Text analysis generation matching exact match recombination (Somers, 2003:8) alignment

22 Translation Memory (TM) A database of aligned SL and TL segments (translation units) to allow the translator to: - propagate translations of internal repetitions in the source text through the target text - recycle translations for previously encountered source text segments (exact matches or fuzzy matches with some edits) - analyse new source texts for repetitions and matches with already translated texts stored in a translation memory

23 Translation Memory (TM) How it works: software segments source language (SL) text human translator translates an SL segment software stores the SL and the corresponding TL segment as a translation unit software checks an incoming SL segment against the stored SL segments and brings up a relevant translation unit in case of match translator determines whether or not to use or edit the previous translation called up by the software

24 Translation Memory (TM) Advantages: the translator can find out the degree of internal repetitions within SL text before translating sentence-level matches and similarities are automatically brought to the translator’s attention for re-use productivity boosted when the text type is suitable (ie repetitive, frequent updates, sim-ship etc) TM normally integrates concordance and terminology management units to assist consistency of use of words and terminology

25 Translation Memory (TM) Disadvantages: Previous errors contained in TM propagated: - the translator forgets to update TM - the translator asked not to change the existing translation in TM A ‘sentence salad’ phenomenon (Bédard, 2000) whereby creating a text less coherent or readable due to: - the translator confined to work on sentence-level - the translator trying to maximise the recyclability - TM consisting of varying texts translated by different translators (Bowker & Barlow, 2004) Similarities in form rather than semantic similarities picked up Potential de-skilling of the translator (Kenny, 2004)

26 Exercises on MT and TM Machine Translation Reverse engineering with Model Zero approach (Pérez-Ortiz & Forcada, 2001) This exercise is designed to allow you to make an intelligent guess about what goes on inside an MT system without looking inside (black box as opposed to glass box evaluation) Translation Memory Understanding how internal and external repetitions are processed by the system