LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 4 prof. ssa Laura Liucci – laura.liucci@uniroma2.it.

Slides:



Advertisements
Similar presentations
How to Use a Translation Memory Prof. Reima Al-Jarf King Saud University, Riyadh, Saudi Arabia Homepage:
Advertisements

Interactive Translation vs. Pre-Translation in the Context of Translation Memory Systems: Investigating the Effects of Translation Method on Productivity,
Computer Assisted Translation CAT Alexander C. Wu
Computer Assisted Translation CAT Alexander C. Wu Fall 2004.
FLUP - Elena Zagar Galvão Faculdade de Letras da Universidade do Porto INFORMÁTICA DE TRADUÇÃO FALL SEMESTER 2008 Lesson 5 Teacher: Elena Zagar Galvão.
SM3121 Software Technology Mark Green School of Creative Media.
XP New Perspectives on Microsoft Access 2002 Tutorial 71 Microsoft Access 2002 Tutorial 7 – Integrating Access With the Web and With Other Programs.
Lecture 04.  DTP  Some features and their configuration  Fields and Filters  Summary.
Joy Oberoi Grade 12. Introduction THEATRE BOOKING SYSTEM (TBS) A system used to perform tasks that one would manually execute at a theatre It is online.
Working freelance for an international organisation.
Localizing Prestashop eCommerce Site with Wordfast
Lecture 01 (Tuesday 18 September).  Lecture 01 What is a TM, some tools Getting started (UI, create a TM, open file, translate, edit, preview)  Lecture.
practical aspects1 Translation Tools Translation Memory Systems Text Concordance Tools Useful Websites.
Translation Technologies Računalne tehnologije za prevo đ enje dr. Špela Vintar Department of Translation Studies Faculty of Arts University of Ljubljana.
Digital Information and Heritage INFuture Zagreb, Sentence Alignment as the Basis For Translation Memory Database Sanja Seljan Faculty of.
Just as there are many human languages, there are many computer programming languages that can be used to develop software. Some are named after people,
ICT IGCSE.  Introducing or changing a system needs careful planning  Why?
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Translation Memory System (TMS)1 Translation Memory Systems Presentation by1 Melina Takanen & Julianna Ekert CAT Prof. Thorsten Trippel University.
Transforming Parallel Corpora to Translation Memory Steve Legrand IPN 29th Sept
Xml:tm XML Based Text Memory Using XML technology to reduce the cost of translating XML documents 27 June 2005.
Xml:tm XML Text Memory Using XML technology to reduce the cost of translating XML documents.
1 Machine Assisted Human Translation (MAHT) (…aka “Translation Memory” or “CAT tool”) …and what it does for the translator…
Mr C Johnston ICT Teacher
SDL Trados Studio 2014 Getting Started. Components of a CAT Tool Translation Memory Terminology Management Alignment – transforming previously translated.
Certificate in Digital Applications – Level 02 Multimedia Showcase – DA202.
Metatexis “the easy way to translate” By: Diana Delgado Ma. Victoria Porro Master en Traduction – TAO ETI – automne 2009.
Mr L Challenor ICT Teacher
DXL to PST Converter presents
Development Environment
Virtual memory.
Jonathan Walpole Computer Science Portland State University
Teacher instructions:
User-Written Functions
Design Components are Code Components
Can you trust a TM? Results of an experiment conducted in November 2015 and August 2016 with students and professional translators. Daniela Ford Centre.
CHP - 9 File Structures.
SWT NET-TRIO SOFTWARE TOOLS RAPGEN - Report Generator
Computer Applications for Business
FEASIBILITY STUDY Feasibility study is a means to check whether the proposed system is correct or not. The results of this study arte used to make decision.
Introduction CSE 1310 – Introduction to Computers and Programming
Translating and the Computer London, 16 November 2017
THE BASICS.
ETS Inside Product Launch
Array.
OPERATE A WORD PROCESSING APPLICATION (BASIC)
Lesson plans Introduction.
Unit 02 – One World Scenario - DA202
Microsoft® Office Word 2007 Training
Using Translation Memory to Speed up Translation Process
Technical translation
Design and Programming
Workshop CAT Technology & Localization
Learning to Program in Python
CSCE 313 – Introduction to UNIx process
Introduction In today’s lesson we will look at: why Python?
Specifications Clean & Match
Tutorial 7 – Integrating Access With the Web and With Other Programs
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 3 prof. ssa Laura Liucci –
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 2 prof. ssa Laura Liucci –
The Systems Life Cycle: Development and testing
Material for your Press Release
Securing and Sharing a Presentation
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 1 prof. ssa Laura Liucci –
PARKLANDS COMMUNITY PRIMARY SCHOOL
Microsoft Office Illustrated Fundamentals
Information Retrieval and Web Design
Introduction of PTM (Planning Tracking & Management) Tool - developed by Meridian Technology 29/05/2019.
Operating Systems: Internals and Design Principles, 6/E
Presentation transcript:

LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 4 prof. ssa Laura Liucci – laura.liucci@uniroma2.it

Translation-memory systems One of the most important sources of information to which a translator can have access is a large body of previous translations. (Kay and Roesheisen, 1993, in Bowker, 2002) Given the staggering volume of translations produced year after year, it is quite obvious that existing translations contain more solutions to more translation problems than any other available resources. (Isabelle, 1993 in Bowker, 2002)

Translation-memory systems The concept of Translation Memory has existed for some time. The idea originated in the 1970s. What is a Translation Memory? A TM is a type of linguistic database that is used to store source texts and their translation, explicitly aligned.

Translation-memory systems A TM is a type of linguistic database that is used to store source texts and their translation, explicitly aligned. The texts are broken down into short segments that often correspond to sentences (often, but not always!). Translation Unit made up of a source text segment and its translated equivalent. Most simply, a TM can be viewed as a list of source-text segments explicitly aligned with their target text counterparts.

Translation-memory systems Translation Unit a source text segment aligned with its translated equivalent.

Translation-memory systems A TM can be viewed as a list of source-text segments explicitly aligned with their target text counterparts. Does it ring a bell? The resulting structure of a TM is sometimes referred to as a parallel corpus, or bitext.

How does a TM system work? Using a TM system the translator will be able to “recycle” previously translated segments. These systems work automatically comparing new source segments against a database of translations. If a matching segment is found, the system will propose the “old” translation to the user, who will then decide whether to use or discard it. SEGMENT  the basic unit in a TM system. But deciding what constitutes a segment isn’t easy!

Segmentation SEGMENT  the basic unit in a TM system. In most instances, the basic unit of segmentation in a TM is the sentence, and this is why TM are sometimes called sentence memories. However, not all texts are written in sentence form (e.g. headings, table cells, etc.) Many TM systems allow the user to define other units of segmentation in addition to sentences, which can include sentence fragments or even entire paragraphs.

Segmentation Deciding what constitutes a segment is not a trivial task! It seems easy to decide that full sentences will qualify as segments, but how can a TM system identify sentences? Punctuation such as periods, exclamation points and question marks are typically used to indicate the end of a sentence…

Segmentation Punctuation such as periods, exclamation points and question marks are typically used to indicate the end of a sentence… …but what happens in case of an abbreviation such as Mr. or Dr.? Or in case of an ellipsis, which can appear in the middle of a sentences? Some of these problems can be solved incorporating stop lists into the TM systems.

Segmentation Another issue related to segmentation is the fact that the segmentation units used in the ST may not correspond exactly to those used in the TT. This lack of one-to-one correspondence can create difficulties for automatic alignment programs.

The most common types are: TM systems: matches Most TM systems present the user with a number of different types of segment matches. What is a match? Matches are correspondences between a new SL segment and one or more “old” translations contained in the database. The most common types are: EXACT matches FUZZY matches TERM matches

TM systems: matches EXACT  (also called “perfect” matches) 100% identical, including spelling, punctuation, numbers, even formatting, etc. FUZZY  when a fuzzy match is found, it means that in the database there is a segment that is similar to the new one (the similarity can range from 1-99%, but the user can set the sensitivity threshold – the standard is between 50 and 99%). TERM  if working in association with a term base (a terminological database), the TM system will compare the single terms contained in the new segment with the ones in the term base.

The Translation Memory (TM) A Translation Memory is essentially a type of database. It is basically a software that allows a user to store and retrieve information. However, as with any database, the information must be provided by the user. Therefore, when the user first purchase a TM system, the database is empty. The system becomes useful when the translator begins to store some data (source and target texts) in the TM.

The Translation Memory (TM) How can we create a Translation Memory? Two main ways: Interactive translation: while we translate the text within the TM system, the new TL segments are fed to and stored in the TM. Post-translation alignment: if we have some source texts and their correspondent translations (translated “in the old fashion”), we can upload them in the TM system , ALIGN them and feed the translations to the TM. TM can be exported and sent!

Examples of TM systems SDL TRADOS STUDIO (http://www.sdl.com/cxc/language/translation-productivity/trados-studio) WORDFAST PRO (http://www.wordfast.com/products_wordfast_pro_3) MEMOQ (https://www.memoq.com/) Freeware: OMEGA T (http://www.omegat.org/it/omegat.html) WORDFAST ANYWHERE (https://www.freetm.com/)

Suitability Given that a TM system allows the user to re-use previously translated work… …in you opinion, which kind of texts are more suitable for inclusion in a TM?

Suitability The most suitable texts for a TM are repetitive and highly specialized texts, and texts that will be updated or revised: Text with internal repetitions (the higher the percentage of repetitions, the more desirable it is to use a TMS) Revisions (amended version of a previous text) Recycled texts (sometimes referred to as external repetitions) Updates (e.g. when the client makes changes to a text that you are already translating )

Pros & cons According to Bowker (2002), the first thing to take into consideration is that an empty TM is of NO use The performance of the TM system is dependent on the scope and quality of the existing DB… …and the quality of the translations store in the DB is dependent on the translator’s skills!

Pros & cons PROS: it saves you time …but… CONS: if you can’t use the software properly, it’s time consuming! You will need a few week’s training to be able to use a TM system in a way that it will save you time, instead of making you lose time! PROS: it improves consistency (internal and external) CONS: The rigidity in maintaining the same ST’s order in the TT may affect the naturalness of the translation

File formats Different software applications store information in different formats, and Translation Memories are no exception. The format used by any given TM is not necessarily compatible with those of other TMs or TM systems. saa A standard data-exchange format for TMs was developed through the years to solve this problem  TMX The purpose of TMX is to make it easier to import and export data between different TM systems without losing or distorting information.

Further considerations TM systems are often quite expensive (even though the prices have been dropping and a few free systems are emerging) And they tend to need “high” minimum requirements to work properly on a PC (a lot of RAM and a good CPU) Different systems work with different formats (even though some standard are emerging – ex: .TMX for TM) Some languages are easier to process than others (especially when it comes to handle the segmentation) Using TM systems affects payments , as the clients may want to pay less for exact and fuzzy matches (but isn’t it fair, in a way?) A “full” TM is an asset, and issues of ownership may arise

Bibliography BOWKER, L. (2002). Computer-Aided Translation Technology: A Practical Introduction, University of Ottawa Press, Ottawa

THANKS FOR YOUR ATTENTION… and good luck!  Prof. Laura Liucci – laura.liucci@gmail.com