Download presentation
Presentation is loading. Please wait.
Published bySamuel Adenauer Modified over 5 years ago
1
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 4 prof. ssa Laura Liucci –
2
Translation-memory systems
One of the most important sources of information to which a translator can have access is a large body of previous translations. (Kay and Roesheisen, 1993, in Bowker, 2002) Given the staggering volume of translations produced year after year, it is quite obvious that existing translations contain more solutions to more translation problems than any other available resources. (Isabelle, 1993 in Bowker, 2002)
3
Translation-memory systems
The concept of Translation Memory has existed for some time. The idea originated in the 1970s. What is a Translation Memory? A TM is a type of linguistic database that is used to store source texts and their translation, explicitly aligned.
4
Translation-memory systems
A TM is a type of linguistic database that is used to store source texts and their translation, explicitly aligned. The texts are broken down into short segments that often correspond to sentences (often, but not always!). Translation Unit made up of a source text segment and its translated equivalent. Most simply, a TM can be viewed as a list of source-text segments explicitly aligned with their target text counterparts.
5
Translation-memory systems
Translation Unit a source text segment aligned with its translated equivalent.
6
Translation-memory systems
A TM can be viewed as a list of source-text segments explicitly aligned with their target text counterparts. Does it ring a bell? The resulting structure of a TM is sometimes referred to as a parallel corpus, or bitext.
7
How does a TM system work?
Using a TM system the translator will be able to “recycle” previously translated segments. These systems work automatically comparing new source segments against a database of translations. If a matching segment is found, the system will propose the “old” translation to the user, who will then decide whether to use or discard it. SEGMENT the basic unit in a TM system. But deciding what constitutes a segment isn’t easy!
8
Segmentation SEGMENT the basic unit in a TM system. In most instances, the basic unit of segmentation in a TM is the sentence, and this is why TM are sometimes called sentence memories. However, not all texts are written in sentence form (e.g. headings, table cells, etc.) Many TM systems allow the user to define other units of segmentation in addition to sentences, which can include sentence fragments or even entire paragraphs.
9
Segmentation Deciding what constitutes a segment is not a trivial task! It seems easy to decide that full sentences will qualify as segments, but how can a TM system identify sentences? Punctuation such as periods, exclamation points and question marks are typically used to indicate the end of a sentence…
10
Segmentation Punctuation such as periods, exclamation points and question marks are typically used to indicate the end of a sentence… …but what happens in case of an abbreviation such as Mr. or Dr.? Or in case of an ellipsis, which can appear in the middle of a sentences? Some of these problems can be solved incorporating stop lists into the TM systems.
11
Segmentation Another issue related to segmentation is the fact that the segmentation units used in the ST may not correspond exactly to those used in the TT. This lack of one-to-one correspondence can create difficulties for automatic alignment programs.
12
The most common types are:
TM systems: matches Most TM systems present the user with a number of different types of segment matches. What is a match? Matches are correspondences between a new SL segment and one or more “old” translations contained in the database. The most common types are: EXACT matches FUZZY matches TERM matches
13
TM systems: matches EXACT (also called “perfect” matches) 100% identical, including spelling, punctuation, numbers, even formatting, etc. FUZZY when a fuzzy match is found, it means that in the database there is a segment that is similar to the new one (the similarity can range from 1-99%, but the user can set the sensitivity threshold – the standard is between 50 and 99%). TERM if working in association with a term base (a terminological database), the TM system will compare the single terms contained in the new segment with the ones in the term base.
14
The Translation Memory (TM)
A Translation Memory is essentially a type of database. It is basically a software that allows a user to store and retrieve information. However, as with any database, the information must be provided by the user. Therefore, when the user first purchase a TM system, the database is empty. The system becomes useful when the translator begins to store some data (source and target texts) in the TM.
15
The Translation Memory (TM)
How can we create a Translation Memory? Two main ways: Interactive translation: while we translate the text within the TM system, the new TL segments are fed to and stored in the TM. Post-translation alignment: if we have some source texts and their correspondent translations (translated “in the old fashion”), we can upload them in the TM system , ALIGN them and feed the translations to the TM. TM can be exported and sent!
16
Examples of TM systems SDL TRADOS STUDIO ( WORDFAST PRO ( MEMOQ ( Freeware: OMEGA T ( WORDFAST ANYWHERE (
17
Suitability Given that a TM system allows the user to re-use previously translated work… …in you opinion, which kind of texts are more suitable for inclusion in a TM?
18
Suitability The most suitable texts for a TM are repetitive and highly specialized texts, and texts that will be updated or revised: Text with internal repetitions (the higher the percentage of repetitions, the more desirable it is to use a TMS) Revisions (amended version of a previous text) Recycled texts (sometimes referred to as external repetitions) Updates (e.g. when the client makes changes to a text that you are already translating )
19
Pros & cons According to Bowker (2002), the first thing to take into consideration is that an empty TM is of NO use The performance of the TM system is dependent on the scope and quality of the existing DB… …and the quality of the translations store in the DB is dependent on the translator’s skills!
20
Pros & cons PROS: it saves you time …but… CONS: if you can’t use the software properly, it’s time consuming! You will need a few week’s training to be able to use a TM system in a way that it will save you time, instead of making you lose time! PROS: it improves consistency (internal and external) CONS: The rigidity in maintaining the same ST’s order in the TT may affect the naturalness of the translation
21
File formats Different software applications store information in different formats, and Translation Memories are no exception. The format used by any given TM is not necessarily compatible with those of other TMs or TM systems. saa A standard data-exchange format for TMs was developed through the years to solve this problem TMX The purpose of TMX is to make it easier to import and export data between different TM systems without losing or distorting information.
22
Further considerations
TM systems are often quite expensive (even though the prices have been dropping and a few free systems are emerging) And they tend to need “high” minimum requirements to work properly on a PC (a lot of RAM and a good CPU) Different systems work with different formats (even though some standard are emerging – ex: .TMX for TM) Some languages are easier to process than others (especially when it comes to handle the segmentation) Using TM systems affects payments , as the clients may want to pay less for exact and fuzzy matches (but isn’t it fair, in a way?) A “full” TM is an asset, and issues of ownership may arise
23
Bibliography BOWKER, L. (2002). Computer-Aided Translation Technology: A Practical Introduction, University of Ottawa Press, Ottawa
24
THANKS FOR YOUR ATTENTION… and good luck!
Prof. Laura Liucci –
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.