LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 3 prof. ssa Laura Liucci – laura.liucci@uniroma2.it.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
Concepts of Database Management Sixth Edition
Aki Hecht Seminar in Databases (236826) January 2009
Physical design. Stage 6 - Physical Design Retrieve the target physical environment Create physical data design Create function component implementation.
Methodology Conceptual Database Design
CORE 2: Information systems and Databases STORAGE & RETRIEVAL 2 : SEARCHING, SELECTING & SORTING.
Overview of the Database Development Process
Concepts of Database Management, Fifth Edition
Array.
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Chapter 9 Designing Databases Modern Systems Analysis and Design Sixth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
CSCI 3140 Module 2 – Conceptual Database Design Theodore Chiasson Dalhousie University.
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
practical aspects1 Translation Tools Translation Memory Systems Text Concordance Tools Useful Websites.
Management Information Systems MS Access MS Access is an application software that facilitates us to create Database Management Systems (DBMS)
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
Concepts of Database Management Seventh Edition
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Cleansing Ola Ekdahl IT Mentors 9/12/08.
Information Processing Content covered  Data and information  Information Qualities  Data/Information Processing  Commercial Information Processing.
MS Access 2007 Management Information Systems 1. Overview 2  What is MS Access?  Access Terminology  Access Window  Database Window  Create New Database.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
Translation Memory System (TMS)1 Translation Memory Systems Presentation by1 Melina Takanen & Julianna Ekert CAT Prof. Thorsten Trippel University.
Introduction to Computational Linguistics
14.1/21 Part 5: protection and security Protection mechanisms control access to a system by limiting the types of file access permitted to users. In addition,
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
1 Automatic indexing Salton: When the assignment of content identifiers is carried out with the aid of modern computing equipment the operation becomes.
Database Management Systems (DBMS)
Intermediate 2 Computing Unit 2 - Software Development Topic 2 - Software Development Languages and Environments.
1 Information Retrieval LECTURE 1 : Introduction.
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
Some of the utilities associated with the development of programs. These program development tools allow users to write and construct programs that the.
 At the end of the class students should:  distinguish between data and information.  explain the characteristics and forms of Information Processing.
WP4 Models and Contents Quality Assessment
Why indexing? For efficient searching of a document
Fundamentals of DBMS Notes-1.
Virtual memory.
CHP - 9 File Structures.
Exploring Microsoft Office Access 2007
Introduction Multimedia initial focus
Chapter Ten Managing a Database.
Searching the Web Very exciting stuff.
Exploring Microsoft Office Access
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Array.
Database Vocabulary Terms.
UNIT-4 BLACKBOX AND WHITEBOX TESTING
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment.
MANAGING KNOWLEDGE FOR THE DIGITAL FIRM
Lesson 17 Getting Started with Excel Essentials
Workshop CAT Technology & Localization
Computer Architecture
Indexing and Hashing Basic Concepts Ordered Indices
Course: Module: Lesson # & Name Instructional Material 1 of 32 Lesson Delivery Mode: Lesson Duration: Document Name: 1. Professional Diploma in ERP Systems.
Databases.
Experience with the process automation at SORS
Implementation of Relational Operations
Applied Linguistics Chapter Four: Corpus Linguistics
The ultimate in data organization
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 2 prof. ssa Laura Liucci –
Data Mining.
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 4 prof. ssa Laura Liucci –
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 1 prof. ssa Laura Liucci –
Chapter 2 Database Environment Pearson Education © 2009.
Exploring Microsoft Office Access
UNIT-4 BLACKBOX AND WHITEBOX TESTING
User’s Perspective Laurie Gerber.
Database management systems
Presentation transcript:

LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 3 prof. ssa Laura Liucci – laura.liucci@uniroma2.it

Terminology-management systems A major part in any translation project is identifying equivalents for specialized terms. Subject fields such as computing, manufacturing, law and medicine all have significant amounts of field-specific terminology. Researching the specific terms needed to complete a translation is a time-consuming task. A terminology-management system (TMS) can help with various aspects of the translator’s terminology-related tasks, including the storage, retrieval, and updating of term records.

Terminology-management systems Effective terminology management can: Help to cut costs Ensure greater consistency (internally and externally) Improve linguistic quality Reduce turnaround times for translation

Terminology-management systems STORAGE TMS acts as a repository for consolidating and storing terminological information for use in future translation projects. IN THE PAST: TMSs stored information using one-to-one correspondences. Users could only choose from a predefined set of fields. TODAY: TMSs use a relational model, which permits mapping in multiple language directions. TMSs adopt a free entry structure, which allows users to define their own set of fields.

Terminology-management systems TMS term record with free entry structure (which allows users to define their own set of fields)

Terminology-management systems RETRIEVAL Once the terminology has been stored, translators need to be able to retrieve this information. A range of retrieval mechanisms is available: A simple look-up to retrieve an exact match. Wildcards for truncated search (EX: comput*) Fuzzy matching techniques. A fuzzy match retrieves terms that are similar to the requested search patterns, but that do not match it exactly.

Terminology-management systems Term records retrieved using fuzzy matching Hit lists retrieved for different search patterns

Terminology-management systems ACTIVE TERMINOLOGY RECOGNITION AND PRE-TRANSLATION A feature of some TMSs, particularly those that operate as part of an integrated package with word processors and translation-memory systems. It is essentially a type of automatic dictionary look-up Some systems also permit a more automated extension of this feature in which a translator can ask the system to do a sort of pre-translation or batch processing of the text.

Terminology-management systems Automatic replacement of source-text terms with translation equivalents found in a term base

Terminology-management systems TERM EXTRACTION Also called “term recognition” or “term identification”. This process can help a translator build a term base more quickly. Even if the extraction is performed by a computer, the resulting list of candidates must be verified by a human.  Semi-automatic (computer-aided) process Unlike the word-frequency lists described earlier, term extraction tools attempt to identify multi-word units. Linguistic approach Statistical approach

Terminology-management systems Linguistic approach to term extraction: the tools tries to identify word combinations that match particular part-of-speech patterns. EX: NOUN+NOUN or ADJECTIVE+NOUN (typical in English) In order to implement such an approach, each word in the text must by tagged with its appropriate part of speech. Then, the tool identifies all the occurrences that match the search. Unfortunately, not all texts can be processed this neatly: 1. Not all the combinations that match the specific patterns can be qualified as terms. 2. Some legitimate terms may be formed according to patterns that have not been pre-programmed into the tool.

Terminology-management systems Let’s take this text as an example Adopting a linguistic approach to term extraction, the tool looked for the combinations NOUN+NOUN and ADJ+NOUN

Terminology-management systems 1. Not all the combinations that match the specific patterns can be qualified as terms. EX: NOUN+NOUN or ADJ+NOUN “antivirus software” “current status”  NOISE 2. Some legitimate terms may be formed according to patterns that have not been pre-programmed into the tool. “after-the-fact detection”  SILENCE (is legitimate but its pattern is PRE+ART+NOUN+NOUN)

Terminology-management systems Statistical approach to term extraction: the most straightforward statistical approach to term extraction is for a tool to look for repeated series of lexical items, specifying a threshold (the number of times a series must be repeated). If the threshold is two, a series of lexical items must appear at least twice to be recognized. Unfortunately, not all repeated series qualify as terms ( NOISE) and not all legitimate candidates are repeated ( SILENCE)

Terminology-management systems Both linguistic and statistical approaches have drawbacks, but there is a clear advantage in adopting the latter… Statistical approach is NOT language-dependent while the linguistic approach is.

Bibliography BOWKER, L. (2002). Computer-Aided Translation Technology: A Practical Introduction, University of Ottawa Press, Ottawa

THANKS FOR YOUR ATTENTION… and good luck!  Prof. Laura Liucci – laura.liucci@gmail.com