Evaluating Translation Memory Software Francie Gow MA Translation, University of Ottawa Translator, Translation Bureau, Government of Canada www.chandos.ca/thesis.html.

Slides:



Advertisements
Similar presentations
© 2000 XTRA Translation Services Is MT technology available today ready to replace human translators?
Advertisements

The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Indexing DNA Sequences Using q-Grams
How to Use a Translation Memory Prof. Reima Al-Jarf King Saud University, Riyadh, Saudi Arabia Homepage:
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
A Syntactic Translation Memory Vincent Vandeghinste Centre for Computational Linguistics K.U.Leuven
Interactive Translation vs. Pre-Translation in the Context of Translation Memory Systems: Investigating the Effects of Translation Method on Productivity,
Computer Assisted Translation CAT Alexander C. Wu
Computer Assisted Translation CAT Alexander C. Wu Fall 2004.
Chapter 2: Algorithm Discovery and Design
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
A Guide to SQL, Seventh Edition. Objectives Embed SQL commands in PL/SQL programs Retrieve single rows using embedded SQL Update a table using embedded.
Modern Information Retrieval Chapter 4 Query Languages.
Chapter 2: Algorithm Discovery and Design
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora Benjamin Arai Computer Science and Engineering Department.
An Automatic Segmentation Method Combined with Length Descending and String Frequency Statistics for Chinese Shaohua Jiang, Yanzhong Dang Institute of.
L. Padmasree Vamshi Ambati J. Anand Chandulal J. Anand Chandulal M. Sreenivasa Rao M. Sreenivasa Rao Signature Based Duplicate Detection in Digital Libraries.
Natural Language Processing Lab Northeastern University, China Feiliang Ren EBMT Based on Finite Automata State Transfer Generation Feiliang Ren.
Reverse Engineering State Machines by Interactive Grammar Inference Neil Walkinshaw, Kirill Bogdanov, Mike Holcombe, Sarah Salahuddin.
Computer-Aided Language Processing Ruslan Mitkov University of Wolverhampton.
Chapter 9 Designing Databases Modern Systems Analysis and Design Sixth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
Chapter 2: Algorithm Discovery and Design Invitation to Computer Science, C++ Version, Third Edition.
Invitation to Computer Science, Java Version, Second Edition.
METEOR-Ranking & M-BLEU: Flexible Matching & Parameter Tuning for MT Evaluation Alon Lavie and Abhaya Agarwal Language Technologies Institute Carnegie.
practical aspects1 Translation Tools Translation Memory Systems Text Concordance Tools Useful Websites.
Translation Technologies Računalne tehnologije za prevo đ enje dr. Špela Vintar Department of Translation Studies Faculty of Arts University of Ljubljana.
Digital Information and Heritage INFuture Zagreb, Sentence Alignment as the Basis For Translation Memory Database Sanja Seljan Faculty of.

1 University of Palestine Topics In CIS ITBS 3202 Ms. Eman Alajrami 2 nd Semester
Computer Aided Process Planning (CAPP). What is Process Planning? Process planning acts as a bridge between design and manufacturing by translating design.
Final Presentation Industrial project Automatic tagging tool for Hebrew Wiki pages Supervisors: Dr. Miri Rabinovitz, Supervisors: Dr. Miri Rabinovitz,
Introduction to Computer Application (IC) MH Room 517 Time : 7:00-9:30pm.
Extended Finite-State Machine Inference with Parallel Ant Colony Based Algorithms PPSN’14 September 13, 2014 Daniil Chivilikhin PhD student ITMO.
1 Machine Assisted Human Translation (MAHT) (…aka “Translation Memory” or “CAT tool”) …and what it does for the translator…
Data Structures and Algorithms Dr. Tehseen Zia Assistant Professor Dept. Computer Science and IT University of Sargodha Lecture 1.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Approximate sentence matching and its applications in corpus-based research Rafał Jaworski INFuture2015, Zagreb, Croatia.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Intermediate 2 Computing Unit 2 - Software Development.
VIGNAN'S NIRULA INSTITUTE OF TECHNOLOGY & SCIENCE FOR WOMEN TOOLS LINKS PRESENTED BY 1.P.NAVEENA09NN1A A.SOUJANYA09NN1A R.PRASANNA09NN1A1251.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
SDL Trados Studio 2014 Getting Started. Components of a CAT Tool Translation Memory Terminology Management Alignment – transforming previously translated.
1 Centroid Based multi-document summarization: Efficient sentence extraction method Presenter: Chen Yi-Ting.
Generating Query Substitutions Alicia Wood. What is the problem to be solved?
SDL Trados Studio 2014 Creating and Managing TMs Alignment Reviewing translations.
IR&NLP Coursework P1 Text Analysis Within The Fields Of Information Retrieval and Natural Language Processing By Ben Addley Academic Year 2004.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
INTRODUCTION TO COMPUTER PROGRAMMING(IT-303) Basics.
Review: Review: Translating without in-domain corpus: Machine translation post-editing with online learning techniques Antonio L. Lagarda, Daniel Ortiz-Martínez,
Metatexis “the easy way to translate” By: Diana Delgado Ma. Victoria Porro Master en Traduction – TAO ETI – automne 2009.
Introduction to Computer Programming Concepts M. Uyguroğlu R. Uyguroğlu.
Some of the utilities associated with the development of programs. These program development tools allow users to write and construct programs that the.
( ) 1 Chapter # 8 How Data is stored DATABASE.
METEOR: Metric for Evaluation of Translation with Explicit Ordering An Improved Automatic Metric for MT Evaluation Alon Lavie Joint work with: Satanjeev.
Language Technologies Institute Carnegie Mellon University
Introduction to programming
Thai AGROVOC Ontology Base for Agricultural Information Retrieval
David Shepherd, Zachary P. Fry, Emily Hill, Lori Pollock, and K
Do-Gil Lee1*, Ilhwan Kim1 and Seok Kee Lee2
Definition In simple terms, an algorithm is a series of instructions to solve a problem (complete a task) We focus on Deterministic Algorithms Under the.
Using Translation Memory to Speed up Translation Process
إستراتيجيات ونماذج التقويم
String matching.
Chapter 1 Introduction(1.1)
Material for your Press Release
What is this course about?
Data-Driven Approach to Synthesizing Facial Animation Using Motion Capture Ioannis Fermanis Liu Zhaopeng
CoXML: A Cooperative XML Query Answering System
Presentation transcript:

Evaluating Translation Memory Software Francie Gow MA Translation, University of Ottawa Translator, Translation Bureau, Government of Canada

Motivation for Research Volume of translation work increasing Machine translation not yet ready to meet new demand in a significant way Translators increasingly turning to translation memory (TM) software to increase productivity Available tools are difficult to compare

What is Translation Memory? Translation support software Allows users to recycle repetitive translation material Compares segments of new text with source material in the database If matches are found, retrieves corresponding target text and inserts it into new document

Automatic Search and Retrieval: Two Approaches Sentence-based approach –Example: TRADOS Character-string-within-a-bitext-based approach (CSB-based approach) –Example: MultiTrans

Two Approaches to TM Evaluation Primarily objective approach –Edit distance Primarily subjective approach –Human rating systems

Edit Distance Definition: smallest number of insertions, deletions, and substitutions required to change one string […] into another –National Institute of Standards and Technology

Edit Distance Advantages –Programmable –Once algorithm is developed, evaluation is fast and inexpensive Disadvantages –Loose approximation of usefulness –Definition of edit distance vague and variable –Assumes model translation

Human Rating Systems Example of a rating system (Sato, 1990) –(A) exact match –(B) “the example provides enough information about the translation of the whole input” –(C) “the example provides information about the translation of the whole input” –(F) “the example provides almost no information about the translation of the whole input”

Human Rating Systems Advantage –More valid than computer-generated results Disadvantages –Time consuming –Applicable to sentences, but not to mixed- language output of MultiTrans –Human is influenced by proposals

An evaluation system should be: reliable valid efficiently applicable –EAGLES Evaluation of Natural Language Processing Systems: Final Report

Measuring Usefulness Usefulness is a function of –Validity –Time Gain –Time Loss

New Evaluation Methodology Construction of a corpus in both tools Analysis and mark-up of new texts Processing of new texts in both tools Application of scores Analysis of scores

Conclusions Resulting methodology produces valid and reliable results Not as efficiently applicable as an edit distance algorithm, but highly customizable to a variety of translation contexts Applicable to any combination of tools, determines which is best for a given job