Development of an Intelligent Translation Memory MorphoLogic SZAK Publishers Balázs Kis

Slides:



Advertisements
Similar presentations
Machine Translation II How MT works Modes of use.
Advertisements

Data Mining and Text Analytics By Saima Rahna & Anees Mohammad Quranic Arabic Corpus.
How to Use a Translation Memory Prof. Reima Al-Jarf King Saud University, Riyadh, Saudi Arabia Homepage:
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
INTRODUCTION TO CAT TOOLS Ravi Kumar, Managing Director Allied Modlingua Services Pvt. Ltd. (ISO 9001:2000 Certified Translation Company) PRE-CONFERENCE.
Leveraging TM Technology to Improve Translatability & Usability Dr Jody Byrne University of Sheffield.
A Syntactic Translation Memory Vincent Vandeghinste Centre for Computational Linguistics K.U.Leuven
Improving Machine Translation Quality via Hybrid Systems and Refined Evaluation Methods Andreas Eisele DFKI GmbH and Saarland University Helsinki, November.
IS530 Lesson 12 Boolean vs. Statistical Retrieval Systems.
Computer Assisted Translation CAT Alexander C. Wu
April 2004 TM RASMAT 2004 – Uppsala Business Needs and Practices Pierre-Yves Foucou CTO - SYSTRAN.
Computer Assisted Translation CAT Alexander C. Wu Fall 2004.
Mastering the Internet, XHTML, and JavaScript Chapter 7 Searching the Internet.
FLUP - Elena Zagar Galvão Faculdade de Letras da Universidade do Porto INFORMÁTICA DE TRADUÇÃO FALL SEMESTER 2008 Lesson 5 Teacher: Elena Zagar Galvão.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
1212 Management and Communication of Distributed Conceptual Design Knowledge in the Building and Construction Industry Dr.ir. Jos van Leeuwen Eindhoven.
An innovative platform to allow translation and indexing of internet sites Localization World
(C) 2013 Logrus International Practical Visualization of ITS 2.0 Categories for Real World Localization Process Part of the Multilingual Web-LT Program.
CS-EE 481 Spring Founders Day, 2005 University of Portland School of Engineering Project Pocket Gopher Conversational Learning Agent Team Josh Jones.
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
Database Design for DNN Developers Sebastian Leupold.
System Analysis and Design
Pržno, Republic of Montenegro 8 October 2007 TRANSLATION FOR EU ACCESSION TRANSLATION FOR EU ACCESSION Jasminka Novak, Head of Service Independent Service.
FLAVIUS Technical presentation (Overblog, Qype, TVTrip) - WP2 Platform architecture.
Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Recognition Assistance Linguistic Feedback for Treating Errors of Recognition Processes Gábor Prószéky & Mátyás Naszódi
Can Controlled Language Rules increase the value of MT? Fred Hollowood & Johann Rotourier Symantec Dublin.
practical aspects1 Translation Tools Translation Memory Systems Text Concordance Tools Useful Websites.
Translation Router LE TransRouter - Partners Berlitz (Co-ordinator) LRC (Technical Manager) Centre for Language Technology (CST) GMS ISSCO University.
Translation Technologies Računalne tehnologije za prevo đ enje dr. Špela Vintar Department of Translation Studies Faculty of Arts University of Ljubljana.
PETRA – the Personal Embedded Translation and Reading Assistant Werner Winiwarter University of Vienna InSTIL/ICALL Symposium 2004 June 17-19, 2004.
Introducing MorphoLogic to LIRICS Gábor Prószéky MorphoLogic Pázmány Péter Catholic University Faculty.
Chapter 10 Language and Computer English Linguistics: An Introduction.
NCSU Libraries Andrew Pace & Emily Lynema NCSU Libraries May 24, 2006.
Digital Information and Heritage INFuture Zagreb, Sentence Alignment as the Basis For Translation Memory Database Sanja Seljan Faculty of.
Similar Document Retrieval and Analysis in Information Retrieval System based on correlation method for full text indexing.
Sofia Garcia/Roberto Silva Tutorial Workshop, GrenobleDate: 31/Jan/2007 The work of a professional translator and the translation agency V1.0.
Coping with Babel How to Localize XML. Designing for Localization Document design can seriously impact the costs of translation and localization. Remember.
GoogleDictionary Paul Nepywoda Alla Rozovskaya. Goal Develop a tool for English that, given a word, will illustrate its usage.
TRANSLATION MEMORY TECHNOLOGY
Gerrit Schutte OHIM 9th of December, 2011 Trademark terminology control.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Alexey Kolosoff, Michael Bogatyrev 1 Tula State University Faculty of Cybernetics Laboratory of Information Systems.
Translation Memory System (TMS)1 Translation Memory Systems Presentation by1 Melina Takanen & Julianna Ekert CAT Prof. Thorsten Trippel University.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
The Localisation Industry in Transition: New Economy, New Technology Florita Mendez Localisation Ireland 2000 Dublin, November 7, 2000.
Xml:tm XML Text Memory Using XML technology to reduce the cost of translating XML documents.
1 Machine Assisted Human Translation (MAHT) (…aka “Translation Memory” or “CAT tool”) …and what it does for the translator…
Collaborative tools for translators at the EP Pedro García Diéguez Head of the CAT and collaborative tools service 08/04/2015.
Toward an Open Source Textual Entailment Platform (Excitement Project) Bernardo Magnini (on behalf of the Excitement consortium) 1 STS workshop, NYC March.
SDL Trados Studio 2014 Getting Started. Components of a CAT Tool Translation Memory Terminology Management Alignment – transforming previously translated.
March, 2007RCO LLC, RCO Text Analysis Technologies for information extraction and business intelligence We can tell you everything about.
The Development Process Compilation. Compilation - Dr. Craig A. Struble 2 Programming Process Problem Solving Phase We will spend significant time on.
Basics of Natural Language Processing Introduction to Computational Linguistics.
By Kyle McCardle.  Issues with Natural Language  Basic Components  Syntax  The Earley Parser  Transition Network Parsers  Augmented Transition Networks.
#APMP2016. Submitting proposals in more than one language: a survival guide Considering language and translation as a key component of your value proposition.
Human Computer Interaction Lecture 21 User Support
Can you trust a TM? Results of an experiment conducted in November 2015 and August 2016 with students and professional translators. Daniela Ford Centre.
Human Computer Interaction Lecture 21,22 User Support
Classical Waterfall Model
Life Cycle Models PPT By :Dr. R. Mall.
Using Translation Memory to Speed up Translation Process
Technical translation
Part of the Multilingual Web-LT Program
Making the System Operational Implementation & Deployment
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 3 prof. ssa Laura Liucci –
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 4 prof. ssa Laura Liucci –
Information Retrieval and Web Design
Presentation transcript:

Development of an Intelligent Translation Memory MorphoLogic SZAK Publishers Balázs Kis IKTA5-146/2002 Rome, 21 May 2003

Project Details Duration 3 March 2003 – 25 February 2005 Budget Total: 96,8 M HUF [ €] Funding: 57,1 M HUF [ €] Consortium MorphoLogic Ltd. (84 %) SZAK Publishers Ltd. (16 %) Project leader: dr. Gábor Prószéky IKTA5-146/2002 Rome, 21 May 2003

The Problem and Its Impact (1.) Current state-of-the art translation memories store previously translated segments and translations offer look-up for similar source segments backed by character-based fuzzy indexes Advantage: this is language independent, and inexpensive to develop and support IKTA5-146/2002 Rome, 21 May 2003

The Problem and Its Impact (2.) Disadvantages of current TM technologies they ignore relationships between syntactic structures, therefore long segments or those with similar meaning or syntactic structure often stay hidden, so many segments included in the translation memory are simply lost IKTA5-146/2002 Rome, 21 May 2003

Before the project started... MorphoLogic had at hand Human Language Technology modules from morphology to every level of parsing syntax a localisation department with very specific technological needs (still pending) SZAK Publishers had at hand many years experience with translation and terminology a parallel corpus of technical texts of approx. 1,5 million words (under processing for project needs) IKTA5-146/2002 Rome, 21 May 2003

Main Objective Development of a Translation Memory equipped with Linguistic Intelligence finding source segments based on their grammatical similarity; making changes to stored translations according to the current source segment Long-term objective: an improvement in the quality of translations and a decrease in the translation effort (time) IKTA5-146/2002 Rome, 21 May 2003

Project Constraints An important remark: This will be a language-dependent translation memory (linguistic intelligence assumes language- specific HLT modules) First phase: using English and Hungarian HLT modules IKTA5-146/2002 Rome, 21 May 2003

Project Contents  The result is an integrated CAT tool (CAT = Computer Assisted Translation)  The tool consists of A terminology management module (already available) A text alignment program A translation memory IKTA5-146/2002 Rome, 21 May 2003

Project Phases 1. Planning and Specification (completed) 2. Corpus Building 3. Core Research Phase: Development of Grammatical Proximity Search and Translation Correction modules 4. Implementation of Database Engine 5. Integration and Test Translation IKTA5-146/2002 Rome, 21 May 2003

Grammatical Proximity Search Research on Non-Exact Matching of Phrases and Sentences (this is not fuzzy!) A procedure for matching grammatical structures normalized by means of syntactic and semantic features Critical evaluation of some „traditional” procedures Research on Adapting Stored Translations to current source segment IKTA5-146/2002 Rome, 21 May 2003

A sample match FrontPage opens the current page in Page view. Word opens the second file in Print Layout view. A FrontPage az aktuális oldalt a Page nézetben nyitja meg. A Word a második fájlt a Print Layout nézetben nyitja meg. Stored source segment Stored translation Current source segment recognized Adapted translation Traditional TMs do not find a match with the default 70% threshold! IKTA5-146/2002 Rome, 21 May 2003

Expected Results... Experiments start Autumn 2003 First Test Version End of 2003 IKTA5-146/2002 Rome, 21 May 2003

Further Steps Making the tool known in Hungary and abroad Improvement of Services based on User Feedback Addition of Further Language Pairs IKTA5-146/2002 Rome, 21 May 2003