Download presentation
Presentation is loading. Please wait.
Published byScot Virgil Hubbard Modified over 9 years ago
1
Development of an Intelligent Translation Memory MorphoLogic http://www.morphologic.hu SZAK Publishers http://www.szak.hu Balázs Kis (kis@morphologic.hu) IKTA5-146/2002 Rome, 21 May 2003
2
Project Details Duration 3 March 2003 – 25 February 2005 Budget Total: 96,8 M HUF [387 200 €] Funding: 57,1 M HUF [228 400 €] Consortium MorphoLogic Ltd. (84 %) SZAK Publishers Ltd. (16 %) Project leader: dr. Gábor Prószéky IKTA5-146/2002 Rome, 21 May 2003
3
The Problem and Its Impact (1.) Current state-of-the art translation memories store previously translated segments and translations offer look-up for similar source segments backed by character-based fuzzy indexes Advantage: this is language independent, and inexpensive to develop and support IKTA5-146/2002 Rome, 21 May 2003
4
The Problem and Its Impact (2.) Disadvantages of current TM technologies they ignore relationships between syntactic structures, therefore long segments or those with similar meaning or syntactic structure often stay hidden, so many segments included in the translation memory are simply lost IKTA5-146/2002 Rome, 21 May 2003
5
Before the project started... MorphoLogic had at hand Human Language Technology modules from morphology to every level of parsing syntax a localisation department with very specific technological needs (still pending) SZAK Publishers had at hand many years experience with translation and terminology a parallel corpus of technical texts of approx. 1,5 million words (under processing for project needs) IKTA5-146/2002 Rome, 21 May 2003
6
Main Objective Development of a Translation Memory equipped with Linguistic Intelligence finding source segments based on their grammatical similarity; making changes to stored translations according to the current source segment Long-term objective: an improvement in the quality of translations and a decrease in the translation effort (time) IKTA5-146/2002 Rome, 21 May 2003
7
Project Constraints An important remark: This will be a language-dependent translation memory (linguistic intelligence assumes language- specific HLT modules) First phase: using English and Hungarian HLT modules IKTA5-146/2002 Rome, 21 May 2003
8
Project Contents The result is an integrated CAT tool (CAT = Computer Assisted Translation) The tool consists of A terminology management module (already available) A text alignment program A translation memory IKTA5-146/2002 Rome, 21 May 2003
9
Project Phases 1. Planning and Specification (completed) 2. Corpus Building 3. Core Research Phase: Development of Grammatical Proximity Search and Translation Correction modules 4. Implementation of Database Engine 5. Integration and Test Translation IKTA5-146/2002 Rome, 21 May 2003
10
Grammatical Proximity Search Research on Non-Exact Matching of Phrases and Sentences (this is not fuzzy!) A procedure for matching grammatical structures normalized by means of syntactic and semantic features Critical evaluation of some „traditional” procedures Research on Adapting Stored Translations to current source segment IKTA5-146/2002 Rome, 21 May 2003
11
A sample match FrontPage opens the current page in Page view. Word opens the second file in Print Layout view. A FrontPage az aktuális oldalt a Page nézetben nyitja meg. A Word a második fájlt a Print Layout nézetben nyitja meg. Stored source segment Stored translation Current source segment recognized Adapted translation Traditional TMs do not find a match with the default 70% threshold! IKTA5-146/2002 Rome, 21 May 2003
12
Expected Results... Experiments start Autumn 2003 First Test Version End of 2003 IKTA5-146/2002 Rome, 21 May 2003
13
Further Steps Making the tool known in Hungary and abroad Improvement of Services based on User Feedback Addition of Further Language Pairs IKTA5-146/2002 Rome, 21 May 2003
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.