Download presentation
Presentation is loading. Please wait.
Published byClifford Crawford Modified over 8 years ago
1
Smart Computer-Aided Translation Environment 23 september 2014 Leuven
2
SCATE T6 Meeting with IAC2 Doelstellingen - Verhoging van de vertaalefficiëntie door een meer gesofisticeerd gebruik van vertaalbronnen - Verhoging van de efficiëntie van het tolken door het automatisch verschaffen van domeinkennis - Verbetering van elektronische taalleerpakketten door automatische verbetering van input van leerder - Verbetering van meertalige information retrieval door het gebruik van vergelijkbare corpora - Verbetering van spraakherkenning door het ter beschikking stellen van domeinspecifieke terminologie - Verbetering van workflows en gepersonaliseerde gebruikersinterfaces voor de vertaalindustrie
3
SCATE T6 Meeting with IAC3 Consortium KU Leuven - Departement Taalkunde, Centrum voor Computerlingu ïstiek (CCL) - Departement Electrotechniek (ESAT-PSI) - Departement Computerwetenschappen, Afd. Informatica (LIIR) - Departement Taal en Communicatie (T&C) Universiteit Gent - Departement Vertaalkunde, Language & Translation Technology Team (LT3) Universiteit Hasselt - Vakgroep Informatica, Expertisecentrum Digitale Media (EDM)
4
SCATE T6 Meeting with IAC4 Begeleidingscommissie Xplanation Nuance Yazzoom Mastervoice ITP Europe Oneliner CrossLang VRT Onderzoek & Innovatie AKTOR Knowledge Technology Commart Mentoring Systems
5
SCATE T6 Meeting with IAC5 Agenda 14.00 Introduction Frank Van Eynde (CCL) 14.05 Survey of where the project stands Vincent Vandeghinste (CCL) 14.20 WP1. Improvements in translation technology Tom Vanallemeersch (CCL) 14.35 WP2. Evaluation of computer-aided translation Arda Tezcan (LT3) 14.50 WP3. Terminology extraction from comparable corpora Geert Heyman (LIIR) 15.05 Why we participate Jan Verhasselt (Yazzoom)
6
SCATE T6 Meeting with IAC6 Agenda 15.50 WP4. Speech recognition accuracy Patrick Wambacq (ESAT-PSI) 16.05 Why we participate Eric Bauwelinck (Mastervoice) 16.20 WP5. Work flows and personalized user interfaces Mieke Haesen (EDM) 16.35 Discussion about interaction between IAC and consortium 17.00 Closure
7
Smart Computer-Aided Translation Environment Project overview and general progress Vincent Vandeghinste
8
Project Overview Sponsor: IWT (SBO-130047) Amount: 3 million euro Timeframe: 01/03/2014 – 28/02/2018 Consortium KU Leuven Centre for Computational Linguistics (CCL) Centre for the Processing of Speech and Images (ESAT/PSI) Language Intelligence and Information Retrieval (LIIR) Language and Communication (L&C) UGent Language and Translation Technology Team (LT3) UHasselt Expertise Centre for Digital Media (EDM) SCATE T6 Meeting with IAC8
9
WP1 Translation Technology Improvements Goal – Improve efficiency of translation memories 1.Improving recall through the development of a syntactic fuzziness metric 2.Improving the integration between TM and MT by automatically correcting the mismatches 3.Improving MT itself SCATE T6 Meeting with IAC9
10
WP2 Evaluation of Computer- Aided Translation Goals 1.Develop a taxonomy of typical MT errors and create a data set with manually annotated MT and TM errors 2.Study the current post-editing effort by observing and analysing how human translators actually post-edit 3.Develop confidence metrics to estimate a- priori post-editing effort SCATE T6 Meeting with IAC10
11
WP3 Terminology Extraction from Comparable Corpora Goal – Study the process of terminology extraction by humans and automate the process of terminology extraction from comparable corpora 1.Investigating translator’s methods in acquiring domain terminology 2.Investigating methods to determine what texts contain comparable information 3.Actual extraction of terminology from these comparable corpora SCATE T6 Meeting with IAC11
12
WP4 Speech Recognition Goals 1.Improve speech recognition accuracy for translator’s respeaking by making more information available to the speech recogniser 2.Improve translation of spoken data by making more information available to the translation engine, such as recognition alternatives SCATE T6 Meeting with IAC12
13
WP5 Work flows and Personalised User Interfaces Goal 1.Obtain techniques to analyse translators’ personal work flows while using the translation system for a specific job. 2.These will be used to organise and visualise the features in a way that optimises ease of use and efficiency for translators SCATE T6 Meeting with IAC13
14
WP6 Integration, Evaluation, Valorisation and Dissemination Goals 1.Integration of the technology at regular time points in the project 2.Evaluation at a modular and at an integrated level 3.Adjustment of the plans for valorisation as the project evolves 4.Dissemination of the obtained results to the IAC to the scientific community to other interested parties SCATE T6 Meeting with IAC14
15
Dissemination of the results Industrial dissemination Dries Dewachter. (2014). IWT legt 3 miljoen op tafel voor Smart Computer-Aided Translation Environment. De Taalsector. 20 januari Tom Vanallemeersch and Vincent Vandeghinste. (accepted). Improving fuzzy matching through syntactic knowledge. Translating and the Computer. ASLING. London Tom Vanallemeersch and Ken De Wachter (accepted). Introducing the Smart Computer Assisted Translation Environment. TEKOM. Stuttgart Academic dissemination Vincent Vandeghinste, Tom Vanallemeersch, Frank Van Eynde, Lieve Macken, Els Lefever, Véronique Hoste, Marie-Francine Moens, Joris Pelemans, Patrick Wambacq, Mieke Haesen, Karin Coninx & Ken De Wachter (2014). Smart Computer Aided Translation Environment. In Marko Tadić, Philipp Koehn, Johann Roturier & Andy Way (eds.), Proceedings of the 17th Annual Conference of the European Association for Machine Translation (EAMT). Dubrovnik, Croatia. p. 135. Vanopstal, K., Macken, L., Lefever, E., Van de Kauter, M., Buysschaert, J., & Hoste, V. (2014). Terminologie: op het snijvlak van ambacht en technologie. In S. Evenepoel, P. Goethals and L. Jooken (eds.), Beschouwingen uit een talenhuis, 179-189. Academia Press, Ghent, Belgium Ivan Vulic, & Marie-Francine Moens. (2014). Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data. In Proceedings of EMNLP 2014: Conference on Empirical Methods in Natural Language Processing. 2014 Ivan Vulic, Wim De Smet, Jie Tang & Marie-Francine Moens. (to appear). Probabilistic Topic Modeling in Multilingual Settings: An Overview of Its Methodology and Applications. Information Processing & Management. SCATE T6 Meeting with IAC15
16
Data provided by the IAC so far SCATE T6 Meeting with IAC16 Translation Memories Mastervoice TMX file (Dutch-English) ITP TMX file (English-French) OneLiner TMX file (Dutch-English) MT Output XPlanation SMT Output (French)
17
Data on our wish list Speech data Recordings of translators that speak translations into Dutch, together with final corrected translation and the source text (preferably in English) Broadcast data: Audio plus transcripts plus subtitles (Dutch audio) TM output data Source text + Pre-translation by TM system(s) + TMX’s MT output data MT Output together with post-edited data, preferably in Dutch (or English) Observational data How do translators use their CAT tools now? How do translators do their terminology lookup now? How is post-editing done now? How is speech recogition used? SCATE T6 Meeting with IAC17
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.