Download presentation
Presentation is loading. Please wait.
Published byDylan Gibson Modified over 9 years ago
1
Profile The METIS Approach Future Work Evaluation METIS II Architecture METIS II, the continuation of the successful assessment project METIS I, is an IST Programme, with a 3-year duration (01/10/2004 – 30/09/2007). The METIS II consortium comprises the following partners: Institute for Language & Speech Processing [ILSP] (co-ordinator) Katholieke Universiteit Leuven [KUL] Gesellschaft zur Förderung der Angewandten Informationsforschung [GFAI] Universitat Pompeu Fabra [UPF] hybrid readily availableresources METIS II is a hybrid system, combining various approaches to machine translation (rule-based, statistical, pattern-matching techniques). It makes use of readily available resources, such as bilingual dictionaries or basic NLP tools, and it can be easily customised to handle different source (SL) and target language (TL) tags. innovative exclusively monolingual TL corpora Most importantly, however, METIS II is innovative because it does not need bilingual corpora for the translation process, but exclusively relies on monolingual TL corpora. recursive METIS II handles sequences both at sentence and sub-sentential level, achieving thus to exploit the recursive property of natural language. weights automatically METIS II employs a series of weights, i.e. system parameters, in various phases of the translation process. Weights are associated with system resources and employed by the pattern-matching algorithm; they can be automatically adjusted to customise system performance. GreekDutchGermanSpanish English Four (4) language pairs have been developed as yet, namely Greek, Dutch, German & Spanish English. METIS II: Statistical Machine Translation using Monolingual Corpora METIS II: Statistical Machine Translation using Monolingual Corpora (FP6-IST-003768) Database Server Lexicon BNC Clauses BNC Chunks Token Generation Rules Final Translation NLP NLP tools handle the SL input yielding an SL sequence annotated with grammatical & syntactic information. LexiconLookup The SL sequence is enhanced by translation equivalents & PoS info, thus resembling a TL pattern. Core Engine The core engine of METIS II system is fed with a sequence of TL-like patterns, handled by the pattern-matching algorithm. It proceeds in 2 stages involving wider and narrower contexts, thus generating a TL sequence. Web Interface The end user selects the preferred SL and enters the text to be translated. Token Generation The token generation module receives as input a sequence of translated lemmas & their respective tags; it is responsible for the production of tokens out of lemmas. Weights Evaluation Setup 200 50 For the system evaluation an experimental corpus extracted from real texts, mainly from newspapers, was used. It consisted of 200 sentences, 50 per language pair. The test sentences were of relative complexity, containing one to two clauses each and covered various syntactic phenomena such as word-order variation, NP structure, negation, modification etc. 3 BLEUNIST The reference translations have been restricted to 3 and were produced by humans, while BLEU & NIST metrics have been used for the evaluation. Evaluation Results Greek Results Dutch Results Fig. 1: Comparative analysis of the score ranges obtained for METIS II and SYSTRAN using the BLEU metric Fig. 2: Comparative analysis of the score ranges obtained for METIS II and SYSTRAN using the NIST metric Fig. 3: Comparative analysis of the scores obtained for METIS II and SYSTRAN using the BLEU metric Fig. 4: Comparative analysis of the scores obtained for METIS II and SYSTRAN using the NIST metric German Results Fig. 8: Comparative analysis of the scores obtained for different settings of METIS II and SYSTRAN using the NIST metric Spanish Results Fig. 7: Comparative analysis of the scores obtained for different settings of METIS II and SYSTRAN using the BLEU metric Future work involves further investigation of METIS II system architecture. More specifically, work towards the system optimisation includes the following: Further system testing with a big number of test suites that will have more elaborate structures and deal with a wider range of phenomena Algorithm optimisation in terms of accuracy Automatic fine tuning of weights Implementation of a post-editor module Fig. 5: Comparative analysis of the scores obtained for METIS II and SYSTRAN using the BLEU metric Fig. 6: Comparative analysis of the scores obtained for METIS II and SYSTRAN using the NIST metric
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.