Download presentation
Presentation is loading. Please wait.
Published byDarleen Walsh Modified over 9 years ago
1
TURKALATOR A Suite of Tools for English to Turkish MT Siddharth Jonathan Gorkem Ozbek CS224n Final Project June 14, 2006
2
English - Turkish MT The challenge Traditionally statistical MT research has focused on language pairs with rich resources Ambitious goal – Complete English-to-Turkish MT system on par with those on the Web (Google, Systran, etc.) Realistic goal – Outperform the general-purpose baseline The focus Address scarcity issues stemming from rich Turkish inflectional morphology The strategy Approximate a morphological analysis by exploiting certain aspects of Turkish morphology to get sub-lexical units Customize translation model building heuristics to deal correctly with these units
3
Baseline English to Turkish MT System Sentence Aligned English-Turkish GIZA++ (aligner) Word Aligned English-Turkish Phrase building heuristics Phrase translation table Turkish Corpus (training set) SRILM Turkish Language Model Pharaoh (decoder) English Sentences Turkish Translations Corpus: Approx. 22,000 aligned sentence pairs covering several genres
4
The Turkalator Way… Segmentation Turkish Text English Text Stem Alignment General word Alignment Phrase Extraction and Scoring Phrase Translation table Turkish Language Model Pharaoh (decoder)
5
Evaluation BaselineTurkalator 1Turkalator 2 Bleu Score9.1216.8017.00 Quantitative results Qualitative results Scarcity reduced greatly: many more Turkish words are now translated An example: English input: “ She thought it over.” Reference translation: “J ulia bunu iyice düşündü.” Baseline translation: “ Ba ş vuran dü ş ünce bu over.” Turkalator translation: “ Julia onun üzerinde dü ş ündü.”
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.