Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine translation the Wiki way Bittlingmayer Adam Mathias 27 February 2007 University of Washington LING 575 – Machine Translation Машинен превод Strojový.

Similar presentations


Presentation on theme: "Machine translation the Wiki way Bittlingmayer Adam Mathias 27 February 2007 University of Washington LING 575 – Machine Translation Машинен превод Strojový."— Presentation transcript:

1 machine translation the Wiki way Bittlingmayer Adam Mathias 27 February 2007 University of Washington LING 575 – Machine Translation Машинен превод Strojový překlad Maskinoversættelse Maschinelle Übersetzung Maŝintradukado Traducción automática Itzulpengintza automatiko ترجمه ماشینی Konekäännin Traduction automatique תרגום מכונה Strojno prevođenje Gépi fordítás 機械翻訳 기계 번역 Terjemahan mesin Computervertaling Maskinoversettelse Tłumaczenie maszynowe Tradução automática Traducere automată Машинный перевод Maskinöversättning การแปลภาษาอัตโนมัติ 机器翻译

2 machine translation the Wiki way introduction to Wikipedia technical details and editing low-density languages parallelness of corpora named entities other entities disambiguation categorization problems papers

3 introduction to Wikipedia en.wikipedia.org

4 introduction to Wikipedia en.wikipedia.org Wikipedia (IPA: / ˌ wi ː ki ːˈ pi ː di.ə/ or / ˌ w ɪː ki ːˈ pi ː di.ə/) is a multilingual, Web- based, free content encyclopedia project. Wikipedia is written collaboratively by volunteers; its articles can be edited by anyone with access to the Web site.IPAWebfree contentencyclopedia volunteers

5 introduction to Wikipedia en.wikipedia.org the Wiki family lots of languages - unevenly distributed lots of topics – unevenly distributed growing fast respectability

6 technical details and editing technical details structure layout content rules tags and templates redirect and disambiguation markup

7 technical details and editing editing anyone locking and blocking disputes version control

8 technical details and editing Fei_Xia example

9 low-density languages predictably lacking X-English / English-X usually good using related languages

10 parallelness of corpora degrees determinants of parallelness mapping

11 named entities article titles abbreviations and acronyms place names company names personal names

12 other entities events dates titles technical terms

13 disambiguation

14 categorization

15 problems incompleteness inconsistency foreign words moving target

16 papers monolingual semantics errors and reliability WordNet using Wikipedia’s structure multilingual named entities parallel sentence generation

17 papers parallel sentence generation 1. compare with Babelfished version create aligned sentences with Babelfish pair off with best scoring sentence from the Wiki article 2. bootstrap from article titles create aligned sentences by replacing linked words with equivalent translate the rest by throwing shrinking N-grams into Wiki search pair off with best scoring sentence from the Wiki article

18 conclusions seed or bootstrap with traditional methods fill holes with Wikipedia hybrid systems lots of research to be done

19 questions general Chinese company names cn/hk/tw issues abbreviations/acronyms many languages with one writing system using links to find word divisions


Download ppt "Machine translation the Wiki way Bittlingmayer Adam Mathias 27 February 2007 University of Washington LING 575 – Machine Translation Машинен превод Strojový."

Similar presentations


Ads by Google