Presentation is loading. Please wait.

Presentation is loading. Please wait.

25. sep. 2007 Dias 1 Center for Sprogteknologi Lene Offersgaard, Claus Povlsen Center for Sprogteknologi SDMT-SMV2 workshop 25. september 2007 Inter-set.

Similar presentations


Presentation on theme: "25. sep. 2007 Dias 1 Center for Sprogteknologi Lene Offersgaard, Claus Povlsen Center for Sprogteknologi SDMT-SMV2 workshop 25. september 2007 Inter-set."— Presentation transcript:

1 25. sep. 2007 Dias 1 Center for Sprogteknologi Lene Offersgaard, Claus Povlsen Center for Sprogteknologi SDMT-SMV2 workshop 25. september 2007 Inter-set SMT - MOSES og POS

2 25. sep. 2007 Dias 2 Center for Sprogteknologi SMT: Statistical resources Translation Workflow Preprocessing English text Translation Engine Postprocessing Danish text Proff reading Language model srilm 3 Phrase table MOSES Decoder

3 25. sep. 2007 Dias 3 Center for Sprogteknologi MOSES Open source system replacing Pharaoh (Koehn et al. 2007) State-of-the-art phrase-based approach Using factored translation models Comparison Pharao and Moses decoder Reuse of statistical resources possible Adding linguistic information to SMT: MOSES

4 25. sep. 2007 Dias 4 Center for Sprogteknologi Using factored translation models Makes it possible to build translation models based on surface forms, part-of-speech, morphology etc. We use: Translation model: word->word, pos->pos Generation model determine the output Adding linguistic information using MOSES InputOutput word pos+morf word pos+morf

5 25. sep. 2007 Dias 5 Center for Sprogteknologi Results adding pos-tags – by inspection With inclusion of morpho-syntactic information: (lit:… control of the full spectrum)... kontrol af det fulde spektrum (gender agreement) (lit: … the active ingredients)... de aktive bestanddele (number agreement) (lit:... this constant erosion)... denne konstante erosion (definiteness agreement)

6 25. sep. 2007 Dias 6 Center for Sprogteknologi Using factored translation models Makes it possible to build translation models based on surface forms, part-of-speech, morphology etc. We use: Translation model: word->word, pos->pos Generation model determine the output Adding linguistic information using MOSES InputOutput word pos+morf word pos+morf


Download ppt "25. sep. 2007 Dias 1 Center for Sprogteknologi Lene Offersgaard, Claus Povlsen Center for Sprogteknologi SDMT-SMV2 workshop 25. september 2007 Inter-set."

Similar presentations


Ads by Google