Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Translation Prof. Alexandros Potamianos Dept. of Electrical & Computer Engineering Technical University of Crete, Greece May 2003.

Similar presentations


Presentation on theme: "Machine Translation Prof. Alexandros Potamianos Dept. of Electrical & Computer Engineering Technical University of Crete, Greece May 2003."— Presentation transcript:

1 Machine Translation Prof. Alexandros Potamianos Dept. of Electrical & Computer Engineering Technical University of Crete, Greece May 2003

2 Outline Machine Translation  Definition  Applications  Algorithms Language Types Algorithms  Transfer method  Inter-lingua  Direct translation Statistical Machine Translation

3 Definitions Machine Translation  Automatically translate text from one language to another Speech-to-Speech Translation  Speech Recognition, Machine Translation, Text to Speech

4 Machine Translation: Applications Rough translation, e.g., Systran Babel Fish Computer-aided human translation  Post-editing by human Multilingual systems in limited domains, e.g.,  Dialogue systems: high quality speech to speech translation  Information kiosks Cross-language information retrieval

5 Machine Translation: Algorithms Transfer Model  Maps languages at the syntactic parse tree level Maps languages at the syntactic parse tree level Interlingua  Maps language via a common new virtual language Maps language via a common new virtual language Direct Translation  Maps language at the lexical, phrase and shallow syntactic level Statistical Machine Translation  Statistical Direct Translation models  Models fluency of generated text and faithfulness of translation

6 Language Types Universal features  Syntactic classes: noun, verbs  Words have common meaning Language typology  Morphological: isolating (one morpheme per word) - polysynthetic agglutinative (clean morpheme boundaries) - fusion languages  Syntactical: SVO (subject-verb-object) SOV (e.g., Hindi, Japanese) VSO (e.g., Irish)

7 Other Language Differences Morphology richness: ENG The dog belongs to the tall child GR Ο σκύλος είναι του ψηλού παιδιού Use of articles: ENG I am runing GR Τρέχω Lexical differences (aka lexical gap)lexical gap ENG older elder GR μεγαλύτερος Format differences, e.g., dates, numbers

8 Transfer Models Three stages  Analysis: syntactic parse (ambiguity might not be a problem)  Transfer: syntactic transformation rules Transfer  Generation: lexical and syntactic transfer Lexical transfer  Function words are syntactically transferred, i.e., part of the rules  Content words are lexically transferred

9 Transfer Model: Example Example (english to french): ENG bad road adjective: bad noun: road (analysis) noun: road adjective: bad(transfer) noun: route adjective: mauvaise (generation) FR route mauvaise Altavista Babel Fish (Systran):  http://babelfish.altavista.com/ http://babelfish.altavista.com/  ENG: bad road FR: mauvaise route (wrong road)  ENG: wrong road FR: mauvaise route (wrong road)

10 Inter-lingua Motivation:  Translation between n languages n(n-1)/2 pairs of rules  With inter-lingua 2n pairs Assertion:  There is a common semantic representation across languages Algorithm: Algorithm  Perform syntactic analysis  Perform semantic analysis in inter-lingua ontology representation Perform semantic analysis in inter-lingua ontology representation  Generate syntactic tree in new language  Generate surface form using lexical transfer Problem: inter-lingua is often english! Useful for small domains

11 Direct Translation Motivation  Syntactic and (especially) semantic parsing often fails  Inter-lingua and transfer models hard to build Direct translation algorithm  Morphological analysis  Lexical transfer of content words  Various work relating to prepositions  SVO re-arrangements  Miscellany  Morphological generation Example: Japanese to EnglishJapanese to English

12 Direct Translation: Overall A realistic approach Uses syntax/semantics as needed  Robust (island) parsing  Shallow parsing Works only for language pairs  Can be extended with (e.g.) English as inter-lingua

13 Statistical Machine Translation Motivation  Why write rules?  Machine learning techniques can do the job for you Requirement  Large bi-lingual (parallel) corpora  Typically alignment required at the sentence level Baysesian Formulation (Brown et al, 1993)

14 Statistical Machine Translation Step 1: Preprocessing (manual or semi-automatic)  Clean parallel corpus  Segment at the sentence level Step 2: Alignment (automatic) ENG: And the program has been implemented GR: Tο πρόγραμμα τέθηκε σε εφαρμογή The(1) program(2) has(3) been(3) implemented(3,4,5) The(1) program(2) has(3,4,5) been(3,4,5) implemented(3,4,5) Step 3: Translation Models: Pr(G,A,E)  G and E are the Greek and English strings  A is a random alignment between them

15 Statistical Machine Translation Models (Brown Model 1, 2, 3,4, 5):  Greek g and English e strings  Number of words in Greek string m  Alignment of j the word is aj

16 Statistical Machine Translation Best translation  Faithfulness  Fluency Example-based machine translation  Ability to store phrases into bi-lingual dictionary  Translation memory  Systran and most translation houses use this

17 Evaluation Edit cost  Distance between standard (human-produced) and machine- generated translation


Download ppt "Machine Translation Prof. Alexandros Potamianos Dept. of Electrical & Computer Engineering Technical University of Crete, Greece May 2003."

Similar presentations


Ads by Google