Presentation is loading. Please wait.

Presentation is loading. Please wait.

Challenges of Machine Translation

Similar presentations


Presentation on theme: "Challenges of Machine Translation"— Presentation transcript:

1 Challenges of Machine Translation
CSC Machine Translation Dr. Tom Way

2 Translation is hard Novels Word play, jokes, puns, hidden messages
Concept gaps: go Greek, bei fen Other constraints: lyrics, dubbing, poem, …

3 Major challenges Getting the right words:
Choosing the correct root form Getting the correct inflected form Inserting “spontaneous” words Putting the words in the correct order: Word order: SVO vs. SOV, … Unique constructions: Divergence

4 Lexical choice Homonymy/Polysemy: bank, run
Concept gap: no corresponding concepts in another language: go Greek, go Dutch, fen sui, lame duck, … Coding (Concept  lexeme mapping) differences: More distinction in one language: e.g., kinship vocabulary. Different division of conceptual space:

5 Choosing the appropriate inflection
Inflection: gender, number, case, tense, … Ex: Number: Ch-Eng: all the concrete nouns: ch_book  book, books Gender: Eng-Fr: all the adjectives Case: Eng-Korean: all the arguments Tense: Ch-Eng: all the verbs: ch_buy  buy, bought, will buy

6 Inserting spontaneous words
Function words: Determiners: Ch-Eng: ch_book  a book, the book, the books, books Prepositions: Ch-Eng: … ch_November  … in November Relative pronouns: Ch-Eng: … ch_buy ch_book de ch_person  the person who bought /book/ Possessive pronouns: Ch-Eng: ch_he ch_raise ch_hand  He raised his hand(s) Conjunction: Eng-Ch: Although S1, S2  ch_although S1, ch_but S2

7 Inserting spontaneous words (cont)
Content words: Dropped argument: Ch-Eng: ch_buy le ma  Has Subj bought Obj? Chinese First name: Eng-Ch: Jiang …  ch_Jiang ch_Zemin … Abbreviation, Acronyms: Ch-Eng: ch_12 ch_big  the 12th National Congress of the CPC (Communist Party of China)

8 Major challenges Putting the words in the correct order:
Getting the right words: Choosing the correct root form Getting the correct inflected form Inserting “spontaneous” words Putting the words in the correct order: Word order: SVO vs. SOV, … Unique construction: Structural divergence

9 Word order SVO, SOV, VSO, … VP + PP  PP VP VP + AdvP  AdvP + VP
Adj + N  N + Adj NP + PP  PP NP NP + S  S NP P + NP  NP + P

10 “Unique” Constructions
Overt wh-movement: Eng-Ch: Eng: Why do you think that he came yesterday? Ch: you why think he yesterday come ASP? Ch: you think he yesterday why come? Ba-construction: Ch-Eng She ba homework finish ASP  She finished her homework. He ba wall dig ASP CL hole  He digged a hole in the wall. She ba orange peel ASP skin  She peeled the orange’s skin.

11 Translation divergences
Source and target parse trees (dependency trees) are not identical. Example: I like Mary  S: Marta me gusta a mi (‘Mary pleases me’)


Download ppt "Challenges of Machine Translation"

Similar presentations


Ads by Google