Download presentation
Presentation is loading. Please wait.
1
Statistical Machine Translation
2
General Framework Given sentences S and T, assume there is a “translator oracle” that can calculate P(T|S), the probability that an “ideal translator” will produce sentence T given sentence S. Our statistical translator tries to “reverse engineer” the ideal translator. That is, given T, it finds the S with highest probability P(S|T). We have: We want:
4
language model translation modelsearch method
5
Language model language model translation model can use n-gram model search method
6
Language model language model translation model can use n-gram model search method
7
Translation model Need alignment model that will allow us to calculate the probabilities of alignments, e.g., P [The (1) proposal (2) will (4) not (3,5) now (9) be implemented (6, 7, 8) | Les propositions ne seront pas mises en application maintenant] Target sentence Source sentence Notation for alignment: Les propositions ne seront pas mises en application maintenant | The (1) proposal (2) will (4, 5) not (3) now (9) be implemented (6, 7, 8)
8
Translation model Alignment model consists of: – fertility model (fertility = number of source words each target word is mapped to) – term-translation model – distortion model Target sentence Source sentence
9
Translation model (from Brown et al. paper): Need to calculate P (alignment), that is: P [The (1) proposal (2) will (4) not (3,5) now (9) be implemented (6, 7, 8) | Les propositions ne seront pas mises en application maintenant] To calculate this, we need: Fertility model: P(fertility =n |term) for each n (up to maximum value) and each target term Term-translation model: P(term S | term T ), the probability that term S appears in the source given that term T appears in the target Distortion model: One simple version is: assume position of target term depends only on position of source term and length of target sentence P(i | j, L) for each target position i, source position j, and target length L (limited to some maximum value for L) Target sentence Source sentence
10
Translation model (from Brown et al. paper): Need to calculate P (alignment), that is: P [The (1) proposal (2) will (4) not (3,5) now (9) be implemented (6, 7, 8) | Les propositions ne seront pas mises en application maintenant] Example: P [The (1) proposal (2) will (4) not (3,5) now (9) be implemented (6, 7, 8) | Les propositions ne seront pas mises en application maintenant] = P(fertility=1 | the) × P(les | the) × P(1 | 1, 7) × P(fertility=1 | proposal) × P(propositions | proposal) × P(2 | 2, 7) × P(fertility=1 | will) × P(seront | will) × P(3 | 4, 7) × P(fertility=2 | not) × P(ne | not) × P(pas | not) × P(4 | 3, 7) × P(4 | 5, 7) × etc. Target sentence Source sentence
11
How does the statistical translator learn these various models? From data, of course! E.g., massive amount of paired source/target sentences from UN translations How does the statistical translator search the database for the highest probability source sentence? See paper
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.