Download presentation
Presentation is loading. Please wait.
1
Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau
2
Overview Language Alignment System Datasets Sentence-aligned sets for training (ex. The Hansards Corpus, European Parliamentary Proceedings Parallel Corpus) A word-aligned set for testing and evaluation to measure accuracy and precision Decoding
3
Language Alignment Goal: Produce a word-aligned set from a sentence-aligned dataset First step on the road toward Statistical Machine Translation Example Problem: The motion to adjourn the House is now deemed to have been adopted. La motion portant que la Chambre s'ajourne maintenant est réputée adoptée.
4
IBM Models 1 and 2 -Kevin Knight, A Statistical MT Tutorial Workbook, 1999 Each capable of being used to produce a word-aligned dataset separately. EM Algorithm Model 1 produces T-values based on normalized fractional counting of corresponding words. Additionally, Model 2 uses A-values for “reverse distortion probabilities” – probabilities based on the positions of the words
5
Training Data European Parliament Proceedings Parallel Corpus 1996-2003 Aligned Languages: English - French English - Dutch English - Italian English - Finish English - Portuguese English - Spanish English - Greek
6
Training Data cont. Eliminated Misaligned sentences Sentences with 50 or more words XML tags Symbols and numerical characters other then commas and periods
7
Ideally… http://www.cs.berkeley.edu/~klein/cs294-5
8
Bypassing Interlingua: Models I-III Variables contributing to the probability of a sentence: Correlation between words in the source/target languages Fertility of a word Correlation between order of words in source sentence and order of words in target
9
A Translation Matrix RobCatisDog Rob1000 Gato0100 es00.50 esta00.50 Perro0001
10
Building the Translation Matrix: Starting from alignments Find the sentence alignment If a word in the source aligns with a word in the target, then increment the translation matrix. Normalize the translation matrix
11
Can’t find alignments Most sentences in the hansards corpus are 60 words long. There are many that can be over 100. 100 100 possible alignments
12
Counting Rob is a boy. Rob es nino. Rob is tall.Rob es alto. Eric is tall.Eric es alto. … … Base counts on co-occurrence, weighting based on sentence length.
13
Iterative Convergence Use Estimation Maximization algorithm Creates translation matrix RobIsTallboy Rob.66.33.25 es.30.66.25 alto.2.05.50 nino.2.050.5
14
Distorting the Sentence Word order changes between languages How is a sentence with 2 words distorted? How is a sentence with 3 words distorted? How is a sentence with … To keep track of this information we use…
15
A tesseract! (A quadruply nested default dictionary) This could be a problem if there are more than 100 words in a sentence. 100x100x100x100 = too big for RAM and takes too much time
16
Broad Look at MT “The translation process can be described simply as: 1.Decoding the meaning of the source text, and 2.Re-encoding this meaning in the target language.” - “Translation Process”, Wikipedia, May 2006
17
Decoding How to go from the T-matrix and A-matrix to a word alignment? There are several approaches…
18
Viterbi If only doing alignment, much smaller memory and time requirements. Returns optimal path. T-Matrix probabilities function as the “emission” matrix A-Matrix probabilities concerned with the positioning of words
19
Decoding as a Translator Without supplying a translated sentence to the program, it is capable of being a stand-alone translator instead of a word aligner. However, while the Viterbi algorithm runs quickly with pruning for decoding, for translating the run time skyrockets.
20
Greedy Hill Climbing Knight & Koehn, What’s New in Statistical Machine Translation, 2003 Best first search 2-step look ahead to avoid getting stuck in most probable local maxima
21
Beam Search Knight & Koehn, What’s New in Statistical Machine Translation, 2003 Optimization of Best First Search with heuristics and “beam” of choices Exponential tradeoff when increasing the “beam” width
22
Other Decoding Methods Knight & Koehn, What’s New in Statistical Machine Translation, 2003 Finite State Transducer Mapping between languages based on a finite automaton Parsing String to Tree Model
23
Problem: One to Many Necessary to take all alignments over a certain probability in order to capture the “probability that e has fertility at least a given value” Al-Onaizan, Curin, Jahr, etc., Statistical Machine Translation, 1999
24
Results Study done in 2003 on word alignment error rates in Hansards corpus: Model 2 – 29.3% on 8K training sentence pairs 19.5% on 1.47M training sentence pairs Optimized Model 6 – 20.3% on 8K training sentence pairs 8.7% on 1.47M training sentence pairs Och and Ney, A Systematic Comparison of Various Statistical Alignment Models, 2003
25
Expected Accuracy 70% overall Language performance: Dutch French Italian, Spanish, Portuguese Greek Finish
26
Possible Future Work Given more time, we would’ve implemented IBM Model 3 Additionally uses n, p, and d fertilities for weighted alignments: N, number of words produced by one word D, distortion P, parameter involving words that aren’t involved directly Invokes Model 2 for scoring
27
Another Possible Translation Scheme Example-Based Machine Translation Translation-by-Analogy Can sometimes achieve better than the “gist” translations from other models
28
Why Is Improving Machine Translation Necessary?
29
A Chinese to English Translation
30
The End Are there any questions/comments?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.