Presentation is loading. Please wait.

Presentation is loading. Please wait.

MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan.

Similar presentations


Presentation on theme: "MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan."— Presentation transcript:

1 MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

2 Outline  Problem statement in SMT  Translation models  Using Giza++ and Moses

3 Introduction to SMT  Given a sentence in foreign language F, find most appropriate translation in English E  P(F|E) – Translation model  P(E) – Language model

4 The Generation Process 4  Partition: Think of all possible partitions of the source language  Lexicalization: For a give partition, translate each phrase into the foreign language  Reordering: permute the set of all foreign words - words possibly moving across phrase boundaries  We need the notion of alignment to better explain mathematic behind the generation process

5 Alignment

6 Word-based alignment  For each word in source language, align words from target language that this word possibly produces  Based on IBM models 1-5  Model 1 – simplest  As we go from models 1 to 5, models get more complex but more realistic  This is all that Giza++ does

7 Alignment A function from target position to source position: 7 The alignment sequence is: 2,3,4,5,6,6,6 Alignment function A: A(1) = 2, A(2) = 3.. A different alignment function will give the sequence:1,2,1,2,3,4,3,4 for A(1), A(2).. To allow spurious insertion, allow alignment with word 0 (NULL) No. of possible alignments: (I+1) J

8 IBM Model 1: Generative Process 8

9 IBM Model 1: Details  No assumptions. Above formula is exact.  Choosing length: P(J|E) = P(J|E,I) = P(J|I) =  Choosing Alignment: all alignments equiprobable  Translation Probability 9

10 Training Alignment Models 10  Given a parallel corpora, for each (F,E) learn the best alignment A and the component probabilities:  t(f|e) for Model 1  lexicon probability P(f|e) and alignment probability P(a i |a i-1,I)  How to compute these probabilities if all you have is a parallel corpora

11 Intuition : Interdependence of Probabilities 11  If you knew which words are probable translation of each other then you can guess which alignment is probable and which one is improbable  If you were given alignments with probabilities then you can compute translation probabilities  Looks like a chicken and egg problem  EM algorithm comes to the rescue

12 Expectation Maximization (EM) Algorithm 12 Used when we want maximum likelihood estimate of the parameters of a model when the model depends on hidden variables -In present case, parameters are Translation Probabilities, and hidden Variables are alignment probabilities Init: Start with an arbitrary estimate of parameters E-step: compute the expected value of hidden variables M-Step: Recompute the parameters that maximize the likelihood of data given the expected value of the hidden variables from E-step

13 Example of EM Algorithm 13 Green house Casa verde The house La case Init: Assume that any word can generate any word with equal prob: P(la|house) = 1/3

14 E-Step 14 E-Step:

15 M-Step 15

16 E-Step again 16 1/32/3 1/3 Repeat till convergence

17 Limitation: Only 1->Many Alignments allowed 17

18 Phrase-based alignment  More natural  Many-to-one mappings allowed

19 Generating Bi-directional Alignments  Existing models only generate uni-directional alignments  Combine two uni-directional alignments to get many-to-many bi- directional alignments 19

20 Hindi-Eng Alignment छुट्टियोंकेलिएगोवा एक प्रमुखसमुद्र - तटीयगंतव्यहै Goa | is a | premier | beach vacation ||| destination || 20

21 Eng-Hindi Alignment छुट्टि यों केलिएगोवा एक प्रमुखसमुद्र - तटीयगंतव्यहै Goa | is a | premier | beach | vacation | destination | 21

22 Combining Alignments छुट्टि यों केलिएगोवा एक प्रमुख समुद्र - तटीय गंतव्यहै Goa + is a + premi er | | beach | vacati on || + destin ation | || 22 P=2/3=.67, R=2/7=.3 P=4/5=.8,R=4/7=.6 P=5/6=.83,R=5/7=.7 P=6/9=.67,R=6/7=.85

23 A Different Heuristic from Moses-Site 23 GROW-DIAG-FINAL(e2f,f2e): neighboring = ((-1,0),(0,-1),(1,0),(0,1),(-1,-1),(-1,1),(1,-1),(1,1)) alignment = intersect(e2f,f2e); GROW-DIAG(); FINAL(e2f); FINAL(f2e); GROW-DIAG(): iterate until no new points added for english word e = 0... en for foreign word f = 0... fn if ( e aligned with f ) for each neighboring point ( e-new, f-new ): if (( e-new, f-new ) in union( e2f, f2e ) and ( e-new not aligned and f-new not aligned )) add alignment point ( e-new, f-new ) FINAL(a): for english word e-new = 0... en for foreign word f-new = 0... fn if ( ( ( e-new, f-new ) in alignment a) and ( e-new not aligned or f-new not aligned ) ) add alignment point ( e-new, f-new ) Proposed Changes: After growing diagonal Align the shorter sentence first And use alignments only from corresponding directional alignment

24 Generating Phrase Alignments छुट्टि यों केलिएगोवा एक प्रमुख समुद्र - तटीय गंतव्यहै Goa + is a + premi er + beach + vacati on ++ + destin ation ++ 24 a premier beach vacation destination एक प्रमुख समुद्र - तटीय गंतव्यहै premier beach vacation प्रमुख समुद्र - तटीय

25 Using Moses and Giza++  Refer to http://www.statmt.org/moses_steps.htmlhttp://www.statmt.org/moses_steps.html

26 Steps  Install all packages in Moses Input - sentence aligned parallel corpus  Training  Tuning  Generate output on test corpus (decoding)

27 Example  train.en h e l l o w o r l d c o m p o u n d w o r d h y p h e n a t e d o n e b o o m k w e e z l e b o t t e r  train.pr hh eh l ow hh ah l ow w er l d k aa m p aw n d w er d hh ay f ah n ey t ih d ow eh n iy b uw m k w iy z l ah b aa t ah r

28 Sample from Phrase-table b o ||| b aa ||| (0) (1) ||| (0) (1) ||| 1 0.666667 1 0.181818 2.718 b ||| b ||| (0) ||| (0) ||| 1 1 1 1 2.718 c o m p o ||| aa m p ||| (2) (0,1) (1) (0) (1) ||| (1,3) (1,2,4) (0) ||| 1 0.0486111 1 0.154959 2.718 c ||| p ||| (0) ||| (0) ||| 1 1 1 1 2.718 d w ||| d w ||| (0) (1) ||| (0) (1) ||| 1 0.75 1 1 2.718 d ||| d ||| (0) ||| (0) ||| 1 1 1 1 2.718 e b ||| ah b ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.6 2.718 e l l ||| ah l ||| (0) (1) (1) ||| (0) (1,2) ||| 1 1 0.5 0.5 2.718 e l l ||| eh l ||| (0) (0) (1) ||| (0,1) (2) ||| 1 0.111111 0.5 0.111111 2.718 e l ||| eh ||| (0) (0) ||| (0,1) ||| 1 0.111111 1 0.133333 2.718 e ||| ah ||| (0) ||| (0) ||| 1 1 0.666667 0.6 2.718 h e ||| hh ah ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.6 2.718 h ||| hh ||| (0) ||| (0) ||| 1 1 1 1 2.718 l e b ||| l ah b ||| (0) (1) (2) ||| (0) (1) (2) ||| 1 1 1 0.5 2.718 l e ||| l ah ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.5 2.718 l l o ||| l ow ||| (0) (0) (1) ||| (0,1) (2) ||| 0.5 1 1 0.227273 2.718 l l ||| l ||| (0) (0) ||| (0,1) ||| 0.25 1 1 0.833333 2.718 l o ||| l ow ||| (0) (1) ||| (0) (1) ||| 0.5 1 1 0.227273 2.718 l ||| l ||| (0) ||| (0) ||| 0.75 1 1 0.833333 2.718 m ||| m ||| (0) ||| (0) ||| 1 0.5 1 1 2.718 n d ||| n d ||| (0) (1) ||| (0) (1) ||| 1 1 1 1 2.718 n e ||| eh n iy ||| (1) (2) ||| () (0) (1) ||| 1 1 0.5 0.3 2.718 n e ||| n iy ||| (0) (1) ||| (0) (1) ||| 1 1 0.5 0.3 2.718 n ||| eh n ||| (1) ||| () (0) ||| 1 1 0.25 1 2.718 o o m ||| uw m ||| (0) (0) (1) ||| (0,1) (2) ||| 1 0.5 1 0.181818 2.718 o o ||| uw ||| (0) (0) ||| (0,1) ||| 1 1 1 0.181818 2.718 o ||| aa ||| (0) ||| (0) ||| 1 0.666667 0.2 0.181818 2.718 o ||| ow eh ||| (0) ||| (0) () ||| 1 1 0.2 0.272727 2.718 o ||| ow ||| (0) ||| (0) ||| 1 1 0.6 0.272727 2.718 w o r ||| w er ||| (0) (1) (1) ||| (0) (1,2) ||| 1 0.1875 1 0.424242 2.718 w ||| w ||| (0) ||| (0) ||| 1 0.75 1 1 2.718

29 Testing output  h o t  hh aa t  p h o n e  p|UNK hh ow eh n iy  b o o k  b uw k


Download ppt "MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan."

Similar presentations


Ads by Google