MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan.

MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan

Outline  Problem statement in SMT  Translation models  Using Giza++ and Moses

Introduction to SMT  Given a sentence in foreign language F, find most appropriate translation in English E  P(F|E) – Translation model  P(E) – Language model

The Generation Process 4  Partition: Think of all possible partitions of the source language  Lexicalization: For a give partition, translate each phrase into the foreign language  Reordering: permute the set of all foreign words - words possibly moving across phrase boundaries  We need the notion of alignment to better explain mathematic behind the generation process

Alignment

Word-based alignment  For each word in source language, align words from target language that this word possibly produces  Based on IBM models 1-5  Model 1 – simplest  As we go from models 1 to 5, models get more complex but more realistic  This is all that Giza++ does

Alignment A function from target position to source position: 7 The alignment sequence is: 2,3,4,5,6,6,6 Alignment function A: A(1) = 2, A(2) = 3.. A different alignment function will give the sequence:1,2,1,2,3,4,3,4 for A(1), A(2).. To allow spurious insertion, allow alignment with word 0 (NULL) No. of possible alignments: (I+1) J

IBM Model 1: Generative Process 8

IBM Model 1: Details  No assumptions. Above formula is exact.  Choosing length: P(J|E) = P(J|E,I) = P(J|I) =  Choosing Alignment: all alignments equiprobable  Translation Probability 9

Training Alignment Models 10  Given a parallel corpora, for each (F,E) learn the best alignment A and the component probabilities:  t(f|e) for Model 1  lexicon probability P(f|e) and alignment probability P(a i |a i-1,I)  How to compute these probabilities if all you have is a parallel corpora

Intuition : Interdependence of Probabilities 11  If you knew which words are probable translation of each other then you can guess which alignment is probable and which one is improbable  If you were given alignments with probabilities then you can compute translation probabilities  Looks like a chicken and egg problem  EM algorithm comes to the rescue

Expectation Maximization (EM) Algorithm 12 Used when we want maximum likelihood estimate of the parameters of a model when the model depends on hidden variables -In present case, parameters are Translation Probabilities, and hidden Variables are alignment probabilities Init: Start with an arbitrary estimate of parameters E-step: compute the expected value of hidden variables M-Step: Recompute the parameters that maximize the likelihood of data given the expected value of the hidden variables from E-step

Example of EM Algorithm 13 Green house Casa verde The house La case Init: Assume that any word can generate any word with equal prob: P(la|house) = 1/3

E-Step 14 E-Step:

M-Step 15

E-Step again 16 1/32/3 1/3 Repeat till convergence

Limitation: Only 1->Many Alignments allowed 17

Phrase-based alignment  More natural  Many-to-one mappings allowed

Generating Bi-directional Alignments  Existing models only generate uni-directional alignments  Combine two uni-directional alignments to get many-to-many bi- directional alignments 19

Hindi-Eng Alignment छुट्टियोंकेलिएगोवा एक प्रमुखसमुद्र - तटीयगंतव्यहै Goa | is a | premier | beach vacation ||| destination || 20

Combining Alignments छुट्टि यों केलिएगोवा एक प्रमुख समुद्र - तटीय गंतव्यहै Goa + is a + premi er | | beach | vacati on || + destin ation | || 22 P=2/3=.67, R=2/7=.3 P=4/5=.8,R=4/7=.6 P=5/6=.83,R=5/7=.7 P=6/9=.67,R=6/7=.85

A Different Heuristic from Moses-Site 23 GROW-DIAG-FINAL(e2f,f2e): neighboring = ((-1,0),(0,-1),(1,0),(0,1),(-1,-1),(-1,1),(1,-1),(1,1)) alignment = intersect(e2f,f2e); GROW-DIAG(); FINAL(e2f); FINAL(f2e); GROW-DIAG(): iterate until no new points added for english word e = 0... en for foreign word f = 0... fn if ( e aligned with f ) for each neighboring point ( e-new, f-new ): if (( e-new, f-new ) in union( e2f, f2e ) and ( e-new not aligned and f-new not aligned )) add alignment point ( e-new, f-new ) FINAL(a): for english word e-new = 0... en for foreign word f-new = 0... fn if ( ( ( e-new, f-new ) in alignment a) and ( e-new not aligned or f-new not aligned ) ) add alignment point ( e-new, f-new ) Proposed Changes: After growing diagonal Align the shorter sentence first And use alignments only from corresponding directional alignment

Generating Phrase Alignments छुट्टि यों केलिएगोवा एक प्रमुख समुद्र - तटीय गंतव्यहै Goa + is a + premi er + beach + vacati on ++ + destin ation ++ 24 a premier beach vacation destination एक प्रमुख समुद्र - तटीय गंतव्यहै premier beach vacation प्रमुख समुद्र - तटीय

Using Moses and Giza++  Refer to http://www.statmt.org/moses_steps.htmlhttp://www.statmt.org/moses_steps.html

Steps  Install all packages in Moses Input - sentence aligned parallel corpus  Training  Tuning  Generate output on test corpus (decoding)

Example  train.en h e l l o w o r l d c o m p o u n d w o r d h y p h e n a t e d o n e b o o m k w e e z l e b o t t e r  train.pr hh eh l ow hh ah l ow w er l d k aa m p aw n d w er d hh ay f ah n ey t ih d ow eh n iy b uw m k w iy z l ah b aa t ah r

Sample from Phrase-table b o ||| b aa ||| (0) (1) ||| (0) (1) ||| 1 0.666667 1 0.181818 2.718 b ||| b ||| (0) ||| (0) ||| 1 1 1 1 2.718 c o m p o ||| aa m p ||| (2) (0,1) (1) (0) (1) ||| (1,3) (1,2,4) (0) ||| 1 0.0486111 1 0.154959 2.718 c ||| p ||| (0) ||| (0) ||| 1 1 1 1 2.718 d w ||| d w ||| (0) (1) ||| (0) (1) ||| 1 0.75 1 1 2.718 d ||| d ||| (0) ||| (0) ||| 1 1 1 1 2.718 e b ||| ah b ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.6 2.718 e l l ||| ah l ||| (0) (1) (1) ||| (0) (1,2) ||| 1 1 0.5 0.5 2.718 e l l ||| eh l ||| (0) (0) (1) ||| (0,1) (2) ||| 1 0.111111 0.5 0.111111 2.718 e l ||| eh ||| (0) (0) ||| (0,1) ||| 1 0.111111 1 0.133333 2.718 e ||| ah ||| (0) ||| (0) ||| 1 1 0.666667 0.6 2.718 h e ||| hh ah ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.6 2.718 h ||| hh ||| (0) ||| (0) ||| 1 1 1 1 2.718 l e b ||| l ah b ||| (0) (1) (2) ||| (0) (1) (2) ||| 1 1 1 0.5 2.718 l e ||| l ah ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.5 2.718 l l o ||| l ow ||| (0) (0) (1) ||| (0,1) (2) ||| 0.5 1 1 0.227273 2.718 l l ||| l ||| (0) (0) ||| (0,1) ||| 0.25 1 1 0.833333 2.718 l o ||| l ow ||| (0) (1) ||| (0) (1) ||| 0.5 1 1 0.227273 2.718 l ||| l ||| (0) ||| (0) ||| 0.75 1 1 0.833333 2.718 m ||| m ||| (0) ||| (0) ||| 1 0.5 1 1 2.718 n d ||| n d ||| (0) (1) ||| (0) (1) ||| 1 1 1 1 2.718 n e ||| eh n iy ||| (1) (2) ||| () (0) (1) ||| 1 1 0.5 0.3 2.718 n e ||| n iy ||| (0) (1) ||| (0) (1) ||| 1 1 0.5 0.3 2.718 n ||| eh n ||| (1) ||| () (0) ||| 1 1 0.25 1 2.718 o o m ||| uw m ||| (0) (0) (1) ||| (0,1) (2) ||| 1 0.5 1 0.181818 2.718 o o ||| uw ||| (0) (0) ||| (0,1) ||| 1 1 1 0.181818 2.718 o ||| aa ||| (0) ||| (0) ||| 1 0.666667 0.2 0.181818 2.718 o ||| ow eh ||| (0) ||| (0) () ||| 1 1 0.2 0.272727 2.718 o ||| ow ||| (0) ||| (0) ||| 1 1 0.6 0.272727 2.718 w o r ||| w er ||| (0) (1) (1) ||| (0) (1,2) ||| 1 0.1875 1 0.424242 2.718 w ||| w ||| (0) ||| (0) ||| 1 0.75 1 1 2.718

Testing output  h o t  hh aa t  p h o n e  p|UNK hh ow eh n iy  b o o k  b uw k

MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan.

Similar presentations

Presentation on theme: "MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan.

Similar presentations

Presentation on theme: "MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan."— Presentation transcript:

Similar presentations

About project

Feedback