Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to IBM Model 1&2 Alignment

Similar presentations


Presentation on theme: "Introduction to IBM Model 1&2 Alignment"β€” Presentation transcript:

1 Introduction to IBM Model 1&2 Alignment
ι’±η‚˜η₯Ί

2 machine translation model
French 𝑓 οƒ  English 𝑒 𝑝 𝑒 𝑓 = 𝑝 𝑒,𝑓 𝑝 𝑓 = 𝑝 𝑒 𝑝 𝑓 𝑒 𝑒 𝑝 𝑒 𝑝 𝑓 𝑒 𝑒 βˆ— = arg max 𝑒 𝑝 𝑒 𝑓 = arg max 𝑒 𝑝 𝑒 𝑝 𝑓 𝑒 𝑝 𝑒 the language model 𝑝 𝑓 𝑒 the translation model (difficult)

3 why alignment e (𝑙 words) = And the program has been implemented
f (π‘š words)= Le programme a ete mis en application 𝑝 𝑓 𝑒,π‘š is difficult with the help of alignment: 𝑝 𝑓 𝑒,π‘š = π‘Žβˆˆπ’œ 𝑝 𝑓,π‘Ž 𝑒,π‘š

4 alignment e (𝑙 words) = And the program has been implemented
f (π‘š words)= Le programme a ete mis en application alignment = {2,3,4,5,6,6,6} # of all possible alignments = 1 𝑙+1 π‘š

5 alignment e (𝑙 words) = And the program has been implemented
f (π‘š words) = __ _______ __ __ ___ ___ ______ programme 𝑝 𝑓 𝑒,π‘š = π‘Žβˆˆπ’œ 𝑝 𝑓,π‘Ž 𝑒,π‘š 𝑝 𝑓,π‘Ž 𝑒,π‘š =𝑝 π‘Ž 𝑒,π‘š 𝑝 𝑓 π‘Ž,𝑒,π‘š 𝑝 𝑓 𝑒,π‘š = π‘Žβˆˆπ’œ 𝑝 π‘Ž 𝑒,π‘š 𝑝 𝑓 π‘Ž,𝑒,π‘š

6 most likely alignment 𝑝 𝑓,π‘Ž 𝑒,π‘š =𝑝 π‘Ž 𝑒,π‘š 𝑝 𝑓 π‘Ž,𝑒,π‘š
𝑝 𝑓 𝑒,π‘š = π‘Žβˆˆπ’œ 𝑝 π‘Ž 𝑒,π‘š 𝑝 𝑓 π‘Ž,𝑒,π‘š 𝑝 π‘Ž 𝑓,𝑒,π‘š = 𝑝 𝑓,π‘Ž 𝑒,π‘š π‘Žβˆˆπ’œ 𝑓,π‘Ž 𝑒,π‘š = 𝑝 𝑓,π‘Ž 𝑒,π‘š 𝑝 𝑓 𝑒,π‘š π‘Ž βˆ— = arg max π‘Ž 𝑝 π‘Ž 𝑓,𝑒,π‘š

7 alignment example French English Alignment
le conseil a rendu son avis, et nous devons à présent adopter un nouvel avis sur la base de la première position. English the council has stated its position, and now, on the basis of the first position, we again have to give our opinion. Alignment the/le council/conseil has/à stated/rendu its/son position/avis ,/, and/et now/présent ,/, on/sur the/le basis/base of/de the/la first/première position/position ,/NULL we/nous again/NULL have/devons to/a give/adopter our/nouvel opinion/avis ./.

8 IBM Model 1 e (𝑙 words) = And the program has been implemented
f (π‘š words) = __ _______ __ __ ___ ___ ______ 𝑝 𝑓,π‘Ž 𝑒,π‘š =𝑝 π‘Ž 𝑒,π‘š 𝑝 𝑓 π‘Ž,𝑒,π‘š all alignments are equally likely: 𝑝 π‘Ž 𝑒,π‘š = 1 𝑙+1 π‘š there are 𝑙+1 π‘š possible values for π‘Ž

9 IBM Model 1 e (𝑙 words) = And the program has been implemented
f (π‘š words) = __ _______ __ __ ___ ___ ______ la programme a ete mis en application 𝑝 𝑓,π‘Ž 𝑒,π‘š =𝑝 π‘Ž 𝑒,π‘š 𝑝 𝑓 π‘Ž,𝑒,π‘š all alignments are equally likely: 𝑝 𝑓 π‘Ž,𝑒,π‘š = 𝑗=1 π‘š 𝑑 𝑓 𝑗 𝑒 π‘Ž 𝑗 𝑝(each aligned term translation)

10 IBM Model 1 estimate 𝑑 𝑓 𝑒 train from parallel corpus

11 IBM Model 2 probability that 𝑗'th French word ( 𝑓 𝑗 ) align to 𝑖'th English word ( 𝑒 𝑖 ), given length 𝑙 and π‘š π‘ž 𝑖 𝑗,𝑙,π‘š 𝑝 π‘Ž 𝑒,π‘š = 𝑗=1 π‘š π‘ž π‘Ž 𝑗 𝑗,𝑙,π‘š β‰  1 𝑙+1 π‘š 𝑝 𝑓,π‘Ž 𝑒,π‘š =𝑝 π‘Ž 𝑒,π‘š 𝑝 𝑓 π‘Ž,𝑒,π‘š = 𝑗=1 π‘š π‘ž π‘Ž 𝑗 𝑗,𝑙,π‘š 𝑑 𝑓 𝑗 𝑒 π‘Ž 𝑗

12 IBM Model 2 e (𝑙 words) = And the program has been implemented
f (π‘š words) = __ _______ __ __ ___ ___ ______ programme 𝑝 π‘Ž 𝑒,π‘š =𝑝 π‘Ž 𝑒,7 =π‘ž 2|1,6,7 βˆ—π‘ž 3|2,6,7 βˆ—π‘ž 4|3,6, βˆ—π‘ž 5|4,6,7 βˆ—π‘ž 6|5,6,7 βˆ—π‘ž 6|6,6, βˆ—π‘ž 6|7,6,7 𝑝 𝑓,π‘Ž 𝑒,π‘š = 𝑗=1 π‘š π‘ž π‘Ž 𝑗 𝑗,𝑙,π‘š 𝑑 𝑓 𝑗 𝑒 π‘Ž 𝑗

13 IBM Model 2 e (𝑙 words) = (NULL) And the program has been implemented
f (π‘š words)= Le programme a ete mis en application π‘Ž 𝑗 = arg max π‘Žβˆˆ 0…𝑙 π‘ž π‘Ž 𝑗,𝑙,π‘š βˆ—π‘‘( 𝑓 𝑗 | 𝑒 π‘Ž ) NULL: π‘ž 0 3,6,7 βˆ—π‘‘(π‘Ž|π‘π‘ˆπΏπΏ) And: π‘ž 1 3,6,7 βˆ—π‘‘(π‘Ž|𝐴𝑛𝑑) the: π‘ž 2 3,6,7 βˆ—π‘‘(π‘Ž|π‘‘β„Žπ‘’) program: π‘ž 3 3,6,7 βˆ—π‘‘(π‘Ž|π‘π‘Ÿπ‘œπ‘”π‘Ÿπ‘Žπ‘š) has: π‘ž 4 3,6,7 βˆ—π‘‘(π‘Ž|β„Žπ‘Žπ‘ ) been: π‘ž 5 3,6,7 βˆ—π‘‘(π‘Ž|𝑏𝑒𝑒𝑛) implemented: π‘ž 6 3,6,7 βˆ—π‘‘(π‘Ž|π‘–π‘šπ‘π‘™π‘’π‘šπ‘’π‘›π‘‘π‘’π‘‘)

14 EM for IBM Model 2 input output: challenge: a set of bitext 𝑒 π‘˜ , 𝑓 π‘˜
𝑒 = And the program has been implemented 𝑓 = Le programme a ete mis en application output: 𝑑 𝑓 𝑒 & π‘ž 𝑗 𝑖,𝑙,π‘š challenge: do not have alignments on training data

15 EM for IBM Model 2 if alignments are observed
estimate 𝑑 𝑓 𝑒 & π‘ž 𝑗 𝑖,𝑙,π‘š by counting 𝑑 𝑀𝐿 𝑓 𝑒 = πΆπ‘œπ‘’π‘›π‘‘ 𝑒,𝑓 πΆπ‘œπ‘’π‘›π‘‘ 𝑒 π‘ž 𝑀𝐿 𝑗 𝑖,𝑙,π‘š = πΆπ‘œπ‘’π‘›π‘‘ 𝑗 𝑖,𝑙,π‘š πΆπ‘œπ‘’π‘›π‘‘ 𝑖,𝑙,π‘š

16 EM for IBM Model 2 For 𝑠=1…𝑆 Set all counts 𝑐 … =0 For π‘˜=1…𝑛 For 𝑖=1… π‘š π‘˜ , For 𝑗=0… 𝑙 π‘˜ 𝑐 𝑒 𝑗 π‘˜ , 𝑓 𝑖 π‘˜ ←𝑐 𝑒 𝑗 π‘˜ , 𝑓 𝑖 π‘˜ +𝛿 π‘˜, 𝑖, 𝑗 𝑐 𝑒 𝑗 π‘˜ ←𝑐 𝑒 𝑗 π‘˜ +𝛿 π‘˜, 𝑖, 𝑗 𝑐 𝑗 𝑖, 𝑙, π‘š ←𝑐 𝑗 𝑖, 𝑙, π‘š +𝛿 π‘˜, 𝑖, 𝑗 𝑐 𝑖, 𝑙, π‘š ←𝑐 𝑖, 𝑙, π‘š +𝛿 π‘˜, 𝑖, 𝑗 Recalculate the parameters: 𝑑 𝑓 𝑒 = 𝑐 𝑒, 𝑓 𝑐 𝑒 π‘ž 𝑗 𝑖, 𝑙, π‘š = 𝑐 𝑗 𝑖, 𝑙, π‘š 𝑐 𝑖, 𝑙, π‘š

17 EM for IBM Model 2 e (𝑙 words) = (NULL) And the program has been implemented f (π‘š words)= Le programme a ete mis en application if alignments are observed 𝛿 π‘˜,𝑖,𝑗 =1 𝑖𝑓 π‘Ž 𝑖 π‘˜ =𝑗 do not have alignments on training data 𝛿 π‘˜,𝑖,𝑗 = π‘ž 𝑗 𝑖, 𝑙 π‘˜ , π‘š π‘˜ 𝑑 𝑓 𝑖 π‘˜ 𝑒 𝑗 π‘˜ 𝑗=0 𝑙 π‘˜ π‘ž 𝑗 𝑖, 𝑙 π‘˜ , π‘š π‘˜ 𝑑 𝑓 𝑖 π‘˜ 𝑒 𝑗 π‘˜ 𝛿 π‘˜,𝑖,𝑗 =𝑝 π‘Ž 𝑖 π‘˜ =𝑗 𝑒 π‘˜ , 𝑓 π‘˜ ;𝑑,π‘ž


Download ppt "Introduction to IBM Model 1&2 Alignment"

Similar presentations


Ads by Google