Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st Jan, 2011 CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 12–IBM Model 1) Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st Jan, 2011
Grammar based and N-gram based models of Language Rule based Model of Language is Grammar A set of rule (grammar) determine whether a sentence is valid in that language. NP -> N |Adj P NP| N PP | Art NP 1/0 decision Recursive rules allow generation of infinite number of sentences in the language Statistical model (e.g. bi-gram , tri-gram) calculates score in the range of 0 to 1 to determine belongingness NOT a 1/0 decision, but a ranking
Statistical Machine Translation (SMT) Data driven approach Goal is to find out the English sentence e given foreign language sentence f whose p(e|f) is maximum. Translations are generated on the basis of statistical model Parameters are estimated using bilingual parallel corpora
SMT: Language Model To detect good English sentences Probability of an English sentence s1s2 …… sn can be written as Pr(s1s2 …… sn) = Pr(s1) * Pr(s2|s1) *. . . * Pr(sn|s1 s2 . . . sn-1) Here Pr(sn|s1 s2 . . . sn-1) is the probability that word sn follows word string s1 s2 . . . sn-1. N-gram model probability Trigram model probability calculation
SMT: Translation Model P(f|e): Probability of some f given hypothesis English translation e How to assign the values to p(e|f) ? Sentences are infinite, not possible to find pair(e,f) for all sentences Introduce a hidden variable a, that represents alignments between the individual words in the sentence pair Sentence level Word level
Alignment If the string, e= e1l= e1 e2 …el, has l words, and the string, f= f1m=f1f2...fm, has m words, then the alignment, a, can be represented by a series, a1m= a1a2...am , of m values, each between 0 and l such that if the word in position j of the f-string is connected to the word in position i of the e-string, then aj= i, and if it is not connected to any English word, then aj= O
Example of alignment English: Ram went to school Hindi: Raama paathashaalaa gayaa
Alignment between source and target sentence e0=Φ f0 = Φ e1=Ram f1 =Raama e2=went f2 = paathshala e3=to f3 = gayaa e4=school Alignment a1=1 a2=4 a3=2
Translation Model: Exact expression Choose the length of foreign language string given e Choose alignment given e and m Choose the identity of foreign word given e, m, a Five models for estimating parameters in the expression [2] Model-1, Model-2, Model-3, Model-4, Model-5
Proof of Translation Model: Exact expression ; marginalization ; marginalization m is fixed for a particular f, hence
Model-1 Simplest model Assumptions The likelihood function will be Pr(m|e) is independent of m and e and is equal to ε Alignment of foreign language words (FLWs) depends only on length of English sentence = (l+1)-1 l is the length of English sentence The likelihood function will be Maximize the likelihood function constrained to
Model-1: Parameter estimation Using Lagrange multiplier for constrained maximization, the solution for model-1 parameters λe : normalization constant; c(f|e; f,e) expected count; δ(f,fj) is 1 if f & fj are same, zero otherwise. Estimate t(f|e) using Expectation Maximization (EM) procedure