Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st Jan, 2011

Slides:



Advertisements
Similar presentations
Statistical Machine Translation
Advertisements

CS626: NLP, Speech and the Web
Probabilistic models Haixu Tang School of Informatics.
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation IBM Model 1 CS626/CS460 Anoop Kunchukuttan Under the guidance of Prof. Pushpak Bhattacharyya.
Measures of Coincidence Vasileios Hatzivassiloglou University of Texas at Dallas.
1 An Introduction to Statistical Machine Translation Dept. of CSIE, NCKU Yao-Sheng Chang Date:
Application of RNNs to Language Processing Andrey Malinin, Shixiang Gu CUED Division F Speech Group.
Parameter estimate in IBM Models: Ling 572 Fei Xia Week ??
EM algorithm LING 572 Fei Xia 03/02/06. Outline The EM algorithm EM for PM models Three special cases –Inside-outside algorithm –Forward-backward algorithm.
Maximum Entropy Model & Generalized Iterative Scaling Arindam Bose CS 621 – Artificial Intelligence 27 th August, 2007.
MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan.
Today Wrap up of probability Vectors, Matrices. Calculus
THE MATHEMATICS OF STATISTICAL MACHINE TRANSLATION Sriraman M Tallam.
Natural Language Processing Expectation Maximization.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 17– Alignment in SMT) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th Feb, 2011.
Combined Lecture CS621: Artificial Intelligence (lecture 25) CS626/449: Speech-NLP-Web/Topics-in- AI (lecture 26) Pushpak Bhattacharyya Computer Science.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 18– Training and Decoding in SMT System) Kushal Ladha M.Tech Student CSE Dept.,
Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart
An Introduction to SMT Andy Way, DCU. Statistical Machine Translation (SMT) Translation Model Language Model Bilingual and Monolingual Data* Decoder:
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21- Forward Probabilities and Robotic Action Sequences.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 16– Linear and Logistic Regression) Pushpak Bhattacharyya CSE Dept., IIT Bombay.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 3 (10/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Statistical Formulation.
Martin KayTranslation—Meaning1 Martin Kay Stanford University with thanks to Kevin Knight.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-17: Probabilistic parsing; inside- outside probabilities.
CS460/626 : Natural Language Processing/Speech, NLP and the Web Some parse tree examples (from quiz 3) Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 29– CYK; Inside Probability; Parse Tree construction) Pushpak Bhattacharyya CSE.
Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alexander Fraser Institute for Natural Language Processing Universität Stuttgart.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Evaluation Decoding Dynamic Programming.
CS Statistical Machine learning Lecture 24
Mathe III Lecture 7 Mathe III Lecture 7. 2 Second Order Differential Equations The simplest possible equation of this type is:
CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 9- Completeness proof; introducing knowledge representation.
Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,
A Statistical Approach to Machine Translation ( Brown et al CL ) POSTECH, NLP lab 김 지 협.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
CS : NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 34: Precision, Recall, F- score, Map.
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
Parallel Implementation Of Word Alignment Model: IBM MODEL 1 Professor: Dr.Azimi Fateme Ahmadi-Fakhr Afshin Arefi Saba Jamalian Dept. of Electrical and.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25– Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March,
Ling 575: Machine Translation Yuval Marton Winter 2016 January 19: Spill-over from last class, some prob+stats, word alignment, phrase-based and hierarchical.
Statistical Machine Translation Part II: Word Alignments and EM
Pushpak Bhattacharyya CSE Dept., IIT Bombay
Alexander Fraser CIS, LMU München Machine Translation
LECTURE 15: HMMS – EVALUATION AND DECODING
Maximum Likelihood Estimation
Data Mining Lecture 11.
Statistical Machine Translation Part III – Phrase-based SMT / Decoding
CS : Speech, NLP and the Web/Topics in AI
CSCI 5832 Natural Language Processing
Lecture 5 Unsupervised Learning in fully Observed Directed and Undirected Graphical Models.
Expectation-Maximization Algorithm
Word-based SMT Ling 580 Fei Xia Week 1: 1/3/06.
LECTURE 14: HMMS – EVALUATION AND DECODING
Machine Translation and MT tools: Giza++ and Moses
CS344 : Introduction to Artificial Intelligence
CS344 : Introduction to Artificial Intelligence
CS621: Artificial Intelligence
Statistical Machine Translation Part IIIb – Phrase-based Model
CS621: Artificial Intelligence
Machine Translation(MT)
CS : Language Technology for the Web/Natural Language Processing
Word Alignment David Kauchak CS159 – Fall 2019 Philipp Koehn
Machine Translation and MT tools: Giza++ and Moses
CS : NLP, Speech and Web-Topics-in-AI
Statistical Machine Translation Part VI – Phrase-based Decoding
Presented By: Sparsh Gupta Anmol Popli Hammad Abdullah Ayyubi
CS224N Section 2: PA2 & EM Shrey Gupta January 21,2011.
Presentation transcript:

Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st Jan, 2011 CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 12–IBM Model 1) Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st Jan, 2011

Grammar based and N-gram based models of Language Rule based Model of Language is Grammar A set of rule (grammar) determine whether a sentence is valid in that language. NP -> N |Adj P NP| N PP | Art NP 1/0 decision Recursive rules allow generation of infinite number of sentences in the language Statistical model (e.g. bi-gram , tri-gram) calculates score in the range of 0 to 1 to determine belongingness NOT a 1/0 decision, but a ranking

Statistical Machine Translation (SMT) Data driven approach Goal is to find out the English sentence e given foreign language sentence f whose p(e|f) is maximum. Translations are generated on the basis of statistical model Parameters are estimated using bilingual parallel corpora

SMT: Language Model To detect good English sentences Probability of an English sentence s1s2 …… sn can be written as Pr(s1s2 …… sn) = Pr(s1) * Pr(s2|s1) *. . . * Pr(sn|s1 s2 . . . sn-1) Here Pr(sn|s1 s2 . . . sn-1) is the probability that word sn follows word string s1 s2 . . . sn-1. N-gram model probability Trigram model probability calculation

SMT: Translation Model P(f|e): Probability of some f given hypothesis English translation e How to assign the values to p(e|f) ? Sentences are infinite, not possible to find pair(e,f) for all sentences Introduce a hidden variable a, that represents alignments between the individual words in the sentence pair Sentence level Word level

Alignment If the string, e= e1l= e1 e2 …el, has l words, and the string, f= f1m=f1f2...fm, has m words, then the alignment, a, can be represented by a series, a1m= a1a2...am , of m values, each between 0 and l such that if the word in position j of the f-string is connected to the word in position i of the e-string, then aj= i, and if it is not connected to any English word, then aj= O

Example of alignment English: Ram went to school Hindi: Raama paathashaalaa gayaa

Alignment between source and target sentence e0=Φ f0 = Φ e1=Ram f1 =Raama e2=went f2 = paathshala e3=to f3 = gayaa e4=school Alignment a1=1 a2=4 a3=2

Translation Model: Exact expression Choose the length of foreign language string given e Choose alignment given e and m Choose the identity of foreign word given e, m, a Five models for estimating parameters in the expression [2] Model-1, Model-2, Model-3, Model-4, Model-5

Proof of Translation Model: Exact expression ; marginalization ; marginalization m is fixed for a particular f, hence

Model-1 Simplest model Assumptions The likelihood function will be Pr(m|e) is independent of m and e and is equal to ε Alignment of foreign language words (FLWs) depends only on length of English sentence = (l+1)-1 l is the length of English sentence The likelihood function will be Maximize the likelihood function constrained to

Model-1: Parameter estimation Using Lagrange multiplier for constrained maximization, the solution for model-1 parameters λe : normalization constant; c(f|e; f,e) expected count; δ(f,fj) is 1 if f & fj are same, zero otherwise. Estimate t(f|e) using Expectation Maximization (EM) procedure