Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st Jan, 2011

Slides:

Advertisements

Similar presentations

Statistical Machine Translation

Advertisements

CS626: NLP, Speech and the Web

Probabilistic models Haixu Tang School of Informatics.

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.

Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Statistical Machine Translation IBM Model 1 CS626/CS460 Anoop Kunchukuttan Under the guidance of Prof. Pushpak Bhattacharyya.

Measures of Coincidence Vasileios Hatzivassiloglou University of Texas at Dallas.

1 An Introduction to Statistical Machine Translation Dept. of CSIE, NCKU Yao-Sheng Chang Date:

Application of RNNs to Language Processing Andrey Malinin, Shixiang Gu CUED Division F Speech Group.

Parameter estimate in IBM Models: Ling 572 Fei Xia Week ??

EM algorithm LING 572 Fei Xia 03/02/06. Outline The EM algorithm EM for PM models Three special cases –Inside-outside algorithm –Forward-backward algorithm.

Maximum Entropy Model & Generalized Iterative Scaling Arindam Bose CS 621 – Artificial Intelligence 27 th August, 2007.

MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan.

Today Wrap up of probability Vectors, Matrices. Calculus

THE MATHEMATICS OF STATISTICAL MACHINE TRANSLATION Sriraman M Tallam.

Natural Language Processing Expectation Maximization.

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 17– Alignment in SMT) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th Feb, 2011.

Combined Lecture CS621: Artificial Intelligence (lecture 25) CS626/449: Speech-NLP-Web/Topics-in- AI (lecture 26) Pushpak Bhattacharyya Computer Science.

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 18– Training and Decoding in SMT System) Kushal Ladha M.Tech Student CSE Dept.,

Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart

An Introduction to SMT Andy Way, DCU. Statistical Machine Translation (SMT) Translation Model Language Model Bilingual and Monolingual Data* Decoder:

Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.

CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21- Forward Probabilities and Robotic Action Sequences.

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 16– Linear and Logistic Regression) Pushpak Bhattacharyya CSE Dept., IIT Bombay.

CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 3 (10/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Statistical Formulation.

Martin KayTranslation—Meaning1 Martin Kay Stanford University with thanks to Kevin Knight.

CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-17: Probabilistic parsing; inside- outside probabilities.

CS460/626 : Natural Language Processing/Speech, NLP and the Web Some parse tree examples (from quiz 3) Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th.

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 29– CYK; Inside Probability; Parse Tree construction) Pushpak Bhattacharyya CSE.

Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alexander Fraser Institute for Natural Language Processing Universität Stuttgart.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Evaluation Decoding Dynamic Programming.

CS Statistical Machine learning Lecture 24

Mathe III Lecture 7 Mathe III Lecture 7. 2 Second Order Differential Equations The simplest possible equation of this type is:

CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 9- Completeness proof; introducing knowledge representation.

Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,

A Statistical Approach to Machine Translation ( Brown et al CL ) POSTECH, NLP lab 김 지 협.

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

CS : NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 34: Precision, Recall, F- score, Map.

Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.

Parallel Implementation Of Word Alignment Model: IBM MODEL 1 Professor: Dr.Azimi Fateme Ahmadi-Fakhr Afshin Arefi Saba Jamalian Dept. of Electrical and.

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25– Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March,

Ling 575: Machine Translation Yuval Marton Winter 2016 January 19: Spill-over from last class, some prob+stats, word alignment, phrase-based and hierarchical.

Statistical Machine Translation Part II: Word Alignments and EM

Pushpak Bhattacharyya CSE Dept., IIT Bombay

Alexander Fraser CIS, LMU München Machine Translation

LECTURE 15: HMMS – EVALUATION AND DECODING

Maximum Likelihood Estimation

Data Mining Lecture 11.

Statistical Machine Translation Part III – Phrase-based SMT / Decoding

CS : Speech, NLP and the Web/Topics in AI

CSCI 5832 Natural Language Processing

Lecture 5 Unsupervised Learning in fully Observed Directed and Undirected Graphical Models.

Expectation-Maximization Algorithm

Word-based SMT Ling 580 Fei Xia Week 1: 1/3/06.

LECTURE 14: HMMS – EVALUATION AND DECODING

Machine Translation and MT tools: Giza++ and Moses

CS344 : Introduction to Artificial Intelligence

CS344 : Introduction to Artificial Intelligence

CS621: Artificial Intelligence

Statistical Machine Translation Part IIIb – Phrase-based Model

CS621: Artificial Intelligence

Machine Translation(MT)

CS : Language Technology for the Web/Natural Language Processing

Word Alignment David Kauchak CS159 – Fall 2019 Philipp Koehn

Machine Translation and MT tools: Giza++ and Moses

CS : NLP, Speech and Web-Topics-in-AI

Statistical Machine Translation Part VI – Phrase-based Decoding

Presented By: Sparsh Gupta Anmol Popli Hammad Abdullah Ayyubi

CS224N Section 2: PA2 & EM Shrey Gupta January 21,2011.

Presentation transcript:

Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st Jan, 2011 CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 12–IBM Model 1) Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st Jan, 2011

Grammar based and N-gram based models of Language Rule based Model of Language is Grammar A set of rule (grammar) determine whether a sentence is valid in that language. NP -> N |Adj P NP| N PP | Art NP 1/0 decision Recursive rules allow generation of infinite number of sentences in the language Statistical model (e.g. bi-gram , tri-gram) calculates score in the range of 0 to 1 to determine belongingness NOT a 1/0 decision, but a ranking

Statistical Machine Translation (SMT) Data driven approach Goal is to find out the English sentence e given foreign language sentence f whose p(e|f) is maximum. Translations are generated on the basis of statistical model Parameters are estimated using bilingual parallel corpora

SMT: Language Model To detect good English sentences Probability of an English sentence s1s2 …… sn can be written as Pr(s1s2 …… sn) = Pr(s1) * Pr(s2|s1) *. . . * Pr(sn|s1 s2 . . . sn-1) Here Pr(sn|s1 s2 . . . sn-1) is the probability that word sn follows word string s1 s2 . . . sn-1. N-gram model probability Trigram model probability calculation

SMT: Translation Model P(f|e): Probability of some f given hypothesis English translation e How to assign the values to p(e|f) ? Sentences are infinite, not possible to find pair(e,f) for all sentences Introduce a hidden variable a, that represents alignments between the individual words in the sentence pair Sentence level Word level

Alignment If the string, e= e1l= e1 e2 …el, has l words, and the string, f= f1m=f1f2...fm, has m words, then the alignment, a, can be represented by a series, a1m= a1a2...am , of m values, each between 0 and l such that if the word in position j of the f-string is connected to the word in position i of the e-string, then aj= i, and if it is not connected to any English word, then aj= O

Example of alignment English: Ram went to school Hindi: Raama paathashaalaa gayaa

Alignment between source and target sentence e0=Φ f0 = Φ e1=Ram f1 =Raama e2=went f2 = paathshala e3=to f3 = gayaa e4=school Alignment a1=1 a2=4 a3=2

Translation Model: Exact expression Choose the length of foreign language string given e Choose alignment given e and m Choose the identity of foreign word given e, m, a Five models for estimating parameters in the expression [2] Model-1, Model-2, Model-3, Model-4, Model-5

Proof of Translation Model: Exact expression ; marginalization ; marginalization m is fixed for a particular f, hence

Model-1 Simplest model Assumptions The likelihood function will be Pr(m|e) is independent of m and e and is equal to ε Alignment of foreign language words (FLWs) depends only on length of English sentence = (l+1)-1 l is the length of English sentence The likelihood function will be Maximize the likelihood function constrained to

Model-1: Parameter estimation Using Lagrange multiplier for constrained maximization, the solution for model-1 parameters λe : normalization constant; c(f|e; f,e) expected count; δ(f,fj) is 1 if f & fj are same, zero otherwise. Estimate t(f|e) using Expectation Maximization (EM) procedure