Expectation-Maximization Algorithm

Slides:

Advertisements

Similar presentations

Statistical Machine Translation

Advertisements

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.

Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Clustering Beyond K-means

Word Alignment Philipp Koehn USC/Information Sciences Institute USC/Computer Science Department School of Informatics University of Edinburgh Some slides.

Translation Model Parameters & Expectation Maximization Algorithm Lecture 2 (adapted from notes from Philipp Koehn & Mary Hearne) Dr. Declan Groves, CNGL,

The EM algorithm LING 572 Fei Xia Week 10: 03/09/2010.

Tagging with Hidden Markov Models. Viterbi Algorithm. Forward-backward algorithm Reading: Chap 6, Jurafsky & Martin Instructor: Paul Tarau, based on Rada.

First introduced in 1977 Lots of mathematical derivation Problem : given a set of data (data is incomplete or having missing values). Goal : assume the.

1 An Introduction to Statistical Machine Translation Dept. of CSIE, NCKU Yao-Sheng Chang Date:

The EM algorithm (Part 1) LING 572 Fei Xia 02/23/06.

1 Duluth Word Alignment System Bridget Thomson McInnes Ted Pedersen University of Minnesota Duluth Computer Science Department 31 May 2003.

Machine Translation (II): Word-based SMT Ling 571 Fei Xia Week 10: 12/1/05-12/6/05.

Today Today: Chapter 9 Assignment: 9.2, 9.4, 9.42 (Geo(p)=“geometric distribution”), 9-R9(a,b) Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.

Expectation Maximization Algorithm

Parameter estimate in IBM Models: Ling 572 Fei Xia Week ??

EM algorithm LING 572 Fei Xia 03/02/06. Outline The EM algorithm EM for PM models Three special cases –Inside-outside algorithm –Forward-backward algorithm.

MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan.

THE MATHEMATICS OF STATISTICAL MACHINE TRANSLATION Sriraman M Tallam.

Natural Language Processing Expectation Maximization.

Translation Model Parameters (adapted from notes from Philipp Koehn & Mary Hearne) 24 th March 2011 Dr. Declan Groves, CNGL, DCU

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 18– Training and Decoding in SMT System) Kushal Ladha M.Tech Student CSE Dept.,

Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.

Machine Translation Course 5 Diana Trandab ă ț Academic year:

Hidden Markov Models Usman Roshan CS 675 Machine Learning.

Korea Maritime and Ocean University NLP Jung Tae LEE

Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alexander Fraser Institute for Natural Language Processing Universität Stuttgart.

CSE 517 Natural Language Processing Winter 2015

Expectation-Maximization (EM) Algorithm & Monte Carlo Sampling for Inference and Approximation.

Logistic Regression Saed Sayad 1www.ismartsoft.com.

NLP. Machine Translation Source-channel model of communication Parametric probabilistic models of language and translation.

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Maximum Entropy … the fact that a certain prob distribution maximizes entropy subject to certain constraints representing our incomplete information, is.

Machine Translation Course 4 Diana Trandab ă ț Academic year:

Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.

Statistical Machine Translation Part II: Word Alignments and EM

RECENT TRENDS IN SMT By M.Balamurugan, Phd Research Scholar,

An Iterative Approach to Discriminative Structure Learning

LECTURE 10: EXPECTATION MAXIMIZATION (EM)

CSC 594 Topics in AI – Natural Language Processing

Partial Products Algorithm for Multiplication

Hidden Markov Models - Training

Statistical Machine Translation Part III – Phrase-based SMT / Decoding

CSE P573 Applications of Artificial Intelligence Bayesian Learning

Training Tree Transducers

CSCI 5832 Natural Language Processing

Statistical Machine Translation

More about Posterior Distributions

Introduction to EM algorithm

CSE P573 Applications of Artificial Intelligence Bayesian Learning

Introduction to IBM Model 1&2 Alignment

KAIST CS LAB Oh Jong-Hoon

'Linear Hierarchical Models'

CSCI 5832 Natural Language Processing

Word-based SMT Ling 580 Fei Xia Week 1: 1/3/06.

Machine Translation and MT tools: Giza++ and Moses

EM for Inference in MV Data

Introduction to Reinforcement Learning and Q-Learning

Word Alignment David Kauchak CS159 – Fall 2019 Philipp Koehn

Topic Models in Text Processing

Machine Translation and MT tools: Giza++ and Moses

EM for Inference in MV Data

Valentin I. Spitkovsky April 16, 2010

EM Algorithm 主講人：虞台文.

Statistical Machine Translation Part VI – Phrase-based Decoding

Presented By: Sparsh Gupta Anmol Popli Hammad Abdullah Ayyubi

Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.

Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st Jan, 2011

CS224N Section 2: EM Nate Chambers April 17, 2009

Presentation transcript:

Expectation-Maximization Algorithm M.B.Chandak

Principle-EM Algorithm Maximum Data Likelihood Estimation. This algorithm operates on parallel corpus. For example: English-Hindi aligned parallel corpus. The algorithm aims to find out MLE [Maximum likelihood estimation] of two words to be used for Machine Translation. In the following example: English and Hindi languages are used source and target language. Let Es-represents English and Hs-represents Hindi corpus.

Implementation: It is an iterative algorithm. The two steps are: Computing the probability of word alignment [M-step] and generating the expected count of these alignment [E-step] Initially: To all alignment uniform probability is assigned.

Example: Sentence: English-Hindi Green House The House हरा घर यह घर हरा घर यह घर Uniform probability table Green House The t(Green|हरा )=1/3 t(house|हरा )=1/3 t(the|हरा )=1/3 t(Green|घर)=1/3 t(house|घर)=1/3 t(the|घर)=1/3 t(Green|यह)=1/3 t(house|यह)=1/3 t(the|यह)=1/3

Example Compute P(a, e|h) by multiplying all “t” probabilities Green House The House हरा घर यह घर 1/3 * 1/3 = 1/9 1/3 * 1/3 = 1/9 1/3 * 1/3 = 1/9 1/3 * 1/3 = 1/9

Re-calculating values Green House हरा घर THE GREEN HOUSE यह हरा ½ घर The House यह घर THE GREEN HOUSE यह ½ हरा घर

Calculate “tcounts”=tc Green House The TOTAL tc(Green|हरा )=1/2 tc(house|हरा )=1/2 tc(the|हरा )=0 t(the|हरा )=1 tc(Green|घर)=1/2 tc(house|घर)=[1/2+1/2]=1 tc(the|घर)=1/2 t(the|घर)=2 tc(Green|यह)=0 tc(house|यह)=1/2 tc(the|यह)=1/2 t(the|यह)=1

M-Step t(Green|हरा )=1/2 t(house|हरा )=1/2 t(the|हरा )=0 TOTAL t(Green|हरा )=1/2/1 tc(house|हरा )=1/2/1 t(the|हरा )=0/1 t(the|हरा )=1 t(Green|घर)=1/2/2 t(house|घर)=[1/2+1/2]=1/2 t(the|घर)=1/2/2 t(the|घर)=2 t(Green|यह)=0/1 t(house|यह)=1/2/1 t(the|यह)=1/2/1 t(the|यह)=1 Green House The t(Green|हरा )=1/2 t(house|हरा )=1/2 t(the|हरा )=0 t(Green|घर)=1/4 t(house|घर)=1/2 t(the|घर)=1/4 t(Green|यह)=1/2 t(house|यह)=1/2 t(the|यह)=1/2

E-step: Part 2: Identifying higher probability phrase Compute P(a, e|h) by multiplying all “t” probabilities Green House The House हरा घर यह घर 1/2 * 1/2 = 1/4 1/2 * 1/2 = 1/4 1/4 * 1/2 = 1/8 1/4 * 1/2= 1/8

Further:: The process continues to iterate with E-step followed by M-step. The probability values are changed from 1/9 to 1/4 and 1/9 to 1/8.