Statistical Machine Translation Part III – Phrase-based SMT / Decoding

Slides:



Advertisements
Similar presentations
Statistical Machine Translation
Advertisements

Statistical Machine Translation Part I - Introduction Alex Fraser Institute for Natural Language Processing University of Stuttgart EMA Summer.
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Information Extraction Lecture 12 – Multilingual Extraction CIS, LMU München Winter Semester Dr. Alexander Fraser, CIS.
Machine Translation Course 9 Diana Trandab ă ț Academic year
Hidden Markov Models Bonnie Dorr Christof Monz CMSC 723: Introduction to Computational Linguistics Lecture 5 October 6, 2004.
DP-based Search Algorithms for Statistical Machine Translation My name: Mauricio Zuluaga Based on “Christoph Tillmann Presentation” and “ Word Reordering.
Novel Reordering Approaches in Phrase-Based Statistical Machine Translation S. Kanthak, D. Vilar, E. Matusov, R. Zens & H. Ney ACL Workshop on Building.
A Phrase-Based, Joint Probability Model for Statistical Machine Translation Daniel Marcu, William Wong(2002) Presented by Ping Yu 01/17/2006.
Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from.
Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau.
Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.
Application of RNNs to Language Processing Andrey Malinin, Shixiang Gu CUED Division F Speech Group.
Jan 2005Statistical MT1 CSA4050: Advanced Techniques in NLP Machine Translation III Statistical MT.
MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan.
THE MATHEMATICS OF STATISTICAL MACHINE TRANSLATION Sriraman M Tallam.
Natural Language Processing Expectation Maximization.
Statistical Machine Translation Part VIII – Log-Linear Models Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 18– Training and Decoding in SMT System) Kushal Ladha M.Tech Student CSE Dept.,
Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart
An Introduction to SMT Andy Way, DCU. Statistical Machine Translation (SMT) Translation Model Language Model Bilingual and Monolingual Data* Decoder:
English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.
Statistical Machine Translation Part IV – Log-Linear Models Alex Fraser Institute for Natural Language Processing University of Stuttgart Seminar:
Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.
Statistical Machine Translation Part IV – Log-Linear Models Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation Part III – Phrase-based SMT Alexander Fraser CIS, LMU München WSD and MT.
MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee.
Martin KayTranslation—Meaning1 Martin Kay Stanford University with thanks to Kevin Knight.
Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alexander Fraser Institute for Natural Language Processing Universität Stuttgart.
LREC 2008 Marrakech 29 May Caroline Lavecchia, Kamel Smaïli and David Langlois LORIA / Groupe Parole, Vandoeuvre-Lès-Nancy, France Phrase-Based Machine.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Statistical Machine Translation Raghav Bashyal. Statistical Machine Translation Uses pre-translated text (copora) Compare translated text to original.
Haitham Elmarakeby.  Speech recognition
MACHINE TRANSLATION PAPER 1 Daniel Montalvo, Chrysanthia Cheung-Lau, Jonny Wang CS159 Spring 2011.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
A Statistical Approach to Machine Translation ( Brown et al CL ) POSTECH, NLP lab 김 지 협.
Wei Lu, Hwee Tou Ng, Wee Sun Lee National University of Singapore
Jan 2009Statistical MT1 Advanced Techniques in NLP Machine Translation III Statistical MT.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee.
Spring 2010 Lecture 4 Kristina Toutanova MSR & UW With slides borrowed from Philipp Koehn and Hwee Tou Ng LING 575: Seminar on statistical machine translation.
1 Minimum Bayes-risk Methods in Automatic Speech Recognition Vaibhava Geol And William Byrne IBM ; Johns Hopkins University 2003 by CRC Press LLC 2005/4/26.
Ling 575: Machine Translation Yuval Marton Winter 2016 January 19: Spill-over from last class, some prob+stats, word alignment, phrase-based and hierarchical.
Bayes Risk Minimization using Metric Loss Functions R. Schlüter, T. Scharrenbach, V. Steinbiss, H. Ney Present by Fang-Hui, Chu.
Neural Machine Translation
Statistical Machine Translation Part II: Word Alignments and EM
CSE 517 Natural Language Processing Winter 2015
Statistical Machine Translation Part IV – Log-Linear Models
Alexander Fraser CIS, LMU München Machine Translation
Statistical NLP: Lecture 13
CSCI 5832 Natural Language Processing
Objective of This Course
Statistical Machine Translation
CSCI 5832 Natural Language Processing
Expectation-Maximization Algorithm
Word-based SMT Ling 580 Fei Xia Week 1: 1/3/06.
Machine Translation and MT tools: Giza++ and Moses
Statistical Machine Translation
Statistical Machine Translation Papers from COLING 2004
Statistical Machine Translation Part IIIb – Phrase-based Model
Machine Translation(MT)
Machine Translation and MT tools: Giza++ and Moses
Johns Hopkins 2003 Summer Workshop on Syntax and Statistical Machine Translation Chapters 5-8 Ethan Phelps-Goodman.
Presenter : Jen-Wei Kuo
Statistical Machine Translation Part VI – Phrase-based Decoding
Presented By: Sparsh Gupta Anmol Popli Hammad Abdullah Ayyubi
Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st Jan, 2011
CS224N Section 2: PA2 & EM Shrey Gupta January 21,2011.
Presentation transcript:

Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2010.05.12 Seminar: Statistical MT

Where we have been We defined the overall problem and talked about evaluation We have now covered word alignment IBM Model 1, true Expectation Maximization IBM Model 4, approximate Expectation Maximization Symmetrization Heuristics (such as Grow) Applied to two Model 4 alignments Results in final word alignment

Where we are going We will define a high performance translation model We will show how to solve the search problem for this model

Outline Phrase-based translation Model Estimating parameters Decoding

We could use IBM Model 4 in the direction p(f|e), together with a language model, p(e), to translate argmax P( e | f ) = argmax P( f | e ) P( e ) e e

However, decoding using Model 4 doesn’t work well in practice One strong reason is the bad 1-to-N assumption Another problem would be defining the search algorithm If we add additional operations to allow the English words to vary, this will be very expensive Despite these problems, Model 4 decoding was briefly state of the art We will now define a better model…

Slide from Koehn 2008

Slide from Koehn 2008

Language Model Often a trigram language model is used for p(e) P(the man went home) = p(the | START) p(man | START the) p(went | the man) p(home | man went) Language models work well for comparing the grammaticality of strings of the same length However, when comparing short strings with long strings they favor short strings For this reason, an important component of the language model is the length bonus This is a constant > 1 multiplied for each English word in the hypothesis It makes longer strings competitive with shorter strings

d Modified from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

z^n Slide from Koehn 2008

Outline Phrase-based translation model Decoding Basic phrase-based decoding Dealing with complexity Recombination Pruning Future cost estimation

Slide from Koehn 2008

Decoding Goal: find the best target translation of a source sentence Involves search Find maximum probability path in a dynamically generated search graph Generate English string, from left to right, by covering parts of Foreign string Generating English string left to right allows scoring with the n-gram language model Here is an example of one path

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

(possible future paths are the same) Modified from Koehn 2008

(possible future paths are the same) Modified from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008

Slide from Koehn 2008