Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.

Slides:

Advertisements

Similar presentations

Statistical Machine Translation

Advertisements

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.

Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.

1 Statistical NLP: Lecture 12 Probabilistic Context Free Grammars.

A Maximum Coherence Model for Dictionary-based Cross-language Information Retrieval Yi Liu, Rong Jin, Joyce Y. Chai Dept. of Computer Science and Engineering.

Chinese Word Segmentation Method for Domain-Special Machine Translation Su Chen; Zhang Yujie; Guo Zhen; Xu Jin’an Beijing Jiaotong University.

Hidden Markov Models Bonnie Dorr Christof Monz CMSC 723: Introduction to Computational Linguistics Lecture 5 October 6, 2004.

Hidden Markov Models Theory By Johan Walters (SR 2003)

1 A Tree Sequence Alignment- based Tree-to-Tree Translation Model Authors: Min Zhang, Hongfei Jiang, Aiti Aw, et al. Reporter: 江欣倩 Professor: 陳嘉平.

A Phrase-Based, Joint Probability Model for Statistical Machine Translation Daniel Marcu, William Wong(2002) Presented by Ping Yu 01/17/2006.

Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from.

“Applying Morphology Generation Models to Machine Translation” By Kristina Toutanova, Hisami Suzuki, Achim Ruopp (Microsoft Research). UW Machine Translation.

1 Improving a Statistical MT System with Automatically Learned Rewrite Patterns Fei Xia and Michael McCord (Coling 2004) UW Machine Translation Reading.

ACL 2005 WORKSHOP ON BUILDING AND USING PARALLEL TEXTS (WPT-05), Ann Arbor, MI. June Competitive Grouping in Integrated Segmentation and Alignment.

Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau.

Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

A Hierarchical Phrase-Based Model for Statistical Machine Translation Author: David Chiang Presented by Achim Ruopp Formulas/illustrations/numbers extracted.

Application of RNNs to Language Processing Andrey Malinin, Shixiang Gu CUED Division F Speech Group.

LEARNING WORD TRANSLATIONS Does syntactic context fare better than positional context? NCLT/CNGL Internal Workshop Ankit Kumar Srivastava 24 July 2008.

Does Syntactic Knowledge help English- Hindi SMT ? Avinesh. PVS. K. Taraka Rama, Karthik Gali.

1 Statistical NLP: Lecture 13 Statistical Alignment and Machine Translation.

MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan.

Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.

Natural Language Processing Expectation Maximization.

Translation Model Parameters (adapted from notes from Philipp Koehn & Mary Hearne) 24 th March 2011 Dr. Declan Groves, CNGL, DCU

Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.

English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.

Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.

Syntax for MT EECS 767 Feb. 1, Outline Motivation Syntax-based translation model  Formalization  Training Using syntax in MT  Using multiple.

Statistical Machine Translation Part IV – Log-Linear Models Alex Fraser Institute for Natural Language Processing University of Stuttgart Seminar:

An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.

Grammatical Machine Translation Stefan Riezler & John Maxwell.

Statistical Machine Translation Part IV – Log-Linear Models Alexander Fraser Institute for Natural Language Processing University of Stuttgart

Johns Hopkins 2003 Summer Workshop on Syntax and Statistical Machine Translation Chapters 1-4 Shauna Eggers.

Scalable Inference and Training of Context- Rich Syntactic Translation Models Michel Galley, Jonathan Graehl, Keven Knight, Daniel Marcu, Steve DeNeefe.

Machine Translation Course 5 Diana Trandab ă ț Academic year:

Statistical Machine Translation Part III – Phrase-based SMT Alexander Fraser CIS, LMU München WSD and MT.

The ICT Statistical Machine Translation Systems for IWSLT 2007 Zhongjun He, Haitao Mi, Yang Liu, Devi Xiong, Weihua Luo, Yun Huang, Zhixiang Ren, Yajuan.

NUDT Machine Translation System for IWSLT2007 Presenter: Boxing Chen Authors: Wen-Han Chao & Zhou-Jun Li National University of Defense Technology, China.

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.

Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.

Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling Ferhan Ture and Jimmy Lin University of Maryland,

What’s in a translation rule? Paper by Galley, Hopkins, Knight & Marcu Presentation By: Behrang Mohit.

Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alexander Fraser Institute for Natural Language Processing Universität Stuttgart.

Alignment of Bilingual Named Entities in Parallel Corpora Using Statistical Model Chun-Jen Lee Jason S. Chang Thomas C. Chuang AMTA 2004.

NRC Report Conclusion Tu Zhaopeng NIST06  The Portage System  For Chinese large-track entry, used simple, but carefully- tuned, phrase-based.

LREC 2008 Marrakech 29 May Caroline Lavecchia, Kamel Smaïli and David Langlois LORIA / Groupe Parole, Vandoeuvre-Lès-Nancy, France Phrase-Based Machine.

Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.

2003 (c) University of Pennsylvania1 Better MT Using Parallel Dependency Trees Yuan Ding University of Pennsylvania.

1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.

Phrase-Based Statistical Machine Translation as a Traveling Salesman Problem Mikhail Zaslavskiy Marc Dymetman Nicola Cancedda ACL 2009.

Wei Lu, Hwee Tou Ng, Wee Sun Lee National University of Singapore

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Structured learning: overview Sunita Sarawagi IIT Bombay TexPoint fonts used in EMF. Read the TexPoint manual before.

A Syntax-Driven Bracketing Model for Phrase-Based Translation Deyi Xiong, et al. ACL 2009.

Statistical Machine Translation Part II: Word Alignments and EM

Alexander Fraser CIS, LMU München Machine Translation

Intelligent Information System Lab

Statistical NLP: Lecture 13

Statistical Machine Translation Part III – Phrase-based SMT / Decoding

CSCI 5832 Natural Language Processing

Machine Translation and MT tools: Giza++ and Moses

Statistical Machine Translation Papers from COLING 2004

Statistical Machine Translation Part IIIb – Phrase-based Model

Improving IBM Word-Alignment Model 1(Robert C. MOORE)

The XMU SMT System for IWSLT 2007

Machine Translation and MT tools: Giza++ and Moses

Johns Hopkins 2003 Summer Workshop on Syntax and Statistical Machine Translation Chapters 5-8 Ethan Phelps-Goodman.

Presentation transcript:

Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model

2/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Overview ● The quality of the MT systems have improved with the use of phrase translation – Phrases from word-based alignments – Syntactic phrases – Phrases from phrase alignments – IBM word-based statistical MT systems enhanced with phrase translation ● Best to extract phrase translations pairs? – Evaluation Framework / Outcome

3/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Word based approaches ● Try to model word-to-word correspondences ● Models are often restricted – source word -> exactly one target word – Hidden Markov models in speech recognition ● Enhanced to “One-to-many” alignment model – Solve lexical problems like ● “Zahnarzttermin” -> “dentist’s appointment” ● Order of words will be changed

4/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Statistical machine translation (1) ● argmax … search/decoding problem (generation of the output sentence) ● Pr(e 1 ) … language model ● Pr(f 1 |e 1 ) … translation model

5/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Statistical machine translation (2) Taken from [2]

6/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Learning translation lexica ● Following describes methods for learning single-word and phrase-based translation lexica – Statistical alignment models ● Used for learning word alignments ● Symmetrization – Bilingual phrases – Alignment templates

7/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Statistical alignment models (1) ● In the alignment model – A “hidden” parameter is introduced a – a describes the mapping from source position j to target position a j ● “a” is represented as a matrix with binary values – 1 entry … words are aligned – 0 entry … words are not aligned – source word -> no target word (empty word e o )

8/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Statistical alignment models (2) ● In general the model depends on a set of unknown parameters ● Exist several different specific statistical alignment models – First compute word alignments i.e. model 4 – Train this hidden parameters θ ● Alignment with highest probability – called Viterbi alignment

9/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Symmetrization (1) ● Baseline alignment model (i.e. model 4) does not allow multiple target words – “Zahnarzttermin” -> “dentist’s appointment” ● Outcome should be such alignment matrix Taken from [2]

10/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Symmetrization (2) ● To solve this problem – Training in both directions – For a sentence pair -> two Viterbi alignments – Now both alignments tables A1 and A2 have to combined (symmetized) ● Simple union of both tables (some refined methods) – Result then is used to train single word based translation lexica

11/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Symmetrization (2) – By computing for relative frequencies using: ● N(e|f) … how many times e and f are aligned ● N(f) … how many time the word f occurs

12/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Bilingual phrases ● Now we need an algorithm that relationships between whole phrases of source sentence m and target sentence n – “phrase extract” algorithm and take as input alignment matrix A Taken from [2]

13/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Alignment templates (1) ● A more systematic approach – Considers whole phrases ● Whole group of adjacent words in the source ● maps to a whole group of words in the target – The context of words have greater influence – The changes of word order can be learned ● The Idea is to model two different alignment levels – Word level alignments – Phrase level alignments

14/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Alignment templates (2) Alignments templates z –“F”… source class sequence –“E”…target class sequence –“A”… describes the alignment between source and target “F” and “E” are classes –The advantage is a better generalization

15/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Alignment templates (3) Taken from [2]

16/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Alignment templates (4) ● For the training we need the probability of applying an alignment template ● The “phrase extraction” have to be modified ● Can be estimated by relative frequencies ● Finished the “Learning translation lexica”-task

17/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Translation model (1) For notation we decompose the sentences –f 1 J …source sentence –e 1 I …target sentence –sequence of phrases (k=1,…,K) Further considerations (only one segmentation)

18/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Translation model (2) ● The model have to allow reordering of the phrases

19/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Translation model (3) Taken from [2]

20/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Translation model (4) Taken from [2]

21/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Alignment template approach results ● Evaluation of the approach by a translation task (“Verbmobil Task”) ● Additional preprocessing – word-joinings – word-splitting Taken from [2]

22/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Alignment template approach conclusions ● Overall we see a better performance ● So it is important to model word groups in source and target language ● By using two abstraction levels – Phrase level alignments – Word level alignments – -> greater influence of the context and can be learned explicitly

23/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Syntactic phrases (1) ● A collection of all phrase pairs will also include non-intuitive phrases – “Okay, the”, “house the”, etc… – Intuitively such phrases do not help – Restricting to syntactically motivated phrases ● The idea of syntactic trees and phrases as subtrees

24/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Syntactic phrases (2) ● The input sentence is preprocessed by a syntactic parser ● Different operations will be performed on each node – reordering child nodes – inserting extra words at each node – translating leaf words

25/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Syntactic phrases (3) Taken from [4]

26/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Syntactic phrases (4) Taken from [6]

27/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Syntactic phrases (5) ● Reordering – Every given child sequence has a probability of reordering (N nodes -> N! pos. reorderings) – The probability of reordering is given by the model (table etc) ● Inserting – Extra word can be inserted (left/right) – Another table for insert probability ● Translating – Operation is applied to every leaf – Assumption that this operation only depends on the word itself

28/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Experiments ● Now we have three models ● [1] build a system to compare them and measure performance under different aspects – Weighting syntactic phrases – Maximum phrase length ● Setup – Free corpus Europarl – German to English – Performance measured using BLEU score

29/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Comparison of core methods ● AP… template alignment ● M4 … IBM Model 4 for word based translation ● Syn … syntactic phrases ● Training corpus size [sentences] Taken from [1]

30/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Weighting syntactic phrases (1) ● The restriction on syntactic phrases is harmful, because too many phrases are eliminated ● Intuitively that can not be – Improvements in data collection, during translation, penalizing ● Results suggest – Collection of only syntactically phrases – Performance not better – But smaller table sizes

31/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Weighting syntactic phrases (2) ● Example: – “es gibt” literally translates in “it gives” but really means “there is” – Not syntactic relationship – Also “with regard to”, “note that” syntactically complex but easy translation

32/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Maximum phrase length ● How long do phrases have to be to achieve high performance? ● All experiments with “Phrases from word-based alignments” approach Taken from [1]

33/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Simpler Underlying word-based models (1) ● The core of this framework is IBM model 4 for collecting phrase pairs ● Model 4 is computationally expensive, parameters problems (approximations) ● What about IBM models 1-3 – Faster and easier to implement – Model 1 and 2 compute word alignments efficiently

34/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Simpler Underlying word-based models (2) ● How much is performance affected, if the base word alignment on these simpler methods? ● M1 worst performance ● But M2 & M3 provide similar performance to the M4 model Taken from [1]

35/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Conclusions ● Intuitively phrase bases approaches gives better performance than word-based approaches ● Also experiments show us that – “straight forward” forward syntax based models have disadvantages ● The “best” outcome with small word phrases ● Phrase extraction and the alignment heuristic have a great influence

36/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based References ● [1] Philipp Koehn, Franz Josef Och, Daniel Marcu; Statistical Phrase- Based Translation ● [2] Franz Josef Och, Hermann Ney; The Alignment Template Approach to Statistical Machine Translation ● [3] Franz Josef Och, Christoph Tillmann, Hermann Ney; Improved Alignment Models for Statistical Machine Translation ● [4] Kenji Yamada, Kevin Knight; A Syntax-based Translation Model ● [5] Daniel Marcu, William Wong; A Phrase-Based, Joint Probability Model for Statistical Machine Translation ● [6] Amitabha Mukerjee, Ankit Soni and Achla M. Raina; Detecting Complex Predicates in Hindi using POS Projection across Parallel Corpora ● [7]

37/37 ASP 06/07 Reinisch Bernhard Translation Model – Phrase-based Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Models