Effective Use of Linguistic and Contextual Information for Statistical Machine Translation Libin Shen and Jinxi Xu and Bing Zhang and Spyros Matsoukas.

Slides:



Advertisements
Similar presentations
Statistical Machine Translation
Advertisements

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Word Sense Disambiguation for Machine Translation Han-Bin Chen
“POETIC” STATISTICAL MACHINE TRANSLATION: RHYME AND METER Genzel, Uszkoreit, Och; Google, 2010.
1 Statistical NLP: Lecture 12 Probabilistic Context Free Grammars.
Hidden Markov Models Bonnie Dorr Christof Monz CMSC 723: Introduction to Computational Linguistics Lecture 5 October 6, 2004.
Confidence Estimation for Machine Translation J. Blatz et.al, Coling 04 SSLI MTRG 11/17/2004 Takahiro Shinozaki.
June 2004 D ARPA TIDES MT Workshop Measuring Confidence Intervals for MT Evaluation Metrics Ying Zhang Stephan Vogel Language Technologies Institute Carnegie.
1 A Tree Sequence Alignment- based Tree-to-Tree Translation Model Authors: Min Zhang, Hongfei Jiang, Aiti Aw, et al. Reporter: 江欣倩 Professor: 陳嘉平.
A Tree-to-Tree Alignment- based Model for Statistical Machine Translation Authors: Min ZHANG, Hongfei JIANG, Ai Ti AW, Jun SUN, Sheng LI, Chew Lim TAN.
Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Grammar induction by Bayesian model averaging Guy Lebanon LARG meeting May 2001 Based on Andreas Stolcke’s thesis UC Berkeley 1994.
Switch to Top-down Top-down or move-to-nearest Partition documents into ‘k’ clusters Two variants “Hard” (0/1) assignment of documents to clusters “soft”
TIDES MT Workshop Review. Using Syntax?  ISI-small: –Cross-lingual parsing/decoding Input: Chinese sentence + English lattice built with all possible.
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
1 Language Model Adaptation in Machine Translation from Speech Ivan Bulyko, Spyros Matsoukas, Richard Schwartz, Long Nguyen, and John Makhoul.
Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.
Maximum Entropy Model LING 572 Fei Xia 02/07-02/09/06.
Application of RNNs to Language Processing Andrey Malinin, Shixiang Gu CUED Division F Speech Group.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Semantic Role Labeling using Maximum Entropy Model Joon-Ho Lim NLP Lab. Korea Univ.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference.
Does Syntactic Knowledge help English- Hindi SMT ? Avinesh. PVS. K. Taraka Rama, Karthik Gali.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Alignment and classification of time series gene expression in clinical studies Tien-ho Lin, Naftali Kaminski and Ziv Bar-Joseph.
Minimum Phoneme Error Based Heteroscedastic Linear Discriminant Analysis for Speech Recognition Bing Zhang and Spyros Matsoukas BBN Technologies Present.
Scalable Inference and Training of Context- Rich Syntactic Translation Models Michel Galley, Jonathan Graehl, Keven Knight, Daniel Marcu, Steve DeNeefe.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
NUDT Machine Translation System for IWSLT2007 Presenter: Boxing Chen Authors: Wen-Han Chao & Zhou-Jun Li National University of Defense Technology, China.
Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.
1 Boosting-based parse re-ranking with subtree features Taku Kudo Jun Suzuki Hideki Isozaki NTT Communication Science Labs.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Korea Maritime and Ocean University NLP Jung Tae LEE
Fields of Experts: A Framework for Learning Image Priors (Mon) Young Ki Baik, Computer Vision Lab.
INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String Statistical Translation Rules Yang Liu, Qun Liu, and Shouxun Lin Institute of Computing Technology Chinese.
Chinese Word Segmentation Adaptation for Statistical Machine Translation Hailong Cao, Masao Utiyama and Eiichiro Sumita Language Translation Group NICT&ATR.
Yang Liu State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer.
Presenter: Jinhua Du ( 杜金华 ) Xi’an University of Technology 西安理工大学 NLP&CC, Chongqing, Nov , 2013 Discriminative Latent Variable Based Classifier.
Effective Reranking for Extracting Protein-protein Interactions from Biomedical Literature Deyu Zhou, Yulan He and Chee Keong Kwoh School of Computer Engineering.
LREC 2008 Marrakech 29 May Caroline Lavecchia, Kamel Smaïli and David Langlois LORIA / Groupe Parole, Vandoeuvre-Lès-Nancy, France Phrase-Based Machine.
Statistical Decision-Tree Models for Parsing NLP lab, POSTECH 김 지 협.
A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪
Yajuan Lü, Jin Huang and Qun Liu EMNLP, 2007 Presented by Mei Yang, May 12nd, 2008 Improving SMT Preformance by Training Data Selection and Optimization.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Imposing Constraints from the Source Tree on ITG Constraints for SMT Hirofumi Yamamoto, Hideo Okuma, Eiichiro Sumita National Institute of Information.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
Wei Lu, Hwee Tou Ng, Wee Sun Lee National University of Singapore
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.
A Syntax-Driven Bracketing Model for Phrase-Based Translation Deyi Xiong, et al. ACL 2009.
N-Gram Model Formulas Word sequences Chain rule of probability Bigram approximation N-gram approximation.
Hierarchical Mixture of Experts Presented by Qi An Machine learning reading group Duke University 07/15/2005.
LING 575 Lecture 5 Kristina Toutanova MSR & UW April 27, 2010 With materials borrowed from Philip Koehn, Chris Quirk, David Chiang, Dekai Wu, Aria Haghighi.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.
Natural Language Processing : Probabilistic Context Free Grammars Updated 8/07.
Statistical Machine Translation Part II: Word Alignments and EM
PRESENTED BY: PEAR A BHUIYAN
An Iterative Approach to Discriminative Structure Learning
Joint Training for Pivot-based Neural Machine Translation
Lecture 15: Text Classification & Naive Bayes
Deep Learning based Machine Translation
Statistical NLP: Lecture 9
CS4705 Natural Language Processing
Statistical Machine Translation Papers from COLING 2004
Statistical NLP : Lecture 9 Word Sense Disambiguation
Presentation transcript:

Effective Use of Linguistic and Contextual Information for Statistical Machine Translation Libin Shen and Jinxi Xu and Bing Zhang and Spyros Matsoukas and RalphWeischedel BBN Technologies EMNLP2009 Presented by Cai

Question  Lexical features are useful in MT  But parameter’s number is large  How to effectively use these features?

Previous Work  Discriminative training the parameters : the need of scalable development set and careful selection  Estimate a single score or likelihood of a translation with rich features (using ME): feature space too large, not practical

Main Contribution  Design effective and efficient statistical models (simple probabilistic models) to capture useful linguistic and context information for MT decoding  Features: robust and ideal

Features introduced  non-terminal labels (+performance)  Length distribution of non-terminals (+performance)  Source-side context information (+performance)  Source-side structural information (dependency information) no performance gain, surprisingly

What’s special  Assume the distribution of length of non- terminal is Gaussian (sampling,estimation, smoothing)  Soft dependency constraints by introducing labels of non-terminals  Context language model  String-to-dependency rule-> dependency-to- dependency rule

Experiments  Baseline: string-to-dependency system presented in (Shen et.al 2008)  Test each feature and their combinations  Arabic-to-English and Chinese-to-English  Measure: Bleu and TER  Results: 2 points of BLEU in A-E and 1 points of BLEU in C-E (nist06); 1.7 points of BLEU in A-E and 0.8 points of BLEU in C-E (nist06); 1.7 poi

Main Related Work  Z. He, Q. Liu, and S. Lin Improving statistical machine translation using lexicalized rule, COLING ’08  A. Ittycheriah and S. Roukos Direct translation model 2. NACCL 07  L. Shen, J. Xu, and R. Weischedel A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model. ACL 2008