Bayesian Subtree Alignment Model based on Dependency Trees Toshiaki Nakazawa Sadao Kurohashi Kyoto University 1 IJCNLP2011.

Slides:

Advertisements

Similar presentations

Statistical Machine Translation

Advertisements

Joint Parsing and Alignment with Weakly Synchronized Grammars David Burkett, John Blitzer, & Dan Klein TexPoint fonts used in EMF. Read the TexPoint manual.

INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart

Bayesian Learning of Non- Compositional Phrases with Synchronous Parsing Hao Zhang; Chris Quirk; Robert C. Moore; Daniel Gildea Z honghua li Mentor: Jun.

Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Chinese Word Segmentation Method for Domain-Special Machine Translation Su Chen; Zhang Yujie; Guo Zhen; Xu Jin’an Beijing Jiaotong University.

Statistical Machine Translation Part IV - Assignments and Advanced Topics Alex Fraser Institute for Natural Language Processing University of Stuttgart.

Why Generative Models Underperform Surface Heuristics UC Berkeley Natural Language Processing John DeNero, Dan Gillick, James Zhang, and Dan Klein.

1 A Tree Sequence Alignment- based Tree-to-Tree Translation Model Authors: Min Zhang, Hongfei Jiang, Aiti Aw, et al. Reporter: 江欣倩 Professor: 陳嘉平.

Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from.

Application of RNNs to Language Processing Andrey Malinin, Shixiang Gu CUED Division F Speech Group.

Parameter estimate in IBM Models: Ling 572 Fei Xia Week ??

Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.

Seven Lectures on Statistical Parsing Christopher Manning LSA Linguistic Institute 2007 LSA 354 Lecture 7.

Does Syntactic Knowledge help English- Hindi SMT ? Avinesh. PVS. K. Taraka Rama, Karthik Gali.

Statistical Machine Translation Part IX – Better Word Alignment, Morphology and Syntax Alexander Fraser ICL, U. Heidelberg CIS, LMU München

MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan.

Alignment by Bilingual Generation and Monolingual Derivation Toshiaki Nakazawa and Sadao Kurohashi Kyoto University.

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 18– Training and Decoding in SMT System) Kushal Ladha M.Tech Student CSE Dept.,

Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.

Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

Language Knowledge Engineering Lab. Kyoto University NTCIR-10 PatentMT, Japan, Jun , 2013 Description of KYOTO EBMT System in PatentMT at NTCIR-10.

Active Learning for Statistical Phrase-based Machine Translation Gholamreza Haffari Joint work with: Maxim Roy, Anoop Sarkar Simon Fraser University NAACL.

Better Punctuation Prediction with Dynamic Conditional Random Fields Wei Lu and Hwee Tou Ng National University of Singapore.

11 Learning to Suggest Questions in Online Learning to Suggest Questions in Online Forums Tom Chao Zhou, Chin-Yew Lin, Irwin King Michael R.

Statistical Machine Translation Part V – Better Word Alignment, Morphology and Syntax Alexander Fraser CIS, LMU München Seminar: Open Source.

2010 Failures in Czech-English Phrase-Based MT 2010 Failures in Czech-English Phrase-Based MT Full text, acknowledgement and the list of references in.

Exploiting Shared Chinese Characters in Chinese Word Segmentation Optimization for Chinese-Japanese Machine Translation Chenhui Chu, Toshiaki Nakazawa,

Scalable Inference and Training of Context- Rich Syntactic Translation Models Michel Galley, Jonathan Graehl, Keven Knight, Daniel Marcu, Steve DeNeefe.

Phrase Reordering for Statistical Machine Translation Based on Predicate-Argument Structure Mamoru Komachi, Yuji Matsumoto Nara Institute of Science and.

NUDT Machine Translation System for IWSLT2007 Presenter: Boxing Chen Authors: Wen-Han Chao & Zhou-Jun Li National University of Defense Technology, China.

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.

Accurate Parallel Fragment Extraction from Quasi-Comparable Corpora using Alignment Model and Translation Lexicon Chenhui Chu, Toshiaki Nakazawa, Sadao.

Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University.

Ibrahim Badr, Rabih Zbib, James Glass. Introduction Experiment on English-to-Arabic SMT. Two domains: text news,spoken travel conv. Explore the effect.

What’s in a translation rule? Paper by Galley, Hopkins, Knight & Marcu Presentation By: Behrang Mohit.

INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String Statistical Translation Rules Yang Liu, Qun Liu, and Shouxun Lin Institute of Computing Technology Chinese.

Chinese Word Segmentation Adaptation for Statistical Machine Translation Hailong Cao, Masao Utiyama and Eiichiro Sumita Language Translation Group NICT&ATR.

NRC Report Conclusion Tu Zhaopeng NIST06  The Portage System  For Chinese large-track entry, used simple, but carefully- tuned, phrase-based.

A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Imposing Constraints from the Source Tree on ITG Constraints for SMT Hirofumi Yamamoto, Hideo Okuma, Eiichiro Sumita National Institute of Information.

2003 (c) University of Pennsylvania1 Better MT Using Parallel Dependency Trees Yuan Ding University of Pennsylvania.

Structural Phrase Alignment Based on Consistency Criteria Toshiaki Nakazawa, Kun Yu, Sadao Kurohashi (Graduate School of Informatics, Kyoto University)

1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.

A New Approach for English- Chinese Named Entity Alignment Donghui Feng Yayuan Lv Ming Zhou USC MSR Asia EMNLP-04.

NLP. Machine Translation Tree-to-tree – Yamada and Knight Phrase-based – Och and Ney Syntax-based – Och et al. Alignment templates – Och and Ney.

Linguistically-motivated Tree-based Probabilistic Phrase Alignment Toshiaki Nakazawa, Sadao Kurohashi (Kyoto University)

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor Martin Pelikan, David E. Goldberg, and Kumara Sastry IlliGAL Report No May.

Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.

Towards Syntactically Constrained Statistical Word Alignment Greg Hanneman : Advanced Machine Translation Seminar April 30, 2008.

October 10, 2003BLTS Kickoff Meeting1 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP.

September 2004CSAW Extraction of Bilingual Information from Parallel Texts Mike Rosner.

A Syntax-Driven Bracketing Model for Phrase-Based Translation Deyi Xiong, et al. ACL 2009.

Correcting Misuse of Verb Forms John Lee, Stephanie Seneff Computer Science and Artiﬁcial Intelligence Laboratory, MIT, Cambridge ACL 2008.

Japanese-Chinese Phrase Alignment Exploiting Shared Chinese Characters Chenhui Chu, Toshiaki Nakazawa and Sadao Kurohashi Graduate School of Informatics,

LING 575 Lecture 5 Kristina Toutanova MSR & UW April 27, 2010 With materials borrowed from Philip Koehn, Chris Quirk, David Chiang, Dekai Wu, Aria Haghighi.

A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval Chengxiang Zhai, John Lafferty School of Computer Science Carnegie.

A Study on Speaker Adaptation of Continuous Density HMM Parameters By Chin-Hui Lee, Chih-Heng Lin, and Biing-Hwang Juang Presented by: 陳亮宇 1990 ICASSP/IEEE.

Cross-language Projection of Dependency Trees Based on Constrained Partial Parsing for Tree-to-Tree Machine Translation Yu Shen, Chenhui Chu, Fabien Cromieres.

Fabien Cromieres Chenhui Chu Toshiaki Nakazawa Sadao Kurohashi

Japanese-Chinese Phrase Alignment Using Common Chinese Characters Information Chenhui Chu, Toshiaki Nakazawa and Sadao Kurohashi Graduate School of Informatics,

Suggestions for Class Projects

Alex Fraser Institute for Natural Language Processing

Eiji Aramaki* Sadao Kurohashi* * University of Tokyo

Machine Translation and MT tools: Giza++ and Moses

Statistical Machine Translation Papers from COLING 2004

Kyoto University Participation to WAT 2016

Machine Translation and MT tools: Giza++ and Moses

Presentation transcript:

Bayesian Subtree Alignment Model based on Dependency Trees Toshiaki Nakazawa Sadao Kurohashi Kyoto University 1 IJCNLP2011

Outline Background Related Work Bayesian Subtree Alignment Model Model Training Experiments Conclusion 2

Background Alignment quality of GIZA++ is quite insufficient for distant language pairs – Wide range reordering, many-to-many alignment 3 En: He is my bother. Zh: 他是我哥哥。 Fr: Il est mon frère. Ja: 彼は私の兄です。

Alignment Accuracy of GIZA++ Language pairPrecisionRecallAER French-English English-Japanese Chinese-Japanese with combination heuristic Sure alignment: clearly right Possible alignment: reasonable to 4 Automatic (GIZA++) alignment make but not so clear

Background Alignment quality is quite insufficient for distant language pairs – Wide range reordering, many-to-many alignment – English-Japanese, Chinese-Japanese > 20% AER – Need to incorporate syntactic information En: He is my bother. Zh: 他是我哥哥。 Fr: Il est mon frère. Ja: 彼は私の兄です。 5

Related Work Cherry and Lin (2003) – Discriminative alignment model using source side dependency tree – Allows one-to-one alignment only Nakazawa and Kurohashi (2009) – Generative model using both side dependency trees – Allows phrasal alignment – Degeneracy of acquiring incorrect larger phrases derived from Maximum Likelihood Estimation 6

Related Work DeNero et al. (2008) – Incorporate prior knowledge about the parameter to void the degeneracy of the model – Place Dirichlet Process (DP) prior over phrase generation model – Simple distortion model: position-based This work – Take advantage of two works by Nakazawa et al. and DeNero et al. 7

Related Work Generative story of (sequential) phrase-based joint probability model 1.Choose a number of components 2.Generate each of phrase pairs independently Nonparametric Bayesian prior 3.Choose an ordering for the phrases Model Step 1Step 3 Step 2 [DeNero et al., 2008] 8

Example of the Model Step 1Step 3 Step 2 He is my brother 彼はです私の兄兄 C1C1 C1C1 C2C2 C2C2 C3C3 C3C3 C4C4 C4C4 9 Simple position-based distortion

Proposed Model Step 1Step 3 Step 2 10 Dependency Tree-based distortion He is my brother 兄です彼は私の C1C1 C1C1 C2C2 C2C2 C3C3 C3C3 C4C4 C4C4

Bayesian Subtree Alignment Model based on Dependency Trees 11

Model Decomposition 12 NullNon-null dependency relations dependency of phrases cf. [DeNero et al., 2008]

Dependency Relations rel(“He”, “is”) 13 彼は私の兄 He is borther my rel(“brother”, “is”) = (1, 0) rel(“my”, “brother”) = (1, 0) です # of steps for going up # of steps for going down = (Up, Down) = (1, 0)

Dependency Relations rel(“ 彼は ”, “ です ”) = (1, 0) 14 彼は私の兄 He is borther my rel(“ 私の ”, “ 兄 ”) = (1, 0) rel(“ 兄 ”, “ です ”) = (1, 0) です

Dependency Relations rel(“long”, “hair”) = (0, 1) 15 彼女は髪が長い She has hair long rel(“hair”, “she has”) = (1, 2) rel(“ 髪が ”, “ 長い ”) = (0, 1) NULL

Dependency Relations 16 彼女は髪が長い She has hair long rel(“ 彼女 ”, “ は ”) = ？ NULL rel(“ 彼女 ”, “ 長い ”) = (0, 2) N(“ 彼女 ”) = 1 # of NULL words on the way to non-null parent

Dependency Relation Probability Assign probability to the tuple: p(R e = (N, rel) = (N, Up, Down)) ～ Reordering model is decomposed as: 17

Model Training 18

Model Training Initialization – Create heuristic phrase alignment like ‘grow-diag- final-and’ on dependency trees using results from GIZA++ – Count phrase alignment and dependency relations Refine the model by Gibbs sampling – Operators: SWAP, TOGGLE, EXPAND 19

SWAP Operator ・・・ NULL ・・・ SWAP-1 SWAP-2 Swap the counterparts of two alignments 20

TOGGLE Operator Remove existing alignment or add new alignment NULL TOGGLE 21

EXPAND Operator Expand or contract an aligned subtree NULL EXPAND-1 EXPAND-2 22

Alignment Experiment Training: 1M for Ja-En, 678K for Ja-Zh Testing: about 500 hand-annotated parallel sentences (with Sure and Possible alignments) Measure: Precision, Recall, Alignment Error Rate Japanese Tools: JUMAN and KNP English Tool: Charniak’s nlparser Chinese Tools: MMA and CNP (from NICT) 23

Alignment Experiment Ja-En (paper abstract: 1M sentences) PrecisionRecallAER Initialization Proposed GIZA++ & grow Berkeley Aligner

Alignment Experiment Ja-Zh (technical paper: 680K sentences) PrecisionRecallAER Initialization Proposed GIZA++ & grow Berkeley Aligner

Improved Example GIZA++Proposed 26

GIZA++ Proposed

28 GIZA++ Proposed

Japanese-to-English Translation Experiment Baseline: Just run Moses and MERT 29

Japanese-to-English Translation Experiment Initialization: Use the result of initialization as the alignment result for Moses 30

Japanese-to-English Translation Experiment Proposed: Use the alignment result of proposed model after few iterations for Moses 31

GIZA++ Proposed

Bayesian Tree-based Phrase Alignment Model – Better alignment accuracy than GIZA++ in distant language pairs Translation – Currently (temporally), not improved  Future work – Robustness for parsing errors Using N-best parsing result or forest – Show improvement in translation Tree-based decoder(, Hiero?) Conclusion 33