Dependency Tree-to-Dependency Tree Machine Translation November 4, 2011 Presented by: Jeffrey Flanigan (CMU) Lori Levin, Jaime Carbonell In collaboration.

Slides:

Advertisements

Similar presentations

Statistical Machine Translation

Advertisements

The Learning Non-Isomorphic Tree Mappings for Machine Translation Jason Eisner - Johns Hopkins Univ. a b A B events of misinform wrongly report to-John.

May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.

Grammatical Relations and Lexical Functional Grammar Grammar Formalisms Spring Term 2004.

Using Percolated Dependencies in PBSMT Ankit K. Srivastava and Andy Way Dublin City University CLUKI XII: April 24, 2009.

GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.

Constituent and Dependency Trees notes for CSCI-GA.2590 Prof. Grishman.

Albert Gatt LIN3022 Natural Language Processing Lecture 8.

1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in.

1 A Tree Sequence Alignment- based Tree-to-Tree Translation Model Authors: Min Zhang, Hongfei Jiang, Aiti Aw, et al. Reporter: 江欣倩 Professor: 陳嘉平.

A Tree-to-Tree Alignment- based Model for Statistical Machine Translation Authors: Min ZHANG, Hongfei JIANG, Ai Ti AW, Jun SUN, Sheng LI, Chew Lim TAN.

Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from.

1 Improving a Statistical MT System with Automatically Learned Rewrite Patterns Fei Xia and Michael McCord (Coling 2004) UW Machine Translation Reading.

Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.

Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.

A Hierarchical Phrase-Based Model for Statistical Machine Translation Author: David Chiang Presented by Achim Ruopp Formulas/illustrations/numbers extracted.

1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.

Does Syntactic Knowledge help English- Hindi SMT ? Avinesh. PVS. K. Taraka Rama, Karthik Gali.

11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.

The Linguistic-Core Approach to Structured Translation and Analysis of Low- Resource Languages 2011 Program Review for ARL MURI Project 4 November 2011.

Syntax Directed Translation. Syntax directed translation Yacc can do a simple kind of syntax directed translation from an input sentence to C code We.

English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.

Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.

1 Data-Driven Dependency Parsing. 2 Background: Natural Language Parsing Syntactic analysis String to (tree) structure He likes fish S NP VP NP VNPrn.

An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.

Grammatical Machine Translation Stefan Riezler & John Maxwell.

Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.

2010 Failures in Czech-English Phrase-Based MT 2010 Failures in Czech-English Phrase-Based MT Full text, acknowledgement and the list of references in.

Scalable Inference and Training of Context- Rich Syntactic Translation Models Michel Galley, Jonathan Graehl, Keven Knight, Daniel Marcu, Steve DeNeefe.

Recent Major MT Developments at CMU Briefing for Joe Olive February 5, 2008 Alon Lavie and Stephan Vogel Language Technologies Institute Carnegie Mellon.

NUDT Machine Translation System for IWSLT2007 Presenter: Boxing Chen Authors: Wen-Han Chao & Zhou-Jun Li National University of Defense Technology, China.

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.

Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.

Transfer-based MT with Strong Decoding for a Miserly Data Scenario Alon Lavie Language Technologies Institute Carnegie Mellon University Joint work with:

MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee.

Publications Vamshi Ambati, Stephan Vogel and Jaime Carbonell. "Collaborative Workflow for Crowdsourcing Translation”, To Appear in the 2012 ACM Conference.

1 Intelligente Analyse- und Informationssysteme Frank Reichartz, Hannes Korte & Gerhard Paass Fraunhofer IAIS, Sankt Augustin, Germany Dependency Tree.

What’s in a translation rule? Paper by Galley, Hopkins, Knight & Marcu Presentation By: Behrang Mohit.

Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alexander Fraser Institute for Natural Language Processing Universität Stuttgart.

CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.

INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String Statistical Translation Rules Yang Liu, Qun Liu, and Shouxun Lin Institute of Computing Technology Chinese.

A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪

Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.

NLP. Introduction to NLP The probabilities don’t depend on the specific words –E.g., give someone something (2 arguments) vs. see something (1 argument)

Haitham Elmarakeby.  Speech recognition

2003 (c) University of Pennsylvania1 Better MT Using Parallel Dependency Trees Yuan Ding University of Pennsylvania.

Improving a Statistical MT System with Automatically Learned Rewrite Rules Fei Xia and Michael McCord IBM T. J. Watson Research Center Yorktown Heights,

Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.

October 10, 2003BLTS Kickoff Meeting1 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP.

CMU MilliRADD Small-MT Report TIDES PI Meeting 2002 The CMU MilliRADD Team: Jaime Carbonell, Lori Levin, Ralf Brown, Stephan Vogel, Alon Lavie, Kathrin.

A Simple English-to-Punjabi Translation System By : Shailendra Singh.

MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee.

LING 575 Lecture 5 Kristina Toutanova MSR & UW April 27, 2010 With materials borrowed from Philip Koehn, Chris Quirk, David Chiang, Dekai Wu, Aria Haghighi.

Learning to Generate Complex Morphology for Machine Translation Einat Minkov †, Kristina Toutanova* and Hisami Suzuki* *Microsoft Research † Carnegie Mellon.

Neural Machine Translation

Approaches to Machine Translation

CSC 594 Topics in AI – Natural Language Processing

Lecture – VIII Monojit Choudhury RS, CSE, IIT Kharagpur

Basic Parsing with Context Free Grammars Chapter 13

Syntax-based Statistical Machine Translation Models

Statistical NLP Spring 2011

Statistical Machine Translation Part III – Phrase-based SMT / Decoding

Approaches to Machine Translation

Statistical Machine Translation Papers from COLING 2004

Report by: 陆纪圆.

A Path-based Transfer Model for Machine Translation

PRESENTATION: GROUP # 5 Roll No: 14,17,25,36,37 TOPIC: STATISTICAL PARSING AND HIDDEN MARKOV MODEL.

Operator Precedence and Associativity

Johns Hopkins 2003 Summer Workshop on Syntax and Statistical Machine Translation Chapters 5-8 Ethan Phelps-Goodman.

Statistical Machine Translation Part VI – Phrase-based Decoding

Presentation transcript:

Dependency Tree-to-Dependency Tree Machine Translation November 4, 2011 Presented by: Jeffrey Flanigan (CMU) Lori Levin, Jaime Carbonell In collaboration with: Chris Dyer, Noah Smith, Stephan Vogel 1

Problem Swahili: Watoto ni kusoma vitabu. Gloss: children aux-pres read books English: Children are reading books. MT (Phrase-based): Children are reading books. 2 Why? Phrase Table: Pr(reading books| kusoma vitabu) Pr(books | kusoma vitabu) Language model: Children are three new reading books. Children are reading books three new. Swahili: Watoto ni kusoma vitabu tatu mpya. Gloss: children aux-pres read books three new English: Children are reading three new books. MT (Phrase-based): Children are three new books.

Problem: Grammatical Encoding Missing Swahili: Nimeona samaki waliokula mashua. Gloss: I-found fish who-ate boat English: I found the fish that ate the boat. MT System: I found that eating fish boat. 3 Predicate-argument structure was corrupted.

Grammatical Relations I found the fish that ate the boat. 4 SUBJ OBJ ⇒ Dependency trees on source and target! ROOT DET RCMOD DOBJ DET REF

Approach Source Dependency Tree Target Dependency Tree Source Sentence Target Sentence Undo grammatical encoding (parse) Translate Grammatical encoding (choose surface form, linearize) 5 All stages statistical

Extracting the rules: Extract all consistent tree fragment pairs 6 Children are reading books three new Abaana barasoma ibitabo bitatubishya NSUBJ NUM AUX DOBJ NUM AMOD are reading NSUBJ AUX [1] [2] DOBJ barasoma NSUBJ [1] [2] DOBJ are reading NSUBJ AUX [1] books DOBJ ibitabo barasoma NSUBJ DOBJ [1] NUM three new AMOD bishya [1] NUM AMOD bitatu Children Abaana Example Extracted Pairs SOURCE SIDE TARGET SIDE Abaana [1] Children [1] NUM

Translating Extension of phrase-based SMT Linear strings → Dependency trees Phrase pairs → Tree fragment pairs Language model → Dependency language model Search is top down on the target side using beam search decoder 7

Translation Example 8 umwaana [3] arasoma Ibitabo [4] is reading NSUBJ AUX [1] [2] DOBJ [2] arasoma NSUBJ DOBJ [1] child umwaana P(e|f)=.5 P(e|f)=.8 NSUBJ DOBJ Inventory of Rules the DET NSUBJ [1] NSUBJ child umwaana P(e|f)=.1 a DET NSUBJ [1] NSUBJ ibitabobooks P(e|f)=.7 Input The child is reading books

Translation Example 9 is reading [4] umwaana [3] arasoma Ibitabo [4] is reading NSUBJ AUX [1] [2] DOBJ [2] arasoma NSUBJ DOBJ [1] child umwaana P(e|f)=.5 P(e|f)=.8 NSUBJ DOBJ NSUBJ DOBJ AUX Inventory of Rules the DET NSUBJ [1] NSUBJ child umwaana P(e|f)=.1 a DET NSUBJ [1] NSUBJ Score = w 1 ln(.5)+w 2 ln(Pr(reading| ROOT ))+w 2 ln(Pr(is|(reading, AUX ))) ibitabobooks P(e|f)=.7 [3] Input Language model on target dependency tree

Translation Example 10 is reading books umwaana [3] arasoma ibitabo is reading NSUBJ AUX [1] [2] DOBJ [2] arasoma NSUBJ DOBJ [1] child umwaana P(e|f)=.5 P(e|f)=.8 NSUBJ DOBJ NSUBJ DOBJ AUX Inventory of Rules the DET [3] NSUBJ [1] NSUBJ child umwaana P(e|f)=.1 a DET NSUBJ [1] NSUBJ Score = w 1 ln(.5)+w 1 ln(.7)+w 2 ln(Pr(reading|ROOT)) +w 2 ln(Pr(is|(reading,AUX)))+w 2 ln(Pr(books|(reading,DOBJ))) ibitabobooks P(e|f)=.7 Input

Translation Example 11 is reading books umwaana arasoma ibitabo is reading NSUBJ AUX [1] [2] DOBJ [2] arasoma NSUBJ DOBJ [1] child umwaana P(e|f)=.5 P(e|f)=.8 NSUBJ DOBJ NSUBJ DOBJ AUX Inventory of Rules the DET child the DET NSUBJ [1] NSUBJ child umwaana P(e|f)=.1 a DET NSUBJ [1] NSUBJ Score(Translation) = w 1 ln(.5)+w 1 ln(.7)+w 1 ln(.8)+w 2 ln(Pr(reading|ROOT)) +w 2 ln(Pr(is|(reading,AUX)))+w 2 ln(Pr(books|(reading,DOBJ))) +w 2 ln(Pr(child|(reading,NSUBJ)))+w 2 ln(Pr(the|(child,DET),(reading,ROOT))) ibitabobooks P(e|f)=.7 Input

Linearization Generate projective trees A* Search Left to right with target LM Admissible Heuristic: Highest scoring completion without LM 12 enough is strong NSUBJ COP ADVMOD He

Linearization 13 enough is strong NSUBJ COP ADVMOD He He is enough strong Score=Pr(He| START )∙Pr( |is)∙Pr( |strong) Generate projective trees A* Search Left to right with target LM Admissible Heuristic: Highest scoring completion without LM

Linearization 14 enough is strong NSUBJ COP ADVMOD He He is enough strong Score=Pr(He |START) ∙Pr(is|He,START)∙Pr( |is)∙Pr( |strong) Generate projective trees A* Search Left to right with target LM Admissible Heuristic: Highest scoring completion without LM

Linearization 15 enough is strong NSUBJ COP ADVMOD He He is strong enough Score=Pr(He |START) ∙Pr(is|He,START) ∙Pr(strong|He,is)∙ Pr( |is)∙Pr( |strong) Generate projective trees A* Search Left to right with target LM Admissible Heuristic: Highest scoring completion without LM

Linearization 16 enough is strong NSUBJ COP ADVMOD He He is strong enough Score=Pr(He)∙Pr(is|He)∙Pr(strong|He, is)∙Pr(enough|strong, is)∙ Pr( |is)∙Pr( |strong) Generate projective trees A* Search Left to right with target LM Admissible Heuristic: Highest scoring completion without LM

Comparison To Major Approaches ApproachSimilaritiesDifference Old Style Analysis-Transfer-GenerateSeparate analysis, generation, transfer models Statistical, rules learned Synchronous CFGs [Chiang 2005] [Zollman et al. 2006] Model of grammatical encodingAllows adjunction and head switching Tree-Transducers [Graehl & Knight 2004] Model of grammatical encodingDifferent decoding Quasi-Synchronous Grammars [Gimpel & Smith 2009] Dependency trees on source and target Different rules, decoding Synchronous Tree Insertion Grammars [DeNeefe & Knight 2009] Allows adjunctsAllows head switching Dependency Treelets [Quirk et al 2005] [Shen et al 2008] Dependency trees on source and target Word order not in rules, linearization procedure String-to-Dependency MT [Shen et al 2008] Target dependency language model Dependency trees on both source and target Dependency tree to dependency tree (JHU Summer Workshop 2002) [Čmejrek et al 2003] [Eisner 2003] Dependency trees on source and target. Linearization step. Different learning of rules, different decoding procedure 17

Conclusion Separate translation from reordering Dependency trees capture grammatical relations Can extend phrase-based MT to dependency trees Complements ISI’s approach nicely Work in progress! 18

Backup Slides 19

Allowable Rules Nodes consistent w/ alignments All variables aligned Nodes ∪ variables ∪ arcs ∪ alignments = connected graph Optional Constraints Nodes on source connected Nodes on target connected Nodes on source and target connected Decoding Constraint Target tree connected 20

Head Switching Example 21 bébé Le vient de tomber child fell just The NSUBJ DET PREP ADVMOD POBJ ADVMOD [1] [2] just NSUBJ ADVMOD [1] vient de [2] NSUBJ PREP POBJ [1] fell [2] NSUBJ ADVMOD [1] [2] de tomber NSUBJ PREP POBJ

Moving Up the Triangle Propositional Semantic Dependencies Deep Syntactic Dependencies Surface Syntactic Dependencies 22

Comparison to Synchronous Phrase Structure Rules Training dataset: Test sentence: Synchronous decoders (SAMT, Hiero, etc) produce: The children are reading book ’s Charles new all of. The children are reading book Charles ’s all of new. Problem: Grammatical encoding tied to word order. 23 Kinyarwanda: Abaana baasoma igitabo gishya kyose cyaa Karooli. English: The children are reading all of Charles ’s new book. Kinyarwanda: Abaana baasoma igitabo gishya kyose cyaa Karooli. English: The children are reading all of Charles ’s new book. Kinyarwanda: Abaana baasoma igitabo cyaa Karooli gishya kyose.