Parser Adaptation and Projection with Quasi-Synchronous Grammar Features David A. Smith (UMass Amherst) Jason Eisner (Johns Hopkins) 1.

Slides:

Advertisements

Similar presentations

Dependency tree projection across parallel texts David Mareček Charles University in Prague Institute of Formal and Applied Linguistics.

Advertisements

Combining Word-Alignment Symmetrizations in Dependency Tree Projection David Mareček Charles University in Prague Institute of.

Joint Parsing and Alignment with Weakly Synchronized Grammars David Burkett, John Blitzer, & Dan Klein TexPoint fonts used in EMF. Read the TexPoint manual.

The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.

Quasi-Synchronous Grammars Alignment by Soft Projection of Syntactic Dependencies David A. Smith and Jason Eisner Center for Language and Speech Processing.

Bayesian Learning of Non- Compositional Phrases with Synchronous Parsing Hao Zhang; Chris Quirk; Robert C. Moore; Daniel Gildea Z honghua li Mentor: Jun.

Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.

Unsupervised Dependency Parsing David Mareček Institute of Formal and Applied Linguistics Charles University in Prague Doctoral thesis defense September.

Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance July 27 EMNLP 2011 Shay B. Cohen Dipanjan Das Noah A. Smith Carnegie Mellon University.

28 June 2007EMNLP-CoNLL1 Probabilistic Models of Nonprojective Dependency Trees David A. Smith Center for Language and Speech Processing Computer Science.

1 Unsupervised Semantic Parsing Hoifung Poon and Pedro Domingos EMNLP 2009 Best Paper Award Speaker: Hao Xiong.

A Scalable Decoder for Parsing-based Machine Translation with Equivalent Language Model State Maintenance Zhifei Li and Sanjeev Khudanpur Johns Hopkins.

1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in.

Discriminative Learning of Extraction Sets for Machine Translation John DeNero and Dan Klein UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.

A Non-Parametric Bayesian Approach to Inflectional Morphology Jason Eisner Johns Hopkins University This is joint work with Markus Dreyer. Most of the.

1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 4.

A Tree-to-Tree Alignment- based Model for Statistical Machine Translation Authors: Min ZHANG, Hongfei JIANG, Ai Ti AW, Jun SUN, Sheng LI, Chew Lim TAN.

Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.

What is the Jeopardy Model? A Quasi-Synchronous Grammar for Question Answering Mengqiu Wang, Noah A. Smith and Teruko Mitamura Language Technology Institute.

Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.

Treebanks as Training Data for Parsers Joakim Nivre Växjö University and Uppsala University

Breaking the Resource Bottleneck for Multilingual Parsing Rebecca Hwa, Philip Resnik and Amy Weinberg University of Maryland.

Approximate Factoring for A* Search Aria Haghighi, John DeNero, and Dan Klein Computer Science Division University of California Berkeley.

Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.

LEARNING WORD TRANSLATIONS Does syntactic context fare better than positional context? NCLT/CNGL Internal Workshop Ankit Kumar Srivastava 24 July 2008.

Standard EM/ Posterior Regularization (Ganchev et al, 10) E-step: M-step: argmax w E q log P (x, y; w) Hard EM/ Constraint driven-learning (Chang et al,

Measuring Language Development in Children: A Case Study of Grammar Checking in Child Language Transcripts Khairun-nisa Hassanali and Yang Liu {nisa,

Natural Language Processing Lab Northeastern University, China Feiliang Ren EBMT Based on Finite Automata State Transfer Generation Feiliang Ren.

Syntax Directed Translation. Syntax directed translation Yacc can do a simple kind of syntax directed translation from an input sentence to C code We.

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars Kewei TuVasant Honavar Departments of Statistics and Computer Science University.

Part D: multilingual dependency parsing. Motivation A difficult syntactic ambiguity in one language may be easy to resolve in another language (bilingual.

Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi and introduced in Tree-adjoining grammars are somewhat similar to context-free.

Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.

Dependency Tree-to-Dependency Tree Machine Translation November 4, 2011 Presented by: Jeffrey Flanigan (CMU) Lori Levin, Jaime Carbonell In collaboration.

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.

Part E: conclusion and discussions. Topics in this talk Dependency parsing and supervised approaches Single model Graph-based; Transition-based; Easy-first;

Interpreter CS 124 Reference: Gamma et al (“Gang-of-4”), Design Patterns Some material taken from:

11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.

COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 6, 10/02/2003 Prof. Roy Levow.

Bayesian Subtree Alignment Model based on Dependency Trees Toshiaki Nakazawa Sadao Kurohashi Kyoto University 1 IJCNLP2011.

What you have learned and how you can use it : Grammars and Lexicons Parts I-III.

Dependency Parser for Swedish Project for EDA171 by Jonas Pålsson Marcus Stamborg.

INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String Statistical Translation Rules Yang Liu, Qun Liu, and Shouxun Lin Institute of Computing Technology Chinese.

A dependency parser for Spanish. David Herrero Marco Language Processing and Computational Linguistics EDA171.

Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏

Supertagging CMSC Natural Language Processing January 31, 2006.

2003 (c) University of Pennsylvania1 Better MT Using Parallel Dependency Trees Yuan Ding University of Pennsylvania.

LING 6520: Comparative Topics in Linguistics (from a computational perspective) Martha Palmer Jan 15,

Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.

Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.

Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.

Overview of Statistical NLP IR Group Meeting March 7, 2006.

Discriminative Word Alignment with Conditional Random Fields Phil Blunsom & Trevor Cohn [ACL2006] Eiji ARAMAKI.

LING 575 Lecture 5 Kristina Toutanova MSR & UW April 27, 2010 With materials borrowed from Philip Koehn, Chris Quirk, David Chiang, Dekai Wu, Aria Haghighi.

Cross-language Projection of Dependency Trees Based on Constrained Partial Parsing for Tree-to-Tree Machine Translation Yu Shen, Chenhui Chu, Fabien Cromieres.

Cross-lingual Models of Word Embeddings: An Empirical Comparison

CSC 594 Topics in AI – Natural Language Processing

[A Contrastive Study of Syntacto-Semantic Dependencies]

David Mareček and Zdeněk Žabokrtský

Programming Language Syntax 7

Regular Grammar - Finite Automaton

Soft Cross-lingual Syntax Projection for Dependency Parsing

Zhifei Li and Sanjeev Khudanpur Johns Hopkins University

Eiji Aramaki* Sadao Kurohashi* * University of Tokyo

Statistical Machine Translation Papers from COLING 2004

A Path-based Transfer Model for Machine Translation

Statistical NLP Spring 2011

Presentation transcript:

Parser Adaptation and Projection with Quasi-Synchronous Grammar Features David A. Smith (UMass Amherst) Jason Eisner (Johns Hopkins) 1

This Talk in a Nutshell 2 inthebeginning imAnfang Parser projection Unsupervised Supervised nowornever Parser adaptation Learned by Quasi-Synchronous Grammar

Projecting Hidden Structure 3 Yarowsky & Ngai ‘01

Projection Train with bitext Parse one side Align words Project dependencies Many to one links? Invalid trees? Hwa et al.: fix-up rules Ganchev et al.: trust only some links ImAnfangwardasWort Inthebeginningwastheword

Divergent Projection AufFragediesebekommenichhabeleiderAntwortkeine Ididnotunfortunatelyreceiveananswertothisquestion NULL monotonic null head-swapping siblings

Free Translation TschernobylkönntedannetwasspäterandieReihekommen ThenwecoulddealwithChernobylsometimelater Bad dependencies Parent-ancestors? NULL

What’s Wrong with Projection? Hwa et al. Chinese data: – 38.1% F1 after projection – Only 26.3% with automatic English parses – Cf. 35.9% for attach right! – 52.4% after fix-up rules Only 1-to-1 alignments: – 68% precision – 11% recall ImAnfangwardasWort Inthebeginningwastheword

EXCURSUS: DOMAIN ADAPTATION 8

Projection Different languages Similar meaning Divergent syntax ImAnfangwardasWort Inthebeginningwastheword

Adaptation Same sentence Divergent syntax Inthebeginningwastheword Inthebeginningwastheword

A Lack of Coordination 11 nowornever Prague nowornever Mel’čuk nowornever CoNLL nowornever MALT

Prepositions and Auxiliaries 12 intheendintheendintheend Ihavedecided Ihavedecided

Adaptation Recipe Acquire (a few) trees in target domain Run source-domain parser on training set Train parser with features for: – Target tree alone – Source and target trees together Parse test set with: – Source-domain parser – Target-domain parser 13

Why? Why not just modify source treebank? Source parser could be a black box – Or rule based Vastly shorter training times with a small target treebank – Linguists can quickly explore alternatives – Don’t need dozens of rules Other benefits of stacking And sometimes, divergence is very large 14

MODEL STRUCTURE 15

What We’re Modeling 16 inthebeginning imAnfang t’ w’ w t a This paper Generative Conditional Ongoing work

Stacking Model 1 Input Model 2 Model 2 has features for when to trust Model 1 … 17

Quasi-Synchronous Grammar Generative or conditional monolingual model of target language or tree Condition target trees on source structure Applications to – Alignment (D. Smith & Eisner ‘06) – Question Answering (Wang, N. Smith, Mitamura ‘07) – Paraphrase (Das & N. Smith ‘09) – Translation (Gimpel & N. Smith ‘09) 18

Dependency Relations + “none of the above”

QG Generative Story observed AufFragediesebekommenichleiderAntwortkeine Ididnotunfortunatelyreceiveananswertothisquestion NULL habe P(parent-child) P(PRP | no left children of did) P(I | ich) O(m 2 n 3 ) P(breakage)

EXPERIMENTS 21

Experimental Plan Proof of concept on English dependency- convention adaptation Unsupervised projection – No target trees – Generative target model + QG features Supervised projection – Small number of target trees – Conditional target model + QG features 22

Adaptation Results 23 See paper for more resultsDifferent PTB dep. conversions

Unsupervised Projection 24

Supervised Projection 25

Conclusions Unified adaptation and projection Conditional and generative training with quasi-synchronous grammar features Learned regular divergences Ongoing work: – Joint, but not synchronous, inference – Better alignments – More adaptation problems 26

QUESTIONS? Thanks! 27

Really Different Domains 中国在基本建设建设方面，开始开始 28 Intheareaofinfrastructureconstruction,Chinahasbegun…