Joint Parsing and Alignment with Weakly Synchronized Grammars David Burkett, John Blitzer, & Dan Klein TexPoint fonts used in EMF. Read the TexPoint manual.

Slides:



Advertisements
Similar presentations
Statistical Machine Translation
Advertisements

Semi-Supervised Learning in Gigantic Image Collections
Thomas Schoenemann University of Düsseldorf, Germany ACL 2013, Sofia, Bulgaria Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment TexPoint.
The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
Authors Sebastian Riedel and James Clarke Paper review by Anusha Buchireddygari Incremental Integer Linear Programming for Non-projective Dependency Parsing.
Regression “A new perspective on freedom” TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A A AAA A A.
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Learning Accurate, Compact, and Interpretable Tree Annotation Recent Advances in Parsing Technology WS 2011/2012 Saarland University in Saarbrücken Miloš.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Constrained Approximate Maximum Entropy Learning (CAMEL) Varun Ganapathi, David Vickrey, John Duchi, Daphne Koller Stanford University TexPoint fonts used.
Structured SVM Chen-Tse Tsai and Siddharth Gupta.
Chinese Word Segmentation Method for Domain-Special Machine Translation Su Chen; Zhang Yujie; Guo Zhen; Xu Jin’an Beijing Jiaotong University.
1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in.
Discriminative Learning of Extraction Sets for Machine Translation John DeNero and Dan Klein UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.
1 A Tree Sequence Alignment- based Tree-to-Tree Translation Model Authors: Min Zhang, Hongfei Jiang, Aiti Aw, et al. Reporter: 江欣倩 Professor: 陳嘉平.
CS 188: Artificial Intelligence Fall 2009 Lecture 21: Speech Recognition 11/10/2009 Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.
Machine Translation via Dependency Transfer Philip Resnik University of Maryland DoD MURI award in collaboration with JHU: Bootstrapping Out of the Multilingual.
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
Application of RNNs to Language Processing Andrey Malinin, Shixiang Gu CUED Division F Speech Group.
Approximate Factoring for A* Search Aria Haghighi, John DeNero, and Dan Klein Computer Science Division University of California Berkeley.
Parameter estimate in IBM Models: Ling 572 Fei Xia Week ??
Maximum Entropy Model & Generalized Iterative Scaling Arindam Bose CS 621 – Artificial Intelligence 27 th August, 2007.
Standard EM/ Posterior Regularization (Ganchev et al, 10) E-step: M-step: argmax w E q log P (x, y; w) Hard EM/ Constraint driven-learning (Chang et al,
Microsoft Research Faculty Summit Robert Moore Principal Researcher Microsoft Research.
Czech-to-English Translation: MT Marathon 2009 Session Preview Jonathan Clark Greg Hanneman Language Technologies Institute Carnegie Mellon University.
Parser Adaptation and Projection with Quasi-Synchronous Grammar Features David A. Smith (UMass Amherst) Jason Eisner (Johns Hopkins) 1.
Named Entity Recognition based on Bilingual Co-training Li Yegang School of Computer, BIT.
SI485i : NLP Set 8 PCFGs and the CKY Algorithm. PCFGs We saw how CFGs can model English (sort of) Probabilistic CFGs put weights on the production rules.
Scalable Inference and Training of Context- Rich Syntactic Translation Models Michel Galley, Jonathan Graehl, Keven Knight, Daniel Marcu, Steve DeNeefe.
Sparse Matrix Factorizations for Hyperspectral Unmixing John Wright Visual Computing Group Microsoft Research Asia Sept. 30, 2010 TexPoint fonts used in.
Training dependency parsers by jointly optimizing multiple objectives Keith HallRyan McDonaldJason Katz- BrownMichael Ringgaard.
Recent Major MT Developments at CMU Briefing for Joe Olive February 5, 2008 Alon Lavie and Stephan Vogel Language Technologies Institute Carnegie Mellon.
1 CS546: Machine Learning and Natural Language Latent-Variable Models for Structured Prediction Problems: Syntactic Parsing Slides / Figures from Slav.
1 Co-Training for Cross-Lingual Sentiment Classification Xiaojun Wan ( 萬小軍 ) Associate Professor, Peking University ACL 2009.
NUDT Machine Translation System for IWSLT2007 Presenter: Boxing Chen Authors: Wen-Han Chao & Zhou-Jun Li National University of Defense Technology, China.
Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.
Part E: conclusion and discussions. Topics in this talk Dependency parsing and supervised approaches Single model Graph-based; Transition-based; Easy-first;
Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.
Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling Ferhan Ture and Jimmy Lin University of Maryland,
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Bayesian Subtree Alignment Model based on Dependency Trees Toshiaki Nakazawa Sadao Kurohashi Kyoto University 1 IJCNLP2011.
Coarse-to-Fine Efficient Viterbi Parsing Nathan Bodenstab OGI RPE Presentation May 8, 2006.
What’s in a translation rule? Paper by Galley, Hopkins, Knight & Marcu Presentation By: Behrang Mohit.
Prototype-Driven Learning for Sequence Models Aria Haghighi and Dan Klein University of California Berkeley Slides prepared by Andrew Carlson for the Semi-
INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String Statistical Translation Rules Yang Liu, Qun Liu, and Shouxun Lin Institute of Computing Technology Chinese.
NRC Report Conclusion Tu Zhaopeng NIST06  The Portage System  For Chinese large-track entry, used simple, but carefully- tuned, phrase-based.
Mutual bilingual terminology extraction Le An Ha*, Gabriela Fernandez**, Ruslan Mitkov*, Gloria Corpas*** * University of Wolverhampton ** Universidad.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning a Compositional Semantic Parser.
Haitham Elmarakeby.  Speech recognition
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.
Structured learning: overview Sunita Sarawagi IIT Bombay TexPoint fonts used in EMF. Read the TexPoint manual before.
Statistical NLP Spring 2010 Lecture 20: Compositional Semantics Dan Klein – UC Berkeley.
A Syntax-Driven Bracketing Model for Phrase-Based Translation Deyi Xiong, et al. ACL 2009.
LING 575 Lecture 5 Kristina Toutanova MSR & UW April 27, 2010 With materials borrowed from Philip Koehn, Chris Quirk, David Chiang, Dekai Wu, Aria Haghighi.
This research is supported by the U.S. Department of Education and DARPA. Focuses on mistakes in determiner and preposition usage made by non-native speakers.
Statistical Machine Translation Part II: Word Alignments and EM
Statistical NLP Winter 2009
Urdu-to-English Stat-XFER system for NIST MT Eval 2008
Statistical NLP Spring 2010
Statistical NLP Spring 2011
--Mengxue Zhang, Qingyang Li
Statistical Machine Translation Papers from COLING 2004
TexPoint fonts used in EMF.
Statistical NLP Spring 2011
Dekai Wu Presented by David Goss-Grubbs
Presentation transcript:

Joint Parsing and Alignment with Weakly Synchronized Grammars David Burkett, John Blitzer, & Dan Klein TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A AAA A A

Statistical MT Training Pipeline 1) Align sentence pairs (GIZA++) 2) Parse English sentences (Berkeley parser) Parse Foreign sentences 3) Extract rules (Galley et al. 2006) 4) Tune discriminative parameters at office in read book read the book in the office } Joint model for (1) & (2)

Data Setting for Joint Models (; ) English WSJ ( EN ; ) (; ) Chinese CTB Parallel, Aligned CTB ( EN,; ) Unlabeled parallel text ( EN ;)

Word alignment grids at office in read book read the book in the office

Syntactic Correspondences EN Build a model

Correspondence via Synchronous Grammars

Synchronous derivation

Synchronous Derivation

Weakly Synchronized Example

Separate PCFGs

Weakly Synchronized Example ITG alignment

Weakly Synchronized Example Points for synchronization, but not required

Correspondence Model & Feature Types office Feature type 1: Word Alignment EN Feature type 3: Correspondence Feature type 2: Monolingual Parser EN PP in the office EN EN EN EN EN [HBDK09]

Estimating EN EN Set to maximize the log-likelihood of the correct parses & alignments EN EN EN normalizes to sum to 1

Computing Correspondence features tie pieces together EN EN Computing exactly is intractable EN EN Individual,, have polynomial-time dynamic programming algorithms

Approximating : Mean Field Exploit tractability in individual models: Factored approximation: EN 1)Initialize separately 2)Iterate: Set to minimize EN EN Algorithm

Large scale inference We can approximate in polynomial time, but... EN Sum over possible alignments is an algorithm. But computers are fast, right? Medium-length sentences are 50 words long Small translation data sets are 250,000 sentences ~4 quadrillion operations (See for speedup details) [BBK10, HBDK09]

Quantitative Results: Parsing

85.7% 83.6%

Quantitative Results: Parsing 81.2% 84.5%

Incorrect English PP Attachment

Corrected English PP Attachment

Quantitative Results: Translation 69.5% 85.0% BLEU improvement from 29.4 to %

Better Translations with Bilingual Adaptation Reference At this point the cause of the plane collision is still unclear. The local caa will launch an investigation into this. Baseline (GIZA++) The cause of planes is still not clear yet, local civil aviation department will investigate this., Cur- rently causeplanecrashDEreasonyetnotclear,local civil aero- nautics bureauwilltowardopen investi- gations Bilingual Adaptation Model The cause of plane collision remained unclear, local civil aviation departments will launch an investigation.

Thanks