Machine Translation via Dependency Transfer Philip Resnik University of Maryland DoD MURI award in collaboration with JHU: Bootstrapping Out of the Multilingual.

Slides:



Advertisements
Similar presentations
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
Advertisements

Joint Parsing and Alignment with Weakly Synchronized Grammars David Burkett, John Blitzer, & Dan Klein TexPoint fonts used in EMF. Read the TexPoint manual.
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Intro to NLP - J. Eisner1 Learning to Translate.
Unsupervised Dependency Parsing David Mareček Institute of Formal and Applied Linguistics Charles University in Prague Doctoral thesis defense September.
Multilinugual PennTools that capture parses and predicate-argument structures, and their use in Applications Martha Palmer, Aravind Joshi, Mitch Marcus,
Word Sense Disambiguation for Machine Translation Han-Bin Chen
Towards Parsing Unrestricted Text into PropBank Predicate- Argument Structures ACL4 Project NCLT Seminar Presentation, 7th June 2006 Conor Cafferkey.
In Search of a More Probable Parse: Experiments with DOP* and the Penn Chinese Treebank Aaron Meyers Linguistics 490 Winter 2009.
1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in.
Are Linguists Dinosaurs? 1.Statistical language processors seem to be doing away with the need for linguists. –Why do we need linguists when a machine.
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
The current status of Chinese-English EBMT research -where are we now Joy, Ralf Brown, Robert Frederking, Erik Peterson Aug 2001.
TIDES MT Workshop Review. Using Syntax?  ISI-small: –Cross-lingual parsing/decoding Input: Chinese sentence + English lattice built with all possible.
Machine Translation Prof. Alexandros Potamianos Dept. of Electrical & Computer Engineering Technical University of Crete, Greece May 2003.
What is the Jeopardy Model? A Quasi-Synchronous Grammar for Question Answering Mengqiu Wang, Noah A. Smith and Teruko Mitamura Language Technology Institute.
Växjö University Joakim Nivre Växjö University. 2 Who? Växjö University (800) School of Mathematics and Systems Engineering (120) Computer Science division.
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
MT Summit VIII, Language Technologies Institute School of Computer Science Carnegie Mellon University Pre-processing of Bilingual Corpora for Mandarin-English.
A Hierarchical Phrase-Based Model for Statistical Machine Translation Author: David Chiang Presented by Achim Ruopp Formulas/illustrations/numbers extracted.
Breaking the Resource Bottleneck for Multilingual Parsing Rebecca Hwa, Philip Resnik and Amy Weinberg University of Maryland.
Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.
Does Syntactic Knowledge help English- Hindi SMT ? Avinesh. PVS. K. Taraka Rama, Karthik Gali.
Czech-to-English Translation: MT Marathon 2009 Session Preview Jonathan Clark Greg Hanneman Language Technologies Institute Carnegie Mellon University.
EMPOWER 2 Empirical Methods for Multilingual Processing, ‘Onoring Words, Enabling Rapid Ramp-up Martha Palmer, Aravind Joshi, Mitch Marcus, Mark Liberman,
PFA Node Alignment Algorithm Consider the parse trees of a Chinese-English parallel pair of sentences.
Statistical Alignment and Machine Translation
English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.
Morphosyntactic correspondence: a progress report on bitext parsing Alexander Fraser, Renjing Wang, Hinrich Schütze Institute for NLP University of Stuttgart.
1 Data-Driven Dependency Parsing. 2 Background: Natural Language Parsing Syntactic analysis String to (tree) structure He likes fish S NP VP NP VNPrn.
Part D: multilingual dependency parsing. Motivation A difficult syntactic ambiguity in one language may be easy to resolve in another language (bilingual.
Grammatical Machine Translation Stefan Riezler & John Maxwell.
© Copyright 2013 ABBYY NLP PLATFORM FOR EU-LINGUAL DIGITAL SINGLE MARKET Alexander Rylov LTi Summit 2013 Confidential.
Czech-English Word Alignment Ondřej Bojar Magdalena Prokopová
1 Semi-Supervised Approaches for Learning to Parse Natural Languages Rebecca Hwa
The ICT Statistical Machine Translation Systems for IWSLT 2007 Zhongjun He, Haitao Mi, Yang Liu, Devi Xiong, Weihua Luo, Yun Huang, Zhixiang Ren, Yajuan.
NUDT Machine Translation System for IWSLT2007 Presenter: Boxing Chen Authors: Wen-Han Chao & Zhou-Jun Li National University of Defense Technology, China.
Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.
Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.
Part E: conclusion and discussions. Topics in this talk Dependency parsing and supervised approaches Single model Graph-based; Transition-based; Easy-first;
Semi-supervised Training of Statistical Parsers CMSC Natural Language Processing January 26, 2006.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
Transfer-based MT with Strong Decoding for a Miserly Data Scenario Alon Lavie Language Technologies Institute Carnegie Mellon University Joint work with:
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
INSTITUTE OF COMPUTING TECHNOLOGY Forest-to-String Statistical Translation Rules Yang Liu, Qun Liu, and Shouxun Lin Institute of Computing Technology Chinese.
A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
Exploiting Reducibility in Unsupervised Dependency Parsing David Mareček and Zdeněk Žabokrtský Institute of Formal and Applied Linguistics Charles University.
2003 (c) University of Pennsylvania1 Better MT Using Parallel Dependency Trees Yuan Ding University of Pennsylvania.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
A Syntax-Driven Bracketing Model for Phrase-Based Translation Deyi Xiong, et al. ACL 2009.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.
Statistical Natural Language Parsing Parsing: The rise of data and statistics.
Approaches to Machine Translation
Parsing in Multiple Languages
Urdu-to-English Stat-XFER system for NIST MT Eval 2008
Statistical NLP: Lecture 13
LING/C SC 581: Advanced Computational Linguistics
Vamshi Ambati 14 Sept 2007 Student Research Symposium
Approaches to Machine Translation
Translingual Knowledge Projection and Statistical Machine Translation
Statistical Machine Translation Papers from COLING 2004
Natural Language Processing
A Path-based Transfer Model for Machine Translation
Presentation transcript:

Machine Translation via Dependency Transfer Philip Resnik University of Maryland DoD MURI award in collaboration with JHU: Bootstrapping Out of the Multilingual Resource Bottleneck Start date: May, 2001

Current statistical MT IBM models 1-5 –Fail to model syntactic dependencies –Don’t take advantage of morphological features Bilingual grammar approaches –Have not evolved into a stochastic setting (SyTAG) –Model constituency rather than dependency (SITG) Dependency transduction models –Are linguistically underconstrained –Don’t take advantage of asymmetrical resources

Modeling richer linguistic features: syntactic dependency I-ergMYBROTHER-datGIFT aBUY-pastWEDDING niknireanaiariopari baterosi nionezkontza weddingIgotformybrotheragift subjobj nnprpvbdinprp$nndtnn prpprp$nn vbd

JJVBGINNNP NNS mod S pobj PLACE [ National laws ] applying in [ Hong Kong ] analysis projection The urgent response to... New Statistical MT Models ….…. NNJJ NN mod ][ translation and analysis of new data training [][] INNNP VBG NNS JJ Hong In Kong nationallaw(s)implementing of [ National laws ] applying in [ Hong Kong ] JJVBGINNNP NNS mod subj mod pobj PLACE mod subj PLACE

Baseline Dependency Transfer Architecture Parallel text GIZA++ Minipar (Lin) Collins parser Parser Training Source text Lexical selection Linearization Target text English-specific processing Language-specific processing INNN P VBGNNSJJ Hong In Kong nationallaw(s)implementing of [National laws] applying in [Hong Kong] JJVBGINNNP NNS mod subj mod pobj mod subj mod pobj

Parser Gold Standard Development Test set:188 English-Chinese sentence pairs from the Penn Chinese Treebank Two bilingual annotators Independent extraction of dependency triples Precision/recall –Inter-annotator –Projected Chinese dependency trees –Trained parser dependency tree output

Research Targets Rapid, automatic creation resources and tools Models for effective use of noisy training data Non-direct transfer: improving alignment models using lexical decomposition Monolingual parsing performance versus effective transfer to English Evaluation of dependency transfer on MT performance