Suggestions for Class Projects

Slides:



Advertisements
Similar presentations
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Advertisements

English-Hindi Translation in 21 Days Ondřej Bojar, Pavel Straňák, Daniel Zeman ÚFAL MFF, Univerzita Karlova, Praha.
TURKALATOR A Suite of Tools for English to Turkish MT Siddharth Jonathan Gorkem Ozbek CS224n Final Project June 14, 2006.
Discriminative Learning of Extraction Sets for Machine Translation John DeNero and Dan Klein UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.
1 A Tree Sequence Alignment- based Tree-to-Tree Translation Model Authors: Min Zhang, Hongfei Jiang, Aiti Aw, et al. Reporter: 江欣倩 Professor: 陳嘉平.
Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from.
Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.
Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.
Does Syntactic Knowledge help English- Hindi SMT ? Avinesh. PVS. K. Taraka Rama, Karthik Gali.
© 2014 The MITRE Corporation. All rights reserved. Stacey Bailey and Keith Miller On the Value of Machine Translation Adaptation LREC Workshop: Automatic.
Czech-to-English Translation: MT Marathon 2009 Session Preview Jonathan Clark Greg Hanneman Language Technologies Institute Carnegie Mellon University.
Alignment by Bilingual Generation and Monolingual Derivation Toshiaki Nakazawa and Sadao Kurohashi Kyoto University.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Suspense The quality of a literary work that makes the reader or audience uncertain or tense about the outcome of events.
Macquarie RT05s Speaker Diarisation System Steve Cassidy Centre for Language Technology Macquarie University Sydney.
English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.
Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.
INSTITUTE OF COMPUTING TECHNOLOGY Bagging-based System Combination for Domain Adaptation Linfeng Song, Haitao Mi, Yajuan Lü and Qun Liu Institute of Computing.
2010 Failures in Czech-English Phrase-Based MT 2010 Failures in Czech-English Phrase-Based MT Full text, acknowledgement and the list of references in.
Scalable Inference and Training of Context- Rich Syntactic Translation Models Michel Galley, Jonathan Graehl, Keven Knight, Daniel Marcu, Steve DeNeefe.
Czech-English Word Alignment Ondřej Bojar Magdalena Prokopová
Coşkun Mermer, Hamza Kaya, Mehmet Uğur Doğan National Research Institute of Electronics and Cryptology (UEKAE) The Scientific and Technological Research.
NUDT Machine Translation System for IWSLT2007 Presenter: Boxing Chen Authors: Wen-Han Chao & Zhou-Jun Li National University of Defense Technology, China.
Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.
Automatic Post-editing (pilot) Task Rajen Chatterjee, Matteo Negri and Marco Turchi Fondazione Bruno Kessler [ chatterjee | negri | turchi
Advanced MT Seminar Spring 2008 Instructors: Alon Lavie and Stephan Vogel.
Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling Ferhan Ture and Jimmy Lin University of Maryland,
Bayesian Subtree Alignment Model based on Dependency Trees Toshiaki Nakazawa Sadao Kurohashi Kyoto University 1 IJCNLP2011.
Korea Maritime and Ocean University NLP Jung Tae LEE
FEISGILTT Dublin 2014 Yves Savourel ENLASO Corporation QuEst Integration in Okapi This presentation was made possible by This project is sponsored by the.
Presenter: Jinhua Du ( 杜金华 ) Xi’an University of Technology 西安理工大学 NLP&CC, Chongqing, Nov , 2013 Discriminative Latent Variable Based Classifier.
A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼, Min Zhang ╪, Chew Lim Tan ┼ ┼╪
Mutual bilingual terminology extraction Le An Ha*, Gabriela Fernandez**, Ruslan Mitkov*, Gloria Corpas*** * University of Wolverhampton ** Universidad.
2003 (c) University of Pennsylvania1 Better MT Using Parallel Dependency Trees Yuan Ding University of Pennsylvania.
NEW REQUIREMENTS New requirements – American Sign Language – Recently Generated Sentences Issues with Requirements Options for Implementation Choice and.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.
Large Vocabulary Data Driven MT: New Developments in the CMU SMT System Stephan Vogel, Alex Waibel Work done in collaboration with: Ying Zhang, Alicia.
A Syntax-Driven Bracketing Model for Phrase-Based Translation Deyi Xiong, et al. ACL 2009.
LING 575 Lecture 5 Kristina Toutanova MSR & UW April 27, 2010 With materials borrowed from Philip Koehn, Chris Quirk, David Chiang, Dekai Wu, Aria Haghighi.
Bayes Risk Minimization using Metric Loss Functions R. Schlüter, T. Scharrenbach, V. Steinbiss, H. Ney Present by Fang-Hui, Chu.
Build MT systems with Moses MT Marathon Americas 2016 Hieu Hoang.
A CASE STUDY OF GERMAN INTO ENGLISH BY MACHINE TRANSLATION: MOSES EVALUATED USING MOSES FOR MERE MORTALS. Roger Haycock 
A NONPARAMETRIC BAYESIAN APPROACH FOR
Eliciting a corpus of word-aligned phrases for MT
Centre for Translation Studies FACULTY OF ARTS
Statistical Machine Translation Part II: Word Alignments and EM
Approaches to Machine Translation
A German Corpus for Similarity Detection
Monoligual Semantic Text Alignment and its Applications in Machine Translation Alon Lavie March 29, 2012.
Ankit Srivastava CNGL, DCU Sergio Penkale CNGL, DCU
Statistical Machine Translation Part IV – Log-Linear Models
Urdu-to-English Stat-XFER system for NIST MT Eval 2008
Alexander Fraser CIS, LMU München Machine Translation
An Overview of Machine Translation
Joint Training for Pivot-based Neural Machine Translation
--Mengxue Zhang, Qingyang Li
Statistical Machine Translation Part III – Phrase-based SMT / Decoding
Build MT systems with Moses
Vamshi Ambati 14 Sept 2007 Student Research Symposium
Eiji Aramaki* Sadao Kurohashi* * University of Tokyo
Expectation-Maximization Algorithm
Approaches to Machine Translation
Machine Translation and MT tools: Giza++ and Moses
Statistical Machine Translation Papers from COLING 2004
Planning a Composition with Style in Mind
Machine Translation and MT tools: Giza++ and Moses
Anthor: Andreas Tsiartas, Prasanta Kumar Ghosh,
Statistical NLP Spring 2011
Johns Hopkins 2003 Summer Workshop on Syntax and Statistical Machine Translation Chapters 5-8 Ethan Phelps-Goodman.
Presentation transcript:

Suggestions for Class Projects Systematic study of phrase table generation heuristics (experimentation) use Moses pipeline, different heuristics, systematic changes of alignment density, alignment quality, corpus size to extract phrase pairs Comparison of Moses phrase-based and hierarchical decoding (experimentation) experiment with at least 2 language pairs; analyze the differences Compare GIZA and Berkeley aligner (experimentation) compare AER; impact on phrase table; impact on resulting translation Word alignment for language without word boundaries (experimentation) can standard work alignment models help to detect word/morpheme boundaries; experiment with simple (e.g. Spanish) and difficult language (e.g Inupiac)

Suggestions for Class Projects Detect discontinuous phrase pairs (implementation) analyze dis-contiguous phrases in hand aligned data; implement extension to PESA aligner to detect dis-contiguous phrases Use Parts-of-Speech features to improve phrase alignment (implementation) evaluate phrase alignment quality; implement extension to PESA aligner; evaluate impact on translation

Suggestions for Class Projects Sentence-level Confidence Estimation of MT quality, possibly including identifying poor MT translations Tuning an MT system to different automatic metrics (including the new METEOR-tuning), and comparing the outcomes Learning DNT (Do Not Translate) Lists and Incorporating them into Moses Improving a Baseline Moses MT System by:       - Filtering out bad word-alignments       - Filtering the phrase-table       - Adding decoding features Building a Hierarchical or Syntax-based MT system for any language-pair