1 Duluth Word Alignment System Bridget Thomson McInnes Ted Pedersen University of Minnesota Duluth Computer Science Department 31 May 2003.

Slides:



Advertisements
Similar presentations
Statistical modelling of MT output corpora for Information Extraction.
Advertisements

GIZA ++ A review of how to run GIZA++ By: Bridget McInnes
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Lauritzen-Spiegelhalter Algorithm
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Word Alignment Philipp Koehn USC/Information Sciences Institute USC/Computer Science Department School of Informatics University of Edinburgh Some slides.
Translation Model Parameters & Expectation Maximization Algorithm Lecture 2 (adapted from notes from Philipp Koehn & Mary Hearne) Dr. Declan Groves, CNGL,
Measures of Coincidence Vasileios Hatzivassiloglou University of Texas at Dallas.
Unsupervised Turkish Morphological Segmentation for Statistical Machine Translation Coskun Mermer and Murat Saraclar Workshop on Machine Translation and.
Statistical Machine Translation. General Framework Given sentences S and T, assume there is a “translator oracle” that can calculate P(T|S), the probability.
Flow Network Models for Sub-Sentential Alignment Ying Zhang (Joy) Advisor: Ralf Brown Dec 18 th, 2001.
A Phrase-Based, Joint Probability Model for Statistical Machine Translation Daniel Marcu, William Wong(2002) Presented by Ping Yu 01/17/2006.
An Unsupervised Approach to Biomedical Term Disambiguation: Integrating UMLS and Medline Bridget T McInnes University of Minnesota Twin Cities Background.
Expectation Maximization Algorithm
1 Hidden Markov Model Instructor : Saeed Shiry  CHAPTER 13 ETHEM ALPAYDIN © The MIT Press, 2004.
ACL 2005 WORKSHOP ON BUILDING AND USING PARALLEL TEXTS (WPT-05), Ann Arbor, MI. June Competitive Grouping in Integrated Segmentation and Alignment.
1 Complementarity of Lexical and Simple Syntactic Features: The SyntaLex Approach to S ENSEVAL -3 Saif Mohammad Ted Pedersen University of Toronto, Toronto.
Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau.
C SC 620 Advanced Topics in Natural Language Processing Lecture 24 4/22.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
Abstract Overall Algorithm Target Matching Error Checking: By comparing what we transform from Kinect Camera coordinate to robot coordinate with what we.
Parameter estimate in IBM Models: Ling 572 Fei Xia Week ??
Near-duplicates detection Comparison of the two algorithms seen in class Romain Colle.
The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.
1 The Web as a Parallel Corpus  Parallel corpora are useful  Training data for statistical MT  Lexical correspondences for cross-lingual IR  Early.
1 Statistical NLP: Lecture 13 Statistical Alignment and Machine Translation.
Jan 2005Statistical MT1 CSA4050: Advanced Techniques in NLP Machine Translation III Statistical MT.
A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora Benjamin Arai Computer Science and Engineering Department.
THE MATHEMATICS OF STATISTICAL MACHINE TRANSLATION Sriraman M Tallam.
Natural Language Processing Expectation Maximization.
Translation Model Parameters (adapted from notes from Philipp Koehn & Mary Hearne) 24 th March 2011 Dr. Declan Groves, CNGL, DCU
Finding parallel texts on the web using cross-language information retrieval Achim Ruopp Joint work with Fei Xia University of Washington.
Machine translation Context-based approach Lucia Otoyo.
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
The Web as a Parallel Corpus A paper by Philip Resnik and Noah A. Smith (2003, Computational Linguistics) My interpretation of their research.
Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.
Regression Approaches to Voice Quality Control Based on One-to-Many Eigenvoice Conversion Kumi Ohta, Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, and.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien Shing Chen Author: Wei-Hao.
Posterior Regularization for Structured Latent Variable Models Li Zhonghua I2R SMT Reading Group.
Survey of Computer Science Fields Related to the Titles of Master Thesis at Faculty of Mathematics in Belgrade Dušan Tošić University of Belgrade, Faculty.
Bayesian Word Alignment for Statistical Machine Translation Authors: Coskun Mermer, Murat Saraclar Present by Jun Lang I2R SMT-Reading Group.
Chinese Word Segmentation Adaptation for Statistical Machine Translation Hailong Cao, Masao Utiyama and Eiichiro Sumita Language Translation Group NICT&ATR.
Yajuan Lü, Jin Huang and Qun Liu EMNLP, 2007 Presented by Mei Yang, May 12nd, 2008 Improving SMT Preformance by Training Data Selection and Optimization.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Mutual bilingual terminology extraction Le An Ha*, Gabriela Fernandez**, Ruslan Mitkov*, Gloria Corpas*** * University of Wolverhampton ** Universidad.
Statistical Machine Translation Raghav Bashyal. Statistical Machine Translation Uses pre-translated text (copora) Compare translated text to original.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
A Statistical Approach to Machine Translation ( Brown et al CL ) POSTECH, NLP lab 김 지 협.
Learning Extraction Patterns for Subjective Expressions 2007/10/09 DataMining Lab 안민영.
Reporter: Shau-Shiang Hung( 洪紹祥 ) Adviser:Shu-Chen Cheng( 鄭淑真 ) Date:99/06/15.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
September 2004CSAW Extraction of Bilingual Information from Parallel Texts Mike Rosner.
An Adaptive Learning with an Application to Chinese Homophone Disambiguation from Yue-shi Lee International Journal of Computer Processing of Oriental.
Statistical Machine Translation Part II: Word Alignments and EM
Biointelligence Laboratory, Seoul National University
GIZA ++ A review of how to run GIZA++ By: Bridget McInnes
Parallelizing Incremental Bayesian Segmentation (IBS)
Using UMLS CUIs for WSD in the Biomedical Domain
Training Tree Transducers
Expectation-Maximization Algorithm
Statistical Machine Translation Papers from COLING 2004
Machine Translation(MT)
Domain Mixing for Chinese-English Neural Machine Translation
Doing t-tests by hand.
Improving IBM Word-Alignment Model 1(Robert C. MOORE)
A Path-based Transfer Model for Machine Translation
Presented By: Sparsh Gupta Anmol Popli Hammad Abdullah Ayyubi
Presentation transcript:

1 Duluth Word Alignment System Bridget Thomson McInnes Ted Pedersen University of Minnesota Duluth Computer Science Department 31 May 2003

2 Duluth Word Alignment System Perl implementation of IBM Model 2 Learns a probabilistic model from sentence aligned parallel corpora –The parallel text consists of a source language text and its translation into some target language –Determines the word alignments of the sentence pairs Missing data problem –No examples of word alignments in the training data –Use the Expectation Maximization (EM) Algorithm

3 IBM Model 2 Takes into account The probability of the two words being translations of each other how likely it is for words at particular positions in a sentence pair to be alignments of each other Example Lagrandemaison Thebighouse

4 Distortion Factor How far away from the original (source) position can the word move Example: Source sentence : Target sentence :

5 Types of Alignments Sure and Probable alignments –Sure : Alignment judged to be very likely –Probable : Alignment judged to be less certain –Our system does not make this distinction, we take the highest alignment regardless of the value No-null and Null alignments –Our system does not include null alignments –Null alignments : source words that do not align to any word in the target sentence One-to-One and One-to-Many alignments –Our system includes one-to-many as well as one to one alignments

6 Alignments S1S2S3S4S5 T1T2T3T4T5 One to OneOne to ManyMany to One

7 Data English – French –Trained 5% subset of the Aligned Hansards of the 36 th Parliament of Canada Approximately 50,000 out of the 1,200,000 given sentence pairs Mixture of the House and Senate debates We wanted to train the model on comparable size data sets –Tested 447 manually word aligned sentence pairs Romanian – English –Trained on all available training data (49,284 sentence pairs) –Tested 248 manually word aligned sentence pairs

8 Results modelPrecisionRecallF-measure UMD-RE UMD-RE UMD-RE UMD-RE UMD-EF UMD-EF UMD-EF UMD-EF

9 Precision and Recall Results Precision of the two language pairs were similar –This may reflect the fact that we used approximately the same amount of training data for each of the models The recall for the English-French data was low –This system does not find alignments in which many English words align to one French word. –This reduced the number of alignment made by the system in comparison to the number of alignments in the gold standard

10 Distortion Results The precision and recall were not significantly affected by the distortion factor –Distortion factor of 0 resulted in lower precision and recall than a distortion factor of 2, 4 or 6 –Distortion factor of 2, 4, 6 resulted in approximately the same precision and recall values for each of the different language sets –The distortion factor of 4 and 6 do not contain any more information than a distortion factor of 2 suggests that word movement is limited

11 Conclusions of Training Data Small amount of training data –wanted to compare the Romanian English and the English French results –Although the data for Romanian English was different than the data for English French the results were comparable –would like to increase the training data to determine the how much of an improvement of the results could be obtained

12 Conclusions Considering modifying the existing Perl implementation to allow for this Database approach –Berkeley DB –NDBM re-implementing the algorithm in Perl Data Language –Perl module that is optimized for matrix and scientific computing