Statistical Machine Translation. General Framework Given sentences S and T, assume there is a “translator oracle” that can calculate P(T|S), the probability.

Slides:



Advertisements
Similar presentations
Statistical Machine Translation
Advertisements

Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Machine Translation III Empirical approaches to MT: Example-based MT Statistical MT LELA30431/chapter50.pdf.
10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 :
Paula Ta-Shma, IBM Haifa Research 1 “Advanced Topics on Storage Systems” - Spring 2013, Tel-Aviv University Big Data and.
GNANA SUNDAR RAJENDIRAN JOYESH MISHRA RISHI MISHRA FALL 2008 BIOINFORMATICS Clustering Method for Repeat Analysis in DNA sequences.
1 Duluth Word Alignment System Bridget Thomson McInnes Ted Pedersen University of Minnesota Duluth Computer Science Department 31 May 2003.
Machine Translation (II): Word-based SMT Ling 571 Fei Xia Week 10: 12/1/05-12/6/05.
Flow Network Models for Sub-Sentential Alignment Ying Zhang (Joy) Advisor: Ralf Brown Dec 18 th, 2001.
Improving Word-Alignments for Machine Translation Using Phrase-Based Techniques Mike Rodgers Sarah Spikes Ilya Sherman.
A Phrase-Based, Joint Probability Model for Statistical Machine Translation Daniel Marcu, William Wong(2002) Presented by Ping Yu 01/17/2006.
Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from.
ACL 2005 WORKSHOP ON BUILDING AND USING PARALLEL TEXTS (WPT-05), Ann Arbor, MI. June Competitive Grouping in Integrated Segmentation and Alignment.
Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.
Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau.
C SC 620 Advanced Topics in Natural Language Processing Lecture 24 4/22.
Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.
Maximum Entropy Model LING 572 Fei Xia 02/07-02/09/06.
Parameter estimate in IBM Models: Ling 572 Fei Xia Week ??
Corpora and Translation Parallel corpora Statistical MT (not to mention: Corpus of translated text, for translation studies)
1 Statistical NLP: Lecture 13 Statistical Alignment and Machine Translation.
Jan 2005Statistical MT1 CSA4050: Advanced Techniques in NLP Machine Translation III Statistical MT.
THE MATHEMATICS OF STATISTICAL MACHINE TRANSLATION Sriraman M Tallam.
Natural Language Processing Expectation Maximization.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 18– Training and Decoding in SMT System) Kushal Ladha M.Tech Student CSE Dept.,
An Introduction to SMT Andy Way, DCU. Statistical Machine Translation (SMT) Translation Model Language Model Bilingual and Monolingual Data* Decoder:
English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
6. N-GRAMs 부산대학교 인공지능연구실 최성자. 2 Word prediction “I’d like to make a collect …” Call, telephone, or person-to-person -Spelling error detection -Augmentative.
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
Named Entity Recognition based on Bilingual Co-training Li Yegang School of Computer, BIT.
Steven Matthew Brown Valdosta State University September 16, 2013.
Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.
Mining High Utility Itemset in Big Data
Crawling and Aligning Scholarly Presentations and Documents from the Web By SARAVANAN.S 09/09/2011 Under the guidance of A/P Min-Yen Kan 10/23/
An Investigation of Statistical Machine Translation (Spanish to English) Raghav Bashyal.
Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling Ferhan Ture and Jimmy Lin University of Maryland,
Martin KayTranslation—Meaning1 Martin Kay Stanford University with thanks to Kevin Knight.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alexander Fraser Institute for Natural Language Processing Universität Stuttgart.
Cluster-specific Named Entity Transliteration Fei Huang HLT/EMNLP 2005.
February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation.
Alignment of Bilingual Named Entities in Parallel Corpora Using Statistical Model Chun-Jen Lee Jason S. Chang Thomas C. Chuang AMTA 2004.
NRC Report Conclusion Tu Zhaopeng NIST06  The Portage System  For Chinese large-track entry, used simple, but carefully- tuned, phrase-based.
LREC 2008 Marrakech 29 May Caroline Lavecchia, Kamel Smaïli and David Langlois LORIA / Groupe Parole, Vandoeuvre-Lès-Nancy, France Phrase-Based Machine.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
A Joint Source-Channel Model for Machine Transliteration Li Haizhou, Zhang Min, Su Jian Institute for Infocomm Research 21 Heng Mui Keng Terrace, Singapore.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
A New Approach for English- Chinese Named Entity Alignment Donghui Feng Yayuan Lv Ming Zhou USC MSR Asia EMNLP-04.
Comparing Word Relatedness Measures Based on Google n-grams Aminul ISLAM, Evangelos MILIOS, Vlado KEŠELJ Faculty of Computer Science Dalhousie University,
A Statistical Approach to Machine Translation ( Brown et al CL ) POSTECH, NLP lab 김 지 협.
Jan 2009Statistical MT1 Advanced Techniques in NLP Machine Translation III Statistical MT.
Natural Language Processing Statistical Inference: n-grams
NLP. Machine Translation Source-channel model of communication Parametric probabilistic models of language and translation.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Parallel Implementation Of Word Alignment Model: IBM MODEL 1 Professor: Dr.Azimi Fateme Ahmadi-Fakhr Afshin Arefi Saba Jamalian Dept. of Electrical and.
September 2004CSAW Extraction of Bilingual Information from Parallel Texts Mike Rosner.
23.3 Information Extraction More complicated than an IR (Information Retrieval) system. Requires a limited notion of syntax and semantics.
Automatic Question Answering Beyond the Factoid Radu Soricut Information Sciences Institute University of Southern California Eric Brill Microsoft Research.
Review: Review: Translating without in-domain corpus: Machine translation post-editing with online learning techniques Antonio L. Lagarda, Daniel Ortiz-Martínez,
EURISOL, PSI, June 2006E.Wildner, CERN1 Data Bases for Parameter Lists N. Emelianenko, CERN AT-MAS E. Wildner, CERN AT-MAS Presentation is based on a presentation.
Neural Machine Translation
Neural Machine Translation by Jointly Learning to Align and Translate
Statistical NLP: Lecture 13
Expectation-Maximization Algorithm
Word-based SMT Ling 580 Fei Xia Week 1: 1/3/06.
Machine Learning Course.
Statistical Machine Translation Part VI – Phrase-based Decoding
CS224N Section 2: PA2 & EM Shrey Gupta January 21,2011.
Neural Machine Translation by Jointly Learning to Align and Translate
Presentation transcript:

Statistical Machine Translation

General Framework Given sentences S and T, assume there is a “translator oracle” that can calculate P(T|S), the probability that an “ideal translator” will produce sentence T given sentence S. Our statistical translator tries to “reverse engineer” the ideal translator. That is, given T, it finds the S with highest probability P(S|T). We have: We want:

language model translation modelsearch method

Language model language model translation model can use n-gram model search method

Language model language model translation model can use n-gram model search method

Translation model Need alignment model that will allow us to calculate the probabilities of alignments, e.g., P [The (1) proposal (2) will (4) not (3,5) now (9) be implemented (6, 7, 8) | Les propositions ne seront pas mises en application maintenant] Target sentence Source sentence Notation for alignment: Les propositions ne seront pas mises en application maintenant | The (1) proposal (2) will (4, 5) not (3) now (9) be implemented (6, 7, 8)

Translation model Alignment model consists of: – fertility model (fertility = number of source words each target word is mapped to) – term-translation model – distortion model Target sentence Source sentence

Translation model (from Brown et al. paper): Need to calculate P (alignment), that is: P [The (1) proposal (2) will (4) not (3,5) now (9) be implemented (6, 7, 8) | Les propositions ne seront pas mises en application maintenant] To calculate this, we need: Fertility model: P(fertility =n |term) for each n (up to maximum value) and each target term Term-translation model: P(term S | term T ), the probability that term S appears in the source given that term T appears in the target Distortion model: One simple version is: assume position of target term depends only on position of source term and length of target sentence P(i | j, L) for each target position i, source position j, and target length L (limited to some maximum value for L) Target sentence Source sentence

Translation model (from Brown et al. paper): Need to calculate P (alignment), that is: P [The (1) proposal (2) will (4) not (3,5) now (9) be implemented (6, 7, 8) | Les propositions ne seront pas mises en application maintenant] Example: P [The (1) proposal (2) will (4) not (3,5) now (9) be implemented (6, 7, 8) | Les propositions ne seront pas mises en application maintenant] = P(fertility=1 | the) × P(les | the) × P(1 | 1, 7) × P(fertility=1 | proposal) × P(propositions | proposal) × P(2 | 2, 7) × P(fertility=1 | will) × P(seront | will) × P(3 | 4, 7) × P(fertility=2 | not) × P(ne | not) × P(pas | not) × P(4 | 3, 7) × P(4 | 5, 7) × etc. Target sentence Source sentence

How does the statistical translator learn these various models? From data, of course! E.g., massive amount of paired source/target sentences from UN translations How does the statistical translator search the database for the highest probability source sentence? See paper