Machine Translation Om Damani (Ack: Material taken from JurafskyMartin 2 nd Ed., Brown et. al. 1993)

Slides:



Advertisements
Similar presentations
Statistical Machine Translation
Advertisements

Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Statistical Machine Translation IBM Model 1 CS626/CS460 Anoop Kunchukuttan Under the guidance of Prof. Pushpak Bhattacharyya.
C SC 620 Advanced Topics in Natural Language Processing Lecture 20 4/8.
Novel Reordering Approaches in Phrase-Based Statistical Machine Translation S. Kanthak, D. Vilar, E. Matusov, R. Zens & H. Ney ACL Workshop on Building.
Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from.
“Applying Morphology Generation Models to Machine Translation” By Kristina Toutanova, Hisami Suzuki, Achim Ruopp (Microsoft Research). UW Machine Translation.
Machine Translation Prof. Alexandros Potamianos Dept. of Electrical & Computer Engineering Technical University of Crete, Greece May 2003.
Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Application of RNNs to Language Processing Andrey Malinin, Shixiang Gu CUED Division F Speech Group.
1 Statistical NLP: Lecture 13 Statistical Alignment and Machine Translation.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
MACHINE TRANSLATION AND MT TOOLS: GIZA++ AND MOSES -Nirdesh Chauhan.
Natural Language Processing Expectation Maximization.
Natural Language Processing Lab Northeastern University, China Feiliang Ren EBMT Based on Finite Automata State Transfer Generation Feiliang Ren.
Machine translation Context-based approach Lucia Otoyo.
English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Recent Major MT Developments at CMU Briefing for Joe Olive February 5, 2008 Alon Lavie and Stephan Vogel Language Technologies Institute Carnegie Mellon.
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
IIIT Hyderabad’s CLIR experiments for FIRE-2008 Sethuramalingam S & Vasudeva Varma IIIT Hyderabad, India 1.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
Martin KayTranslation—Meaning1 Martin Kay Stanford University with thanks to Kevin Knight.
What’s in a translation rule? Paper by Galley, Hopkins, Knight & Marcu Presentation By: Behrang Mohit.
Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alexander Fraser Institute for Natural Language Processing Universität Stuttgart.
Alignment of Bilingual Named Entities in Parallel Corpora Using Statistical Model Chun-Jen Lee Jason S. Chang Thomas C. Chuang AMTA 2004.
LREC 2008 Marrakech 29 May Caroline Lavecchia, Kamel Smaïli and David Langlois LORIA / Groupe Parole, Vandoeuvre-Lès-Nancy, France Phrase-Based Machine.
2003 (c) University of Pennsylvania1 Better MT Using Parallel Dependency Trees Yuan Ding University of Pennsylvania.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
SYNTAX.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Part-of-Speech Tagging with Limited Training Corpora Robert Staubs Period 1.
October 10, 2003BLTS Kickoff Meeting1 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP.
Hindi Generation from Interlingua (UNL) Om P. Damani, IIT Bombay (Joint work with S. Singh, M. Dalal, V. Vachhani, P. Bhattacharya)
Computational Linguistics Seminar LING-696G Week 6.
23.3 Information Extraction More complicated than an IR (Information Retrieval) system. Requires a limited notion of syntax and semantics.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Natural Language Processing Vasile Rus
Neural Machine Translation
Statistical Machine Translation Part II: Word Alignments and EM
Approaches to Machine Translation
Statistical NLP: Lecture 13
Statistical Machine Translation Part III – Phrase-based SMT / Decoding
CSCI 5832 Natural Language Processing
Probabilistic and Lexicalized Parsing
CSCI 5832 Natural Language Processing
Eiji Aramaki* Sadao Kurohashi* * University of Tokyo
Expectation-Maximization Algorithm
Approaches to Machine Translation
(Ack: Material taken from JurafskyMartin 2nd Ed., Brown et. al. 1993)
Machine Translation and MT tools: Giza++ and Moses
Machine Translation(MT)
Machine Translation and MT tools: Giza++ and Moses
A Path-based Transfer Model for Machine Translation
Johns Hopkins 2003 Summer Workshop on Syntax and Statistical Machine Translation Chapters 5-8 Ethan Phelps-Goodman.
Statistical Machine Translation Part VI – Phrase-based Decoding
Pushpak Bhattacharyya CSE Dept., IIT Bombay 31st Jan, 2011
CS249: Neural Language Model
Presentation transcript:

Machine Translation Om Damani (Ack: Material taken from JurafskyMartin 2 nd Ed., Brown et. al. 1993)

2 The spirit is willing but the flesh is weak English-Russian Translation System Дух охотно готов но плоть слаба Russian-English Translation System The vodka is good, but the meat is rotten State of the Art Babelfish: Spirit is willingly ready but flesh it is weak Google: The spirit is willing but the flesh is week

3 The spirit is willing but the flesh is weak Google English-Hindi Translation System आत्मा पर शरीर दुर्बल है Google Hindi-English Translation System Spirit on the flesh is weak State of the Art (English-Hindi) – March 19, 2009

4 Is state of the art so bad Google English-Hindi Translation System कला की हालत इतनी खराब है Google Hindi-English Translation System The state of the art is so bad Is State of the Art (English-Hindi) so bad

5 State of the english hindi translation is not so bad Google English-Hindi Translation System राज्य के अंग्रेज़ी हिन्दी अनुवाद का इतना बुरा नहीं है Google Hindi-English Translation System State of the English translation of English is not so bad State of the english-hindi translation is not so bad OK. Maybe it is __ bad.

6 State of the English Hindi translation is not so bad Google English-Hindi Translation System राज्य में अंग्रेजी से हिंदी अनुवाद का इतना बुरा नहीं है Google Hindi-English Translation System English to Hindi translation in the state is not so bad State of the English-Hindi translation is not so bad OK. Maybe it is __ __ bad. राज्य के अंग्रेज़ी हिन्दी अनुवाद का इतना बुरा नहीं है

7 Your Approach to Machine Translation

8 Translation Approaches

9 Direct Transfer – What Novices do

10 Direct Transfer: Limitations Lexical Transfer: Many Bengali poet-PL,OBL this land of songs {sing has}- PrPer,Pl Many Bengali poets have sung songs of this land Final: Many Bengali poets of this land songs have sung Local Reordering: Many Bengali poet-PL,OBL of this land songs {has sing}- PrPer,Pl कई बंगाली कवियों ने इस भूमि के गीत गाए हैं Kai Bangali kaviyon ne is bhoomi ke geet gaaye hain Morph: कई बंगाली कवि -PL,OBL ने इस भूमि के गीत { गाए है }-PrPer,Pl Kai Bangali kavi-PL,OBL ne is bhoomi ke geet {gaaye hai}-PrPer,Pl

11 Syntax Transfer (Analysis-Transfer-Generation) Here phrases NP, VP etc. can be arbitrarily large

12 Syntax Transfer Limitations He went to Patna -> Vah Patna gaya He went to Patil -> Vah Patil ke pas gaya Translation of went depends on the semantics of the object of went Fatima eats salad with spoon – what happens if you change spoon Semantic properties need to be included in transfer rules – Semantic Transfer

13 Interlingua Based Transfer you this farmer agtobj pur plc contact nam or region khatav manchar taluka nam :01 For this, you contact the farmers of Manchar region or of Khatav taluka. In theory: N analysis and N transfer modules in stead of N 2 In practice: Amazingly complex system to tackle N 2 language pairs

14 Difficulties in Translation – Language Divergence ( Concepts from Dorr 1993, Text/Figures from Dave, Parikh and Bhattacharyya 2002 ) Constituent OrderPrepositional StrandingNull Subject Conflational Divergence Categorical Divergence

15 Lost in Translation: We are talking mostly about syntax, not semantics, or pragmatics You: Could you give me a glass of water Robot: Yes. ….wait..wait..nothing happens..wait… …Aha, I see… You: Will you give me a glass of water …wait…wait..wait.. Image from

16 CheckPoint  State of the Art  Different Approaches  Translation Difficulty  Need for a novel approach

17 Statistical Machine Translation: Most ridiculous idea ever Consider all possible partitions of a sentence. For a given partition, Consider all possible translations of each part. Consider all possible combinations of all possible translations Consider all possible permutations of each combination And somehow select the best partition/translation/permutation कई बंगाली कवियों ने इस भूमि के गीत गाए हैं Kai Bangali kaviyon ne is bhoomi ke geet gaaye hain कई बंगाली कवियोंने इसभूमि केगीत गाए हैं Many Bengali Poetsthislandofhave sung poem Several Bengalito thisplace‘ssing songs Many poets from Bangal in thisspacesong sung Poets from Bangladesh farmhave sung songs To this space have sung songs of many poets from Bangal

18 How many combinations are we talking about Number of choices for a N word sentence N=20 ?? Number of possible chess games

19 How do we get the Phrase Table Collect large amount of bi-lingual parallel text. For each sentence pair, Consider all possible partitions of both sentences For a given partition pair, Consider all possible mapping between parts (phrases) on two side Somehow assign the probability to each phrase pair इसके लिए आप मंचर क्षेत्र के किसानों सॆ संपर्क कीजिए For this you contact the farmers of Manchar region

20 Data Sparsity Problems in Creating Phrase Table Sunil is eating mangoe -> Sunil aam khata hai Noori is eating banana -> Noori kela khati hai Sunil is eating banana -> We need examples of everyone eating everything !! We want to figure out that eating can be either khata hai or khati hai And let Language Model select from ‘Sunil kela khata hai’ and ‘Sunil kela khati hai’ Select well-formed sentences among all candidates using LM

21 Formulating the Problem. A language model to compute P(E). A translation model to compute P(F|E). A decoder, which is given F and produces the most probable E

22 P(F|E) vs. P(E|F) P(F|E) is the translation probability – we need to look at the generation process by which pair is obtained. Parts of F correspond to parts of E. With suitable independence assumptions, P(F|E) measures whether all parts of E are covered by F. E can be quite ill-formed. It is OK if {P(F|E) for an ill-formed E} is greater than the {P(F|E) for a well formed E}. Multiplication by P(E) should hopefully take care of it. We do not have that luxury in estimating P(E|F) directly – we will need to ensure that well-formed E score higher. Summary: For computing P(F|E), we may make several independence assumptions that are not valid. P(E) compensated for that. P( बारिश हो रही है |It is raining) =.02 P( बरसात आ रही है | It is raining) =.03 P( बारिश हो रही है |rain is happening) =.420 We need to estimate P(It is raining| बारिश हो रही है ) vs. P(rain is happening| बारिश हो रही है )

23 CheckPoint  From a parallel corpus, generate probabilistic phrase table  Give a sentence, generate various candidate translations using the phrase table  Evaluate the candidates using Translation and Language Models

24 What is the meaning of Probability of Translation  What is the meaning of P(F|E)  By Magic: you simply know P(F|E) for every (E,F) pair – counting in a parallel corpora  Or, each word in E generates one word of F, independent of every other word in E or F  Or, we need a ‘random process’ to generate F from E  A semantic graph G is generated from E and F is generated from G We are no better off. We now have to estimate P(G|E) and P(F|G) for various G and then combine them – How? We may have a deterministic procedure to convert E to G, in which case we still need to estimate P(F|G)  A parse tree T E is generated from E; T E is transformed to T F; finally T F is converted into F Can you write the mathematical expression

25 The Generation Process  Partition: Think of all possible partitions of the source language  Lexicalization: For a give partition, translate each phrase into the foreign language  Spurious insertion: add foreign words that are not attributable to any source phrase  Reordering: permute the set of all foreign words - words possibly moving across phrase boundaries Try writing the probability expression for the generation process We need the notion of alignment

26 Generation Example: Alignment

27 Simplify Generation: Only 1->Many Alignments allowed

28 Alignment A function from target position to source position: The alignment sequence is: 2,3,4,5,6,6,6 Alignment function A: A(1) = 2, A(2) = 3.. A different alignment function will give the sequence:1,2,1,2,3,4,3,4 for A(1), A(2).. To allow spurious insertion, allow alignment with word 0 (NULL) No. of possible alignments: (I+1) J

29 CheckPoint  From a parallel corpus, generate probabilistic phrase table  Give a sentence, generate various candidate translations using the phrase table  Evaluate the candidates using Translation and Language Models  Understanding of Generation Process is critical  Notion of Alignment is important