LING 388: Language and Computers Sandiway Fong Lecture 27: 12/1.

Slides:



Advertisements
Similar presentations
Statistical Machine Translation
Advertisements

LING 388: Language and Computers Sandiway Fong Lecture 23.
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
LING 388: Language and Computers Sandiway Fong Lecture 15: 10/18.
LING 388: Language and Computers Sandiway Fong Lecture 22.
LING 388: Language and Computers Sandiway Fong Lecture 23 11/10.
LING 388: Language and Computers Sandiway Fong Lecture 28 12/1.
LING 388: Language and Computers Sandiway Fong Lecture 24: 11/16.
LING 388: Language and Computers Sandiway Fong Lecture 24 11/15.
LING 388: Language and Computers Sandiway Fong Lecture 29: 12/6.
Probabilistic Language Processing Chapter 23. Probabilistic Language Models Goal -- define probability distribution over set of strings Unigram, bigram,
LING 388: Language and Computers Sandiway Fong Lecture 16: 10/20.
Statistical Machine Translation. General Framework Given sentences S and T, assume there is a “translator oracle” that can calculate P(T|S), the probability.
LING 388: Language and Computers Sandiway Fong Lecture 26: 11/29.
LING 388: Language and Computers Sandiway Fong Lecture 20: 11/2.
LING 388: Language and Computers Sandiway Fong Lecture 23: 11/14.
LING 388: Language and Computers Sandiway Fong Lecture 14: 10/13.
Machine Translation (II): Word-based SMT Ling 571 Fei Xia Week 10: 12/1/05-12/6/05.
LING 388: Language and Computers Sandiway Fong Lecture 27: 11/30.
LING 388: Language and Computers Sandiway Fong Lecture 26: 11/28.
LING 388: Language and Computers Sandiway Fong Lecture 17: 10/25.
LING 388: Language and Computers Sandiway Fong Lecture 25: 11/21.
Language Model. Major role: Language Models help a speech recognizer figure out how likely a word sequence is, independent of the acoustics. A lot of.
LING 388: Language and Computers Sandiway Fong Lecture 16: 10/19.
Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau.
C SC 620 Advanced Topics in Natural Language Processing Lecture 24 4/22.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Maximum Entropy Model LING 572 Fei Xia 02/07-02/09/06.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 18: 10/26.
Parameter estimate in IBM Models: Ling 572 Fei Xia Week ??
LING 388: Language and Computers Sandiway Fong Lecture 17: 10/24.
Corpora and Translation Parallel corpora Statistical MT (not to mention: Corpus of translated text, for translation studies)
LING 388: Language and Computers Sandiway Fong Lecture 19: 11/1.
1 The Web as a Parallel Corpus  Parallel corpora are useful  Training data for statistical MT  Lexical correspondences for cross-lingual IR  Early.
LING 388: Language and Computers Sandiway Fong Lecture 22 11/8.
1 Statistical NLP: Lecture 13 Statistical Alignment and Machine Translation.
Jan 2005Statistical MT1 CSA4050: Advanced Techniques in NLP Machine Translation III Statistical MT.
THE MATHEMATICS OF STATISTICAL MACHINE TRANSLATION Sriraman M Tallam.
Natural Language Processing Expectation Maximization.
LING 388: Language and Computers Sandiway Fong Lecture 14 10/11.
English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.
LING 388: Language and Computers Sandiway Fong Lecture 27.
LING 388: Language and Computers Sandiway Fong Lecture 22: 11/10.
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
Machine Translation Course 5 Diana Trandab ă ț Academic year:
LING 388: Language and Computers Sandiway Fong Lecture 26 11/22.
LING 388: Language and Computers Sandiway Fong Lecture 21 11/3.
LING 388: Language and Computers Sandiway Fong Lecture 10.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
LING 388: Language and Computers Sandiway Fong Lecture 20: 11/3.
Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alexander Fraser Institute for Natural Language Processing Universität Stuttgart.
February 2006Machine Translation II.21 Postgraduate Diploma In Translation Example Based Machine Translation Statistical Machine Translation.
LREC 2008 Marrakech 29 May Caroline Lavecchia, Kamel Smaïli and David Langlois LORIA / Groupe Parole, Vandoeuvre-Lès-Nancy, France Phrase-Based Machine.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
LING 388: Language and Computers Sandiway Fong Lecture 25.
A Statistical Approach to Machine Translation ( Brown et al CL ) POSTECH, NLP lab 김 지 협.
Jan 2009Statistical MT1 Advanced Techniques in NLP Machine Translation III Statistical MT.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
September 2004CSAW Extraction of Bilingual Information from Parallel Texts Mike Rosner.
LING 388: Language and Computers Sandiway Fong Lecture 23.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Review: Review: Translating without in-domain corpus: Machine translation post-editing with online learning techniques Antonio L. Lagarda, Daniel Ortiz-Martínez,
Statistical Machine Translation Part II: Word Alignments and EM
Approaches to Machine Translation
Statistical NLP: Lecture 13
Word-based SMT Ling 580 Fei Xia Week 1: 1/3/06.
Approaches to Machine Translation
Machine Translation(MT)
Presentation transcript:

LING 388: Language and Computers Sandiway Fong Lecture 27: 12/1

Today Three things –a bit more on the technology behind Statistical Machine Translation (SMT) –Homework 4 Review –Class Evaluations

Part 1

Last Time Statistical Machine Translation (SMT) –popular now –Language Weaver (Arabic, also French etc.) newest one: Persian

Translating is EU's new boom industry 2004 article

Translating is EU's new boom industry

market is there: opportunities for machine translation?

Statistical MT Avoid the explicit construction of linguistically sophisticated models of grammar Pioneered by IBM researchers (Brown et al., 1990) –Language Model Pr(S) estimated by n-grams –Translation Model Pr(T|S) estimated through alignment models

N-grams language model: P(sentence) idea: –collect statistics on co-occurrence of adjacent words Brown corpus (1 million words): –word wfrequency(w)probability(w) –the 69, –rabbit example: –Just then, the white –expectation is p(white rabbit) > p(white the) –but p(the) > p(rabbit)

Statistical Machine Translation Machine Translation –Source sentence S –Target sentence T –Every pair (S,T) has a probability –P(T|S) = probability target is T given S

Statistical Machine Translation The Language Model: P(S) –bigrams: w 1 w 2 w 3 w 4 w 5 w 1 w 2, w 2 w 3, w 3 w 4, w 4 w 5 –sequences of words S = w 1 … w n P(S) = P(w 1 )P(w 2 | w 1 )…P(w n | w 1 …w n-1 ) –product of probability of w i given preceding context for w i problem: we need to know too many probabilities –bigram approximation limit the context to the previous word P(S) ≈ P(w 1 )P(w 2 | w 1 )…P(w n | w n-1 ) –bigram probability estimation from corpora P(w i | w i-1 ) ≈ freq(w i-1 w i )/freq(w i-1 ) in a corpus

Statistical Machine Translation The Translation Model: P(T|S) –Alignment model: assume there is a transfer relationship between source and target words not necessarily 1-to-1 –Example S = w 1 w 2 w 3 w 4 w 5 w 6 w 7 T = u 1 u 2 u 3 u 4 u 5 u 6 u 7 u 8 u 9 w 4 -> u 3 u 5 fertility of w 4 = 2 distortion w 5 -> u 9

Statistical Machine Translation Alignment notation –use word positions in parentheses –no word position, no mapping –Example ( Les propositions ne seront pas mises en application maintenant | The(1) proposal(2) will(4) not(3,5) now(9) be implemented(6,7,8) ) This particular alignment is not correct, an artifact of their algorithm

Statistical Machine Translation How to compute probability of an alignment? –Need to estimate Fertility probabilities –P(fertility=n|w) = probability word w has fertility n Distortion probabilities –P(i|j,l) = probability target word is at position i given source word at position j and l is the length of the target –Example (Le chien est battu par Jean | John(6) does beat(3,4) the(1) dog(2)) –P(f=1|John)P(Jean|John) x –P(f=0|does) x –P(f=2|beat)P(est|beat)P(battu|beat) x –P(f=1|the)P(Le|the) x –P(f=1|dog)P(chien|dog) x –P(f=1| )P(par| ) x distortion probabilities…

Statistical Machine Translation Not done yet –Given T –translation problem is to find S that maximizes P(S)P(T|S) –can’t look for all possible S in the language Idea (Search): –construct best S incrementally –start with a highly likely word transfer –and find a valid alignment –extending candidate S at each step –(Jean aime Marie | * ) –(Jean aime Marie | John(1) * ) Failure? –best S not a good translation language model failed or translation model failed –couldn’t find best S search failure

Statistical Machine Translation Parameter Estimation –English/French from the Hansard corpus –100 million words –bilingual Canadian parliamentary proceedings –unaligned corpus –Language Model P(S) from bigram model –Translation Model how to estimate this with an unaligned corpus? Used EM (Estimation and Maximization) algorithm, an iterative algorithm for re- estimating probabilities Need –P(u|w) for words u in T and w in S –P(n|w) for fertility n and w in S –P(i|j,l) for target position i and source position j and target length l

Statistical Machine Translation Experiment 1: Parameter Estimation for the Translation Model –Pick 9,000 most common words for French and English –40,000 sentence pairs –81,000,000 parameters –Initial guess: minimal assumptions

Statistical Machine Translation Experiment 1: results –(English) Hear, hear! –(French) Bravo!

Statistical Machine Translation Experiment 2: Translation from French to English –Make task manageable English lexicon –1,000 most frequent English words in corpus French lexicon –1,700 most frequent French words in translations completely covered by the selected English words 117,000 sentence pairs with words covered by the lexicons 17 million parameters estimated for the translation model bigram model of English –570,000 sentences –12 million words –73 test sentences Categories: (exact, alternate, different), wrong, ungrammatical

Statistical Machine Translation

48% (Exact, alternate, different) Editing 776 keystrokes 1,916 Hansard

Part 2 Homework 4 Review

English Grammar: e21.pl DCG rules sbar(PA) --> np(X,wh), do(_,_), s_objectwh(_,S,P), {headof(X,O), PA =..[P,S,O]}. sbar(S) --> s(S). s_objectwh(s(Y,Z),S,P) --> np(Y,_), vp_objectwh(Z), {headof(Y,S),headof(Z,P)}. s(PA) --> np(Y,_), vp(Z,_), {predarg(Y,Z,1,PA)}. np(np(Y),Q) --> pronoun(Y,Q). np(np(Y),notwh) --> proper_noun(Y). np(np(D,N),Q) --> det(D,Number), common_noun(N,Number,Q). vp(vp(v(died)),ed) --> [kicked,the,bucket]. vp(vp(Y,Z),F) --> transitive(Y,F), np(Z,_). vp(vp(A,V),F) --> aux(A,F), transitive(V,en). vp_objectwh(vp(Y)) --> transitive(Y,root). det(det(the),_) --> [the]. det(det(a),sg) --> [a]. common_noun(n(bucket),sg,notwh) --> [bucket]. common_noun(n(buckets),pl,notwh ) --> [buckets]. common_noun(n(apple),sg,notwh) --> [apple]. common_noun(n(apples),pl,notwh) --> [apples]. common_noun(n(man),sg,notwh) -- > [man]. common_noun(n(book),sg,notwh) - -> [book]. common_noun(n(books),pl,notwh) --> [books].

English Grammar: e21.pl pronoun(who,wh) --> [who]. pronoun(what,wh) --> [what]. proper_noun(john) --> [john]. transitive(v(eats),s) --> [eats]. transitive(v(ate),ed) --> [ate]. transitive(v(eaten),en) --> [eaten]. transitive(v(buy),root) --> [buy]. transitive(v(buys),s) --> [buys]. transitive(v(bought),ed) --> [bought]. transitive(v(bought),en) --> [bought]. transitive(v(kicks),s) --> [kicks]. transitive(v(kicked),ed) --> [kicked]. transitive(v(kicked),en) --> [kicked]. aux(aux(was),ed) --> [was]. aux(aux(is),s) --> [is]. do(aux(does),s) --> [does]. do(aux(did),ed) --> [did].

Japanese Grammar: j21.pl DCG Rules s(PA) --> np(Y,Q1), nomcase, vp(Z,Q2), sf(Q1,Q2), {predarg(Y,Z,2,PA)}. vp(vp(Z,Y),Q) --> np(Z,Q), acccase, transitive(Y). transitive(v(katta)) --> [katta]. nomcase --> [ga]. acccase --> [o]. np(np(taroo),notwh) --> [taroo]. np(np(hon),notwh) --> [hon]. np(np(dare),wh) --> [dare]. np(np(nani),wh) --> [nani]. sf(wh,notwh) --> [ka]. sf(notwh,wh) --> [ka]. sf(notwh,notwh) --> []. sf(wh,wh) --> [ka]. predarg(X,Y,Order,PA) :- headof(X,S), headof(Y,P), order(Order,Y,NP), headof(NP,O), PA =.. [P,S,O]. predarg(X,Y,_,PA) :- headof(X,S), headof(Y,P), Y = vp(_), PA =.. [P,S]. order(1,vp(_,NP),NP). order(2,vp(NP,_),NP). headof(np(_,n(N)),N). headof(vp(v(V),_),V). headof(vp(_,v(V)),V). headof(vp(v(V)),V). headof(np(N),N).

Translator: t.pl Prolog translation code –translate(E,J) :-% Translator – sbar(X,E,[]), % English grammar – mapPA(X,Xp), – js(Xp,J,[]). % Japanese grammar –mapPA(E,J) :- % Map predicate-argument E =.. [P,S,O], je(PJ,P), je(SJ,S), je(OJ,O), J =.. [PJ,SJ,OJ]. –je(katta,bought).% Bilingual dictionary –je(hon,book). –je(taroo,john). –je(dare,who). –je(nani,what). –je(katta,buy).

Question 2: Tense Homework Question –(A) English morphology and tense –(1) (1pt) Why does ?- translate(X,[taroo,ga,hon,o,katta]). return duplicate answers? –(2) (2pts) fix the problem –(3) (1pt) *John buy the book(John buys the book) –are accepted by the English grammar fix the problem –submit both your grammar and relevant examples

Question 2: Tense English Grammar –code –vp(vp(Y,Z),F) --> transitive(Y,F), {F \== en, F \== root}, np(Z,_). –replacing unrestricted –vp(vp(Y,Z),F) --> transitive(Y,F), np(Z,_). –blocks both –*John bought the bookbought = -en form –*John buy the bookbuy = root form

Question 2: Tense Homework Question –(B) Tense and predicate-argument structure –let’s expand the grammar slightly –assume kau (buy(s)) is the present tense form of katta (bought) –(3pts) –Modify the translator to respect tenses when translating between John buys a booktaroo-ga hon-o kau John bought a booktaroo-ga hon-o katta –submit both your code and all relevant translations, e.g. ?- translate([john,buys,a,book],X). ?- translate(X,[taroo,ga,hon,o,kau]).

Question 2: Tense Translator –Code –je(kau,buys). Japanese grammar –Code –jtransitive(v(kau)) --> [kau].

Question 3: Yes-No Questions Homework Question –modify the English and Japanese grammars to incorporate yes-no questions –modify the translator to operate on yes-no questions –examples: Did John buy a book? yesno(buy(john,book)) Taroo-ga hon-o katta ka yesno(katta(taroo,hon))

Question 3: Yes-No Questions English grammar –Code –sbar(yesno(PA)) --> do(_,_), s_rootv(PA). –s_rootv(PA) --> np(Y,_), vp_rootv(Z,_), {predarg(Y,Z,1,PA)}. –vp_rootv(vp(Y,Z),root) --> transitive(Y,root), np(Z,_). Japanese grammar –Code –js(yesno(PA)) --> jnp(Y,notwh), nomcase, jvp(Z,notwh), [ka], {predarg(Y,Z,2,PA)}. Translator –Code –mapPA(yesno(E),yesno(J)) :- mapPA(E,J).

Question 4: English Idiom Complete the translator so that –John kicked the bucket has both a literal and an idiomatic translation Taroo-ga buketsu-o ketta Taroo-ga shinda buketsu = bucket shinda = died ketta = kicked –John kicked the buckets has only a literal translation Taroo-ga buketsu-o ketta (assuming Japanese does not distinguish number)

Question 4: English Idiom Japanese grammar –Code –jvp(vp(Y),notwh) --> jintransitive(Y). –jintransitive(v(shinda)) --> [shinda]. –jtransitive(v(ketta)) --> [ketta]. –jnp(np(buketsu),notwh) --> [buketsu]. Translator –Code –je(ketta,kicked). –je(shinda,died). –je(buketsu,bucket). –je(buketsu,buckets). –mapPA(E,J) :- % for intransitive P(S) PJ(SJ) –E =.. [P,S], je(PJ,P), je(SJ,S), J =.. [PJ,SJ].

Question 5: Japanese Idiom examples –taroo-ga sensei-ni goma-o sutta –taroo-nom teacher-dat sesame-acc grinded –“John flattered the teacher” –taroo-ga Hanako-ni goma-o sutta –taroo-nom Hanako-dat sesame-acc grinded –“John flattered Mary” –ni = (dat) dative Case marker –odateta is the Japanese counterpart for flattered

Question 5: Japanese Idiom English grammar –Code –common_noun(n(teacher),sg,notwh) --> [teacher]. –proper_noun(mary) --> [mary]. –transitive(v(flattered),ed) --> [flattered]. Japanese grammar –Code –jvp(vp(Z,v(odateta)),Q) --> jnp(Z,Q),datcase, [goma],acccase, [sutta]. –jtransitive(v(odateta)) --> [odateta]. –jnp(np(hanako),notwh) --> [hanako]. –jnp(np(sensei),notwh) --> [sensei]. –datcase --> [ni]. Translator –Code –je(odateta,flattered). –je(hanako,mary). –je(sensei,teacher).

Part 3 Class Evaluations