Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz.

Slides:



Advertisements
Similar presentations
Machine Translation: Interlingual Methods Thanks to Les Sikos Bonnie J. Dorr, Eduard H. Hovy, Lori S. Levin.
Advertisements

Mi casa WALT: talking about your home WILF: Grade E - detailed description of your home, opinions and justifying them Grade D – Grade E + what you are.
Language Divergences and Solutions Advanced Machine Translation Seminar Alison Alvarez.
Gustar grammar for 7 th grade. Meaning In English, it is correct to construct a sentence that has the subject "liking" a direct object. In Spanish, this.
Semantics (Representing Meaning)
Impersonal “Se”.
Syntax-Semantics Mapping Rajat Kumar Mohanty CFILT.
The Meaning of Language
Language and Cognition Colombo, June 2011 Day 2 Introduction to Linguistic Theory, Part 4.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Linguistic Theory Lecture 8 Meaning and Grammar. A brief history In classical and traditional grammar not much distinction was made between grammar and.
1 Words and the Lexicon September 10th 2009 Lecture #3.
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
NLP and Speech 2004 Feature Structures Feature Structures and Unification.
Artificial Intelligence 2005/06 From Syntax to Semantics.
CMSC 723 / LING 645: Intro to Computational Linguistics September 8, 2004: Dorr MT (continued), MT Evaluation Prof. Bonnie J. Dorr Dr. Christof Monz TA:
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
PSY 369: Psycholinguistics Some basic linguistic theory part3.
LCS and Approximate Interlingua at UMD Semantic Annotation Planning Meeting April 14, 2004 Bonnie J. Dorr University of Maryland.
Translation Divergence LING 580MT Fei Xia 1/10/06.
Link Grammar ( by Davy Temperley, Daniel Sleator & John Lafferty ) Syed Toufeeq Ahmed ASU.
PSY 369: Psycholinguistics Language Acquisition: Bilinugalism.
Machine Translation Challenges and Language Divergences Alon Lavie Language Technologies Institute Carnegie Mellon University : Machine Translation.
Natural Language Processing Expectation Maximization.
Syntax The number of words in a language is finite
1 Semantics Interpretation Allen ’ s Chapter 9 J&M ’ s Chapter 15.
Transitivity / Intransitivity Lecture 7. (IN)TRANSITIVITY is a category of the VERB Verbs which require an OBJECT are called TRANSITIVE verbs. My son.
Introduction to English Syntax Level 1 Course Ron Kuzar Department of English Language and Literature University of Haifa Chapter 2 Sentences: From Lexicon.
Week 9: resources for globalisation Finish spell checkers Machine Translation (MT) The ‘decoding’ paradigm Ambiguity Translation models Interlingua and.
Interpreting Dictionary Definitions Dan Tecuci May 2002.
Reading. How do you think we read? -memorizing words on the page -extracting just the meanings of the words -playing a mental movie in our heads of what.
CAS LX 502 8b. Formal semantics A fragment of English.
AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg.
ASSIGNMENT: Text Types
Deeper Sentiment Analysis Using Machine Translation Technology Kanauama Hiroshi, Nasukawa Tetsuya Tokyo Research Laboratory, IBM Japan Coling 2004.
Culture , Language and Communication
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
The Minimalist Program
Making it stick together…
From Syntax to Semantics
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
SYNTAX.
Levels of Linguistic Analysis
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
TYPES OF PHRASES REPRESENTING THE INTERNAL STRUCTURE OF PHRASES 12/5/2016.
An Introduction to Semantic Parts of Speech Rajat Kumar Mohanty rkm[AT]cse[DOT]iitb[DOT]ac[DOT]in Centre for Indian Language Technology Department of Computer.
Basic Syntactic Structures of English CSCI-GA.2590 – Lecture 2B Ralph Grishman NYU.
NATURAL LANGUAGE PROCESSING
Lec. 10.  In this section we explain which constituents of a sentence are minimally required, and why. We first provide an informal discussion and then.
Figure and Ground Part 2 APLNG 597C LEJIAO WANG 03/16/2015.
THE GENITIVE CASE Their Syntactical Classification.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 King Faisal University.
Natural Language Processing Vasile Rus
An Introduction to the Government and Binding Theory
PHRASE.
Statistical NLP: Lecture 3
BBI 3212 ENGLISH SYNTAX AND MORPHOLOGY
Semantics (Representing Meaning)
GREEK ADJECTIVES
Natural Language Processing
Representation of Actions as an Interlingua
Syntax.
Prepositions: show relationship of one noun or pronoun to some other word in the sentence; often reflect spatial or time relationships.
Syntax.
My aunt + has been given that teapot by the duke
BBI 3212 ENGLISH SYNTAX AND MORPHOLOGY
Natural Language - General
Machine Translation Nov 8, 2006
Prepositions: show relationship of one noun or pronoun to some other word in the sentence; often reflect spatial or time relationships.
Presentation transcript:

Machine Translation Divergences: A Formal Description and Proposed Solution Bonnie J. Dorr University of Maryland Presented by: Soobia Afroz

2 What is Machine translation Divergence? Source Language  Machine Translation System ~~cross-linguistic distinctions  Target Language 2 distinctions between source language and target language: Translation divergences: The same information is conveyed in the source and target texts, but the structures of the sentences are different. Translation mismatches: The information that is conveyed is different in the source and target languages First type is the focus of this paper.

3 Formal Definitions Lexical conceptual structure (LCS) An LCS used to map between interlingual reps and surface syntactic reps and conforms to the following structural form: [T(X') X' ([T(W') Wt], [T(Zq) Ztl] "'" [T(Z',,) Ztn] [T(Q',) Q'I] - " [T(Q',,,) Q'm])] Where, X' = the logical head W' = the logical subject Z~... Z~ = the logical arguments Q~... Q~m= the logical modifiers T(~)= the logical type (Event, State, Path, Position, etc.) corresponding to the primitive ~ (CAUSE, LET, GO, STAY, BE, etc.)

4 Example LCS: The LCS representation of “John went happily to school”: [Event GO_Loc ([Thing JOHN], [Path TO_Loc ([Position AT_Loc ([Thing JOHN], [Location SCHOOL])])] [M..... HAPPILY])]

5 RCLS AND CLCS: Root LCS (RLCS) = An uninstantiated LCS that is associated with a word definition in the lexicon Example: The RLCS associated with the word go= [Event GOLoc ([Thing X], [Path TOLoc ([Position ATLoc ([Thing X], [Location Z])])])] Compopsed LCS (CLCS) = An instantiated LCS that is the result of combining two or more RLCSs by means of unification. This is the interlingua, or language-independent, form that serves as the pivot between the source and target languages. Example: Compose the RLCS for “go” with the RLCSs for John ([ThingJOHN]), school ([Location SCHOOL]), and happily ([Manner HAPPILY]), to get the CLCS corresponding to “John went happily to school”:

6 Syntactic Phrase: A fundamental component of the mapping between the interlingual representation and the surface syntactic representation. Example: “ John went happily to school” = [C-MAX [I-MAX [N-MAX John] [V-MAX [v went] [ADV happily] [P- MAX to [N-MAX school]]]]] Where, The syntactic head is [v went] The external argument is [N-MAX John] The internal argument is [P-MAX a...] The syntactic adjunct is [ADV happily]

7 Formalizing the Mapping: Generalized linking routine GLR: Systematically relates syntactic positions from LCS Definition and Syntactic Phrase Definition 4 as follows: 1. X’ =~ X 2. W‘ =~ W 3. Z1’…Z’n =~ Z1… Zn 4. Q‘1…Q'm =~ Q1… Qm The correspondence between the LCS and the syntactic structure for the sentence John went happily to school = X’= GOLoc =~ X = [v went] W' = JOHN =~ W = [N-MAX John] Z' = TOLoc =~ Z=[pp to...] Q' = HAPPILY =~ Q=[ADV happily]

8 Formalizing the Mapping: Canonical syntactic realization (CST): Systematically relates an LCS type T(phi’) to a syntactic category CAT(phi), where phi’ is a CLCS constituent related to the syntactic constituent phi by the GLR. Example: LCS type ‘Thing’  Syntactic category N, which is ultimately projected up to a maximal level (i.e., N-MAX)

9 GLR mapping between the CLCS and the syntactic structure Where, X= Logical head Q= Syntactic adjunct W= External Arg Z= Internal Arg

10 1. Thematic Divergence Repositioning of arguments with respect to a given head. GLR invokes the following sets of relations: W' ~ Z Z’ ~ W Thematic divergence arises only in cases in which there is a logical subject, e.g., reversal of the subject with an object, as in: E: I like Mary ~ S: Maria me gusta a mi 'Mary pleases me' [C-MAX [I--MAX [N-MAX I] [V-MAX [V like] [N-MAX Mary]]]]  [State BEIdent ([Thing I], [Position ATIdent ([Thing I], [Thing MARY])], [Manner LIKINGLY])]  [C-MAX [I-MAX [N-MAX Maria] [V-MAX [V me gusta]l]] object Mary reversed places with the subject I in the Spanish translation -- object Mary turns into the subject Maria, and the subject I turns into the object me.

11 1. Thematic Divergence

12 2. Promotional Divergence Promotion (placement "higher up") of a logical modifier into a main verb position (or vice versa). The logical modifier is associated with the syntactic head position, and the logical head is then associated with an internal argument position. GLR invokes the following sets of relations: X’ ~ Z Q' ~ X E: John usually goes home =~ S: Juan suele i r a casa  'John tends to go home' Here the main verb go is modified by an adverbial adjunct usually, but in Spanish, usually has been placed into a higher position as the main verb soler, and the "going home" event has been realized as the internal argument of this verb.

13 2. Promotional Divergence

14 3. Demotional Divergence The demotion (placement "lower down") of a logical head into an internal argument position (or vice versa). In such a situation, the logical head is associated with the syntactic adjunct position, and the logical argument is then associated with a syntactic head position. The GLR : X' ~ Q Z' ~ X E: I like eating ~ G: Ich esse gem 'I eat likingly’ Here the main verb like takes the "to eat" event as an internal argument; but in German, like has been placed into a lower position as the adjunct gern, and the "eat" event has been realized as the main verb.

15 3. Demotional Divergence

16 4. Structural Divergence It changes the nature of the relation, it does not alter the positions used in the GLR mapping. E: John entered the house ~ S: Juan entr6 en la casa 'John entered in the house' Here the verbal object is realized as a noun phrase (the house) in English and as a prepositional phrase (en la casa) in Spanish.

17 4. Structural Divergence

18 5. Conflational Divergence Conflational divergence is characterized by the suppression of a CLCS constituent (or the inverse of this process). The constituent generally occurs in logical argument or logical modifier position. E: I stabbed John ~ S: Yo le di pu~aladas a Juan 'I gave knife-wounds to John'

19 5. Conflational Divergence

20 6. Categorial divergence: It is characterized by a situation in which CAT(phi) is forced to have a different value than would normally be assigned to T(~phi). E: I am hungry ~ G: Ich habe Hunger 'I have hunger‘ Here, the predicate is adjectival (hungry) in English but nominal (Hunger) in German. 7. Lexical divergence: Lexical divergence arises only in the context of other divergence types. For example, in the following example, a conflational divergence forces the occurrence of a lexical divergence. E: John broke into the room ~ S: Juan forz6 la entrada al cuarto 'John forced (the) entry to the room‘ Here, the event is lexically realized as the main verb break in English but as a different verb forzar (literally force) in Spanish.

21 Conclusion Proposed system addresses following issues: (1)Lexical selection: The task of deciding what target- language words accurately reflect the meaning of the corresponding source-language words, so matching the LCS-based interlingua (the CLCS) against the LCS- based entries (the RLCS) in the dictionary in order to select the appropriate word (2)Syntactic realization: The task of determining how target-language words are mapped to their appropriate syntactic structures, so realizing the positions marked by * (and other parametric markers) into the appropriate syntactic structure.

22 Conclusion (..cont’d) Proposed system is used in UNITRAN Does not use rules specifically tailored to source-target language Translates one sentence at a time (so mismatch between number of sentences in s-t language not allowed)