[1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Slides:



Advertisements
Similar presentations
SWG Strategy (C) Copyright IBM Corp. 2006, All Rights Reserved. P4 Task 2 Fact Extraction using a CNL Current Status David Mott, Dave Braines, ETS,
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Unit Five.
CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
CODE/ CODE SWITCHING.
BBN-ANG-253 Advanced Syntax Lecture Course Autumn, 2014/15
Syntax-Semantics Mapping Rajat Kumar Mohanty CFILT.
Greenberg 1963 Some Universals of Grammar with Particular Reference to the Order of Meaningful Elements.
Verbs Longman Student Grammar of Spoken and Written English Biber; Conrad; Leech (2009, p ) Verbs provide the focal point of the clause. The main.
Statistical NLP: Lecture 3
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Chapter 20: Natural Language Generation Presented by: Anastasia Gorbunova LING538: Computational Linguistics, Fall 2006 Speech and Language Processing.
1 Words and the Lexicon September 10th 2009 Lecture #3.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
DS-to-PS conversion Fei Xia University of Washington July 29,
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Matakuliah: G0922/Introduction to Linguistics Tahun: 2008 Session 10 Syntax 1.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
C SC 620 Advanced Topics in Natural Language Processing 3/9 Lecture 14.
Creation of a Russian-English Translation Program Karen Shiells.
1 A Chart Parser for Analyzing Modern Standard Arabic Sentence Eman Othman Computer Science Dept., Institute of Statistical Studies and Research (ISSR),
Chapter 2 Words and word classes.
EMPOWER 2 Empirical Methods for Multilingual Processing, ‘Onoring Words, Enabling Rapid Ramp-up Martha Palmer, Aravind Joshi, Mitch Marcus, Mark Liberman,
GRAMMAR APPROACH By: Katherine Marzán Concepción EDUC 413 Prof. Evelyn Lugo.
Introduction to English Syntax Level 1 Course Ron Kuzar Department of English Language and Literature University of Haifa Chapter 2 Sentences: From Lexicon.
Chapter 4 Syntax Part II.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 12.
© 2006 SOUTH-WESTERN EDUCATIONAL PUBLISHING 11th Edition Hulbert & Miller Effective English for Colleges Chapter 9 SENTENCES: ELEMENTS, TYPES, AND STRUCTURES.
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 11.
CRESST ONR/NETC Meetings, July 2003, v1 ONR Advanced Distributed Learning Linguistic Modification of Test Items Jamal Abedi University of California,
English Review for Final These are the chapters to review. In Textbook: Chapter 1 Nouns Chapter 2 Pronouns Chapter 3 Adjectives Chapter 4 Verbs Chapter.
A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart
GrammaticalHierarchy in Information Flow Translation Grammatical Hierarchy in Information Flow Translation CAO Zhixi School of Foreign Studies, Lingnan.
Metalanguage Revision English language year
Deeper Sentiment Analysis Using Machine Translation Technology Kanauama Hiroshi, Nasukawa Tetsuya Tokyo Research Laboratory, IBM Japan Coling 2004.
Ideas for 100K Word Data Set for Human and Machine Learning Lori Levin Alon Lavie Jaime Carbonell Language Technologies Institute Carnegie Mellon University.
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
Deep structure (semantic) Structure of language Surface structure (grammatical, lexical, phonological) Semantic units have all meaning components such.
Parts of Speech Major source: Wikipedia. Adjectives An adjective is a word that modifies a noun or a pronoun, usually by describing it or making its meaning.
Unit 8 Syntax. Syntax Syntax deals with rules for combining words into sentences, as well as with relationship between elements in one sentence Basic.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
October 10, 2003BLTS Kickoff Meeting1 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP.
Category 2 Category 6 Category 3.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Monday W rite out this week's sentence and add capitalization and punctuation including end punctuation, commas, semicolons, apostrophes, underlining,
Differences between Spoken and Written Discourse Source: Paltridge, p.p
Eugene Nida
TRUE or FALSE? Syntax= the order of words in a sentence.
Beginning Syntax Linda Thomas
An Introduction to the Government and Binding Theory
Appendix A: Basic Grammar and Punctuation Reference
Statistical NLP: Lecture 3
Revision Outcome 1, Unit 1 The Nature and Functions of Language
Representation of Actions as an Interlingua
Part I: Basics and Constituency
Monday Write out this week's sentence and add capitalization and punctuation including end punctuation, commas, semicolons, apostrophes, underlining, and.
Monday Write out this week's sentence and add capitalization and punctuation including end punctuation, commas, semicolons, apostrophes, underlining, and.
PREPOSITIONAL PHRASES
English Concepts & Vocabulary # 2.
Linguistic aspects of interlanguage
Structure of a Lexicon Debasri Chakrabarti 13-May-19.
DGP THURSDAY NOTES (Clauses and Sentence Type)
Monday Write out this week's sentence and add capitalization and punctuation including end punctuation, commas, semicolons, apostrophes, underlining, and.
Presentation transcript:

[1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events in English and Korean: The influence of language- specific lexicalization patterns 2004 Fall Presented by Yeongmi Jeon

Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System Chung-hye Han, Martha Palmer (IRCS/CIS, UPenn) Benoit Lavoie, Richard Kittredge, Tanya Korelsky, Myunghee Kim (CoGenTex, Inc.) Owen Rambow (ATT Labs-Research) Nari Kim (Konan Technology, Inc.) AMTA ’2000 Oct , 2000

Outline of the Talk Linguistic issues System overview Deep Syntactic Structure (DSyntS) Parser output conversion Handling structural divergences: Transfer Dropped argument recovery

Linguistic Issues in Korean/English MT-1 Word Order SOURCE: chuka kongkwupmul-eul 103 ceonwiciweontaetae-eke saryeongpu-ka cueossta GLOSS: additional supply-Acc 103rd forward support battalion-Dat headquarters-Nom gave TARGET: Headquarters gave 103rd forward support battalion additional supplies. OUTPUT: Headquarters gave an additional supply to a 103rd forward support battalion.

Linguistic Issues in Korean/English MT-2 Dropped arguments and Morphology SOURCE: IBP hwail-eul keomsaekhaci moshaess-tamyeon cikeum tasi ponaekessta. GLOSS: IBP file-Acc retrieve could_not- if now again will_send TARGET: If (NP1) could not retrieve IBP file, (NP2) will send again (NP3) now. OUTPUT: If one can not retrieve an IBP file, one will send it again now.

Overview of the System

Deep Syntactic Structure-1 Dependency structure based on Meaning Text Theory (Mel’cuk 1988). Nodes are labeled by lexemes. Directed arcs with dependency relation labels: I, II, III, ATTR. Critical to the success of translation!!! Grammatical information is represented as features on the node labels. Well suited to MT: Abstracts away from superficial grammatical differences between languages, such as linear order and the usage of function words.

DSyntS-2: example ‘John often eats beans.’

Predicate-Argument Lexicon-1: English Subcategorization information for verbs and adjectives. Critical for recovery of dropped arguments!!!

Predicate-Argument Lexicon-2: Korean Arguments are listed with case or adverbial postpositions. -case postpositions: nominative, accusative. -adverbial postpositions: {e-Ke}(‘to’), {Ro} (‘to’), {e-Seo} (‘from’). Critical for conversion!!!

Conversion-1 Generic dependency structure (Yoon et. al. 1997) )  MTT-based DSyntS -STEP 1: Rewriting feature labels.

Conversion-2 -STEP 2: Making dependency relationships more explicit. Korean predicate-argument lexicon is used as a guide.

Conversion-3 -STEP 3: Promoting features to lexemes and vice versa.

Conversion-4: from Korean Parser Output to DSyntS

Transfer-1 Based on DSyntS grammars that are independently motivated by source and target languages. Transfer rules relate DSyntS subtrees. Map source DSyntS subtrees to target DSyntS subtrees. Use of variables allows generalization of rule application. Features on DSyntS nodes constrain rule application.

Transfer-2 Simplest case: The related subtrees are reduced to a single node. Structural divergence is represented in the transfer lexicon by including contextual information in the related subtrees.

Transfer-3: Multi-word Transfer of predicative adjectives

Transfer-4: from Inflection to a Lexeme

Transfer-5: More Complex Example Korean complex NP whose head noun is lexicalized as an auxiliary noun { Keos} in the context of a copular  English to-infinitive.

Transfer-6: from Korean DSyntS to English DSyntS

Argument Recovery-1 Dropped arguments must be recovered in order to obtain grammatical English sentences. Add default pronouns for missing arguments using grammatical and lexical knowledge. - English predicate-argument lexicon is critical. This is performed just before English realization, by preprocessing the English DSyntS obtained from transfer.

Argument Recovery-2: Rules Insertion of Missing Actant I: Determining whether pronouns are animate or not :

Argument Recovery-3: Before ‘If (NP1) could not retrieve IBP file, (NP2) will send (NP3) again now.’

Argument Recovery-4: After ‘If one cannot retrieve an IBP file, one will send it again now.’

Conclusion and Future Work Transfer based on predicate argument structures of each language.  Allows us to use off-the-shelf parsers. The development of a TreeBank for a Korean-English parallel corpus. Use syntactically annotated corpus for automatic extraction of transfer rules. Explicit annotation of empty arguments as well as the incorporation of a discourse model for a more principled recovery of implicit arguments.

Current Status Parallel corpus: military language training manual - 50,000 word tokens, 3800 word types, 5000 sentences. Predicate-argument lexicon entries. Transfer lexicon entries. Grammatical analysis - simple clause (declaratives, imperatives, interrogatives), -complex clause (subordination, coordination), -scrambling, empty argument, adjective phrase, -noun phrase (compound nouns, NP modifiers, relative clauses, complex noun phrases), -verb phrase (auxiliary verbs, light verbs, compound verbs), -negation, copular sentence, adverb modification, etc.

Learning to express motion events in English and Korean: The influence of language- specific lexicalization patterns Soonja Choi and Melissa Bowerman

Outline of the Talk Introduction Semantic components of a motion event English: -Conflation of Motion with Manner or Cause Korean: Mixed conflation pattern -Spontaneous motion -Caused motion

Introduction-1 Encoding of motion events -provides core structuring principles to many meanings -different in many languages Language acquiring -two sources : nonlinguistic knowledge, semantic organization of the language -want to know how they interact in acquiring of a language

Introduction-2 4 basic components of (dynamic) motion event -Motion, Figure, Ground, Path Additional components -Manner, Cause, Deixis Fundamental typological differences [Talmy] in how a motion event is expressed -3 patterns 1> [Motion + [Manner|Cause] ] - [Path] 2> [Motion + Path] - [Manner|Cause] 3> [Motion + Figure] - [Path] - [Manner|Cause]

English Usual pattern = [ Motion + [Manner | Cause] ] – [Path] [Motion + Manner] The rock SLID/ROLLED/BOUNCED down ( the hill ) [Motion + Cause] The wind BLEW the napkin off the table [Motion + Deixis]: (towards vs. away from the speaker) John CAME/WENT into the room The same verb conflations in both intransitive, transitive sentences Path - marked in the same way in both intransitive, transitive sentences

Korean-1: Basic Different encoding patterns for transitive, intransitive verbs Path markers are also verbs: No dedicated system of morphemes - prepositions or particles in English -3 locative case endings: are suffixed to a Ground nominal and function like prepositions EY “at, to”, -LO “toward”, -EYSE “from” Basic word order: subject-object-verb Verb phrase: one or more “full” verbs -The final verb bears all the inflectional suffixes - Compound verb: connected by a “connecting” suffixes

Korean-2: Spontaneous motion Main verb: usually KATA “go” or OTA “come” Pattern = [Manner] - [Path] - [Motion+Deixis] Path verbs -Do not express posture changes up, down in English for changes of location and postures Posture changes with monomorphemic verbs -ANCTA “sit down”, NWUPTA “lie down” -[Path]-[posture verbs]: serialized events OLLA ANCTA “get on to a higher surface and sit down"

Korean-2: Spontaneous motion verb-1

Korean-2: Spontaneous motion verb-2

Korean-3: Caused Motion Verbs-1

Korean-3: Caused Motion Verbs-2

Korean-3: Caused motion-1 Pattern = [Motion+Path] Path -Different forms -Different meanings: Require finer distinction in actions KKITA/PPAYTA Path category “putting in/on/together” result in a fitting relationship = KKITA loose = NEHTA surface contact = NOHTA, PWUTHITA -Incorporate aspects of Figure and Ground also : different verbs for different Figures or Ground

Korean-3: Caused motion-2 Deixis -No deictic transitive verb take, bring in English, KATA, OTA in Korean intransitive -Special encoding take = KACY-E "have" - KATA "go" bring = KACY-E "have" - OTA "come"

Korean-3: Caused motion-3 [Manner|Cause]-[Path] -Possible but less frequent than in English -Reason = Different restrictions on obligatory information English: Better spell out Path completely John threw his keys TO his desk ( x ) John threw his keys ONTO his desk ( o ) Korean: Path can often be omitted if Manner or Cause supplied if the relationship between Figure and Ground can be easily inferred  locative case endings are sufficient

Conclusion

English - The same verb conflation patterns in both spontaneous motion expressions and caused motion expressions - Encodes Path separately with the same markers for both kinds of motions Korean - Different lexicalization patterns for spontaneous and caused motion - Path markers (verbs) are different for two kinds of motions and have narrower usage ranges