Presentation is loading. Please wait.

Presentation is loading. Please wait.

[1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events.

Similar presentations


Presentation on theme: "[1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events."— Presentation transcript:

1 [1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events in English and Korean: The influence of language- specific lexicalization patterns 2004 Fall Presented by Yeongmi Jeon

2 Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System Chung-hye Han, Martha Palmer (IRCS/CIS, UPenn) Benoit Lavoie, Richard Kittredge, Tanya Korelsky, Myunghee Kim (CoGenTex, Inc.) Owen Rambow (ATT Labs-Research) Nari Kim (Konan Technology, Inc.) AMTA ’2000 Oct. 12 - 14, 2000

3 Outline of the Talk Linguistic issues System overview Deep Syntactic Structure (DSyntS) Parser output conversion Handling structural divergences: Transfer Dropped argument recovery

4 Linguistic Issues in Korean/English MT-1 Word Order SOURCE: chuka kongkwupmul-eul 103 ceonwiciweontaetae-eke saryeongpu-ka cueossta GLOSS: additional supply-Acc 103rd forward support battalion-Dat headquarters-Nom gave TARGET: Headquarters gave 103rd forward support battalion additional supplies. OUTPUT: Headquarters gave an additional supply to a 103rd forward support battalion.

5 Linguistic Issues in Korean/English MT-2 Dropped arguments and Morphology SOURCE: IBP hwail-eul keomsaekhaci moshaess-tamyeon cikeum tasi ponaekessta. GLOSS: IBP file-Acc retrieve could_not- if now again will_send TARGET: If (NP1) could not retrieve IBP file, (NP2) will send again (NP3) now. OUTPUT: If one can not retrieve an IBP file, one will send it again now.

6 Overview of the System

7 Deep Syntactic Structure-1 Dependency structure based on Meaning Text Theory (Mel’cuk 1988). Nodes are labeled by lexemes. Directed arcs with dependency relation labels: I, II, III, ATTR. Critical to the success of translation!!! Grammatical information is represented as features on the node labels. Well suited to MT: Abstracts away from superficial grammatical differences between languages, such as linear order and the usage of function words.

8 DSyntS-2: example ‘John often eats beans.’

9 Predicate-Argument Lexicon-1: English Subcategorization information for verbs and adjectives. Critical for recovery of dropped arguments!!!

10 Predicate-Argument Lexicon-2: Korean Arguments are listed with case or adverbial postpositions. -case postpositions: nominative, accusative. -adverbial postpositions: {e-Ke}(‘to’), {Ro} (‘to’), {e-Seo} (‘from’). Critical for conversion!!!

11 Conversion-1 Generic dependency structure (Yoon et. al. 1997) )  MTT-based DSyntS -STEP 1: Rewriting feature labels.

12 Conversion-2 -STEP 2: Making dependency relationships more explicit. Korean predicate-argument lexicon is used as a guide.

13 Conversion-3 -STEP 3: Promoting features to lexemes and vice versa.

14 Conversion-4: from Korean Parser Output to DSyntS

15 Transfer-1 Based on DSyntS grammars that are independently motivated by source and target languages. Transfer rules relate DSyntS subtrees. Map source DSyntS subtrees to target DSyntS subtrees. Use of variables allows generalization of rule application. Features on DSyntS nodes constrain rule application.

16 Transfer-2 Simplest case: The related subtrees are reduced to a single node. Structural divergence is represented in the transfer lexicon by including contextual information in the related subtrees.

17 Transfer-3: Multi-word Transfer of predicative adjectives

18 Transfer-4: from Inflection to a Lexeme

19 Transfer-5: More Complex Example Korean complex NP whose head noun is lexicalized as an auxiliary noun { Keos} in the context of a copular  English to-infinitive.

20 Transfer-6: from Korean DSyntS to English DSyntS

21 Argument Recovery-1 Dropped arguments must be recovered in order to obtain grammatical English sentences. Add default pronouns for missing arguments using grammatical and lexical knowledge. - English predicate-argument lexicon is critical. This is performed just before English realization, by preprocessing the English DSyntS obtained from transfer.

22 Argument Recovery-2: Rules Insertion of Missing Actant I: Determining whether pronouns are animate or not :

23 Argument Recovery-3: Before ‘If (NP1) could not retrieve IBP file, (NP2) will send (NP3) again now.’

24 Argument Recovery-4: After ‘If one cannot retrieve an IBP file, one will send it again now.’

25 Conclusion and Future Work Transfer based on predicate argument structures of each language.  Allows us to use off-the-shelf parsers. The development of a TreeBank for a Korean-English parallel corpus. Use syntactically annotated corpus for automatic extraction of transfer rules. Explicit annotation of empty arguments as well as the incorporation of a discourse model for a more principled recovery of implicit arguments.

26 Current Status Parallel corpus: military language training manual - 50,000 word tokens, 3800 word types, 5000 sentences. Predicate-argument lexicon - 1000 entries. Transfer lexicon - 4000 entries. Grammatical analysis - simple clause (declaratives, imperatives, interrogatives), -complex clause (subordination, coordination), -scrambling, empty argument, adjective phrase, -noun phrase (compound nouns, NP modifiers, relative clauses, complex noun phrases), -verb phrase (auxiliary verbs, light verbs, compound verbs), -negation, copular sentence, adverb modification, etc.

27 Learning to express motion events in English and Korean: The influence of language- specific lexicalization patterns Soonja Choi and Melissa Bowerman

28 Outline of the Talk Introduction Semantic components of a motion event English: -Conflation of Motion with Manner or Cause Korean: Mixed conflation pattern -Spontaneous motion -Caused motion

29 Introduction-1 Encoding of motion events -provides core structuring principles to many meanings -different in many languages Language acquiring -two sources : nonlinguistic knowledge, semantic organization of the language -want to know how they interact in acquiring of a language

30 Introduction-2 4 basic components of (dynamic) motion event -Motion, Figure, Ground, Path Additional components -Manner, Cause, Deixis Fundamental typological differences [Talmy] in how a motion event is expressed -3 patterns 1> [Motion + [Manner|Cause] ] - [Path] 2> [Motion + Path] - [Manner|Cause] 3> [Motion + Figure] - [Path] - [Manner|Cause]

31 English Usual pattern = [ Motion + [Manner | Cause] ] – [Path] [Motion + Manner] The rock SLID/ROLLED/BOUNCED down ( the hill ) [Motion + Cause] The wind BLEW the napkin off the table [Motion + Deixis]: (towards vs. away from the speaker) John CAME/WENT into the room The same verb conflations in both intransitive, transitive sentences Path - marked in the same way in both intransitive, transitive sentences

32 Korean-1: Basic Different encoding patterns for transitive, intransitive verbs Path markers are also verbs: No dedicated system of morphemes - prepositions or particles in English -3 locative case endings: are suffixed to a Ground nominal and function like prepositions EY “at, to”, -LO “toward”, -EYSE “from” Basic word order: subject-object-verb Verb phrase: one or more “full” verbs -The final verb bears all the inflectional suffixes - Compound verb: connected by a “connecting” suffixes

33 Korean-2: Spontaneous motion Main verb: usually KATA “go” or OTA “come” Pattern = [Manner] - [Path] - [Motion+Deixis] Path verbs -Do not express posture changes up, down in English for changes of location and postures Posture changes with monomorphemic verbs -ANCTA “sit down”, NWUPTA “lie down” -[Path]-[posture verbs]: serialized events OLLA ANCTA “get on to a higher surface and sit down"

34 Korean-2: Spontaneous motion verb-1

35 Korean-2: Spontaneous motion verb-2

36 Korean-3: Caused Motion Verbs-1

37 Korean-3: Caused Motion Verbs-2

38 Korean-3: Caused motion-1 Pattern = [Motion+Path] Path -Different forms -Different meanings: Require finer distinction in actions KKITA/PPAYTA Path category “putting in/on/together” result in a fitting relationship = KKITA loose = NEHTA surface contact = NOHTA, PWUTHITA -Incorporate aspects of Figure and Ground also : different verbs for different Figures or Ground

39 Korean-3: Caused motion-2 Deixis -No deictic transitive verb take, bring in English, KATA, OTA in Korean intransitive -Special encoding take = KACY-E "have" - KATA "go" bring = KACY-E "have" - OTA "come"

40 Korean-3: Caused motion-3 [Manner|Cause]-[Path] -Possible but less frequent than in English -Reason = Different restrictions on obligatory information English: Better spell out Path completely John threw his keys TO his desk ( x ) John threw his keys ONTO his desk ( o ) Korean: Path can often be omitted if Manner or Cause supplied if the relationship between Figure and Ground can be easily inferred  locative case endings are sufficient

41 Conclusion

42 English - The same verb conflation patterns in both spontaneous motion expressions and caused motion expressions - Encodes Path separately with the same markers for both kinds of motions Korean - Different lexicalization patterns for spontaneous and caused motion - Path markers (verbs) are different for two kinds of motions and have narrower usage ranges


Download ppt "[1].Handling Structural Divergences and Recovering Dropped Arguments in a Korean/English Machine Translation System [2].Learning to express motion events."

Similar presentations


Ads by Google