Download presentation
Presentation is loading. Please wait.
1
ZERO PRONOUN RESOLUTION IN JAPANESE Jeffrey Shu Ling 575 Discourse and Dialogue
2
Review of Zero Pronouns (Pro)nouns often dropped if pragmatically/semantically inferable from context General preference for names/nouns rather than pronouns in polite/formal speech (except 1 st person and demonstrative pronouns) Grammatical markers removed with nouns Pronominal referent may or may not appear in discourse prior to appearance of zero pronoun
3
Overview of Objectives Find general syntactic rules that can be used to identify the presence of zero pronouns Find syntactic/semantic/pragmatic clues to identify referents for ZPs, or determine appropriate pronoun/noun insertion if not explicitly found in textual context Determine priority in case of conflict
4
Syntactic Identification of Zero Pronouns Determining presence of zero pronoun fairly straightforward for subject/topic, objects Determine syntactic argument structure of verb Identify if noun exists to fill arguments (simple task with grammatical markers) Sometimes grammatical markers not used in casual speech Not so straightforward for other nouns (e.g. locatives) Might not be important if unstated (e.g. “I’ll go to the store” vs. “I’ll go”). Usually has precedent.
5
Findings: Some Generalizations from Semantics/Syntax In consecutive statements with ZPs, the topic/subject is often the same Many verbs have a semantic preference for certain types of nouns (e.g. animate vs. inanimate) which can narrow possibilities Imperatives are always 2 nd person (unless speaker is talking to him/herself) Rarely have a subject/topic
6
Findings: Conversations 1P by far most commonly used pronoun If ZP in statement is 1P, usually has an explicit precedent If 2P, it may or may not have an explicit precedent Question/answer format generalizations Subject/topic of question usually 2/3P If1P, often directly stated Subject of answer almost always opposite pronoun of question (2/3P vs. 1P), same referent (i.e. same person) 3P both simplest and potentially most complicated 3P personal pronouns (e.g. he/she) very rarely used If 3P ZP, almost always preceded by explicit reference to name/noun Gender indeterminate (possible problem for MT)
7
Findings: Domain Specific Formal situations usually dictate specific conventions to be followed Representatives almost never use 1P singular or 3P personal pronouns 2P ZPs can be inferred from domain context (a corporate statement would probably be addressing its customers or investors)
8
Findings: Domain Specific Reference articles (e.g. Wikipedia) While ZPs commonly are used, the antecedent is usually found no more than a few sentences prior ZPs in consecutive sentences usually have same referent If no clear antecedent can be found, subject of article is often a reasonable assumption 1 st and 2 nd person virtually non-existent, can be safely eliminated as possibilities Exceptions: reader-addressed texts (e.g. reference guides) News articles similar conventions
9
Findings: Domain Specific Visual media (e.g. comics, TV, etc.) Most problematic Referent very often in visual context with no textual context Heavy reliance on visuals results in not only (pro)noun dropping, but “anything” dropping (including verbs), losing syntactic information Quite possibly impossible to resolve without human intervention Purely textual works (e.g. novels) usually have enough information
10
Priority Domain has highest priority in all situations Rules for separate domains largely mutually exclusive Failure to determine reasonable pronouns as determined to be can lead to misinformation Japanese particularly sensitive to social context Generic pronoun insertion may be highly inappropriate Simple domain information can be extremely valuable Unless ruled out by domain, general conversation rules may be applicable to many different media
11
Priority Pragmatics/semantics > syntax Ex: Question/answer conventions are of greater relevance than assumption that the subject of the previous sentence is the same as the subject of the current sentence Ex: The semantics of verbs to prefer certain types of nouns is of greater relevance than the fact that a ZP is a particular grammatical role (e.g. naïve assumption that direct objects tend to be inanimate)
12
An Idea Instead of inserting “best guess” pronouns, provide a selection of best candidates in text for user to disambiguate In current MT systems that insert generic pronouns, users have to “interpret” (guess) what is really meant anyway Insertion of pronouns is never 100% certain Some media (visual) require human intervention Insertion of pronouns/nouns can lead to misinformation, faux pas, and sense of unreliability of system It is much faster to pick out of a set of provided candidates rather than guess whether the pronoun is right or wrong, and go back to try to figure out what is going on
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.