1 Exploiting Syntactic Patterns as Clues in Zero- Anaphora Resolution Ryu Iida, Kentaro Inui and Yuji Matsumoto Nara Institute of Science and Technology.

1 Exploiting Syntactic Patterns as Clues in Zero- Anaphora Resolution Ryu Iida, Kentaro Inui and Yuji Matsumoto Nara Institute of Science and Technology {ryu-i,inui,matsu}@is.naist.jp June, 20th, 2006

2 Zero-anaphora resolution Zero-anaphor = a gap with an anaphoric function Zero-anaphora resolution becoming important in many applications In Japanese, even obligatory arguments of a predicate are often omitted when they are inferable from the context 45.5% nominative arguments of verbs are omitted in newspaper articles

3 Zero-anaphora resolution (cont’d) Three sub-tasks: Zero-pronoun detection: detect a zero-pronoun Antecedent identification : identify the antecedent for a given zero-pronoun Anaphoricity determination : Mary-wa John-ni ( φ -ga ) tabako-o yameru-youni it-ta Mary-NOM John-DAT ( φ -NOM ) smoking-OBJ quit-COMP say-PAST [Mary asked John to quit smoking.] anaphoric zero-pronoun antecedent

4 Zero-anaphora resolution (cont’d) Three sub-tasks: Zero-pronoun detection: detect a zero-pronoun Antecedent identification : identify antecedent from the set of candidate antecedents for a given zero-pronoun Anaphoricity determination : classify whether a given zero-pronoun is anaphoric or non-anaphoric ( φ -ga ) ie-ni kaeri-tai ( φ -NOM) home-DAT want to go back [(φ=I) want to go home.] non-anaphoric zero-pronoun Mary-wa John-ni ( φ -ga ) tabako-o yameru-youni it-ta Mary-NOM John-DAT ( φ -NOM ) smoking-OBJ quit-COMP say-PAST [Mary asked John to quit smoking.] anaphoric zero-pronoun antecedent

5 Previous work on anaphora resolution Research trend has been shifting from rule-based approaches (Baldwin, 95; Lappin and Leass, 94; Mitkov, 97, etc.) to empirical, or learning-based, approaches (Soon et al., 2001; Ng 04, Yang et al., 05, etc.) Cost-efficient solution for achieving performance comparable to best performing rule-based systems Learning-based approaches represent a problem, anaphoricity determination and antecedent identification, as a set of feature vectors and apply machine learning algorithms to them

6 Useful clues for both anaphoricity determination and antecedent identification Syntactic pattern features Mary-wa Mary-TOP predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST Antecedent John-ni John-DAT tabako-o smoking-OBJ

7 Useful clues for both anaphoricity determination and antecedent identification Questions How to encode syntactic patterns as features How to avoid data sparseness problem Syntactic pattern features Mary-wa Mary-TOP predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST Antecedent John-ni John-DAT tabako-o smoking-OBJ

8 Talk outline 1. Zero-anaphora resolution: Background 2. Selection-then-classification model (Iida et al., 05) 3. Proposed model Represents syntactic patterns based on dependency trees Uses a tree mining technique to seek useful sub-trees to solve data sparseness problem Incorporates syntactic pattern features in the selection-then-classification model 4. Experiments on Japanese zero-anaphora 5. Conclusion and future work

9 A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, … candidate anaphor tournament model USAir suit USAir Group Inc order federal judge candidate anaphor candidate antecedents … Selection-then-Classification Model (SCM) (Iida et al., 05)

10 tournament model USAir suit USAir Group Inc order federal judge candidate anaphor candidate antecedents … USAir Group Inc USAir suit USAir Group Inc Federal judge candidate anaphor candidate antecedents … order Selection-then-Classification Model (SCM) (Iida et al., 05) (Iida et al. 03)

11 USAir Group Inc most likely candidate antecedent tournament model USAir suit USAir Group Inc order federal judge candidate anaphor candidate antecedents … Selection-then-Classification Model (SCM) (Iida et al., 05)

12 USAir Group Inc most likely candidate antecedent tournament model USAir suit USAir Group Inc order federal judge candidate anaphor candidate antecedents … is non-anaphoric USAir score θ ana score ≧ θ ana is anaphoric and is the USAir USAir Group Inc antecedent of Anaphoricity determination model USAir Group Inc USAir Selection-then-Classification Model (SCM) (Iida et al., 05)

13 USAir Group Inc most likely candidate antecedent tournament model USAir suit USAir Group Inc order federal judge candidate anaphor candidate antecedents … is non-anaphoric USAir score θ ana score ≧ θ ana is anaphoric and is the USAir USAir Group Inc antecedent of Anaphoricity determination model USAir Group Inc USAir Selection-then-Classification Model (SCM) (Iida et al., 05)

14 Anaphoric Non-anaphoric NANP NP5 NP4 NP3 NP2 NP1 non-anaphoric noun phrase set of candidate antecedents NP3 tournament model candidate antecedent Non-anaphoric instances NP3NANP ANP NP5 NP4 NP3 NP2 NP1 anaphoric noun phrase set of candidate antecedents Antecedent Anaphoric instances NP4ANP NPi: candidate antecedent Training the anaphoricity determination model

15 Talk outline 1. Zero-anaphora resolution: Background 2. Selection-then-classification model (Iida et al., 05) 3. Proposed model Represents syntactic patterns based on dependency trees Uses a tree mining technique to seek useful sub-trees to solve data sparseness problem Incorporates syntactic pattern features in the selection-then-classification model 4. Experiments on Japanese zero-anaphora 5. Conclusion and future work

16 USAir Group Inc most likely candidate antecedent tournament model USAir suit USAir Group Inc order federal judge candidate antecedents … is non-anaphoric USAir score θ ana score ≧ θ ana is anaphoric and is the USAir USAir Group Inc antecedent of Anaphoricity determination model USAir Group Inc USAir New model candidate anaphor

17 Use of syntactic pattern features Encoding parse tree features Learning useful sub-trees

18 Encoding parse tree features Mary-wa Mary-TOP predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST Antecedent John-ni John-DAT tabako-o smoking-OBJ

19 Encoding parse tree features predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST Antecedent John-ni John-DAT Mary-wa Mary-TOP tabako-o smoking-OBJ

20 Encoding parse tree features Antecedent predicate zero-pronoun predicate predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST Antecedent John-ni John-DAT

21 Encoding parse tree features Antecedent predicate zero-pronoun predicate youni CONJ ni DAT ga CONJ ta PAST predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST Antecedent John-ni John-DAT

22 Encoding parse trees LeftCand predicate RightCand (TI)(TI) LeftCand predicate zero- pronoun predicate (TL)(TL) RightCand (TR)(TR) predicate zero- pronoun predicate LeftCand Mary-wa Mary-TOP predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST RightCand John-ni John-DAT tabako-o smoking-OBJ

23 Encoding parse trees Antecedent identification root Three sub-trees

24 Encoding parse trees Antecedent identification root Three sub-trees 1 2 n … … Lexical, Grammatical, Semantic, Positional and Heuristic binary features

25 Encoding parse trees Antecedent identification root 1 2 n … … Three sub-trees Lexical, Grammatical, Semantic, Positional and Heuristic binary features Left or right label

26 Learning useful sub-trees Kernel methods: Tree kernel (Collins and Duffy, 01) Hierarchical DAG kernel (Suzuki et al., 03) Convolution tree kernel (Moschitti, 04) Boosting-based algorithm: BACT (Kudo and Matsumoto, 04) system learns a list of weighted decision stumps with the Boosting algorithm

27 positive Boosting-based algorithm: BACT Learns a list of weighted decision stumps with Boosting Classifies a given input tree by weighted voting Learning useful sub-trees positive Labels Training instances …. 0.4 weight Label positive sub-tree decision stumps learn Score: +0.34  positive apply

28 Overall process Input (a zero-pronoun φ in the sentence S ) Intra-sentential model Inter-sentential model score intra < θ intra score intra ≧ θ intra Output the most-likely candidate antecedent appearing in S score inter ≧ θ inter Output the most-likely candidate appearing outside of S score inter < θ inter Return ‘‘non-anaphoric’’ syntactic patterns

29 Table of contents 1. Zero-anaphora resolution 2. Selection-then-classification model (Iida et al., 05) 3. Proposed model Parse encoding Tree mining 4. Experiments 5. Conclusion and future work

30 Japanese newspaper article corpus comprising zero- anaphoric relations: 197 texts (1,803 sentences) 995 intra-sentential anaphoric zero-pronouns 754 inter-sentential anaphoric zero-pronouns 603 non-anaphoric zero-pronouns Recall = Precision = Experiments # of correctly resolved zero-anaphoric relations # of anaphoric zero-pronouns # of anaphoric zero-pronouns the model detected # of correctly resolved zero-anaphoric relations

31 Experimental settings Conducting five-fold cross validation Comparison among four models BM : Ng and Cardie (02)’s model: Identify an antecedent with candidate-wise classification Determine the anaphoricity of a given anaphor as a by- product of the search for its antecedent BM_STR : BM +syntactic pattern features SCM : Selection-then-classification model (Iida et al., 05) SCM_STR : SCM + syntactic pattern features

32 Results of intra-sentential ZAR Antecedent identification (accuracy)  The performance of antecedent identification improved by using syntactic pattern features BM (Ng02)BM_STRSCM (Iida05)SCM_STR 48.0% (478/995) 63.5% (632/995) 65.1% (648/995) 70.5% (701/995)

33 antecedent identification + anaphoricity determination Results of intra-sentential ZAR

34 Impact on overall ZAR Evaluate the overall performance for both intra- sentential and inter-sentential ZAR Baseline model: SCM resolves intra-sentential and inter-sentential zero-anaphora simultaneously with no syntactic pattern features.

35 Results of overall ZAR

36 AUC curve AUC (Area Under the recall-precision Curve) plotted by altering θ intra Not peaky  optimizing parameter θ intra is not difficult

37 Conclusion We have addressed the issue of how to use syntactic patterns for zero-anaphora resolution. How to encode syntactic pattern features How to seek useful sub-trees Incorporating syntactic pattern features into our selection-then- classification model improves the accuracy for intra-sentential zero-anaphora, which consequently improves the overall performance of zero-anaphora resolution

38 Future work How to find zero-pronouns? Designing a broader framework to interact with analysis of predicate argument structure How to find a globally optimal solution to the set of zero-anaphora resolution problems in a given discourse? Exploring methods as discussed by McCallum and Wellner (03)

1 Exploiting Syntactic Patterns as Clues in Zero- Anaphora Resolution Ryu Iida, Kentaro Inui and Yuji Matsumoto Nara Institute of Science and Technology.

Similar presentations

Presentation on theme: "1 Exploiting Syntactic Patterns as Clues in Zero- Anaphora Resolution Ryu Iida, Kentaro Inui and Yuji Matsumoto Nara Institute of Science and Technology."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Exploiting Syntactic Patterns as Clues in Zero- Anaphora Resolution Ryu Iida, Kentaro Inui and Yuji Matsumoto Nara Institute of Science and Technology.

Similar presentations

Presentation on theme: "1 Exploiting Syntactic Patterns as Clues in Zero- Anaphora Resolution Ryu Iida, Kentaro Inui and Yuji Matsumoto Nara Institute of Science and Technology."— Presentation transcript:

Similar presentations

About project

Feedback