Presentation is loading. Please wait.

Presentation is loading. Please wait.

Incorporating Contextual Cues in Trainable Models for Coreference Resolution 14 April 2003 Ryu Iida Computational Linguistic Laboratory Graduate School.

Similar presentations


Presentation on theme: "Incorporating Contextual Cues in Trainable Models for Coreference Resolution 14 April 2003 Ryu Iida Computational Linguistic Laboratory Graduate School."— Presentation transcript:

1 Incorporating Contextual Cues in Trainable Models for Coreference Resolution 14 April 2003 Ryu Iida Computational Linguistic Laboratory Graduate School of Information Science Nara Institute of Science and Technology

2 2 Background Two approaches to coreference resolution Rule-based approach [Mitkov 97, Baldwin 95, Nakaiwa 96, Okumura 95, Murata 97] Many attempted to encode linguistic cues into rules This was significantly influenced by Centering Theory [Grosz 95, Walker et al. 94, Kameyama, 86] Best-achieved performance in MUC: Precision roughly 70% (Message Understanding Conference) Recall roughly 60% Corpus-based machine learning approach [Aone and Bennett 95, Soon et al. 01, Ng and Cardie 02, Seki 02] Cost effective They have achieved a performance comparable to best performing rule-based systems Problem: Further manual refinement is needed in this study but it will be prohibitively costly Problem: These previous work tend to lack an appropriate reference to the theoretical linguistic work on coherence and coreference

3 3 Background Challenging issue Achieving a good union between theoretical linguistic findings and corpus-based empirical methods

4 4 Outline of this Talk Background Problems with previous statistical approaches Two methods Centering features Tournament-based search model Experiments Conclusions

5 5 Statistical approaches [Soon et al. ‘01, Ng and Cardie ‘02] Reach a level of performance comparable to state-of-the- art rule-based systems Recast the task of anaphora resolution as a sequence of classification problems

6 6 Statistical approaches [Soon et al. ‘01, Ng and Cardie ‘02] the task is to classify these pairs of noun phrases into positive or negative positive instance: Pair of an anaphor and the antecedent negative instance: Pairs of an anaphor and the NPs located between the anaphor and the antecedent A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. [MUC-6] USAir suit USAir order USAir USAir Group Inc positive negative output class antecedent anaphor × × 〇

7 7 Statistical approaches [Soon et al. ‘01, Ng and Cardie ‘02] Feature set [Ng and Cardie 02] features training (C4.5) Model (decision tree) Person:1 ハ :1 先行詞候補照応詞 Person:1 Pronoun:1 ハ :1 Pronoun:0 SENT_DIST:0 negative Person:1 ハ :1 先行詞候補照応詞 Person:1 Pronoun:1 ハ :1 Pronoun:0 SENT_DIST:0 negative Organization:1Prp_noun:1 candidateanaphor Organization:1 Pronoun:0 SENT_DIST:0 positive POS DEMONSTRATIVE STRING_MATCH NUMBER GENDER SEM_CLASS DISTANCE SYNTACTIC ROLE STR_MATCH:0 USAir suit USAir order USAir USAir Group Inc positive negative

8 8 1.5 Statistical approaches [Soon et al. ‘01, Ng and Cardie ‘02] Precision 78.0%, Recall 64.2% Slightly better than best-performing rule-based model at MUC-7 -3.5 -2.5 -0.3 -0.4 -1.1 -2.0 1.5 NP6 antecedent anaphor NP8 NP7 NP6 NP5 NP4 NP3 NP2 NP1 extract NPs candidates Test Phase [Ng and Cardie, 02] We refer to Ng and Cardie’s model as the baseline of our empirical evaluation Select the best-scored candidate as the output Input each pair of given anaphor and one of these candidates to the decision tree

9 9 Sarah went downstairs and received another curious shock, for when Glendora flapped into the dining room in her home made moccasins, Sarah asked her when she had brought coffee to her room, and Glendora said she hadn't. A drawback of the previous statistical models Positive and negative instances may have the identical feature vector Sarahshe positive Glendorashe negative POS: Noun Prop_Noun: Yes Pronoun: No NE: PERSON SEM_CLASS: Person SENT_DIST: 0 features [Kameyama 98] antecedent anaphor The previous models do not capture local context appropriately POS: Noun Prop_Noun: Yes Pronoun: No NE: PERSON SEM_CLASS: Person SENT_DIST: 0

10 Two methods

11 11 Two methods Use more sophisticated linguistic cues: centering features Augmentation of a set of new features inspired by Centering Theory that implement local contextual factors Improve the search algorithm: tournament model A new model which makes pair-wise comparisons between candidates

12 12 Centering Features Sarah went downstairs and received another curious shock, …… she hadn't. CHAIN(Cb = Cp = Sarah) CHAIN(Cb = Cp = Glendora) transition the problem is that the current feature set does not tell the difference between these two candidates Sarahshe positive Glendorashe negative POS: Noun Prop_Noun: Yes Pronoun: No NE: PERSON SEM_CLASS: Person SENT_DIST: 0 features Glendora antecedent Introduce extra devices such as the forward-looking center list Encode state transitions on them into a set of additional features POS: Noun Prop_Noun: Yes Pronoun: No NE: PERSON SEM_CLASS: Person SENT_DIST: 0

13 13 Two methods Use more sophisticated linguistic cues: centering features We augment the feature set with a set of new features inspired by Centering theory that implement local contextual factors Improve the search algorithm: tournament model We propose a new model which makes pair-wise comparisons between antecedent candidates

14 14 Tournament model What we want to do is to answer a question which is more likely to be coreferent, Sarah or Glendora Conduct a tournament consisting of a series of matches in which candidates compete with each other Match victory is determined by a pairwise comparison between candidates as a binary classification problem Most likely candidate is selected through a single-elimination tournament of matches Sarah went downstairs and received another curious shock, for when Glendora flapped into the dining room in her home made moccasins, Sarah asked her when she had brought coffee to her room, and Glendora said she hadn't. dining room 〇 × downstairs 〇 × Sarah 〇 ×

15 15 Tournament model Training Phase Training instances NP7 NP5 NP4 NP5 NP1 ANP features right class right left NP8 NP5 ANP left In the tournament, the correct antecedent NP5 must prevail over any of the other four candidates Extract four training instances Induce a pairwise classifier from a set of extracted training instances The classifier classifies a given pair of candidates into left or right NP7 coreferent ANP NP6 NP5 NP4 NP3 NP2 NP1 coreferent anaphor NP8 beginning of document antecedent the right hand side of a given pair wins (is more likely to be the antecedent)

16 16 NP7 coreferent Test Phase Tournament model ANP NP6 NP5 NP4 NP3 NP2 NP1 coreferent beginning of document anaphor NP8 1.the first match is arranged between the nearest candidates (NP7 and NP8) 2.each of the following matches arranged in turn between the winner (NP8) of the previous match and a new challenger (NP5)

17 17 NP7 coreferent Test Phase Tournament model NP5 antecedent ANP NP6 NP5 NP4 NP3 NP2 NP1 coreferent beginning of document anaphor NP8 3. the winner is next matched against the next challenger (NP4) 4. this process is repeated until the last one participate 5. the model selects the candidate that prevails through the final round as the answer

18 Experiments

19 19 Experiments Empirical evaluation on Japanese zero-anaphora resolution Japanese does not normally use personal pronoun as anaphor Instead, Japanese uses zero-pronouns Comparison among four models 1.Baseline model 2.Baseline model with Centering Features 3.Tournament model 4.Tournament model with Centering Features

20 20 Centering Features in Japanese Japanese anaphora resolution model [Nariyama 02] Expansion of Kameyama’s work on the application of Centering Theory to Japanese zero-anaphora resolution Expanding the original forward-looking center list into Salience Reference List (SRL) to take into account broader contextual information More use of linguistic information In the experiments, we introduced two features to reflect the SRL-related contextual factors

21 21 Data GDA-tagged Japanese newspaper article corpus Texts : 2,176 60 Sentences : 24,475 - Tags of anaphoric relation : 14,743 8,946 Tags of ellipsis (Zero-anaphor) : 5,966 0 As a preliminarily test, only resolving subject zero- anaphors, 2,155 instances in total Conduct five fold cross-validation on that data set with support vector machines GDAMUC-6 Method

22 22 1.Features for simulating Ng and Cardie’s feature set 2.Centering Features 3.Features for capturing the relations between two candidates Feature set Feature set (see our paper for details) Order in SRL Heuristic rule of preference Preference of SRL in two candidates Preference of Animacy in two candidates Distance between two candidates POS Pronoun Particle Named-Entity Semantic class Animacy Selectional Restrictions Distance between an anaphor and the candidate Number of anaphoric relations introduce only in tournament model but not in the baseline model

23 23 Results Tournament model + Centering Features Tournament model Baseline model + Centering Features Baseline model

24 24 Results (1/3) Results (1/3) the effect of incorporating centering features Baseline model Baseline model + Centering Features 67.0% 64.0% centering features were reasonably effective

25 25 Results (2/3) Tournament model 70.8% 67.0% 64.0% Introducing the tournament model significantly improved the performance regardless the size of training data Baseline model Baseline model + Centering Features

26 26 most complex model did not outperform the tournament model without centering features Results (3/3) Tournament model + Centering Features Tournament model 70.8% 69.7% 67.0% 64.0% The improvement ratio of this model against the data size is the best of all Baseline model Baseline model + Centering Features

27 27 Results Results after cleaning data (March ‘03) 74.3% 72.5% Tournament model + Centering Features Tournament model the tournament model with centering features is more effective than the one without centering features

28 28 Conclusions Our concern is achieving a good union between theoretical linguistic findings and corpus-based empirical methods We presented a trainable coreference resolution model that is designed to incorporate contextual cues by means of centering features and a tournament-based search algorithm. These two improvements worked effectively in our experiments on Japanese zero-anaphora resolution.

29 29 Future Work In Japanese zero-anaphora resolution, 1.Identification of relations between the topic and subtopics 2.Analysis of complex and quoted sentences 3.Refinement of the treatment of selectional restrictions

30 30

31 31 Tournament model Training Phase ANP NP7 NP6 NP5 NP4 NP3 NP2 NP1 coreferent antecedent coreferent beginning of document anaphor NP8 Training instances NP7 NP5 NP4 NP5 NP1 ANP features right class right left NP8 NP5 ANP left In the tournament, the correct antecedent NP5 must prevail over any of the other four candidates extract four training instances Induce from a set of extracted training instances a pairwise classifier

32 32 Test Phase Tournament model NP5 antecedent A tournament consists of a series of matches in which candidates compete with each other ANP NP7 NP6 NP5 NP4 NP3 NP2 NP1 coreferent beginning of document anaphor NP8 < < < >

33 33 Tournament model What we want to do is to answer a question which is more likely to be coreferent, Sarah or Glendora Implement a pairwise comparison between candidates as a binary classification problem Sarah went downstairs and received another curious shock, for when Glendora flapped into the dining room in her home made moccasins, Sarah asked her when she had brought coffee to her room, and Glendora said she hadn't. CHAIN(Cb = Cp = Sarah):CHAIN(Cb = Cp = Glendora): Sarah Glendora she < transition < < < Sarah dining room downstairs

34 34 Tournament model Training Phase She moccasins Glendora she < coffee Glendora she < Sarah Glendora she < room Glendora she < output class downstairs Glendora moccasins Sarah she coffee room Glendora she her coreferred downstairs Glendora she < extract NPs coreferent Sarah went downstairs and received another curious shock, for when Glendora flapped into the dining room in her home made moccasins, Sarah asked her when she had brought coffee to her room, and Glendora said she hadn't. Training instances

35 35 Conclusions To incorporate linguistic cues into trainable approaches: Add features which takes into consideration linguistic cues such as Centering Theory: Centering Features Propose the novel search model which the candidates are compared in terms of the likelihood of antecedents: Tournament model In Japanese zero-anaphora resolution task, Tournament model significantly outperforms earlier machine learning approaches [Ng and Cardie 02] Incorporating linguistic cues in machine learning models is effective

36 36 GDA-tagged Japanese newspaper article corpus Texts : 2,176 60 Sentences : 24,475 - Tags of anaphoric relation : 14,743 8,946 Tags of ellipsis (Zero-anaphor) : 5,966 0 Data クリントン米大統領 の内政の最大課題のひとつであ る 包括犯罪対策法案 が十一日の下院本会議で、審議 ・表決に移ることを承認する動議が、反対二二五対賛成二一〇で否決さ れた。 これで 同法案 は事実上、大幅修正または廃 案に追い込まれた。 同大統領 は緊急会見で怒りを あらわにして、法案の復活を要求。 同大統領 は中 間選挙を前に得点を あげる ことを目指したが、逆に大きな痛手を受けた 。 GDAMUC-6 Extract 2,155 example coreferent Ellipsis (AGENT) coreferent

37 37 Statistical approaches [Soon et al. 01, Ng and Cardie 02] Reach a level of performance comparable to state-of-the- art rule-based systems Recast the task of anaphora resolution as a sequence of classification problems Pair of an anaphor and the antecedent:positive instance Pairs of an anaphor and the NPs located between the anaphor and the antecedent: negative instance USAir suit USAir order USAir USAir Group Inc positive negative A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. [MUC-6] output class the task is to classify these pairs of noun phrases into positive or negative.

38 38 *Centering Features Centering Theory [Grosz 95, Walker et al. 94, Kameyama, 86] Part of an overall theory of discourse structure and meaning Two levels of discourse coherence: global and local Centering models the local-level component of attentional state e.g. Intrasentential centering [Kameyama 97] Sarah went downstairs and received another curious shock, for when Glendora flapped into the dining room in her home made moccasins, Sarah asked her when she had brought coffee to her room, and Glendora said she hadn't.

39 39 *Centering Features in English [Kameyama 97] Sarah went downstairs and received another curious shock, for when Glendora flapped into the dining room in her home made moccasins, Sarah asked her when she had brought coffee to her room, and Glendora said she hadn't. CHAIN(Cb = Cp = Sarah): ESTABLISH(Cb = Cp = Glendora): CHAIN(Cb = Glendora, Cp = Sarah): CHAIN(Cb = Cp = Glendora): CHAIN(Cb = NULL, Cp = Glendora): CHAIN(Cb = Cp = Glendora): [Kameyama 97]

40 40 *Centering Features in English [Kameyama 97] The essence is that takes into account the preference between candidates Cb and Cp distinguish the two candidates Sarah went downstairs and received another curious shock, …… she hadn't. CHAIN(Cb = Cp = Sarah) CHAIN(Cb = Cp = Glendora) transition Implement local contextual factor: centering features

41 41 She downstairs shock Glendora room moccasins Sarah she coffee room Glendora she her *Tournament model Test Phase Glendora antecedent room < her < coffee < moccasins < room < shock < downstairs < A tournament consists of a series of matches in which candidates compete with each other

42 42 Rule-based Approaches Encoding linguistic cues into rules manually Thematic roles of the candidates Order of the candidates Semantic relation between anaphors and antecedents etc.. This approaches are influenced by Centering Theory [Grosz 95, Walker et al. 94, Kameyama, 86] The Coreference Resolution Task of Message Understanding Conference (MUC-6 / MUC-7) Precision:roughly 70% Recall: roughly 60% Further manual refinement of rule-based models will be prohibitively costly

43 43 Statistical Approaches with Tagged-Corpus The statistical approaches have achieved a performance comparable to the best-performing rule-based systems Lack an appropriate reference to theoretical linguistic work on coherence and coreference Making a good marriage between theoretical linguistic findings and corpus-based empirical methods

44 44 *Test Phase [Soon et al. 01] Precision 67.3%, Recall 58.6% on MUC data set USAir a suit order USAir Group Inc anaphor 〇 × × USAir Group Inc antecedent extracting NP share Trans World Airlines order Pittsburgh judge A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. candidates

45 45 Improving Soon’s model [Ng and Cardie 02] Expanding the feature set 12 features ⇒ 53 features Introducing a new search algorithm POS DEMONSTRATIVE STRING_MATCH NUMBER GENDER SEM_CLASS DISTANCE SYNTACTIC ROLE

46 46 Test Phase [Soon et al. 01] Precision 67.3%, Recall 58.6% on MUC data set USAir a suit order USAir Group Inc anaphor 〇 × × USAir Group Inc antecedent extracting NP share Trans World Airlines order Pittsburgh judge A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. candidates

47 47 A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, dealt another blow to TWA's bid to buy the company for $52 a share. Task of Coreference Resolutions Two process Resolution of anaphors Resolution of antecedents applications Machine Translation, IR, etc antecedent anaphor [MUC-6] (Same color NPs are coreferred)

48 48 Future Work Evaluate some examples Tournament model doesn’t deal with Direct quote Proposed methods cannot deal with different discourse structures …… 獄に下るモハンメドは妻にこう言い残した。「おれが刑務所にいる間 、外で働いてはいけない」。貞節を守れ、という意味だ。さすがに刑 務所で新しい子供に恵まれる可能性はないと思ったのだろうか。 Topic モハンメ ド Focus おれ I-Obj 刑務所 D-ObjNULL Others 外 SRL

49 49 Centering Features of Japanese Adding the likelihood of antecedents into features In Japanese, wa -marked NPs tend to be topics Topics tend to be omitted Salience Reference List (SRL) [Nariyama 02] Store NPs in SRL from the beginning of text Overwrite the old entity if new entity fills same point Topic/φ (wa) > Focus (ga) > I-Obj (ni) > D-Obj (wo) > Others …NP1-waNP2-wo… 。 …NP3-ga… 、 NP4-ha… 。 …NP5-ni……(φ-ga)V 。 TopicNULL FocusNULL I-ObjNULL D-ObjNULL OthersNULL TopicNP1 FocusNULL I-ObjNULL D-ObjNULL OthersNULL TopicNP1 FocusNULL I-ObjNULL D-ObjNP2 OthersNULL TopicNP1 FocusNP3 I-ObjNULL D-ObjNP2 OthersNULL TopicNP4 FocusNP3 I-ObjNULL D-ObjNP2 OthersNULL TopicNP4 FocusNP3 I-ObjNP5 D-ObjNP2 OthersNULL preferred

50 50 Evaluation of models Introduce a confidence measure Confidence coefficient is the value when two candidates are the nearest at the tournament corefered President A armistice President B this he action he President B < this > action < armistice < President A 3.8 3.2 2.4 0.9

51 51 Evaluation of Tournament model investigate the Tournament model ( investigate the Tournament model (the best performance )

52 52 ドゥダエフ大統領 > NULL > NULL > NULL > NULL エリツィン・ロシア大統領 > NULL > 行動 > NULL > NULL Centering Features example SRL ドゥダエフ大統領は、正月休戦を提案したが 、 エリツィン・ロシア大統領はこれを黙殺し、 (φ ガ ) 行動を開始した。 President A proposed the armistice, but President B ignored this. And he started action.

53 53 *Features (1/3) Ng’s model, Tournament model Features are decided by one candidate candidate1candidate2anaphor POS Pronoun Particle Named-Entity the number of anaphoric relations First NP in a sentence Order of SRL

54 54 *Features (2/3) Ng’s model, Tournament model Features are decided by a pair of an anaphor and the candidate candidate1candidate2anaphor Selectional restrictions the pair of candidate and anaphor satisfies selectional restriction in Nihongo Goi Taikei log-likelihood ratio calculated from cooccurrence data Distance in terms of sentence between an anaphor and the candidate

55 55 *Features (3/3) only Tournament model Features are decided by the relation between two candidates candidate1candidate2anaphor Distance in terms of sentence between two candidates Animacy Whether or not one candidate belongs to the class of PERSON or ORGANIZATION Which candidate is preferred in SRL TopicNULL FocusNP1 I-ObjNULL D-ObjNP2 OthersNULL

56 56 Anaphoric relations Variety of antecedents Noun Phrase (NP), Sentence, text, etc Many previous works deal with anaphora resolutions The number of antecedent-anaphor examples is the most of all endophora exophora anaphora cataphora Antecedent exists in a context Antecedents precede anaphors Antecedent doesn’t exist in context Anaphors precede antecedents

57 57 Results Results (examples: 2155 ⇒ 2681) Tournament model The Model using Centering Features gets worse than the model without Centering Features Modify some tagging errors by hand examples: 2155 ⇒ 2681 Tournament model + Centering Features Tournament model Ng’s model + Centering Features Ng’s model


Download ppt "Incorporating Contextual Cues in Trainable Models for Coreference Resolution 14 April 2003 Ryu Iida Computational Linguistic Laboratory Graduate School."

Similar presentations


Ads by Google