Download presentation
Presentation is loading. Please wait.
1
Detecting Anaphoricity and Antecedenthood for Coreference Resolution Olga Uryupina (uryupina@gmail.com)uryupina@gmail.com Institute of Linguistics, RAS 13.11.08
2
Overview Anaphoricity and Antecedenthood Experiments Incorporating A&A detectors into a CR system Conclusion
3
A&A: example Shares in Loral Space will be distributed to Loral shareholders. The new company will start life with no debt and $700 million in cash. Globalstar still needs to raise $600 million, and Schwartz said that the company would try to raise the money in the debt market.
4
A&A: example Shares in Loral Space will be distributed to Loral shareholders. The new company will start life with no debt and $700 million in cash. Globalstar still needs to raise $600 million, and Schwartz said that the company would try to raise the money in the debt market.
5
Anaphoricity Likely anaphors: - pronouns, definite descriptions Unlikely anaphors: - indefinites Unknown: - proper names Poesio&Vieira: more than 50% of definite descriptions in a newswire text are not anaphoric!
6
A&A: example Shares in Loral Space will be distributed to Loral shareholders. The new company will start life with no debt and $700 million in cash. Globalstar still needs to raise $600 million, and Schwartz said that the company would try to raise the money in the debt market.
7
A&A: example Shares in Loral Space will be distributed to Loral shareholders. The new company will start life with no debt and $700 million in cash. Globalstar still needs to raise $600 million, and Schwartz said that the company would try to raise the money in the debt market.
8
Antecedenthood Related to referentiality (Karttunen, 1976): „no debt“ etc Antecedenthood vs. Referentiality: corpus-based decision
9
Experiments Can we learn anaphoricity/antecedenthood classifiers? Do they help for coreference resolution?
10
Methodology MUC-7 dataset Anaphoricity/antecedenthood induced from the MUC annotations Ripper, SVM
11
Features Surface form (12) Syntax (20) Semantics (3) Salience (10) „same-head“ (2) From Karttunen, 1976 (7) 49 features – 123 boolean/continuous
12
Results: anaphoricity Feature groupsRPF Baseline10066.579.9 All93.582.387.6 Surface10066.579.9 Syntax97.472.082.8 Semantics98.568.981.1 Salience91.269.378.7 Same-head84.581.182.8 Karttunen‘s91.671.180.1 Synt+SH90.083.586.6
13
Results: antecedenthood Feature groupsRPF Baseline10066.579.9 All95.769.280.4 Surface94.668.579.5 Syntax95.769.280.3 Semantics94.969.480.2 Salience98.967.079.9 Same-head10066.579.9 Karttunen‘s99.367.380.2
14
Integrating A&A into a CR system Apply an A&A prefiltering before CR starts: -Saves time -Improves precision Problem: we can filter out good candidates..: - Will loose some recall
15
Oracle-based A&A prefiltering Take MUC-based A&A classifier („gold standard“ CR system: Soon et al. (2001) with SVMs MUC-7 validation set (3 „training“ documents)
16
Oracle-based A&A prefiltering RPF No prefilteing54.556.955.7 ±ana49.673.659.3 ±ante54.269.460.9 ±ana & ±ante52.981.964.3
17
Automatically induced classifiers Precision more crucial than Recall Learn Ripper classifiers with different Ls (Loss Ratio)
18
Anaphoricity prefiltering
19
Antecedenthood prefiltering
20
Conclusion Automatically induced detectors: Reliable for anaphoricity Much less reliable for antecedenthood (a corpus, explicitly annotated for referentiality could help) A&A prefiltering: Ideally, should help In practice – substantial optimization required
21
Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.