1 Exploiting Syntactic Patterns as Clues in Zero- Anaphora Resolution Ryu Iida, Kentaro Inui and Yuji Matsumoto Nara Institute of Science and Technology.

Slides:



Advertisements
Similar presentations
Ryu Iida Tokyo Institute of Technology Kentaro Inui Yuji Matsumoto Nara Institute of Science and Technology
Advertisements

Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.
Coreference Based Event-Argument Relation Extraction on Biomedical Text Katsumasa Yoshikawa 1), Sebastian Riedel 2), Tsutomu Hirao 3), Masayuki Asahara.
A Structured Model for Joint Learning of Argument Roles and Predicate Senses Yotaro Watanabe Masayuki Asahara Yuji Matsumoto ACL 2010 Uppsala, Sweden July.
Sequence Classification: Chunking Shallow Processing Techniques for NLP Ling570 November 28, 2011.
计算机科学与技术学院 Chinese Semantic Role Labeling with Dependency-driven Constituent Parse Tree Structure Hongling Wang, Bukang Wang Guodong Zhou NLP Lab, School.
Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions Eric Breck and Claire Cardie Department of Computer.
Chapter 18: Discourse Tianjun Fu Ling538 Presentation Nov 30th, 2006.
1 A Comparative Evaluation of Deep and Shallow Approaches to the Automatic Detection of Common Grammatical Errors Joachim Wagner, Jennifer Foster, and.
Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
Beyond Actions: Discriminative Models for Contextual Group Activities Tian Lan School of Computing Science Simon Fraser University August 12, 2010 M.Sc.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
CS 4705 Algorithms for Reference Resolution. Anaphora resolution Finding in a text all the referring expressions that have one and the same denotation.
CS 4705 Lecture 21 Algorithms for Reference Resolution.
Learning Object Identification Rules for Information Integration Sheila Tejada Craig A. Knobleock Steven University of Southern California.
Page 1 Generalized Inference with Multiple Semantic Role Labeling Systems Peter Koomen, Vasin Punyakanok, Dan Roth, (Scott) Wen-tau Yih Department of Computer.
Supervised models for coreference resolution Altaf Rahman and Vincent Ng Human Language Technology Research Institute University of Texas at Dallas 1.
Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Part-of-Speech Tagging and Chunking with Maximum Entropy Model Sandipan Dandapat.
Improving Machine Learning Approaches to Coreference Resolution Vincent Ng and Claire Cardie Cornell Univ. ACL 2002 slides prepared by Ralph Grishman.
Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.
A Global Relaxation Labeling Approach to Coreference Resolution Coling 2010 Emili Sapena, Llu´ıs Padr´o and Jordi Turmo TALP Research Center Universitat.
A Light-weight Approach to Coreference Resolution for Named Entities in Text Marin Dimitrov Ontotext Lab, Sirma AI Kalina Bontcheva, Hamish Cunningham,
Andreea Bodnari, 1 Peter Szolovits, 1 Ozlem Uzuner 2 1 MIT, CSAIL, Cambridge, MA, USA 2 Department of Information Studies, University at Albany SUNY, Albany,
Recognition of Multi-sentence n-ary Subcellular Localization Mentions in Biomedical Abstracts G. Melli, M. Ester, A. Sarkar Dec. 6, 2007
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
Authors: Ting Wang, Yaoyong Li, Kalina Bontcheva, Hamish Cunningham, Ji Wang Presented by: Khalifeh Al-Jadda Automatic Extraction of Hierarchical Relations.
Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
On the Issue of Combining Anaphoricity Determination and Antecedent Identification in Anaphora Resolution Ryu Iida, Kentaro Inui, Yuji Matsumoto Nara Institute.
Incorporating Extra-linguistic Information into Reference Resolution in Collaborative Task Dialogue Ryu Iida Shumpei Kobayashi Takenobu Tokunaga Tokyo.
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
Part-Of-Speech Tagging using Neural Networks Ankur Parikh LTRC IIIT Hyderabad
A Language Independent Method for Question Classification COLING 2004.
Phrase Reordering for Statistical Machine Translation Based on Predicate-Argument Structure Mamoru Komachi, Yuji Matsumoto Nara Institute of Science and.
1 Boosting-based parse re-ranking with subtree features Taku Kudo Jun Suzuki Hideki Isozaki NTT Communication Science Labs.
1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.
A Cross-Lingual ILP Solution to Zero Anaphora Resolution Ryu Iida & Massimo Poesio (ACL-HLT 2011)
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
Reference Resolution. Sue bought a cup of coffee and a donut from Jane. She met John as she left. He looked at her enviously as she drank the coffee.
A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Incorporating Contextual Cues in Trainable Models for Coreference Resolution 14 April 2003 Ryu Iida Computational Linguistic Laboratory Graduate School.
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
Online Multiple Kernel Classification Steven C.H. Hoi, Rong Jin, Peilin Zhao, Tianbao Yang Machine Learning (2013) Presented by Audrey Cheong Electrical.
Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology.
Evaluation issues in anaphora resolution and beyond Ruslan Mitkov University of Wolverhampton Faro, 27 June 2002.
Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff School of Computing University of Utah Janyce Wiebe, Theresa Wilson Computing.
Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.
1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Correcting Misuse of Verb Forms John Lee, Stephanie Seneff Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge ACL 2008.
Solving Hard Coreference Problems Haoruo Peng, Daniel Khashabi and Dan Roth Problem Description  Problems with Existing Coref Systems Rely heavily on.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
Language Identification and Part-of-Speech Tagging
A Deep Memory Network for Chinese Zero Pronoun Resolution
Simone Paolo Ponzetto University of Heidelberg Massimo Poesio
Improving a Pipeline Architecture for Shallow Discourse Parsing
Background & Overview Proposed Model Experimental Results Future Work
Algorithms for Reference Resolution
Intent-Aware Semantic Query Annotation
Learning to Parse Database Queries Using Inductive Logic Programming
Automatic Detection of Causal Relations for Question Answering
CS246: Information Retrieval
Extracting Why Text Segment from Web Based on Grammar-gram
Presentation transcript:

1 Exploiting Syntactic Patterns as Clues in Zero- Anaphora Resolution Ryu Iida, Kentaro Inui and Yuji Matsumoto Nara Institute of Science and Technology June, 20th, 2006

2 Zero-anaphora resolution Zero-anaphor = a gap with an anaphoric function Zero-anaphora resolution becoming important in many applications In Japanese, even obligatory arguments of a predicate are often omitted when they are inferable from the context 45.5% nominative arguments of verbs are omitted in newspaper articles

3 Zero-anaphora resolution (cont’d) Three sub-tasks: Zero-pronoun detection: detect a zero-pronoun Antecedent identification : identify the antecedent for a given zero-pronoun Anaphoricity determination : Mary-wa John-ni ( φ -ga ) tabako-o yameru-youni it-ta Mary-NOM John-DAT ( φ -NOM ) smoking-OBJ quit-COMP say-PAST [Mary asked John to quit smoking.] anaphoric zero-pronoun antecedent

4 Zero-anaphora resolution (cont’d) Three sub-tasks: Zero-pronoun detection: detect a zero-pronoun Antecedent identification : identify antecedent from the set of candidate antecedents for a given zero-pronoun Anaphoricity determination : classify whether a given zero-pronoun is anaphoric or non-anaphoric ( φ -ga ) ie-ni kaeri-tai ( φ -NOM) home-DAT want to go back [(φ=I) want to go home.] non-anaphoric zero-pronoun Mary-wa John-ni ( φ -ga ) tabako-o yameru-youni it-ta Mary-NOM John-DAT ( φ -NOM ) smoking-OBJ quit-COMP say-PAST [Mary asked John to quit smoking.] anaphoric zero-pronoun antecedent

5 Previous work on anaphora resolution Research trend has been shifting from rule-based approaches (Baldwin, 95; Lappin and Leass, 94; Mitkov, 97, etc.) to empirical, or learning-based, approaches (Soon et al., 2001; Ng 04, Yang et al., 05, etc.) Cost-efficient solution for achieving performance comparable to best performing rule-based systems Learning-based approaches represent a problem, anaphoricity determination and antecedent identification, as a set of feature vectors and apply machine learning algorithms to them

6 Useful clues for both anaphoricity determination and antecedent identification Syntactic pattern features Mary-wa Mary-TOP predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST Antecedent John-ni John-DAT tabako-o smoking-OBJ

7 Useful clues for both anaphoricity determination and antecedent identification Questions How to encode syntactic patterns as features How to avoid data sparseness problem Syntactic pattern features Mary-wa Mary-TOP predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST Antecedent John-ni John-DAT tabako-o smoking-OBJ

8 Talk outline 1. Zero-anaphora resolution: Background 2. Selection-then-classification model (Iida et al., 05) 3. Proposed model Represents syntactic patterns based on dependency trees Uses a tree mining technique to seek useful sub-trees to solve data sparseness problem Incorporates syntactic pattern features in the selection-then-classification model 4. Experiments on Japanese zero-anaphora 5. Conclusion and future work

9 A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, … candidate anaphor tournament model USAir suit USAir Group Inc order federal judge candidate anaphor candidate antecedents … Selection-then-Classification Model (SCM) (Iida et al., 05)

10 tournament model USAir suit USAir Group Inc order federal judge candidate anaphor candidate antecedents … USAir Group Inc USAir suit USAir Group Inc Federal judge candidate anaphor candidate antecedents … order Selection-then-Classification Model (SCM) (Iida et al., 05) (Iida et al. 03)

11 USAir Group Inc most likely candidate antecedent tournament model USAir suit USAir Group Inc order federal judge candidate anaphor candidate antecedents … Selection-then-Classification Model (SCM) (Iida et al., 05)

12 USAir Group Inc most likely candidate antecedent tournament model USAir suit USAir Group Inc order federal judge candidate anaphor candidate antecedents … is non-anaphoric USAir score θ ana score ≧ θ ana is anaphoric and is the USAir USAir Group Inc antecedent of Anaphoricity determination model USAir Group Inc USAir Selection-then-Classification Model (SCM) (Iida et al., 05)

13 USAir Group Inc most likely candidate antecedent tournament model USAir suit USAir Group Inc order federal judge candidate anaphor candidate antecedents … is non-anaphoric USAir score θ ana score ≧ θ ana is anaphoric and is the USAir USAir Group Inc antecedent of Anaphoricity determination model USAir Group Inc USAir Selection-then-Classification Model (SCM) (Iida et al., 05)

14 Anaphoric Non-anaphoric NANP NP5 NP4 NP3 NP2 NP1 non-anaphoric noun phrase set of candidate antecedents NP3 tournament model candidate antecedent Non-anaphoric instances NP3NANP ANP NP5 NP4 NP3 NP2 NP1 anaphoric noun phrase set of candidate antecedents Antecedent Anaphoric instances NP4ANP NPi: candidate antecedent Training the anaphoricity determination model

15 Talk outline 1. Zero-anaphora resolution: Background 2. Selection-then-classification model (Iida et al., 05) 3. Proposed model Represents syntactic patterns based on dependency trees Uses a tree mining technique to seek useful sub-trees to solve data sparseness problem Incorporates syntactic pattern features in the selection-then-classification model 4. Experiments on Japanese zero-anaphora 5. Conclusion and future work

16 USAir Group Inc most likely candidate antecedent tournament model USAir suit USAir Group Inc order federal judge candidate antecedents … is non-anaphoric USAir score θ ana score ≧ θ ana is anaphoric and is the USAir USAir Group Inc antecedent of Anaphoricity determination model USAir Group Inc USAir New model candidate anaphor

17 Use of syntactic pattern features Encoding parse tree features Learning useful sub-trees

18 Encoding parse tree features Mary-wa Mary-TOP predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST Antecedent John-ni John-DAT tabako-o smoking-OBJ

19 Encoding parse tree features predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST Antecedent John-ni John-DAT Mary-wa Mary-TOP tabako-o smoking-OBJ

20 Encoding parse tree features Antecedent predicate zero-pronoun predicate predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST Antecedent John-ni John-DAT

21 Encoding parse tree features Antecedent predicate zero-pronoun predicate youni CONJ ni DAT ga CONJ ta PAST predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST Antecedent John-ni John-DAT

22 Encoding parse trees LeftCand predicate RightCand (TI)(TI) LeftCand predicate zero- pronoun predicate (TL)(TL) RightCand (TR)(TR) predicate zero- pronoun predicate LeftCand Mary-wa Mary-TOP predicate yameru-youni quit-CONP zero-pronoun φ-ga φ-NOM predicate it-ta say-PAST RightCand John-ni John-DAT tabako-o smoking-OBJ

23 Encoding parse trees Antecedent identification root Three sub-trees

24 Encoding parse trees Antecedent identification root Three sub-trees 1 2 n … … Lexical, Grammatical, Semantic, Positional and Heuristic binary features

25 Encoding parse trees Antecedent identification root 1 2 n … … Three sub-trees Lexical, Grammatical, Semantic, Positional and Heuristic binary features Left or right label

26 Learning useful sub-trees Kernel methods: Tree kernel (Collins and Duffy, 01) Hierarchical DAG kernel (Suzuki et al., 03) Convolution tree kernel (Moschitti, 04) Boosting-based algorithm: BACT (Kudo and Matsumoto, 04) system learns a list of weighted decision stumps with the Boosting algorithm

27 positive Boosting-based algorithm: BACT Learns a list of weighted decision stumps with Boosting Classifies a given input tree by weighted voting Learning useful sub-trees positive Labels Training instances …. 0.4 weight Label positive sub-tree decision stumps learn Score:  positive apply

28 Overall process Input (a zero-pronoun φ in the sentence S ) Intra-sentential model Inter-sentential model score intra < θ intra score intra ≧ θ intra Output the most-likely candidate antecedent appearing in S score inter ≧ θ inter Output the most-likely candidate appearing outside of S score inter < θ inter Return ‘‘non-anaphoric’’ syntactic patterns

29 Table of contents 1. Zero-anaphora resolution 2. Selection-then-classification model (Iida et al., 05) 3. Proposed model Parse encoding Tree mining 4. Experiments 5. Conclusion and future work

30 Japanese newspaper article corpus comprising zero- anaphoric relations: 197 texts (1,803 sentences) 995 intra-sentential anaphoric zero-pronouns 754 inter-sentential anaphoric zero-pronouns 603 non-anaphoric zero-pronouns Recall = Precision = Experiments # of correctly resolved zero-anaphoric relations # of anaphoric zero-pronouns # of anaphoric zero-pronouns the model detected # of correctly resolved zero-anaphoric relations

31 Experimental settings Conducting five-fold cross validation Comparison among four models BM : Ng and Cardie (02)’s model: Identify an antecedent with candidate-wise classification Determine the anaphoricity of a given anaphor as a by- product of the search for its antecedent BM_STR : BM +syntactic pattern features SCM : Selection-then-classification model (Iida et al., 05) SCM_STR : SCM + syntactic pattern features

32 Results of intra-sentential ZAR Antecedent identification (accuracy)  The performance of antecedent identification improved by using syntactic pattern features BM (Ng02)BM_STRSCM (Iida05)SCM_STR 48.0% (478/995) 63.5% (632/995) 65.1% (648/995) 70.5% (701/995)

33 antecedent identification + anaphoricity determination Results of intra-sentential ZAR

34 Impact on overall ZAR Evaluate the overall performance for both intra- sentential and inter-sentential ZAR Baseline model: SCM resolves intra-sentential and inter-sentential zero-anaphora simultaneously with no syntactic pattern features.

35 Results of overall ZAR

36 AUC curve AUC (Area Under the recall-precision Curve) plotted by altering θ intra Not peaky  optimizing parameter θ intra is not difficult

37 Conclusion We have addressed the issue of how to use syntactic patterns for zero-anaphora resolution. How to encode syntactic pattern features How to seek useful sub-trees Incorporating syntactic pattern features into our selection-then- classification model improves the accuracy for intra-sentential zero-anaphora, which consequently improves the overall performance of zero-anaphora resolution

38 Future work How to find zero-pronouns? Designing a broader framework to interact with analysis of predicate argument structure How to find a globally optimal solution to the set of zero-anaphora resolution problems in a given discourse? Exploring methods as discussed by McCallum and Wellner (03)