A multiple knowledge source algorithm for anaphora resolution Allaoua Refoufi Computer Science Department University of Setif, Setif 19000, Algeria email.

Slides:



Advertisements
Similar presentations
Referring Expressions: Definition Referring expressions are words or phrases, the semantic interpretation of which is a discourse entity (also called referent)
Advertisements

Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.
Eye Movements and Spoken Language Comprehension: effects of visual context on syntactic ambiguity resolution Spivey et al. (2002) Psych 526 Eun-Kyung Lee.
Non Monotonic Reasoning for Disambiguation and Parsing Arthi Murugesan.
 Quail -> quail  Radius -> radii  Phenomenon -> phenomena  Medium -> media  Cactus -> cacti  Syllabus -> syllabi  Trout -> trout  Fish -> fish.
TEMPLATE DESIGN © Identifying Noun Product Features that Imply Opinions Lei Zhang Bing Liu Department of Computer Science,
Processing of large document collections Part 6 (Text summarization: discourse- based approaches) Helena Ahonen-Myka Spring 2006.
1 Discourse, coherence and anaphora resolution Lecture 16.
Discourse Martin Hassel KTH NADA Royal Institute of Technology Stockholm
MORPHOLOGY - morphemes are the building blocks that make up words.
Chapter 18: Discourse Tianjun Fu Ling538 Presentation Nov 30th, 2006.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Seminar: Efficient NLP Session 2, NLP behind Broccoli November 2nd, 2011 Elmar Haußmann Chair for Algorithms and Data Structures Department of Computer.
Introduction to Computational Linguistics Lecture 2.
CS 4705 Algorithms for Reference Resolution. Anaphora resolution Finding in a text all the referring expressions that have one and the same denotation.
CS 4705 Lecture 21 Algorithms for Reference Resolution.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Database Design.  Define a table for each entity  Give the table the same name as the entity  Make the primary key the same as the identifier of the.
A Light-weight Approach to Coreference Resolution for Named Entities in Text Marin Dimitrov Ontotext Lab, Sirma AI Kalina Bontcheva, Hamish Cunningham,
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Differential effects of constraints in the processing of Russian cataphora Kazanina and Phillips 2010.
Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 11.
Introduction  Information Extraction (IE)  A limited form of “complete text comprehension”  Document 로부터 entity, relationship 을 추출 
On the Issue of Combining Anaphoricity Determination and Antecedent Identification in Anaphora Resolution Ryu Iida, Kentaro Inui, Yuji Matsumoto Nara Institute.
A Language Independent Method for Question Classification COLING 2004.
Efficiently Computed Lexical Chains As an Intermediate Representation for Automatic Text Summarization H.G. Silber and K.F. McCoy University of Delaware.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Processing of large document collections Part 6 (Text summarization: discourse- based approaches) Helena Ahonen-Myka Spring 2005.
Reference Resolution. Sue bought a cup of coffee and a donut from Jane. She met John as she left. He looked at her enviously as she drank the coffee.
Coherence and Coreference Introduction to Discourse and Dialogue CS 359 October 2, 2001.
Using Semantic Relations to Improve Passage Retrieval for Question Answering Tom Morton.
CSA2050 Introduction to Computational Linguistics Parsing I.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Evaluation issues in anaphora resolution and beyond Ruslan Mitkov University of Wolverhampton Faro, 27 June 2002.
Measuring the Influence of Errors Induced by the Presence of Dialogs in Reference Clustering of Narrative Text Alaukik Aggarwal, Department of Computer.
Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
Yule: “Words themselves do not refer to anything, people refer” Reference and inference Pragmatics: Reference and inference.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Extracting and Ranking Product Features in Opinion Documents Lei Zhang #, Bing Liu #, Suk Hwan Lim *, Eamonn O’Brien-Strain * # University of Illinois.
1 Special Electives of Comp.Linguistics: Processing Anaphoric Expressions Eleni Miltsakaki AUTH Fall 2005-Lecture 5.
An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,
10/31/00 1 Introduction to Cognitive Science Linguistics Component Topic: Formal Grammars: Generating and Parsing Lecturer: Dr Bodomo.
The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.
Approaches to Machine Translation
Lecture 3: Functional Phrases
Simone Paolo Ponzetto University of Heidelberg Massimo Poesio
Basic Parsing with Context Free Grammars Chapter 13
Natural Language Processing (NLP)
Referring Expressions: Definition
CS 388: Natural Language Processing: Syntactic Parsing
Clustering Algorithms for Noun Phrase Coreference Resolution
Algorithms for Reference Resolution
Approaches to Machine Translation
CS246: Information Retrieval
Natural Language Processing (NLP)
Pragmatics: Reference and inference
The Winograd Schema Challenge Hector J. Levesque AAAI, 2011
Artificial Intelligence 2004 Speech & Natural Language Processing
Natural Language Processing (NLP)
Presentation transcript:

A multiple knowledge source algorithm for anaphora resolution Allaoua Refoufi Computer Science Department University of Setif, Setif 19000, Algeria Introduction Types of anaphora Knowledge sources The main algorithm DiscussionConclusion

what is anaphora ? The term anaphora relates to the presence in the text of entities (noun phrases, pronouns, etc.) which, on one hand, refer to the same entity (are co referential) and, on the other hand supply additional information. The term anaphora relates to the presence in the text of entities (noun phrases, pronouns, etc.) which, on one hand, refer to the same entity (are co referential) and, on the other hand supply additional information. Reference to an entity is generally termed anaphora, the entity to which the anaphora refers is the antecedent or the referent, anaphor is the entity used to make the reference. Example “ called last night, wanted to see me”. Reference to an entity is generally termed anaphora, the entity to which the anaphora refers is the antecedent or the referent, anaphor is the entity used to make the reference. Example “My brother called last night, he wanted to see me”. Linguistic unit which is a substitution to another linguistic unit already introduced. Linguistic unit which is a substitution to another linguistic unit already introduced.

Types of anaphora Pronominal: it’s the most used one, the reference is made by a pronoun :” took the apple on the table. ate it” Pronominal: it’s the most used one, the reference is made by a pronoun :”Sabrina took the apple on the table. She ate it” Definite noun phrase : the antecedent is referred to by a definite noun phrase « visited the city. inaugurated several realisations ». Definite noun phrase : the antecedent is referred to by a definite noun phrase « The president visited the city. The host of the people’s palace inaugurated several realisations ». Verb phrase as antecedent : « Sarah was vain ». Verb phrase as antecedent : « Sarah tried to convince him to stay. The attempt was vain ». Ordinal Anaphora : the anaphor is a cardinal number like first, second, etc. “Sarah was not satisfied by. She looked for ”. Ordinal Anaphora : the anaphor is a cardinal number like first, second, etc. “Sarah was not satisfied by the solution. She looked for a new one”.

knowledge sources Morphology is concerned with the structure of words; it tells us how to extract the base forms out of inflected forms that occur in texts. Morphology is concerned with the structure of words; it tells us how to extract the base forms out of inflected forms that occur in texts. Syntax is concerned with the ways words combine to form phrases, and phrases combine to form sentences. It extracts the syntactic function of each word (verb, noun, pronoun, etc.). This process is known as parsing. Syntax is concerned with the ways words combine to form phrases, and phrases combine to form sentences. It extracts the syntactic function of each word (verb, noun, pronoun, etc.). This process is known as parsing. Semantics deals with the meaning of words, phrases and sentences. Semantics deals with the meaning of words, phrases and sentences. Pragmatic knowledge uses the context in order to disambiguate among different settings. Pragmatic knowledge uses the context in order to disambiguate among different settings.

The main algorithm Recognition phase Recognition phase –Morphosyntactic analysis –Recognition of non anaphoric pronouns –Identification of focusing expressions –Data structures building Resolution phase Resolution phase For each anaphor do : For each anaphor do : –Carry out in order the constraints –Carry out in order the preferences

constraints Constraints are rules which participate in the purging of the candidates appearing in the structures built during the parsing process. Consistency conditions : candidates are eliminated on morphological grounds (number, gender, person) Consistency conditions : candidates are eliminated on morphological grounds (number, gender, person) Condition on insertions : an expression which is included in an insertion cannot be the antecedent of an anaphor located outside the insertion. Condition on insertions : an expression which is included in an insertion cannot be the antecedent of an anaphor located outside the insertion.

preferences Preferences, as opposed to constraints, can be violated by the antecedent candidates, they are used to rank the candidates. However those that verify the preferences are retained. The order in which they appear reflects their weight. Preferences, as opposed to constraints, can be violated by the antecedent candidates, they are used to rank the candidates. However those that verify the preferences are retained. The order in which they appear reflects their weight. Syntactic parallelism Syntactic parallelism Antecedent not occurring in a prep. phrase Antecedent not occurring in a prep. phrase Focus expressions Focus expressions Recency Recency

Some preferences Syntactic parallelism states that we prefer the antecedent that shares the same syntactic function as the anaphor. “ recognized the king, although has never met him before”. Syntactic parallelism states that we prefer the antecedent that shares the same syntactic function as the anaphor. “The child recognized the king, although he has never met him before”. An expression included in a prep. phrase is unlikely to be referred to because it only brings additional information. “ de la voisine bloque le passage, il faut déplacer» An expression included in a prep. phrase is unlikely to be referred to because it only brings additional information. “La voiture de la voisine bloque le passage, il faut la déplacer»

Focus expressions They identify the main theme, the focus of attention. Of the form : C’est NP qui … C’est NP qui … Il y a NP qui … Il y a NP qui …

appositions Fragments of sentences which can be eliminated without ‘altering’ the main meaning. Goal : eliminate candidates which occur inside. Mainly three forms : “,” “(“ Delimited by separators “,” “(“ : La dame, assise en face de Sarah, était anxieuse. Elle voulait prendre la parole.” Relative clauses : La dame qui discute avec Sarah est une voisine. Just one comma : Caesar, the roman emperor VERB …

discussion The algorithm realises a success rate of 68%. The evaluation has been carried out so far on more than 100 texts of reasonable size(1 page) from literary stories. The algorithm realises a success rate of 68%. The evaluation has been carried out so far on more than 100 texts of reasonable size(1 page) from literary stories. The results show that the resolution of pronouns such as il(s), elle(s), le, la is relatively successful (success rate of 93%). The results show that the resolution of pronouns such as il(s), elle(s), le, la is relatively successful (success rate of 93%). The insertion constraint tends to add more complexity in the implementation. The insertion constraint tends to add more complexity in the implementation.

Unresolved problems Multiple source anaphor :“ left early this morning. Multiple source anaphor :“ Sarah and Sofia left early this morning. They have an appointment at the university” Self referring expressions : ” Self referring expressions : ” everyone knows it, John is a good driver ” Reference to verb phrases, sentences : « On to we are vulnerable. The problem is to forget ”. Reference to verb phrases, sentences : « On two wheels we are vulnerable. The problem is to forget it ”.

conclusion The main idea of our work is to establish a link between nominal phrases that share similar context with constituents in the input text. The main idea of our work is to establish a link between nominal phrases that share similar context with constituents in the input text. It relies heavily on a morphosyntactic parser. The application of a set of constraints followed by a set of preferences provides an elegant modular, easy to update anaphora resolution algorithm. It relies heavily on a morphosyntactic parser. The application of a set of constraints followed by a set of preferences provides an elegant modular, easy to update anaphora resolution algorithm. Unfortunately, current state-of-the-art of practically applicable parsing technology still falls short of robust and reliable delivery of syntactic analysis of real texts to the level of detail and precision that most algorithms assume. Unfortunately, current state-of-the-art of practically applicable parsing technology still falls short of robust and reliable delivery of syntactic analysis of real texts to the level of detail and precision that most algorithms assume. Shallow parsing, on the other hand, can affect greatly the performance and the efficiency of the algorithm. Shallow parsing, on the other hand, can affect greatly the performance and the efficiency of the algorithm.

Related work Type of knowledge Success rate corpus Lappin & Leass Robust parser 75% Computer texts Kennedy et al. Shallow parser 85% Web documents Mitkov P.O.S. tagger 89.7% Manuel texts