Chapter 18: Discourse Tianjun Fu Ling538 Presentation Nov 30th, 2006.

Slides:



Advertisements
Similar presentations
Referring Expressions: Definition Referring expressions are words or phrases, the semantic interpretation of which is a discourse entity (also called referent)
Advertisements

Pasco-Hernando Community College Tutorial Series.
Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.
Pronoun antecedent-agreement
Discourse Analysis David M. Cassel Natural Language Processing Villanova University April 21st, 2005 David M. Cassel Natural Language Processing Villanova.
Projecting Grammatical Features in Nominals: 23 March 2010 Jerry T. Ball Senior Research Psychologist 711 th HPW / RHAC Air Force Research Laboratory DISTRIBUTION.
Chapter 4 Basics of English Grammar
Processing of large document collections Part 6 (Text summarization: discourse- based approaches) Helena Ahonen-Myka Spring 2006.
1 Discourse, coherence and anaphora resolution Lecture 16.
Discourse Martin Hassel KTH NADA Royal Institute of Technology Stockholm
Anaphora Resolution Spring 2010, UCSC – Adrian Brasoveanu [Slides based on various sources, collected over a couple of years and repeatedly modified –
Pronouns.
Reference Resolution #1 CSCI-GA.2590 Ralph Grishman NYU.
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Discourse and Dialogue.
ZERO PRONOUN RESOLUTION IN JAPANESE Jeffrey Shu Ling 575 Discourse and Dialogue.
CS 4705 Algorithms for Reference Resolution. Anaphora resolution Finding in a text all the referring expressions that have one and the same denotation.
Final Review CS4705 Natural Language Processing. Semantics Meaning Representations –Predicate/argument structure and FOPC Thematic roles and selectional.
CS 4705 Lecture 21 Algorithms for Reference Resolution.
Natural Language Generation Martin Hassel KTH CSC Royal Institute of Technology Stockholm
1 Pragmatics: Discourse Analysis J&M’s Chapter 21.
Pragmatics I: Reference resolution Ling 571 Fei Xia Week 7: 11/8/05.
Designed by Elisa Paramore
THE PARTS OF SYNTAX Don’t worry, it’s just a phrase ELL113 Week 4.
Reference Resolution CSCI-GA.2590 Ralph Grishman NYU.
Chapter 4 Basics of English Grammar Business Communication Copyright 2010 South-Western Cengage Learning.
1 Computational Discourse Chapter 21 November 2012 Lecture #15.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
1 Natural Language Processing Computational Discourse.
Fall 2005 Lecture Notes #7 EECS 595 / LING 541 / SI 661 Natural Language Processing.
Distributional Part-of-Speech Tagging Hinrich Schütze CSLI, Ventura Hall Stanford, CA , USA NLP Applications.
Thinking about agreement. Part of Dick Hudson's web tutorial on Word Grammarweb tutorial.
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
Differential effects of constraints in the processing of Russian cataphora Kazanina and Phillips 2010.
1 CS3730/ISP3120 Discourse Processing and Pragmatics Lecture Notes Jan 10, 12.
Incorporating Extra-linguistic Information into Reference Resolution in Collaborative Task Dialogue Ryu Iida Shumpei Kobayashi Takenobu Tokunaga Tokyo.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Discourse Read J & M Chapter More than One Sentence at a Time The alphas have a long-standing hatred of the betas. Their leaders have decided that.
1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.
A Cross-Lingual ILP Solution to Zero Anaphora Resolution Ryu Iida & Massimo Poesio (ACL-HLT 2011)
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Reference Resolution. Sue bought a cup of coffee and a donut from Jane. She met John as she left. He looked at her enviously as she drank the coffee.
Coherence and Coreference Introduction to Discourse and Dialogue CS 359 October 2, 2001.
Introduction to Dialogue Systems. User Input System Output ?
1 Natural Language Processing Chapter Outline Reference –Kinds of reference phenomena –Constraints on co-reference –Preferences for co-reference.
Using Semantic Relations to Improve Passage Retrieval for Question Answering Tom Morton.
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
Keys to Understanding the Chapter
Reference Resolution CMSC Discourse and Dialogue September 30, 2004.
Evaluation issues in anaphora resolution and beyond Ruslan Mitkov University of Wolverhampton Faro, 27 June 2002.
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Using Semantic Relations to Improve Information Retrieval
Natural Language Processing COMPSCI 423/723 Rohit Kate.
Pronouns Chapter 8. Pronouns - Basics A pronoun is used in place of a noun. The noun it refers to is called an antecedent. I read a book. It was good.
Pronouns Chapter 8.
Discourse Analysis Natural Language Understanding basis
Referring Expressions: Definition
Chapter 4 Basics of English Grammar
Clustering Algorithms for Noun Phrase Coreference Resolution
Lecture 17: Discourse: Anaphora Resolution and Coherence
Anaphora Resolution Spring 2010, UCSC – Adrian Brasoveanu
Statistical NLP: Lecture 9
Algorithms for Reference Resolution
Statistical Anaphora Resolution
Chapter 4 Basics of English Grammar
CS4705 Natural Language Processing
Pronouns.
Statistical NLP : Lecture 9 Word Sense Disambiguation
Presentation transcript:

Chapter 18: Discourse Tianjun Fu Ling538 Presentation Nov 30th, 2006

Introduction Language consists of collocated, related groups of sentences. We refer to such a group of sentences as a discourse. There are three forms of discourse:  Monologue;  Dialogue;  Human-computer interaction (HCI); This chapter focuses on techniques commonly applied to the interpretation of monologues.

Reference Resolution Reference: the process by which speakers use expressions to denote an entity. Referring expression: expression used to perform reference. Referent: the entity that is referred to. Coreference: referring expressions that are used to refer to the same entity. Anaphora: reference to an previously introduced entity.

Reference Resolution Discourse Model (Webber,1978)  It contains representations of the entities that have been referred to in the discourse and the relationships in which they participate. Two components required by a system to produce and interpret referring expressions.  A method for constructing a discourse model that evolves dynamically.  A method for mapping between referring expressions and referents.

Reference Phenomena Five common types of referring expression TypeExample Indefinite noun phraseI saw a Ford Escort today. Definite noun phraseI saw a Ford Escort today. The Escort was white. PronounI saw a Ford Escort today. It was white. DemonstrativesI like this better than that. One-anaphoraI saw 6 Ford Escort today. Now I want one. Three types of referring expression that complicate the reference resolution TypeExample InferrablesI almost bought a Ford Escort, but a door had a dent. Discontinuous SetsJohn and Mary love their Escorts. They often drive them. GenericsI saw 6 Ford Escorts today. They are the coolest cars.

Reference Resolution How to develop successful algorithms for reference resolution? There are two necessary steps. First is to filter the set of possible referents by certain hard-and-fast constraints. Second is to set the preference for possible referents.

Constraints (for English) Number Agreement:  To distinguish between singular and plural references. *John has a new car. They are red. Gender Agreement:  To distinguish male, female, and nonpersonal genders. John has a new car. It is attractive. [It = the new car] Person and Case Agreement:  To distinguish between three forms of person; *You and I have Escorts. They love them.  To distinguish between subject position, object position, and genitive position.

Constraints (for English) Syntactic Constraints:  Syntactic relationships between a referring expression and a possible antecedent noun phrase John bought himself a new car. [himself=John] John bought him a new car. [him!=John] Selectional Restrictions:  A verb places restrictions on its arguments. John parked his Acura in the garage. He had driven it around for hours. [it=Acura];

Preferences in Pronoun Interpretation Recency:  Entities introduced recently are more salient than those introduced before. John has an Legend. Bill has an Escort. Mary likes to drive it. Grammatical Role:  Entities mentioned in subject position are more salient than those in object position. John went to the Ford dealership with Bill. He bought an Escort. Repeated Mention:  Entities that have been focused on in the prior discourse are more salient.

Preferences in Pronoun Interpretation Parallelism:  There are also strong preferences that appear to be induced by parallelism effects. Mary went with Sue to the cinema. Sally went with her to the mall. [ her = Sue] Verb Semantics:  Certain verbs appear to place a semantically-oriented emphasis on one of their argument positions. John telephoned Bill. He lost the book in the mall. [He = John] John criticized Bill. He lost the book in the mall. [He = Bill] These preferences are not perfect.

An Algorithm for Pronoun Resolution The algorithm (Lappin & Leass, 1994) employs a simple weighting scheme that integrates the effects of several preferences;  For each new entity, a representation for it is added to the discourse model and salience value computed for it.  Salience value is computed as the sum of the weights assigned by a set of salience factors. The weight a salience factor assigns to a referent is the highest one the factor assigns to the referent’s referring expression.  Salience values are cut in half each time a new sentence is processed.

An Algorithm for Pronoun Resolution Salience FactorSalience Value Sentence recency 100 Subject emphasis 80 Existential emphasis 70 Accusative emphasis 50 Indirect object emphasis 40 Non-adverbial emphasis 50 Head noun emphasis 80 *The weights are arrived by experimentation on a certain corpus.

An Algorithm for Pronoun Resolution The steps taken to resolve a pronoun are as follows:  Collect potential referents (four sentences back);  Remove potential referents that don’t semantically agree;  Remove potential referents that don’t syntactically agree;  Compute salience values for the rest potential referents;  Select the referent with the highest salience value.

Other Algorithm for Pronoun Resolution A Centering Algorithm (Grosz et al., 1995)  There is a single entity being “centered” on at any given point in the discourse  It also has an explicit representation of a discourse model  The major difference with previous one is that there are no numerical weights. The factors are simply ordered relative to each other A Tree Search algorithm (Hobbs, 1978)  No explicit representation of a discourse model  It searches syntactic parse tree.

Disadvantage and Limitations of Lapping and Leass’s algorithm It was developed on the assumption that correct syntactic structures are available. The weight used were based on a corpus of computer training manuals, which lacks generalizability. It only works for pronoun instead of all noun phrases.

Related Work Ge, Hale, and Charniak (1998) used a statistical model for resolving pronouns. Kehler (1997) used maximum entropy modeling to assigna probability distribution for coreference relationships. Soon et al. (2001) used decision tree learning to resolve general noun phrase. Aone and Bennett (1995) use decision tree learning for Japanese texts coreference resolution.

Comparison How to compare those algorithms? Which one is best? “a long-standing weakness in the area of anaphora resolution: the inability to fairly and consistently compare anaphora resolution algorithms due not only to the difference of evaluation data used, but also to the diversity of pre-processing tools employed by each system.” (Barbu & Mitkov, 2001) It seems popular to evaluate algorithms on the MUC- 6 and MUC-7 coreference corpora now.