Pronouns and Reference Resolution CS 4705 Julia Hirschberg CS 4705.

Slides:



Advertisements
Similar presentations
Referring Expressions: Definition Referring expressions are words or phrases, the semantic interpretation of which is a discourse entity (also called referent)
Advertisements

Syntactic Complexity and Cohesion
Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.
Semantics (Representing Meaning)
Language Use and Understanding BCS 261 LIN 241 PSY 261 CLASS 12: BRANIGAN ET AL.: PRIMING.
Discourse Analysis David M. Cassel Natural Language Processing Villanova University April 21st, 2005 David M. Cassel Natural Language Processing Villanova.
Unit 6 Predicates, Referring Expressions, and Universe of Discourse Part 1: Practices 1-7.
When writing, it is important to be consistent in verb tense usage. There are three basic forms of verb tenses: past, present, and future. Simple and.
Parts of Speech Part 2: Adverbs and Prepositions.
Unit 13: Relative Pronouns
Processing of large document collections Part 6 (Text summarization: discourse- based approaches) Helena Ahonen-Myka Spring 2006.
1 Discourse, coherence and anaphora resolution Lecture 16.
Discourse Martin Hassel KTH NADA Royal Institute of Technology Stockholm
Chapter 18: Discourse Tianjun Fu Ling538 Presentation Nov 30th, 2006.
Reference Resolution #1 CSCI-GA.2590 Ralph Grishman NYU.
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Discourse and Dialogue.
CS 4705 Algorithms for Reference Resolution. Anaphora resolution Finding in a text all the referring expressions that have one and the same denotation.
Final Review CS4705 Natural Language Processing. Semantics Meaning Representations –Predicate/argument structure and FOPC Thematic roles and selectional.
CS 4705 Lecture 21 Algorithms for Reference Resolution.
Natural Language Generation Martin Hassel KTH CSC Royal Institute of Technology Stockholm
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
CS 4705 Pronouns and Reference Resolution. Gracie: Oh yeah... and then Mr. and Mrs. Jones were having matrimonial trouble, and my brother was hired to.
CS 4705 Lecture 20 Reference. Pragmatics Context-dependent meaning Jeb Bush was helped by his brother and so was Frank Lautenberg. Mike Bloomberg bet.
1 Pragmatics: Discourse Analysis J&M’s Chapter 21.
Reference and inference By: Esra’a Rawah
Pragmatics I: Reference resolution Ling 571 Fei Xia Week 7: 11/8/05.
CS 4705 Final Review CS4705 Julia Hirschberg. Format and Coverage Covers only material from thru (i.e. beginning with Probabilistic Parsing) Same format.
Pronouns and Reference Resolution CS  HW3 deadline extended to Wednesday, Nov. 25 th at 11:58 pm  Michael Collins will talk Thursday, Dec. 3 rd.
Avoiding Sentence Fragments
Level 1: The Parts of Speech
9/8/20151 Natural Language Processing Lecture Notes 1.
Introduction to English Syntax Level 1 Course Ron Kuzar Department of English Language and Literature University of Haifa Chapter 2 Sentences: From Lexicon.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
1 Natural Language Processing Computational Discourse.
The 8 Parts of Speech. NOUNS Nouns name persons, places, things, or concepts – pilot – house – toy – happiness Proper name specific persons, places things,
Fall 2005 Lecture Notes #7 EECS 595 / LING 541 / SI 661 Natural Language Processing.
Coreference in NLP Breck Baldwin
Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,
Binding Theory Describing Relationships between Nouns.
A multiple knowledge source algorithm for anaphora resolution Allaoua Refoufi Computer Science Department University of Setif, Setif 19000, Algeria .
1 CS3730/ISP3120 Discourse Processing and Pragmatics Lecture Notes Jan 10, 12.
SYNTAX Lecture -1 SMRITI SINGH.
1 Special Electives of Comp.Linguistics: Processing Anaphoric Expressions Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Reference Resolution. Sue bought a cup of coffee and a donut from Jane. She met John as she left. He looked at her enviously as she drank the coffee.
Coherence and Coreference Introduction to Discourse and Dialogue CS 359 October 2, 2001.
1 Natural Language Processing Chapter Outline Reference –Kinds of reference phenomena –Constraints on co-reference –Preferences for co-reference.
Rules, Movement, Ambiguity
Reference Resolution CMSC Discourse and Dialogue September 30, 2004.
Pragmatics Nuha Alwadaani.
Sight Words.
High Frequency Words.
Unit 3 Grammar Form & Function Level 3
Yule: “Words themselves do not refer to anything, people refer” Reference and inference Pragmatics: Reference and inference.
POSSESSIVE ADJECTIVES and OBJECT PRONOUNS. READ. Layla loves her dog. Its name is Skip. Leo and his sisters like to play. Their names are Vicky and Sue.
Reference Resolution CMSC Natural Language Processing January 15, 2008.
REFERENCE AND INFERENCE Words themselves do not refer to anything, people refer.
Unit 6 Predicates, Referring Expressions, and Universe of Discourse.
Oxford Words
PRESUPPOSITION PRESENTED BY: SUHAEMI.
Reference. “John went to the candy store to shop for chocolate.” “He bought some.”
Discourse Analysis Natural Language Understanding basis
Referring Expressions: Definition
Lecture 17: Discourse: Anaphora Resolution and Coherence
Algorithms for Reference Resolution
Natural Language - General
Pronouns and Reference Resolution
CS4705 Natural Language Processing
Pragmatics: Reference and inference
Presentation transcript:

Pronouns and Reference Resolution CS 4705 Julia Hirschberg CS 4705

A Reference Joke Gracie: Oh yeah... and then Mr. and Mrs. Jones were having matrimonial trouble, and my brother was hired to watch Mrs. Jones. George: Well, I imagine she was a very attractive woman. Gracie: She was, and my brother watched her day and night for six months. George: Well, what happened? Gracie: She finally got a divorce. George: Mrs. Jones? Gracie: No, my brother's wife.

Terminology Discourse: anything longer than a single utterance or sentence –Monologue –Dialogue: May be multi-party May be human-machine

Reference Resolution Process of associating Bloomberg/he/his with particular person and big budget problem/it with a concept Guiliani left Bloomberg to be mayor of a city with a big budget problem. It’s unclear how he’ll be able to handle it during his term. Referring exprs.: Guilani, Bloomberg, he, it, his Presentational it, there: non-referential Referents: the person named Bloomberg, the concept of a big budget problem

Co-referring referring expressions: Bloomberg, he, his Antecedent: Bloomberg Anaphors: he, his

Discourse Models Needed to model reference because referring expressions (e.g. Guiliani, Bloomberg, he, it budget problem) encode information about beliefs about the referent When a referent is first mentioned in a discourse, a representation is evoked in the model –Information predicated of it is stored also in the model –On subsequent mention, it is accessed from the model

Types of Referring Expressions Entities, concepts, places, propositions, events,... According to John, Bob bought Sue an Integra, and Sue bought Fred a Legend. –But that turned out to be a lie. (a speech act) –But that was false. (proposition) –That struck me as a funny way to describe the situation. (manner of description) –That caused Sue to become rather poor. (event) –That caused them both to become rather poor. (combination of multiple events)

Reference Phenomena Indefinite NPs A homeless man hit up Bloomberg for a dollar. Some homeless guy hit up Bloomberg for a dollar. This homeless man hit up Bloomberg for a dollar. Definite NPs The poor fellow only got a lecture. Demonstratives This homeless man got a lecture but that one got carted off to jail.

One-anaphora Clinton used to have a dog called Buddy. Now he’s got another one

Pronouns A large tiger escaped from the Central Park zoo chasing a tiny sparrow. It was recaptured by a brave policeman. –Referents of pronouns usually require some degree of salience in the discourse (as opposed to definite and indefinite NPs, e.g.) –How do items become salient in discourse?

Salience vs. Recency: The Rule of 2 Sentences He had dodged the press for 36 hours, but yesterday the Buck House Butler came out of the cocoon of his room at the Millennium Hotel in New York and shoveled some morsels the way of the panting press. First there was a brief, if obviously self-serving, statement, and then, in good royal tradition, a walkabout. Dapper in a suit and colourfully striped tie, Paul Burrell was stinging from a weekend of salacious accusations in the British media. He wanted us to know: he had decided after his acquittal at his theft to trial to sell his story to the Daily Mirror because he needed the money to stave off "financial ruination". And he was here in America further to spill the beans to the ABC TV network simply to tell "my side of the story".

If he wanted attention in America, he was getting it. His lawyer in the States, Richard Greene, implored us to leave alone him, his wife, Maria, and their two sons, Alex and Nicholas, as they spent three more days in Manhattan. Just as quickly he then invited us outside to take pictures and told us where else the besieged family would be heading: Central Park, the Empire State Building and ground zero. The "blabbermouth", as The Sun – doubtless doubled up with envy at the Mirror's coup – has taken to calling Mr Burrell, said not a word during the 10-minute outing to Times Square. But he and his wife, in pinstripe jacket and trousers, wore fixed smiles even as they struggled to keep their footing against a surging scrum of cameramen and reporters. Only the two boys looked resolutely miserable.

Salience vs. Structural Recency E: So you have the engine assembly finished. Now attach the rope. By the way, did you buy the gas can today? A: Yes. E: Did it cost much? A: No. E: OK, good. Have you got it attached yet?

Inferables I almost bought an Acura Integra today, but a door had a dent and the engine seemed noisy. Mix the flour, butter, and water. Knead the dough until smooth and shiny.

Discontinuous Sets Entities evoked together but mentioned in different sentence or phrases John has a St. Bernard and Mary has a Yorkie. They arouse some comment when they walk them in the park.

Generics I saw two Corgis and their seven puppies today. They are the funniest dogs

Constraints on Coreference Number agreement John’s parents like opera. John hates it/John hates them. Person and case agreement –Nominative: I, we, you, he, she, they –Accusative: me,us,you,him,her,them –Genitive: my,our,your,his,her,their George and Edward brought bread and cheese. They shared them.

Gender agreement John has a Porsche. He/it/she is attractive. Syntactic constraints: binding theory John bought himself a new Volvo. (himself = John) John bought him a new Volvo (him = not John) Selectional restrictions John left his plane in the hangar. He had flown it from Memphis this morning.

Interpretation of Pronouns Recency John bought a new boat. Bill bought a bigger one. Mary likes to sail it. But…grammatical role raises its ugly head… John went to the Acura dealership with Bill. He bought an Integra. Bill went to the Acura dealership with John. He bought an Integra. ?John and Bill went to the Acura dealership. He bought an Integra.

And so does…repeated mention –John needed a car to go to his new job. He decided that he wanted something sporty. Bill went to the dealership with him. He bought a Miata. –Who bought the Miata? –What about grammatical role preference? Parallel constructions Saturday, Mary went with Sue to the farmer’s market. Sally went with her to the bookstore. Sunday, Mary went with Sue to the mall. Sally told her she should get over her shopping obsession.

Verb semantics/thematic roles John telephoned Bill. He’d lost the directions to his house. John criticized Bill. He’d lost the directions to his house.

Pragmatics Context-dependent meaning Jeb Bush was helped by his brother and so was Frank Lautenberg. (Strict vs. Sloppy) Mike Bloomberg bet George Pataki a baseball cap that he could/couldn’t run the marathon in under 3 hours. Mike Bloomberg bet George Pataki a baseball cap that he could/couldn’t be hypnotized in under 1 minute.

What Affects Reference Resolution? Lexical factors –Reference type: Inferrability, discontinuous set, generics, one anaphora, pronouns,… Discourse factors: –Recency –Focus/topic structure, digression –Repeated mention Syntactic factors: –Agreement: gender, number, person, case –Parallel construction –Grammatical role

–Selectional restrictions Semantic/lexical factors –Verb semantics, thematic role Pragmatic factors

Reference Resolution Algorithms Given these types of constraints, can we construct an algorithm that will apply them such that we can identify the correct referents of anaphors and other referring expressions?

Reference Resolution Task Finding in a text all the referring expressions that have one and the same denotation –Pronominal anaphora resolution –Anaphora resolution between named entities –Full noun phrase anaphora resolution

Issues Which constraints/features can/should we make use of? How should we order them? I.e. which override which? What should be stored in our discourse model? I.e., what types of information do we need to keep track of? How to evaluate?

Some Algorithms Lappin & Leas ‘94: weighting via recency and syntactic preferences Hobbs ‘78: syntax tree-based referential search Supervised approaches

Hobbs: Syntax-Based Approach Search for antecedent in parse tree of current sentence, then prior sentences in order of recency –For current S, search for NP nodes to the left of a path p from the pronoun up to the first NP or S node (X) above it in L2R, breadth-first Propose as pronoun’s antecedent any NP you find as long as it has an NP or S node between itself and X If X is highest node in sentence, search prior sentences, L2R breadth-first, for candidate NPs O.w., continue searching current tree by going to next S or NP above X before going to prior sentences

Hobbs’ Example

Lappin & Leas: Salience Weights candidate antecedents by recency and syntactic preference (86% accuracy) Two major functions to perform: –Update the discourse model when an NP that evokes a new entity is found in the text, computing the salience of this entity for future anaphora resolution –Find most likely referent for current anaphor by considering possible antecedents and their salience values

Saliency Factor Weights Sentence recency (in current sentence?) 100 Subject emphasis (is it the subject?) 80 Existential emphasis (existential prednom?) 70 Accusative emphasis (is it the dir obj?) 50 Indirect object/oblique comp emphasis 40 Non-adverbial emphasis (not in PP) 50 Head noun emphasis (is head noun) 80

Implicit ordering of arguments: subj/exist pred/obj/indobj-oblique/dem.advPP On the sofa, the cat was eating candy. It…. Computing Salience scores sofa: =180 cat: =310 candy: =280 Update: –Weights accumulate over time –Cut in half after each sentence processed –Salience values for subsequent referents accumulate for equivalence class of co-referential items (exceptions, e.g. multiple references in same sentence)

Evaluation Walker ‘89 manual comparison of Centering vs. Hobbs ‘78 –Only 281 examples from 3 genres –Assumed correct features given as input to each –Centering 77.6% vs. Hobbs 81.8% –Lappin and Leass’ 86% accuracy on test set from computer training manuals Type of text used for the evaluation –Lappin and Leass’ computer manual texts (86% accuracy) –Statistical approach on WSJ articles (83% accuracy) –Syntactic approach on different genres (75% accuracy)

Supervised Anaphora Resolution Input: pronoun plus current and preceding sentences Training: hand-labeled corpus of positive examples plus inferred negative examples Features Strict number Compatible number Strict gender Compatible gender Sentence distance

Hobbs distance (noun groups between pronoun and candidate antecedent Grammatical role Linguistic form (proper name, definite, indefinite, pronominal antecedent)

Newer Approaches Reason over all possible coreference relations as sets (within-doc)‏ –Culotta, Hall and McCallum '07, First-Order Probabilistic Models for Coreference Resolution Reasoning over proper probabalistic models of clustering (across doc)‏ –Haghighi and Klein '07, Unsupervised coreference resolution in a nonparametric bayesian model Unsupervised coreference resolution in a nonparametric bayesian model

Summary Many approaches to reference resolution Use similar information/features but different methods –Lappin & Leas’ Saliency –Hobbs’ Syntax –Centering’s Grammatical Function –Supervised and shallow methods use simple techniques with reasonable success