Download presentation
Presentation is loading. Please wait.
Published byHortense Jackson Modified over 9 years ago
1
1 CS3730/ISP3120 Discourse Processing and Pragmatics Lecture Notes Jan 10, 12
2
2 Outline Finish going over the syllabus and list of assigned papers. Signup for presentation dates. Questions about Bonnie Webber’s Chapter in the Handbook of Discourse processing? Questions about the Intro to the Handbook of Discourse Processing? Information about Reference Information about Annotation
3
3 Syllabus and List of Assigned Papers Note that the weighting for calculating final grades has changed, to give you more credit for presentations and reaction essays. Instructions for the course yahoo! group were added to the syllabus. The data cannot be posted to the web, so information about where it is has been posted to the group. Note that a journal article was removed from the list of assigned papers, and the schedule changed accordingly Auditors: the grade you will receive is NC, for “no credit” This is a change effective this term.
4
4 List of Assigned Papers Schedule
5
5 Signup for Presentation Dates Non-auditors (NAs) should sign up for two lectures. NAs should randomly select a number NAs will select a presentation date in increasing order, and then select their second presentation date in decreasing order
6
6 Signup for presentation dates Let doubles = (# NAs * 2) – 24 At most doubles/2 (rounded up) paired presentations may be chosen during the first round. An individual NA may be involved in at most one paired presentation A day may have at most 2 presenters The following papers much have sole presenters: Lappin & Leass 94, Mann & Thompson 88, Hobbs 79
7
7 Questions on Bonnie Webber’s Chapter? Good for pointers into the literature. P. 800: ellipsis, eventualities, information structure (general idea of what they are?) P. 802-803: understand the examples?
8
8 Questions about the Introduction to The Handbook of DP? Many fields study discourse. Different definitions, theoretical paradigms, and methodologies Interesting links are described on page 6 Most of the topics in Part 1, Discourse Analysis and Linguistics, have influenced work in NLP (except historical linguistics; diachronic) Throughout, many possibilities for NLP, some in interdisciplinary work
9
9 Reference –Kinds of reference phenomena –Constraints on co-reference –Preferences for co-reference From Dan Jurafsky’s Lecture notes. Here are his acknowledgements: Thanks to Diane Litman, Andy Kehler, Jim Martin!!! This material is from J+M Chapter 18, written by Andy Kehler, slides inspired by Diane Litman + Jim Martin
10
10 Reference Resolution John went to Bill’s car dealership to check out an Acura Integra. He looked at it for half an hour I’d like to get from Boston to San Francisco, on either December 5th or December 6th. It’s ok if it stops in another city along they way
11
11 Why reference resolution? Conversational Agents: Airline reservation system needs to know what “it” refers to in order to book correct flight Information Extraction: First Union Corp. is continuing to wrestle with severe problems unleashed by a botched merger and a troubled business strategy. According to industry insiders at Paine Webber, their president, John R. Georgius, is planning to retire by the end of the year.
12
12 Some terminology John went to Bill’s car dealership to check out an Acura Integra. He looked at it for half an hour Reference: process by which speakers use words John and he to denote a particular person –Referring expression: John, he –Referent: the actual entity (but as a shorthand we might call “John” the referent). –John and he “corefer” –Antecedent: John –Anaphor: he
13
13 Many types of reference (after Webber 91) According to John, Bob bought Sue an Integra, and Sue bought Fred a Legend –But that turned out to be a lie (a speech act) –But that was false (proposition) –That struck me as a funny way to describe the situation (manner of description) –That caused Sue to become rather poor (event) –That caused them both to become rather poor (combination of several events)
14
14 Reference Phenomena Indefinite noun phrases: new to hearer –I saw an Acura Integra today –Some Acura Integras were being unloaded… –I am going to the dealership to buy an Acura Integra today. (specific/non-specific) I hope they still have it I hope they have a car I like Definite noun phrases: identifiable to hearer because –Mentioned: I saw an Acura Integra today. The Integra was white –Identifiable from beliefs: The Indianapolis 500 –Inherently unique: The fastest car in …
15
15 Indefinites (an aside) Lots of complexities; an e.g. of one type: The king and his men don’t know Merry and Pippin, and they can’t even see what they are (superordinate term “figures” versus basic level term “hobbits“) There they [the King and his men] saw close beside them a great rubbleheap; and suddenly they were aware of two small figures lying on it at their ease, grey-clad, hardly to be seen among the stones. [The Two Towers, Tolkein]
16
16 Reference Phenomena: Pronouns I saw an Acura Integra today. It was white Compared to definite noun phrases, pronouns require more referent salience. –John went to Bob’s party, and parked next to a beautiful Acura Integra –He went inside and talked to Bob for more than an hour. –Bob told him that he recently got engaged. –??He also said that he bought it yesterday. –OK He also said that he bought the Acura yesterday
17
17 More on Pronouns Cataphora: pronoun appears before referent: –Before he bought it, John checked over the Integra very carefully.
18
18 Inferrables I almost bought an Acura Integra today, but the engine seemed noisy. Mix the flour, butter, and water. –Kneed the dough until smooth and shiny –Spread the paste over the blueberries –Stir the batter until all lumps are gone.
19
19 Generics I saw no less than 6 Acura Integras today. They are the coolest cars.
20
20 Pronominal Reference Resolution Given a pronoun, find the reference (either in text or as a entity in the world) We will look at constraints. The first student presentations will look at resolution algorithms. –Hard constraints on reference –Soft constraints on reference
21
21 Hard constraints on coreference Number agreement –John has an Acura. It is red. Person and case agreement –*John and Mary have Acuras. We love them (where We=John and Mary) Gender agreement –John has an Acura. He/it/she is attractive. Syntactic constraints –John bought himself a new Acura (himself=John) –John bought him a new Acura (him = not John)
22
22 Pronoun Interpretation Preferences Selectional Restrictions –John parked his Acura in the garage. He had driven it around for hours. Recency –John has an Integra. Bill has a Legend. Mary likes to drive it.
23
23 Pronoun Interpretation Preferences Grammatical Role: Subject preference –John went to the Acura dealership with Bill. He bought an Integra. –Bill went to the Acura dealership with John. He bought an Integra –(?) John and Bill went to the Acura dealership. He bought an Integra
24
24 Repeated Mention preference John needed a car to get to his new job. He decided that he wanted something sporty. Bill went to the Acura dealership with him. He bought an Integra.
25
25 Parallelism Preference Mary went with Sue to the Acura dealership. Sally went with her to the Mazda dealership. Mary went with Sue to the Acura dealership. Sally told her not to buy anything.
26
26 Verb Semantics Preferences John telephoned Bill. He lost the pamphlet on Acuras. John criticized Bill. He lost the pamphlet on Acuras. Implicit causality –Implicit cause of criticizing is object. –Implicit cause of telephoning is subject.
27
27 Manual Annotation AKA coding, labeling
28
28 From Webber’s Chapter The aims of computational work in [discourse and dialog]: –Modeling particular phenomena in discourse and dialog in terms of underlying computational processes –Providing useful natural language services, whose success depends in part on handling aspects of discourse and dialog What computation contributes is a coherent framework for modeling these phenomena in terms of … search through a space of possible candidate interpretations (in language analysis) or candidate realizations (in language generation)
29
29 Desiderata Interesting and rich enough Not so rich that automation is too far ahead of the current state of the art –Too complex logical structure –Knowledge bottleneck (without viable source) –Too fine-grained or subtle Annotation instructions (aka coding manual) feasible Time required for training is reasonable Annotators can reliably perform the annotations in a reasonable amount of time
30
30 Minimal Process for NLP 1.Develop initial coding manual 2.At least two people perform sample annotations, and discuss their disagreements and experiences 3.Revise coding manual 4.Repeat 2-3 until agreement on training data is sufficient 5.Independently annotate a fresh test set 6.Evaluate agreement
31
31 Additional Steps 1.Develop initial coding manual 2.At least two people perform sample annotations, and discuss their disagreements and experiences. Analysis of patterns of agreement and disagreement using probability models (Wiebe et al. ACL-99; Bruce & Wiebe NLE-99; from work in applied statistics) 3.Revise coding manual 4.Repeat 2-3 until agreement on training data is sufficient 5.Independently annotate a fresh test set 6.Evaluate agreement 7.Train more annotators, assess average time for training and annotation 8.Evaluate other types of reliability (psychology, content analysis, applied statistics literatures)
32
32 Measures of Agreement Percentage Agreement: OK, but not sufficient If the distribution of classes is highly skewed, then the baseline algorithm of always assigning the most frequent class would have high agreement Kappa: measures agreement over and above agreement expected by chance Details available in section 3 of this paper by our groupthis paper
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.