August 20011 Discourse Structure and Anaphoric Accessibility Massimo Poesio and Barbara Di Eugenio with help from Gerard Keohane.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Advanced XSLT. Branching in XSLT XSLT is functional programming –The program evaluates a function –The function transforms one structure into another.
MATH 224 – Discrete Mathematics
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Negotiative dialogue some definitions and ideas. Negotiation vs. acceptance Clark’s ladder: –1. A attends to B’s utterance –2. A percieves B’s utterance.
1 Introduction to Data Flow Analysis. 2 Data Flow Analysis Construct representations for the structure of flow-of-data of programs based on the structure.
1 Programming Languages (CS 550) Lecture Summary Functional Programming and Operational Semantics for Scheme Jeremy R. Johnson.
The Assembly Language Level
SDLC Software Development Life Cycle. SDLC Acronym for system development life cycle. Acronym for system development life cycle. Is the process of developing.
On the Genetic Evolution of a Perfect Tic-Tac-Toe Strategy
Modeling and simulation of systems Slovak University of Technology Faculty of Material Science and Technology in Trnava.
1 The Cost of Socially Responsible Investing* Mark Kritzman Windham Capital Management and MIT Sloan School *Research performed in collaboration with Timothy.
Exploiting Discourse Structure for Sentiment Analysis of Text OR 2013 Alexander Hogenboom In collaboration with Flavius Frasincar, Uzay Kaymak, and Franciska.
Pragmatics II: Discourse structure Ling 571 Fei Xia Week 7: 11/10/05.
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
ISBN Chapter 3 Describing Syntax and Semantics.
S © Siemens Corporate Research, Inc. S I E M E N S C O R P O R A T E R E S E A R C H A Corpus-based Analysis for the Ordering of Clause Aggregation Operators.
1/14 Ad Hoc Networking, Eli M. Gafni and Dimitri P. Bertsekas Distributed Algorithm for Generating Loop-free Routes in Networks With Frequently.
Empirical Evaluation of Pronoun Resolution and Clausal Structure Joel Tetreault and James Allen University of Rochester Department of Computer Science.
How many transcripts does it take to reconstruct the splice graph? Introduction Alternative splicing is the process by which a single gene may be used.
© 2006 Pearson Addison-Wesley. All rights reserved7 B-1 Chapter 7 (continued) Stacks.
A GOAL-BASED FRAMEWORK FOR SOFTWARE MEASUREMENT
CS 4705 Discourse Structure and Text Coherence. What makes a text/dialogue coherent? Incoherent? “Consider, for example, the difference between passages.
Discourse Structure Grosz and Sidner. Why bother? Leads to an account of discourse meaning Constrains how utterances are related Useful for explaining.
CS 4705 Lecture 21 Algorithms for Reference Resolution.
This material in not in your text (except as exercises) Sequence Comparisons –Problems in molecular biology involve finding the minimum number of edit.
Research Methods for Business Students
Ideas for Explainable AI
Describing Syntax and Semantics
Copyright © Cengage Learning. All rights reserved. CHAPTER 11 ANALYSIS OF ALGORITHM EFFICIENCY ANALYSIS OF ALGORITHM EFFICIENCY.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Upgrading to XHTML DECO 3001 Tutorial 1 – Part 1 Presented by Ji Soo Yoon 19 February 2004 Slides adopted from
Query Operations: Automatic Global Analysis. Motivation Methods of local analysis extract information from local set of documents retrieved to expand.
CHAPTER 7: SORTING & SEARCHING Introduction to Computer Science Using Ruby (c) Ophir Frieder at al 2012.
Fruitful functions. Return values The built-in functions we have used, such as abs, pow, int, max, and range, have produced results. Calling each of these.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Purpose of study A high-quality computing education equips pupils to use computational thinking and creativity to understand and change the world. Computing.
AUTOMATIC DETECTION OF REGISTER CHANGES FOR THE ANALYSIS OF DISCOURSE STRUCTURE Laboratoire Parole et Langage, CNRS et Université de Provence Aix-en-Provence,
Computer Science Department Data Structure & Algorithms Problem Solving with Stack.
An Introduction to Programming and Algorithms. Course Objectives A basic understanding of engineering problem solving process. A basic understanding of.
More on “The Huddersfield Method” A lightweight, pattern-driven method based on SSM, Domain Driven Design and Naked Objects.
Incorporating Extra-linguistic Information into Reference Resolution in Collaborative Task Dialogue Ryu Iida Shumpei Kobayashi Takenobu Tokunaga Tokyo.
1 Automatic Refinement and Vacuity Detection for Symbolic Trajectory Evaluation Orna Grumberg Technion Haifa, Israel Joint work with Rachel Tzoref.
Planning, page 1 CSI 4106, Winter 2005 Planning Points Elements of a planning problem Planning as resolution Conditional plans Actions as preconditions.
 Topology Topology  Different types of topology Different types of topology  bus topologybus topology  ring topologyring topology  star topologystar.
1 Special Electives of Comp.Linguistics: Processing Anaphoric Expressions Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
CMPS 1371 Introduction to Computing for Engineers PRINCIPLES OF PROBLEM SOLVING.
AMB HW LOW LEVEL SIMULATION VS HW OUTPUT G. Volpi, INFN Pisa.
Latent Semantic Indexing and Probabilistic (Bayesian) Information Retrieval.
Computing Simulation in Orders Based Transparent Parallelizing Pavlenko Vitaliy Danilovich, Odessa National Polytechnic University Burdeinyi Viktor Viktorovych,
Processor Architecture
Dialog Models September 18, 2003 Thomas Harris.
Evaluation issues in anaphora resolution and beyond Ruslan Mitkov University of Wolverhampton Faro, 27 June 2002.
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
From Error Control to Error Concealment Dr Farokh Marvasti Multimedia Lab King’s College London.
An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,
Interaction Frameworks COMPSCI 345 S1 C and SoftEng 350 S1 C Lecture 3 Chapter (Heim)
Shadow Detection in Remotely Sensed Images Based on Self-Adaptive Feature Selection Jiahang Liu, Tao Fang, and Deren Li IEEE TRANSACTIONS ON GEOSCIENCE.
Fifth Edition Mark Saunders, Philip Lewis and Adrian Thornhill 2009 Research Methods for Business Students.
CSC 143 P 1 CSC 143 Recursion [Chapter 5]. CSC 143 P 2 Recursion  A recursive definition is one which is defined in terms of itself  Example:  Compound.
Research Methods for Business Students
Simone Paolo Ponzetto University of Heidelberg Massimo Poesio
Improving a Pipeline Architecture for Shallow Discourse Parsing
CSc4730/6730 Scientific Visualization
Lecture 23 Pages : Separating Syntactic Analysis from Execution. We omit many details so you have to read the section in the book. The halting.
Presented By: Darlene Banta
(via graph coloring and spilling)
Presentation transcript:

August Discourse Structure and Anaphoric Accessibility Massimo Poesio and Barbara Di Eugenio with help from Gerard Keohane

August 2001 Information Structure and Discourse Structure 2 Content Empirical Investigations of Discourse Structure Grosz and Sidner’s theory of the Global Focus Relational Discourse Analysis How we used RDA to study G&S Results Discussion

August 2001 Information Structure and Discourse Structure 3 Empirical Investigations of Discourse Structure: A new opportunity Original proposals concerning effect of discourse structure on accessibility (Reichman, 1985; Fox, 1987; Grosz and Sidner, 1986) based on unsystematic analysis of data These days we know more about reliable studies of discourse phenomena (Passonneau and Litman, 1993; Carletta et al, 1997) These new resources already used to propose new theories of anaphora and discourse structure such as Veins Theory (Cristea, Ide, Marcu, et al, 1998, 1999, 2000) The goal of this project: use a reliably annotated corpus (the Sherlock corpus from the University of Pittsburgh, Moser and Moore, 1996; Di Eugenio et al, 1997) to study claims of G&S

August 2001 Information Structure and Discourse Structure 4 Grosz and Sidner’s Theory of the Global Focus The structure of a discourse is determined by the intentions utterances are meant to convey (DISCOURSE SEGMENT PURPOSES) INTENTIONAL STRUCTURE: DOMINANCE and SAT-PRECEDES relations between DSPs ATTENTIONAL STRUCTURE: a stack of FOCUS SPACES Focus spaces on the stack contain accessible discourse entities Presence on the stack reflects intentional structure The problem: how to identify DSPs in a discourse

August 2001 Information Structure and Discourse Structure 5 Relational Discourse Analysis (RDA) Moore and Pollack, 1992; Moser and Moore, 1996 Combines ideas from RST and Grosz and Sidner’s theory From Grosz and Sidner: discourse structure is determined by intentional structure RDA-SEGMENT: a segment expressing an intentional relation From RST: segments have internal structure CORE (cfr. NUCLEUS) CONTRIBUTOR (cfr. SATELLITE) Both INTENTIONAL and INFORMATIONAL relations A fixed number of intentional relations Has been proven to be usable for reliable analysis

August 2001 Information Structure and Discourse Structure 6 RDA Analysis of an excerpt from a tutorial 1.1 Before troubleshooting inside the text station, 1.2 It’s always best to eliminate both the UUT and the TP 2.1 Since the test package is moved frequently 2.2 It is prone to damage 3.1 Also, testing the test package is much easier and faster 3.2 than opening up test station drawers CONVINCE ENABLE step1:step2 Cause:effect Prescribed-act: Wrong-act

August 2001 Information Structure and Discourse Structure 7 Moser and Moore: mapping between RST relations and G&S Basic principles: Every DSP must be associated with a core Constituents of the RDA structure that do not include cores – such as clusters – do not introduce DSPs Consequences for attentional state: A new focus space only pushed when a segment is open Information relations do not affect the attentional state

August 2001 Information Structure and Discourse Structure Before troubleshooting inside the text station, 1.2 It’s always best to eliminate both the UUT and the TP 2.1 Since the test package is moved frequently 2.2 It is prone to damage 3.1 Also, testing the test package is much easier and faster 3.2 than opening up test station drawers. Mapping RDA into Attentional State CONVINCE ENABLE step1:step2 Cause:effect Prescribed-act: Wrong-act DSP1 DSP 2

August 2001 Information Structure and Discourse Structure 9 Using an RDA-annotated corpus to study anaphoric accessibility The data: the SHERLOCK corpus, already annotated according to RDA instructions (Moser, 1996) Added anaphoric annotation according to GNOME instructions (Poesio, 2000) derived from MATE scheme (Poesio Bruneseaux and Romary, 1999) Use RDA analysis to drive focus space construction Measure: Accessibility Perplexity

August 2001 Information Structure and Discourse Structure 10 The Data: the SHERLOCK corpus 17 tutorial dialogues collected within the Sherlock project (Lesgold et al, 1992) Students solve electronic troubleshooting problem 313 turns, 1333 clauses RDA annotation: Moser and Moore, 1996 Reliability verified at different levels Intentional relations: CONCEDE, CONVINCE, ENABLE, JOINT

August 2001 Information Structure and Discourse Structure 11 An example of Sherlock dialogue STUDENT: 1.1 Why isn't measurement signal path green during good test readings (steps)? TUTOR: 2.1 For each step that passed, 2.2 you know the measurement path is good. 2.3 You also know that one of the measurement paths is bad. 2.4 Showing the UUT, Test Package, and measurement section as unknown is correct 2.5 because, you know when you get your fail that something was wrong, 2.6 but you didn't know exactly what. 2.7 The DMM is green 2.8 because it has been working all along. 2.9 The stimulus section is green 2.10 because it was not used 2.11 and is assumed to be good.

August 2001 Information Structure and Discourse Structure 12 Anaphoric Annotation The GNOME scheme (Poesio, 2000) Mark up all NPs as NE element, with a variety of attributes About 3000 NEs Use separate ANTE element to mark up anaphoric relations (including bridges) In this annotation: only direct anaphoric relations (About 1500 total)

August 2001 Information Structure and Discourse Structure 13 Evaluation A PERL script simulates focus space construction and computes accessibility and perplexity Accessibility: whether antecedent is in focus stack Perplexity: Sum 1/d(x i ) m(x i ) (where m(x i ) = 1 if x i matches anaphor, 0 otherwise) Parameters for focus space construction: PUSHING: Whenever relation is encountered (either informational or intentional) Only when intentional POPPING: As soon as associated constituent is completed Immediate popping of contributors, delayed popping of cores Delayed popping of contributors

August 2001 Information Structure and Discourse Structure 14 Evaluation I: Intentional vs Informational OKNOOut of APPN All Intentional (immediate popping) Accessibility : Perplexity : All = 0.83, Intentional = 1.23

August 2001 Information Structure and Discourse Structure a Since S52 puts a return (0 VDC) on it’s outputs 24.13b when they are active, the inactive state must be some other voltage So even though you may not know what the “other” voltage is, You can test to ensure that 24.17a the active pins are 0 VDC 24.17b and all the inactive pins are not 0 VDC. Complications 24.13a24.13b a ENABLE CONCEDE ENABLE Effect:cause Contrast1: contrast2 DSP b

August 2001 Information Structure and Discourse Structure a Since S52 puts a return (0 VDC) on it’s outputs 24.13b when they are active, the inactive state must be some other voltage So even though you may not know what the “other” voltage is, You can test to ensure that 24.17a the active pins are 0 VDC 24.17b and all the inactive pins are not 0 VDC. Complications 24.13a24.13b a ENABLE CONCEDE ENABLE Effect:cause Contrast1: contrast2 DSP b

August 2001 Information Structure and Discourse Structure 17 Evaluation II: Delayed Popping OKNO Immediate popping28020 Delay pop of cores28716 Delay pop of contributors 3108 Accessibility Perplexity Average perplexity with immediate popping: 1.23 Delayed popping of cores: 1.3 Delayed popping of contributors: 1.33

August 2001 Information Structure and Discourse Structure 18 Discussion Accessibility: Intentional vs. informational distinction makes sense Cfr. Fox Want to keep contributors as well as cores on stack cfr. Veins Theory An evaluation of Grosz and Sidner’s framework: The most direct implementation makes quite a few discourse entities unaccessible Difficult to interpret more complex operations in terms of intentional structure Alternative: a cache model (cfr. Guindon 1985, Walker 1996, 1998) Version 1 (conservative): cache of focus spaces Version 2: cache of forward looking centers

August 2001 Information Structure and Discourse Structure 19 Cache-based global focus: a conservative proposal Cache elements are FOCUS SPACES Cache elements are RANKED: Current focus space < other constituents of same segment < dominating segments < focus spaces of contributors to closed spaces (Cfr. Reichman 85) Search algorithm: follow ranking Cache replacement algorithm: Opening RDA segment: open new focus space, replace lowest-ranked element of cache, assign it highest rank Closing RDA segment: Assign lowest rank to embedded contributors