Recognizing Discourse Structure: Text Discourse & Dialogue CMSC 35900-1 October 16, 2006.

Slides:



Advertisements
Similar presentations
Discourse Structure and Discourse Coherence
Advertisements

Polarity Analysis of Texts using Discourse Structure CIKM 2011 Bas Heerschop Erasmus University Rotterdam Frank Goossen Erasmus.
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Exploiting Discourse Structure for Sentiment Analysis of Text OR 2013 Alexander Hogenboom In collaboration with Flavius Frasincar, Uzay Kaymak, and Franciska.
Pragmatics II: Discourse structure Ling 571 Fei Xia Week 7: 11/10/05.
Discourse Segmentation Ling575 Discourse and Dialogue April 20, 2011.
Passage Retrieval & Re-ranking Ling573 NLP Systems and Applications May 5, 2011.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
Information Retrieval Ling573 NLP Systems and Applications April 26, 2011.
Discourse: Structure Ling571 Deep Processing Techniques for NLP March 7, 2011.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
IR & Metadata. Metadata Didn’t we already talk about this? We discussed what metadata is and its types –Data about data –Descriptive metadata is external.
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
InfoMagnets : Making Sense of Corpus Data Jaime Arguello Language Technologies Institute.
Overview of Search Engines
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
Ljubljana, November 18JOTA - Faculty of Arts1 Topic Segmentation Gaël Dias and Elsa Alves Human Language Technology Interest Group Department of Computer.
Discourse Markers Discourse & Dialogue CS November 25, 2006.
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
SAC 1 Informal Discourse Comparative Analysis. Analytical Commentary SAC 1: Analytical Commentary What is it? Linguistic analysis. Articulate your understanding.
Generic text summarization using relevance measure and latent semantic analysis Gong Yihong and Xin Liu SIGIR, April 2015 Yubin Lim.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Transformation-Based Learning Advanced Statistical Methods in NLP Ling 572 March 1, 2012.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
LOGO Summarizing Conversations with Clue Words Giuseppe Carenini, Raymond T. Ng, Xiaodong Zhou (WWW ’07) Advisor : Dr. Koh Jia-Ling Speaker : Tu.
CS654: Digital Image Analysis
Qualitative Measures of Text Complexity. Agenda 12:30-12:50 Seminar: Does accessible text have to sacrifice rigor? Why or why not? 12:50-1:15 Activity.
Automatic Cue-Based Dialogue Act Tagging Discourse & Dialogue CMSC November 3, 2006.
Dependency Parser for Swedish Project for EDA171 by Jonas Pålsson Marcus Stamborg.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
Summary Generation Keith Trnka. The approach ● Apply Marcu's basic summarizer (1999) to perform content selection ● Re-generate the selected content so.
Boundary Detection in Tokenizing Network Application Payload for Anomaly Detection Rachna Vargiya and Philip Chan Department of Computer Sciences Florida.
Discourse & Dialogue CS 359 November 13, 2001
Automatic recognition of discourse relations Lecture 3.
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Discourse: Structure and Coherence Kathy McKeown Thanks to Dan Jurafsky, Diane Litman, Andy Kehler, Jim Martin.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
Text Summarization using Lexical Chains. Summarization using Lexical Chains Summarization? What is Summarization? Advantages… Challenges…
Natural Language Processing COMPSCI 423/723 Rohit Kate.
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.
Discourse: Structure and Coherence Kathy McKeown Thanks to Dan Jurafsky, Diane Litman, Andy Kehler, Jim Martin.
Introduction to RST (Rhetorical Structure Theory)
The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.
Content Selection: Supervision & Discourse
Neural Machine Translation
CSC 594 Topics in AI – Natural Language Processing
Clustering of Web pages
Natural Language Processing (NLP)
Improving a Pipeline Architecture for Shallow Discourse Parsing
Mean Shift Segmentation
Q1-Identify and Interpret List four things from the text about…
Probabilistic and Lexicalized Parsing
Natural Language - General
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
Natural Language Processing (NLP)
Natural Language Processing (NLP)
Presentation transcript:

Recognizing Discourse Structure: Text Discourse & Dialogue CMSC October 16, 2006

Roadmap Discourse structure TextTiling (Hearst 1994) RST Parsing (Marcu 1999) Contrasts & Issues

Discourse Structure Models Attention, Intentions (G&S) Rhetorical Structure Theory (RST)

Contrasts G&S –Participants Mono, dial, multi-party –Relations 2: Dom, SP Purpose: infinite –Speech issues Address some –Granularity Variable –Minimum: phrase/clause RST –Participants Monologue –Relations Myriad –Speech issues Largely ignored –Granularity Clausal

Similarities Structure –Predominantly hierarchical, tree-structured –Some notion of core intentions/topics Intentions of speaker and hearer –Co-constraint of discourse structure and ling form

Questions Which would be better for generation? –For recognition? –For summarization? –Why?

Recognizing Discourse Structure Decompose text into subunits Questions: –What type of structure is derived? Sequential spans, hierarchical trees, arbitrary graphs –What is the granularity of the subunits? Clauses? Sentences? Paragraphs? –What information guides segmentation? Local cue phrases? Lexical cohesion? –How is the information modeled? Learned? –How do we measure success?

TextTiling (Hearst 1994) Identify discourse structure of expository text Hypothesized structure –Small number of main topics Sequence of subtopical discussions –Could be marked by headings and subheading –Non-overlapping, contiguous multi-paragraph blocks »Linear sequence of segments Just extents not (summary of) contents –Useful for WSD, IR, summary Assume small coherent units

Characterizing Topic Structure Presumes “piecewise” structure (Skorochod’ko 72) –Sequences of densely interrelated discussions –Linked linearly Allows multiple concurrent active themes –Segment at points of sudden radical change Change in subject matter: e.g. earth/moon vs binary stars Characterize similarity based on lexical cohesion –Repeated terms – literal or semantically related

Core Algorithm Tokenization: –Pseudo-sentences of fixed lengths: 20 words Stemmed, stopped Block similarity computation –Blocks: Groups of pseudo-sentences; length, k k=6 is good for many texts –Gap: between-block position, possible segment Compute similarity between adjacent blocks

Block Similarity Term weight w= term frequency in block Compute each gap similarity measure –Perform average smoothing

Boundary Detection Based on changes in similarity scores –For each gap, move left and right until similarity stops increasing –Compute difference b/t peak & current Sum peak differences –Largest scores -> Boundaries Require some minimum separation (3) Threshold depth for boundary

Evaluation Contrast with reader judgments –Alternatively with author or task-based 7 readers, 13 articles: “Mark topic change” –If 3 agree, considered a boundary Run algorithm – align with nearest paragraph –Contrast with random assignment at frequency Auto: 0.66, 0.61; Human:0.81, 0.71 –Random: 0.44, 0.42

Discussion Overall: Auto much better than random –Often “near miss” – within one paragraph 0.83,0.78 Issues: Summary material –Often not similar to adjacent paras Similarity measures –Is raw tf the best we can do? –Other cues?? Other experiments with TextTiling perform less well – Why?

RST Parsing (Marcu 1999) Learn and apply classifiers for –Segmentation and parsing of discourse Discourse structure –RST trees Fine-grained, hierarchical structure –Clause-based units –Inter-clausal relations: 71 relations: 17 clusters Mix of intention, informational relations

Corpus-based Approach Training & testing on 90 RST trees –Texts from MUC, Brown (science), WSJ (news) Annotations: –Identify “edu”s – elementary discourse units Clause-like units – key relation Parentheticals – could delete with no effect –Identify nucleus-satellite status –Identify relation that holds – I.e. elab, contrast…

RST Parsing Model Shift-reduce parser –Assumes segmented into edus Edu -> Edt – minimal discourse tree unit, undefined –Promotion set: self –Shift: Add next input edt to stack –Reduce: pop top 2 edts, combine in new tree Update: status, relation, promotion set Push new tree on stack Requires 1 shift op and 6 reduce –Reduce ops: ns, sn, nn -> binary; below ops: non-binary Learn: relation+ops: 17*6 + 1

Learning Segmentation Per-word: Sentence, edu, or parenthetical bnd Features: –Context: 5 word window, POS: 2 before, after Discourse marker present? Abbreviations? –Global context: Discourse markers, commas & dashes, verbs present –2417 binary features/example

Segmentation Results Good news: Overall:~97% accuracy –Contrast: Majority (none): ~92% –Contrast: “DOT”= bnd: 93% –Comparable to alternative approaches Bad news: –Problem cases: Start of parenthetical, edu

Learning Shift-Reduce Construct sequence of actions from tree For a configuration: –3 top trees on stack, next edt in input –Features: # of trees on stack, in input Tree characteristics: size, branching, relations Words and POS of 1 st and last 2 lexemes in spans Presence and position of any discourse markers Previous 5 parser actions Hearst-style semantic similarity across trees Similarity of WordNet measures

Classifying Shift-Reduce C4.5 classifier –Overall: 61% Majority: 50%, Random: 27% End-to-end evaluation: –Good on spans and N/S status Poor on relations

Discussion Noise in segmentation degrades parsing –Poor segmentation -> poor parsing Need sufficient training data –Subset (27) texts insufficient More variable data better than less but similar data Constituency and N/S status good –Relation far below human