From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio,

Slides:

Advertisements

Similar presentations

Critical Reading Strategies: Overview of Research Process

Advertisements

Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.

1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.

Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.

Towards Twitter Context Summarization with User Influence Models Yi Chang et al. WSDM 2013 Hyewon Lim 21 June 2013.

Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November,

Linear Model Incorporating Feature Ranking for Chinese Documents Readability Gang Sun, Zhiwei Jiang, Qing Gu and Daoxu Chen State Key Laboratory for Novel.

A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008.

Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.

Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.

Funding Networks Abdullah Sevincer University of Nevada, Reno Department of Computer Science & Engineering.

Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.

Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.

Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.

Approaches to automatic summarization Lecture 5. Types of summaries Extracts – Sentences from the original document are displayed together to form a summary.

Information Modeling: The process and the required competencies of its participants Paul Frederiks Theo van der Weide.

Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.

Scalable Text Mining with Sparse Generative Models

12 -1 Lecture 12 User Modeling Topics –Basics –Example User Model –Construction of User Models –Updating of User Models –Applications.

Query Operations: Automatic Global Analysis. Motivation Methods of local analysis extract information from local set of documents retrieved to expand.

Query session guided multidocument summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.

Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.

Temporal Event Map Construction For Event Search Qing Li Department of Computer Science City University of Hong Kong.

Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)

A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.

Literature Review and Parts of Proposal

An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.

Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.

CS4723 Software Validation and Quality Assurance

Focused Matrix Factorization for Audience Selection in Display Advertising BHARGAV KANAGAL, AMR AHMED, SANDEEP PANDEY, VANJA JOSIFOVSKI, LLUIS GARCIA-PUEYO,

Evaluating Statistically Generated Phrases University of Melbourne Department of Computer Science and Software Engineering Raymond Wan and Alistair Moffat.

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.

This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.

A Machine Learning Approach to Sentence Ordering for Multidocument Summarization and Its Evaluation D. Bollegala, N. Okazaki and M. Ishizuka The University.

1 Towards Automated Related Work Summarization (ReWoS) HOANG Cong Duy Vu 03/12/2010.

A Language Independent Method for Question Classification COLING 2004.

Presenter: Shanshan Lu 03/04/2010

Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.

1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.

LexPageRank: Prestige in Multi- Document Text Summarization Gunes Erkan and Dragomir R. Radev Department of EECS, School of Information University of Michigan.

Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.

A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.

Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.

1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.

Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,

Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,

Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:

1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.

Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -

1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:

NTU & MSRA Ming-Feng Tsai

Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)

LexPageRank: Prestige in Multi-Document Text Summarization Gunes Erkan, Dragomir R. Radev (EMNLP 2004)

Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Pastra and Saggion, EACL 2003 Colouring Summaries BLEU Katerina Pastra and Horacio Saggion Department of Computer Science, Natural Language Processing.

An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,

Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,

Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.

A Document-Level Sentiment Analysis Approach Using Artificial Neural Network and Sentiment Lexicons Yan Zhu.

Sentiment analysis algorithms and applications: A survey

Improving a Pipeline Architecture for Shallow Discourse Parsing

Kostas Kolomvatsos, Christos Anagnostopoulos

Presentation transcript:

From Extracting to Abstracting Generating Quasi- abstractive Summaries Zhuli Xie Application & Software Research Center Motorola Labs Barbara Di Eugenio, Peter C. Nelson Department of Computer Science University of Illinois at Chicago

2 Outline  Introduction  Quasi-abstractive summaries  Model & Approach  Experimental Results  Conclusion & Discussion

3 Introduction  Types of text summaries –Extractive: composed of whole sentences or clauses from source text. Paradigm adopted by most automatic text summarization systems –Abstractive: obtained using various techniques like paraphrasing. Equivalent to human-written abstracts. Still well beyond state-of-the-art.

4 Quasi-abstractive Summaries  Composed not of whole sentences from source text but of fragments that form new sentences [Jing 02]  We will show they are more similar to human-written abstracts, as measured with cosine similarity & ROUGE-1,2 metrics

5 Quasi Abstractive: Rationale Two sentences from a human written abstract: A1 We introduce the bilingual dual-coding theory as a model for bilingual mental representation. A2 Based on this model, lexical selection neural networks are implemented for a connectionist transfer project in machine translation. Extractive Summary (by ADAMS): E1 We have explored an information theoretical neural network that can acquire the verbal associations in the dual-coding theory. E2 The bilingual dual-coding theory partially answers the above questions. Candidate sentence set for A1: S1 The bilingual dual-coding theory partially answers the above questions. S2 There is a well-known debate in psycholinguistics concerning the bilingual mental representation... Candidate sentence set for A2: S3 We have explored an information theoretical neural network that can acquire the verbal associations in the dual-coding theory. S4 It provides a learnable lexical selection sub-system for a connectionist transfer project in machine translation.

6 Model & Approach  Learn model that can identify Candidate Sentence Set (CSS) a.Label: a.Label: generate patterns of correspondence b.Train classifier: to identify the CSS’s   Generate summary for a new document a. a.Generate CSS’s b. b.Realize Summary

7 CSS’s Discovery Diagram

8 Learn the CSS Model (1)  Label: – –decomposition of abstract sentences based on string overlaps – –70.8% of abstract sentences are composed of fragments of length >= 2, which can be found in the text to be summarized in our test data (CMP-LG corpus)

9 Learn the CSS Model (2)   Train classifier: Given docs where all CSS’s have been labelled, transform each doc into sentence pair set. Each instance is represented by feature vector and target feature is whether pair belongs to same CSS – –Used Decision Trees, also tried Support Vector Machines [Joachims, 2002] and Naïve Bayes classifiers [Borgelt, 1999] – –Sparse data problem: [Japkowicz 2000; Chawla et al., 2003]

10 Summary Generation  Generate CSS’s for unseen documents: –Use classifier to identify sentence pairs belonging to same CSS and merge them –CSS’s formation exhibits natural order since sentences and sentence pairs are labeled sequentially: i.e., first CSS will contain at least one fragment which appears earlier in source text than any fragments in second CSS  Summary Realization

11 Summary Realization  Simple Quasi-abstractive (SQa) – New sentence generated by appending new word to previously generated sequence according to n-gram probabilities calculated from CSS –Each CSS is used only once

12 Summary Realization  Quasi-abstractive with Salient Topics (QaST) –Salient NPs model based on social networks [Wasserman & Faust, 94; Xie 2005] –Sort predicted salient NPs according to their lengths –Traverse list of salient NPs and of CSS-based n-gram probabilities in parallel to generate sentence: use highest ranked NP which has not been used yet, and first n-gram probability model that contains this NP

13 Topic Prediction  Salient NPs –Abstract should contain salient topics of article –Topics are often expressed by NPs –We assume that NPs in an abstract represent most salient topics in article  NP Network & NP Centrality –Collocated NPs can be connected and hence network can be formed –Social network analysis techniques used to analyze network [Wasserman & Faust 94] and calculate centrality for nodes [Xie 05]

14 Experiments  Data: 178 documents from CMP-LG corpus, 3-fold cross validation  Four Models: –Lead: first sentence from first m paragraphs. –ADAMS: top m sentences ranked according to sentence ranking function ADAMS learned. –SQa: uses n-gram probabilities over first m discovered CSS’s to generate new sentences. –QaST: anchors choice of specific set of n-gram probabilities in salient topics. Stops after m sentences have been generated.

15 Evaluation Metrics  Cosine similarity: bag of words method  ROUGE-1,2: [Lin 2004] –A recall measure to compare machine-generated summary and its reference summaries –Still bag of words/n-gram method –But showed high correlation with human judges

16 Experimental Results LeadADAMSSQaQaST Cosine Similarity ROUGE-1 Score ROUGE-2 Score  SQa’s performance is even lower than Lead  ADAMS achieved +13.6%, +27.9%, and +37.8% improvement over Lead for the three metrics  QaST achieved +29.4%, +31.5%, and +64.3% improvement over Lead, and +13.9%, +2.8%, +19.3% over ADAMS  All differences between QaST and others are statistically significant (two sample t-test) except for ADAMS/ROUGE-1

17 Generated Sentence Sample   QaST:   Original: In collaborative expert-consultation dialogues, two participants ( executing agent and the consultant bring to the plan construction task different knowledge about the domain and the desirable characteristics of the resulting domain plan. In collaborative expert-consultation dialogues, two participants (executing agent and consultant) work together to construct a plan for achieving the executing agent’s domain goal. The executing agent and the consultant bring to the plan construction task different knowledge about the domain and the desirable characteristics of the resulting domain plan.

18 Sample Summary QaST: In this paper, we present a plan-based architecture for response generation in collaborative consultation dialogues, with emphasis on cases in which the user has indicated preferences. to an existing tripartite model might require inferring a chain of actions for addition to the shared plan, can appropriately respond to user queries that are motivated by ill-formed or suboptimal solutions, and handles in a unified manner the negotiation of proposed domain actions, proposed problem-solving actions, and beliefs proposed by discourse actions as well as the relationship amongst them. In collaborative expert- consultation dialogues, two participants( executing agent and the consultant bring to the plan construction task different knowledge about the domain and the desirable characteristics of the resulting domain plan. In suggesting better alternatives, our system differs from van Beek’s in a number of ways. Abstract: This paper presents a plan-based architecture for response generation in collaborative consultation dialogues, with emphasis on cases in which the system (consultant) and user (executing agent) disagree. Our work contributes to an overall system for collaborative problem-solving by providing a plan-based framework that captures the Propose-Evaluate-Modify cycle of collaboration, and by allowing the system to initiate subdialogues to negotiate proposed additions to the shared plan and to provide support for its claims. In addition, our system handles in a unified manner the negotiation of proposed domain actions, proposed problem-solving actions, and beliefs proposed by discourse actions. Furthermore, it captures cooperative responses within the collaborative framework and accounts for why questions are sometimes never answered.

19 Conclusion & Discussion  New type of machine generated summary: Quasi-abstractive summary  N-gram model anchored by salient NPs gives good results  Further investigation needed in several aspects –CSS’s Discovery with cost-sensitive classifiers [Domingos, 1999; Ting, 2002] –Grammaticality and length of generated summaries [Wan et al, 2007]