NYU/CRL system for DUC and Prospect for Single Document Summaries

Slides:



Advertisements
Similar presentations
Critical Reading Strategies: Overview of Research Process
Advertisements

Effective reading strategies for study
Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Business and Administrative Communication SIXTH EDITION.
Using Information Extraction for Question Answering Done by Rani Qumsiyeh.
Writing Articles. Articles take a considered view of events, including opinions and sometimes refer to related issues. Reports are more immediate and.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
LITERACY SUCCESS 11 Part B A PROVINCIAL DEPARTMENT OF EDUCATION INNITIATIVE It is recommended that you view the Literacy Success 10 PowerPoint before viewing.
 Finding Scholarly Research on Your Topic. Your Research Journey…  You have, at this point, found information on your topic from general sources – news.
NYU/CRL system for DUC and Prospect for Single Document Summaries Satoshi Sekine (New York University) Chikashi Nobata (CRL – Japan) September 14, 2001.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Discovering Relations among Named Entities from Large Corpora Takaaki Hasegawa *, Satoshi Sekine 1, Ralph Grishman 1 ACL 2004 * Cyberspace Laboratories.
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
Writing Exercise Try to write a short humor piece. It can be fictional or non-fictional. Essay by David Sedaris.
An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,
A Survey on Automatic Text Summarization Dipanjan Das André F. T. Martins Tolga Çekiç
Present apply review Introduce students to a new topic by giving them a set of documents using a variety of formats (e.g. text, video, web link etc.) outlining.
Language Learning for Busy People These documents are private and confidential. Please do not distribute.. Pre-Intermediate: Interview Skills 5 Discussing.
Avalon Science and Engineering Fair 2015 Let’s Get Started Science and Engineering Fair packets will go home this week. All 2 nd, 3 rd, 4 th and 5 th.
Comprehensive Science II Mrs. Paola González
The Scientific Method.
Oral Presentation of the Teaching Plan for Module 2, Book 1
Defining the research problem
The problem you have samples for has been edited to reduce the amount of reading for the students. We gave the original late in the year. As we were working.
An –Najah National University Submitted to : Dr. Suzan Arafat
Reading Unit: 1 Lesson: 9 Module: A Objectives:
REPORT WRITING.
Module 4: Building Reports
What is a CAT? What is a CAT?.
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
15 “To read without reflecting is like eating without digesting.” ― Edmund Burke, 18th century Irish statesman and philosopher Reading with a Purpose.
Writing your reflection in Stage 1 & 2 Indonesian (continuers)
How learners learn in my teaching world…
Primary Lesson Designer(s):
Coursework: The Use of Generic Application Software for Task Solution
CHAPTER 7 REFLECTING IN COMMUNICATION
Editorials.
HUM 102 Report Writing Skills
The Learner Centered Classroom
Level 4 Counselling: Catherine Drewer
Qualitative and Quantitative Data
COMPREHENSION Tool Kit K-3 1 1
Time for notetaking FLASHCARDS!
How to take notes… The Crainum Way!
Q uality uestioning Materials adapted from QUILT curriculum:
Year 2: How to help your child
Fermi Questions Enrico Fermi.
Teaching Listening Based on Active Learning.
Counseling with Depth of Knowledge
Reading in Year 5 and 6 At Gulf Harbour School.
Dot Plots & Box Plots Analyze Data.
Module 5: Data Cleaning and Building Reports
Semantic Knowledge Discovery, Organization and Use
What are the SATS tests? The end of KS2 assessments are sometimes informally referred to as ‘SATS’. SATS week across the country begins on 14th May 2018.
South Douglas Elementary Science and Engineering Fair
CS 430: Information Discovery
Your Standards TODAY’S FLOW MORNING: Standards & 1st Unit Curriculum
Algorithm Discovery and Design
Overview of Group Presentations & Counterarguments
Chapter Four Engineering Communication
Chapter Four Engineering Communication
Chapter Four Engineering Communication
Synthesis.
Template For Oral Presenation
Time for notetaking FLASHCARDS!
Bell Ringer August 20, 2014 On a clean sheet of paper in the writing section of your binder, write the heading above. Then, take a few minutes to examine.
Development of Marking-support Soft in Task-given-type Examination
Retrieval Performance Evaluation - Measures
Software Development Techniques
Using Phonemic Awareness &
Presentation transcript:

NYU/CRL system for DUC and Prospect for Single Document Summaries September 14, 2001 DUC2001 Workshop My name is Chikashi Nobata from Communication Research Laboratory Japan. Mr. Sekine was planning to give this talk, but as he could not come to the workshop, because of the condition in New York and his flight, I will give this talk instead. In the first half of the talk, we will explain our system for DUC, and for this part, I developed the system with him, so I can answer your questions. In the latter part, talking about single document summary and information extraction, It is Mr. Sekine’s idea and thought, so I may not be able to answer your questions. Please send him e-mail regarding that. Satoshi Sekine (New York University) Chikashi Nobata (CRL – Japan)

Objective Use IE technologies for Summarization Named Entity Automatic pattern discovery Find important phrases (patterns) of the domain Combine with Summarization technologies Important Sentence Extraction Sentence position, length, TF/IDF, Headline The objective of our participation to the DUC is this. We aim to use information extraction technologies for summarization, In particular single document summary this time. We used a named entity tagger, which tags names like organization, location or person. Also we used automatic pattern discovery procedure, which is currently a hot topic in IE research. Automatic pattern discovery is a method to find phrase patterns which is useful for IE, For example, in executive succession domain, it tries to find phrases which express the event, like “Organization hires Person”. We used those methods combining with the standard summarization technologies. It is important sentence extraction using information like sentence position, sentence length, TF/IDF and headline information. We did not use co-reference which is also a well-used technique this time, because we don’t have a good system.

Important Sentence Extraction Combining 5 scores Sentence position Sentence length TF/IDF Similarity to Headline Pattern Optimize functions/weights on training data So, we used 5 components in important sentence extraction. Namely, we combined 5 independent scores by interpolation. The weights and functions to be used were decided by optimizing the performance of the system on the training data. I’m going to explain each function from the next slide and show you the contribution of each component.

Alternative scores for Sentence position 0 (otherwise) max(1/i, 1/(n-i+1)) Score We prepared three functions for sentence position. In the slide, the one we used in the system is displayed in red. The function in red is like a step function. It returns 1 if the position of the sentence is within T from the beginning, and returns 0 otherwise. T is decided by the number of words for the summary. So if we just used this function, it is exactly the lead-based system. We have two more functions. One is reciprocal function to the position of the sentence, by which the first sentence receives the highest score and gradually decrease and go to the minimum at the final sentence. The other is maximum of the reciprocal to the position from the top or tail. This looks strange in general, but this function was the optimal function in summarizing the editorial articles in Japanese summarization evaluation project, TSC. It means that in the editorial articles, the conclusion sentences were also important. (!!Nova, If TSC allow to say this!! By the way, we got the top score in the 10% summary, subjective judgement in the evaluation.) 1/i 1 T n Sentence position

Alternative scores for Sentence length & TF/IDF 1. Score = Length 2. Score = Length (if L>C) Length – C (other wise) TF/IDF TF = tf(w), (tf(w)-1)/tf(w), tf(w)/(tf(w)+1) For the sentence length we have two scores, one is just the length and the other is the length with penalty for the short sentences. We have three scores for TF/IDF, where term frequencies were calculated differently. One used the raw term frequencies, and the others are two ways of normalizing the figure.

Alternative scores for Headline TF/IDF ratio between words overlapping words in headline and all words in sentence TF ratio between overlapping Named Entities (NE), and all NE’s in sentence TF = tf(e)/(1+tf(e)) We use the similarity measure of the sentence to the headline. The basic idea is that the more words in the sentence overlaps with the words in the headline, the more important the sentence is. The words are weighted by TF/IDF of the word. There are two methods in selecting words. One is to use all word except stop words, and the other is to use only named entities. From the training data, we found that NE only was better than all words. Here the term frequency was normalized like the way shown in the slide.

Pattern Assumption Strategy Patterns (phrases) that appear often in the domain are important Strategy Intended to use IR to find a larger set of documents in the domain, but used the given document set NE’s were treated as class rather than the literal Now, I will explain the main IE technology we introduced in our system. The assumption of the IE pattern discovery is that patterns that appears often in the domain is important. For example, in the earthquake report domain, we may find a lot of patterns like “There was an earthquake in LOCATION at TIME on DATE”. Then it is regarded as an important pattern in the domain. We were intended to use IR technique to find a larger set of similar documents, but because of time limitation, we used only the given document set provided by DUC. Although the number of documents are small, like 10, the quality are good, as these were created by human. In the pattern instances, each type of named entities was treated as a class rather than the literal words.

Pattern discovery Procedure Analyze sentences (NE, dependency) Extract all sub-trees from the dependency trees in the domain Score the trees based on frequency of the tree and TF/IDF of the words High score trees are regarded as important patterns Basic procedure of pattern discovery is this. First analyze the all the sentences in the domain documents, namely named entity tagging and dependency analysis. Extract all sub-trees from the dependency trees and score the tree based on the frequency of the tree and TF/IDF of the words in the tree. Then high score trees are regarded as important patterns.

Optimal weight Optimal weights are found on training set Contribution Score weight * std. dev. Position 277 Length 8 TF/IDF 96 Headline 18 Pattern 2 Now, I will explain the contribution of each component. As I mentioned before, the optimal weights were found on the training set. The table shows, the contribution of each component, which is multiplication of the optimized weight and the standard deviation of the score. We can see the biggest contribution is the sentence position and the second is the TF/IDF. The others are relatively small, and unfortunately the contribution of the IE pattern is very small. In the evaluation, our average subjective score is the best and there were very small difference to the second, this small contribution from IE pattern could have made the difference, although we have not investigate the system without the pattern component,

Evaluation Result Subjective evaluation (V; out of 12) Average over all documents System Lead Average Grammaticality 3.711 (5) 3.236 3.580 Cohesion 3.054 (1) 2.926 2.676 Organization 3.215 (1) 3.081 2.870 Total 9.980 (1) 9.243 9.126 This is the evaluation result. This shows the average of the subjective evaluation over all documents. Our system scores 5th in grammaticality and the top for the other measurements including total.

Prospect for Single Document Summaries Important Sentence Extraction CAN be Summarization but is NOT Now I finished talking about our system for DUC and let me talk about the prospect for single document summaries. In our system, we used important sentence extraction technique, and I believe some other system did, too. However, I believe many of us agree that “Important sentence extraction can be summarization, but summarization is not important sentence extraction.” We would like and we have to go beyond the important sentence extraction.

DUC We are aiming for Document understanding How can understanding be instantiated? Make summary Extract essential point, principle relations Answer questions Comprehension test Let’s think about document understanding, as we are DUC people. The problem of document understanding is that it is difficult to assess if a human or a system “understands” the document. For example, we can ask a high school student to make a summary in order to prove he understands the document, but do we think the student understand the document if he extracts top n sentence as a summary? Or we can ask him to extract essential point or principle relations. Or we can ask him questions about the contents and judge if he can answer the question. Or finally we can give him a comprehension test. The last two methods are attractive, but in order to do so we have to prepare questions or a test. Also, it may introduce biased point of view. I would like to pursue the second method to instantiate the documents understanding.

Example Earthquake jolts Los Angeles area LOS ANGELES (AP) — An earthquake shook the greater Los Angeles area Sunday, but there were no immediate reports of damage or injuries. The quake had a preliminary magnitude of 4.2 and was centered about one mile southeast of West Hollywood, said Lucy Jones of the U.S. Geological Survey. The quake was felt in downtown Los Angeles where it rolled for about four seconds and also shook in the suburban areas of Van Nuys, Whittier and Glendale. I will show you an example. This is a part of a Web article from CNN last Sunday. It is already a summary as it is only the first three sentences from the document of more than ten sentences. Following the discussion before, do you think the student understands the whole document if he returns these three sentences? Alternatively, how about if he gives back …

Essential points Event (Earthquake) When: Sunday, September 9, 2001 Where: greater Los Angeles area Magnitude: 4.2 Injury: No Death: No Damage: No THIS. I call it the “essential points”, but it is a table in IE or maybe called other ways. It tells that in the document, an event was mentioned, which is an earthquake event. It happened Sunday, September 9, in greater Los Angeles area, magnitude 4.2 and reported no injury, death and damage. I believe this is good enough to show that the student or the system understands the document.

How can we make it IE is a hint (a step) IE is a version of document understanding limited to a specific domain and task which are given in advance Document understanding can be achieved by upgrading IE technologies by deleting “specific” and “given in advance” So how can we make it? This is the research conducted at NYU with some collaborations of CRL. In order to achieve it, IE can be a hint or a step. We can say that IE is a version of document understanding limited to a specific domain and task which are given in advance. In other words, document understanding can be achieved by upgrading IE technologies by deleting “specific” and “given in advance”. So now the question is how to delete these terms. It can be done by dynamically extracting domain knowledge for IE.

Our approach Essential points can be found by searching frequently mentioned patterns in the same domain Strategy Given a document, find its domain by IR Find frequently mentioned patterns Extract information matching those patterns Our approach is partially the same to what I explained at pattern discovery in the system. Essential points can be found by searching frequently mentioned patterns in the same domain The strategy is this. Given a document, find some documents in its domain by IR. Find frequently mentioned patterns from the document set. Up to here, it is the same as the pattern discovery we have used in our DUC system. Then in order to extract the information, match the extracted pattern to the given document. So, in short, it tries to find the essential point by the help of similar articles.

Single Document Summarization Has to be continued To pursue researches on “Understanding” To find something more than sentence extraction To observe human in summary task To have new comers (like us) In our conclusion, I hope the single document summarization will be continued in any shape. It is in order to pursue researches on understanding, In particular to find something more than important sentence extraction. Also, to observe human in summary task. This has been dome by SUMMAC, DUC and TSC in Japan, but these are basically text based. We may want to think about different style summary, as well. Finally, this is for the project to be grown. If there will be no single document summarization and only multi document summarization evaluation, it is hard for new comers to join. I think we could not join if there were no single document summarization this time. I think some people worry about the evaluation if we would like to submit the essential points as summary. In other words, it is how to evaluate different styles of summary in a same manner. I’m not sure if the single measurement is necessary or not, but if it is needed, for example, two dimensional measurement done by SUMMAC could be a hint. They draw a graph in which one dimension is some kinds of goodness score and the other is the time to judge the summary. I believe that our style of summary does not need much time to be read, but may get a bad score at the moment. Another idea is to ask human to extract essential points, and see the recall and precision on both styles of summaries. In this case, the time to judge may also be an important factor in the evaluation. Anyway, I hope the single document summarization project will be continued in any shape. Again I would like to convey the apologize from Mr.Sekine and thank you for listening. I can answer questions to the first half of the talk.