NYU/CRL system for DUC and Prospect for Single Document Summaries

NYU/CRL system for DUC and Prospect for Single Document Summaries
September 14, 2001 DUC2001 Workshop My name is Chikashi Nobata from Communication Research Laboratory Japan. Mr. Sekine was planning to give this talk, but as he could not come to the workshop, because of the condition in New York and his flight, I will give this talk instead. In the first half of the talk, we will explain our system for DUC, and for this part, I developed the system with him, so I can answer your questions. In the latter part, talking about single document summary and information extraction, It is Mr. Sekine’s idea and thought, so I may not be able to answer your questions. Please send him regarding that. Satoshi Sekine (New York University) Chikashi Nobata (CRL – Japan)

Objective Use IE technologies for Summarization
Named Entity Automatic pattern discovery Find important phrases (patterns) of the domain Combine with Summarization technologies Important Sentence Extraction Sentence position, length, TF/IDF, Headline The objective of our participation to the DUC is this. We aim to use information extraction technologies for summarization, In particular single document summary this time. We used a named entity tagger, which tags names like organization, location or person. Also we used automatic pattern discovery procedure, which is currently a hot topic in IE research. Automatic pattern discovery is a method to find phrase patterns which is useful for IE, For example, in executive succession domain, it tries to find phrases which express the event, like “Organization hires Person”. We used those methods combining with the standard summarization technologies. It is important sentence extraction using information like sentence position, sentence length, TF/IDF and headline information. We did not use co-reference which is also a well-used technique this time, because we don’t have a good system.

Important Sentence Extraction
Combining 5 scores Sentence position Sentence length TF/IDF Similarity to Headline Pattern Optimize functions/weights on training data So, we used 5 components in important sentence extraction. Namely, we combined 5 independent scores by interpolation. The weights and functions to be used were decided by optimizing the performance of the system on the training data. I’m going to explain each function from the next slide and show you the contribution of each component.

Alternative scores for Sentence position
0 (otherwise) max(1/i, 1/(n-i+1)) Score We prepared three functions for sentence position. In the slide, the one we used in the system is displayed in red. The function in red is like a step function. It returns 1 if the position of the sentence is within T from the beginning, and returns 0 otherwise. T is decided by the number of words for the summary. So if we just used this function, it is exactly the lead-based system. We have two more functions. One is reciprocal function to the position of the sentence, by which the first sentence receives the highest score and gradually decrease and go to the minimum at the final sentence. The other is maximum of the reciprocal to the position from the top or tail. This looks strange in general, but this function was the optimal function in summarizing the editorial articles in Japanese summarization evaluation project, TSC. It means that in the editorial articles, the conclusion sentences were also important. (!!Nova, If TSC allow to say this!! By the way, we got the top score in the 10% summary, subjective judgement in the evaluation.) 1/i 1 T n Sentence position

Alternative scores for Sentence length & TF/IDF
1. Score = Length 2. Score = Length (if L>C) Length – C (other wise) TF/IDF TF = tf(w), (tf(w)-1)/tf(w), tf(w)/(tf(w)+1) For the sentence length we have two scores, one is just the length and the other is the length with penalty for the short sentences. We have three scores for TF/IDF, where term frequencies were calculated differently. One used the raw term frequencies, and the others are two ways of normalizing the figure.

Alternative scores for Headline
TF/IDF ratio between words overlapping words in headline and all words in sentence TF ratio between overlapping Named Entities (NE), and all NE’s in sentence TF = tf(e)/(1+tf(e)) We use the similarity measure of the sentence to the headline. The basic idea is that the more words in the sentence overlaps with the words in the headline, the more important the sentence is. The words are weighted by TF/IDF of the word. There are two methods in selecting words. One is to use all word except stop words, and the other is to use only named entities. From the training data, we found that NE only was better than all words. Here the term frequency was normalized like the way shown in the slide.

Pattern Assumption Strategy
Patterns (phrases) that appear often in the domain are important Strategy Intended to use IR to find a larger set of documents in the domain, but used the given document set NE’s were treated as class rather than the literal Now, I will explain the main IE technology we introduced in our system. The assumption of the IE pattern discovery is that patterns that appears often in the domain is important. For example, in the earthquake report domain, we may find a lot of patterns like “There was an earthquake in LOCATION at TIME on DATE”. Then it is regarded as an important pattern in the domain. We were intended to use IR technique to find a larger set of similar documents, but because of time limitation, we used only the given document set provided by DUC. Although the number of documents are small, like 10, the quality are good, as these were created by human. In the pattern instances, each type of named entities was treated as a class rather than the literal words.

Pattern discovery Procedure Analyze sentences (NE, dependency)
Extract all sub-trees from the dependency trees in the domain Score the trees based on frequency of the tree and TF/IDF of the words High score trees are regarded as important patterns Basic procedure of pattern discovery is this. First analyze the all the sentences in the domain documents, namely named entity tagging and dependency analysis. Extract all sub-trees from the dependency trees and score the tree based on the frequency of the tree and TF/IDF of the words in the tree. Then high score trees are regarded as important patterns.

Optimal weight Optimal weights are found on training set Contribution
Score weight * std. dev. Position 277 Length 8 TF/IDF 96 Headline 18 Pattern 2 Now, I will explain the contribution of each component. As I mentioned before, the optimal weights were found on the training set. The table shows, the contribution of each component, which is multiplication of the optimized weight and the standard deviation of the score. We can see the biggest contribution is the sentence position and the second is the TF/IDF. The others are relatively small, and unfortunately the contribution of the IE pattern is very small. In the evaluation, our average subjective score is the best and there were very small difference to the second, this small contribution from IE pattern could have made the difference, although we have not investigate the system without the pattern component,

Evaluation Result Subjective evaluation (V; out of 12)
Average over all documents System Lead Average Grammaticality 3.711 (5) 3.236 3.580 Cohesion 3.054 (1) 2.926 2.676 Organization 3.215 (1) 3.081 2.870 Total 9.980 (1) 9.243 9.126 This is the evaluation result. This shows the average of the subjective evaluation over all documents. Our system scores 5th in grammaticality and the top for the other measurements including total.

Prospect for Single Document Summaries
Important Sentence Extraction CAN be Summarization but is NOT Now I finished talking about our system for DUC and let me talk about the prospect for single document summaries. In our system, we used important sentence extraction technique, and I believe some other system did, too. However, I believe many of us agree that “Important sentence extraction can be summarization, but summarization is not important sentence extraction.” We would like and we have to go beyond the important sentence extraction.

DUC We are aiming for Document understanding
How can understanding be instantiated? Make summary Extract essential point, principle relations Answer questions Comprehension test Let’s think about document understanding, as we are DUC people. The problem of document understanding is that it is difficult to assess if a human or a system “understands” the document. For example, we can ask a high school student to make a summary in order to prove he understands the document, but do we think the student understand the document if he extracts top n sentence as a summary? Or we can ask him to extract essential point or principle relations. Or we can ask him questions about the contents and judge if he can answer the question. Or finally we can give him a comprehension test. The last two methods are attractive, but in order to do so we have to prepare questions or a test. Also, it may introduce biased point of view. I would like to pursue the second method to instantiate the documents understanding.

Example Earthquake jolts Los Angeles area
LOS ANGELES (AP) — An earthquake shook the greater Los Angeles area Sunday, but there were no immediate reports of damage or injuries. The quake had a preliminary magnitude of 4.2 and was centered about one mile southeast of West Hollywood, said Lucy Jones of the U.S. Geological Survey. The quake was felt in downtown Los Angeles where it rolled for about four seconds and also shook in the suburban areas of Van Nuys, Whittier and Glendale. I will show you an example. This is a part of a Web article from CNN last Sunday. It is already a summary as it is only the first three sentences from the document of more than ten sentences. Following the discussion before, do you think the student understands the whole document if he returns these three sentences? Alternatively, how about if he gives back …

Essential points Event (Earthquake) When: Sunday, September 9, 2001
Where: greater Los Angeles area Magnitude: 4.2 Injury: No Death: No Damage: No THIS. I call it the “essential points”, but it is a table in IE or maybe called other ways. It tells that in the document, an event was mentioned, which is an earthquake event. It happened Sunday, September 9, in greater Los Angeles area, magnitude 4.2 and reported no injury, death and damage. I believe this is good enough to show that the student or the system understands the document.

How can we make it IE is a hint (a step)
IE is a version of document understanding limited to a specific domain and task which are given in advance Document understanding can be achieved by upgrading IE technologies by deleting “specific” and “given in advance” So how can we make it? This is the research conducted at NYU with some collaborations of CRL. In order to achieve it, IE can be a hint or a step. We can say that IE is a version of document understanding limited to a specific domain and task which are given in advance. In other words, document understanding can be achieved by upgrading IE technologies by deleting “specific” and “given in advance”. So now the question is how to delete these terms. It can be done by dynamically extracting domain knowledge for IE.

Our approach Essential points can be found by searching frequently mentioned patterns in the same domain Strategy Given a document, find its domain by IR Find frequently mentioned patterns Extract information matching those patterns Our approach is partially the same to what I explained at pattern discovery in the system. Essential points can be found by searching frequently mentioned patterns in the same domain The strategy is this. Given a document, find some documents in its domain by IR. Find frequently mentioned patterns from the document set. Up to here, it is the same as the pattern discovery we have used in our DUC system. Then in order to extract the information, match the extracted pattern to the given document. So, in short, it tries to find the essential point by the help of similar articles.

Single Document Summarization
Has to be continued To pursue researches on “Understanding” To find something more than sentence extraction To observe human in summary task To have new comers (like us) In our conclusion, I hope the single document summarization will be continued in any shape. It is in order to pursue researches on understanding, In particular to find something more than important sentence extraction. Also, to observe human in summary task. This has been dome by SUMMAC, DUC and TSC in Japan, but these are basically text based. We may want to think about different style summary, as well. Finally, this is for the project to be grown. If there will be no single document summarization and only multi document summarization evaluation, it is hard for new comers to join. I think we could not join if there were no single document summarization this time. I think some people worry about the evaluation if we would like to submit the essential points as summary. In other words, it is how to evaluate different styles of summary in a same manner. I’m not sure if the single measurement is necessary or not, but if it is needed, for example, two dimensional measurement done by SUMMAC could be a hint. They draw a graph in which one dimension is some kinds of goodness score and the other is the time to judge the summary. I believe that our style of summary does not need much time to be read, but may get a bad score at the moment. Another idea is to ask human to extract essential points, and see the recall and precision on both styles of summaries. In this case, the time to judge may also be an important factor in the evaluation. Anyway, I hope the single document summarization project will be continued in any shape. Again I would like to convey the apologize from Mr.Sekine and thank you for listening. I can answer questions to the first half of the talk.

NYU/CRL system for DUC and Prospect for Single Document Summaries

Similar presentations

Presentation on theme: "NYU/CRL system for DUC and Prospect for Single Document Summaries"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

NYU/CRL system for DUC and Prospect for Single Document Summaries

Similar presentations

Presentation on theme: "NYU/CRL system for DUC and Prospect for Single Document Summaries"— Presentation transcript:

Similar presentations

About project

Feedback