Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,

Similar presentations


Presentation on theme: "A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,"— Presentation transcript:

1 A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende, Microsoft Research Kathleen McKeown, Columbia University SIGIR 06

2 Why is summarization important?
Ani Nenkova

3 Summarizing multi-document
Pan Am bombing Libya suspects Gadhafi trial Libya refuses to surrender two Pan Am bombing suspects UK and USA ??? Ani Nenkova

4 Ani Nenkova

5 Introduction Most current automatic summarization systems rely on sentence extraction Common approaches for identifying important sentences to include in the summary: training a binary classifier training a Markov model directly assigning weights to sentences But the question of which components and features of automatic summarizers contribute most to their performance has largely remained unanswered

6 Introduction (cont’d)
In this paper, examining several design decisions and the impact they have on the performance of generic multi-document summarizers of news Content word frequency content words such as nouns, verbs and adjectives serve as surrogates for the atomic units of meaning in text Choice of composition function it must estimate the importance of larger text units, typically sentences Context sensitivity The notion of importance is not static: it depends on what has been already said in a summary

7 Frequency in Human Summaries
Performance Evaluation - Human agreement different people choose different content for their summaries the degree of overlap between input documents and human summaries Investigating the association between content that appears frequently, and the likelihood that it will be selected by a human summarizer

8 Frequency in Human Summaries (cont’d) Content word frequency and importance
DUC (Document Understanding Conference) 2003, 30 test sets Input documents, four human abstracts, automatic summaries Each sets: 10 documents, 100 words summary The counts for frequency in the input were taken over the concatenation of the documents (All) in the input set

9 Frequency in Human Summaries (cont’d) Content word frequency and importance
Words frequent in the input appear in human summaries Are content words that are very frequent in the input likely to appear in at least one of the human summaries? Exclude stop words Use only nouns, verbs and adjectives Result: The high frequency words from the input are very likely to appear in the human models For the automatic summarizer, the trend to include more frequent words is preserved

10 Frequency in Human Summaries (cont’d) Content word frequency and importance
Humans agree on words that are frequent in the input The words that human summarizers agreed to use in their summaries include the high frequency ones In the 30 sets of DUC 2003 data, the state-of-the-art machine summary contained 69% of the words appearing in all 4 human models and 46% of the words that appeared in 3 models

11 Frequency in Human Summaries (cont’d) Content word frequency and importance
Formalizing frequency: the multinomial model The findings from the previous sections suggest that frequency in the inputs is strongly indicative of whether a word will be used in a human summary Likelihood of a summary N is the number of words in the summary r is the number of unique words in the summary Ni is the number of times word wi appears in the summary p(wi) is the probability wi appearing in the summary estimated from the input documents

12 Frequency in Human Summaries (cont’d) Content word frequency and importance
Formalizing frequency: the multinomial model (cont’d) The log likelihood of summaries produced by human summarizers were overall higher than for those produced by systems log[L(sum; p(wi))] =

13 Frequency in Human Summaries (cont’d) Frequency of semantic content
We established that high-frequency content words in the input are very likely to be used in human summaries But, not the same facts have covered A better granularity for such investigation is the semantic content unit, an atomic fact expressed in a text Pyramid approach for evaluation In annotation procedure, the content units are manually annotated, and expressions with the same meaning are linked together 11 sets of DUC 2004

14 Frequency in Human Summaries (cont’d) Frequency of semantic content
As in the study for words, we looked at the N most frequent content units in the inputs and calculated the percentage of these that appeared in any of the human summaries The 5 most frequent content units, 96% appeared in a human summary across the 11 sets Top 8 and top 12 content units were 92% and 85%

15 Composition Functions
Sentence is the usual units for extraction in summarization How is the frequency of words to be combined in order to get an estimate for the importance of sentences We can define a family of summarizers, SUMCF , where CF is the combination function

16 Composition Functions (cont’d)
Context-sensitive frequency-based summarizer Input Text ………………………………………………… Sentence Splitter Sentences Calculating Term Probability based on term frequency aa 0.1 bb 0.3 cc 0.4 0.6 Calculating CF Value for each Sentence Words Probability Verbs, nouns, adjectives, numbers 96%是屬於most freq words ………………………………………………… 0.011 0.322 0.566 0.12 Sentences CF Value the specific choice has a huge impact on the performance of the summarizer; not all frequency-based summarizers perform well

17 Context Adjustment Using frequency alone to determine summary content in multi-document summarization will result in a repetitive summary Input Text ………………………………………………… Sentence Splitter Sentences Calculating Term Probability based on term frequency aa 0.1 bb 0.3 cc 0.4 0.6 Calculating CF Value for each Sentence Words Probability Verbs, nouns, adjectives, numbers ………………………………………………… 0.011 0.322 0.566 0.12 Select sentence with max score for each round, and reassign word probability Sentences CF Value

18 Context Adjustment (cont’d)
It gives the summarizer sensitivity to context We also allow words with initially low probability to have higher impact on the choice of subsequent sentences In terms of content units, the inclusion of the same unit twice in the same summary is rather improbable

19 Evaluation Results Document Understanding Conference (DUC)
50 test sets We used the data from the 2003 DUC conference for development and the data from the 2004 DUC as test data The choice of combination function CF has a significant impact on summarizer performance CF=Product: shorter sentences CF=Sum: more words in sentence CF=Average: compromise

20 Evaluation Results (cont’d)
All summaries were truncated to 100 words for the evaluation System Rouge-1 #sentenences Peer 65 0.305 CF=Sum 0.283 3.10 CF=Average 0.280 4.46 CF=AvrNoAdjust 0.252 CF=Product 0.227 5.40 Baseline 0.202 HMM System SUMCF summarizers are non-supervised 3 vs 13 repeated content unit Significantly worse

21 Evaluation Results (cont’d)
Machine translation and summarization evaluation 2005 Workshop at ACL only 10 of test sets

22 Conclusions (1/2) The analysis using the DUC datasets shows that frequency has a powerful impact on the performance of summarization systems And, a good composition function is used Results show that SUMAvr yields a system that performs comparably to other state-of-the-art systems and that outperforms many of the participating systems More importantly, repetition in the summary significantly decreases

23 Conclusions (2/2) These results suggest that: more complex combination of features used by state-of-the-art systems today may not be necessary The fact that composition plays an important role in performance, but is an unknown for most state-of-the-art systems, who often do not report

24 Comments

25 Pyramid score for evaluation
New summary with n content units Ani Nenkova


Download ppt "A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,"

Similar presentations


Ads by Google