Download presentation
Presentation is loading. Please wait.
Published byLetitia Thomas Modified over 9 years ago
1
A progressive sentence selection strategy for document summarization You Ouyang, Wenjie Li, Renxian Zhang, Sujian Li, Qin Lu IPM 2013 Hao-Chin Chang Department of Computer Science & Information Engineering National Taiwan Normal University 2013/03/05
2
2 Outline Introduction Methodology Experiments and evaluation Conclusion and future work
3
Introduction Many studies –It is also well acknowledged that sentence selection strategies are very important, which mainly aim at reducing the redundancy among the selected sentences to enable them to cover more concepts. Different from the existing methods, in our study we’d like to explore the idea of directly examining the uncovered part of the sentences for saliency estimation in order to maximize the coverage of the summary. 3
4
Introduction To avoid the possible saliency problem, we make use of the subsuming relationship between sentences to improve the saliency measure. –The idea is to use the salient general concepts that are more significant to help discover the salient supporting concepts. once we have selected a general word ‘‘school’’ in a sentence of the summary, we would like to select ‘‘student’’ or ‘‘teacher’’ in the next sentences. Sentence A: the schools that have vigorous music programs tend to have higher academic performance. Sentence B: among the lower-income students without music involvement, only 15.5% achieved high math scores. when sentence A is selected, how much we want to include another sentence B to support the ideas in sentence A. 4
5
Identifying word relations 1. linguistic relation databases such as WordNet 2. frequency-based statistics such as co-occurrence or pointwise mutual information In our study, the target is to study the subsuming relations between the words in the input documents. –the association of two words is defined by two conditions: –P(a|b) ≧ 0.8, P(b|a) < P(a|b) Word a subsumes word b if the documents in which b occurs are a subset, or nearly a subset, or nearly a subset, of the documents in which a occurs. 5
6
Identifying word relations Sentence-level coverage –sometimes a document set just consists of only a few documents –to get more available information, We intend to study the sentence-level co-occurrence statistics in our study Set-based coverage –Sentence-level co-occurrence is sparser than document-level co- occurrence due to the shorter length of sentences. –we intend to examine the coverage not only between two words, but also between a word and a word set two common phrases King Norodom, Prince Norodom Norodom is almost entirely covered by the set {King, Prince } 6
7
Identifying word relations Transitive reduction –The subsuming relation between two words also reflects the recommendation status between them. –three words a, b, c that satisfy a > b, b > c and a > c (a > b denotes a subsuming b), –the long-term relationship a > c will be ignored Spanned sentence set –a word w in a document set D, whose sentence set is denoted by S D, is defined as the set of the sentences that contain w 7
8
Identifying word relations an existing non-empty word set Concept coverage of a word w over W is devised to reflect to what extent w brings new information relative to the known information provided in W COV(w) is defined as the proportion of the sentences in SPAN(w) that appear in SPAN(W) The smaller the coverage is, the more likely w will bring new information to W 8
9
Identifying word relations When comparing the word w to a former word w 0 that already subsumes a set of words S to align a relation between w and w 0 –two constraints (0-1) 9
10
The definition of the subsuming relationship the word set of s as W = the word set of s ’ as W ’ = Connected Word –A word w i in W is regarded to be ‘‘connected’’ to a word w ’ j in W ’ if it satisfies the condition –s.t. directly connects w i is w l1 The weight of the edge COV(w i |w l1 ) strength of the connection between w i and w ’ j CON(w i | w ’ j ) 10
11
The definition of the subsuming relationship Conditional Saliency (CS for short) of s to s ’ is calculated as a weighted sum of the importance of all the ‘‘connected words’’ in s to s ’ 11
12
Progressive sentence selection strategy It can be viewed as a random walking process on the DAG(directed acyclic graph) from the center to its surrounding nodes we introduce a virtual word besides the real words that do appear in the input documents The virtual word is used as the center of the DAG (denoted as ROOT-W). we can view it as a virtual word that spans the whole sentence set so that it can perfectly cover any actual word. 12
13
Progressive sentence selection strategy This virtual sentence ROOT-S is regarded as being already selected at the beginning of the sentence selection process. The conditional saliency of a sentence to ROOT-S just indicates its ability of describing the general ideas of the input documents because the words attached to ROOT-W are the general words. 13
14
Progressive sentence selection strategy The sentence selection process is cast as: –first adding ROOT-S to the initial summary –then iteratively adding the sentence that best supports the existing sentence(s) (denoted as S old ) the score of each unselected sentence based on its conditional saliency to each selected sentence This maximum saliency indicates how much supporting information When different sentences contain the same ‘‘connected words’’, they have equal scores –we use two popular criteria, length and position, to obtain the final measure of the sentence score 14
15
Redundancy control by penalizing repetitive words To ensure that the selected sentence always brings new concepts, a damping factor a is applied to the word importance during the sentence selection process In the extreme case when equals 0, an effective ‘‘connected word’’ is required not to appear in any selected sentence 15
16
Experiments and evaluation Document Understanding Conference (DUC) –The proposed summarization methods are first evaluated on a generic multi-document summarization data set –And then extended to several query-focused multi-document summarization data sets we use the automatic evaluation toolkit ROUGE to evaluate the system summaries. DUC 2004 generic multi-document summarization data set –45 set multi-document –Each set consisting of 10 documents 16
17
Experiments and evaluation The resulting summary tends to include more diverse words and thus stands a better chance to share more words with the reference summaries, which may lead to a higher ROUGE-1 score. the ROUGE-2 score may decrease even more as it requires matching two continuous words sequential system obtain the highest ROUGE-1 score with full penalty on repetitive words (a equals 0). However, the ROUGE-2 scores drop significantly ROUGE-2 scores are obtained when a equals 0.5, we can observe that the dropping rate is much lower for the progressive system. 17
18
Experiments and evaluation This clearly demonstrates the advantages of the progressive sentence selection strategy guarantees the novelty and saliency of the sentences 18
19
Experiments and evaluation the damping factor is used to handle the redundancy issue The reason is that it is more consistent with the word importance estimation method used in the systems and thus it is better in handling the redundancy for the system 19
20
Experiments and evaluation is too small –many unrelated words may also be wrongly associated, which unavoidably impairs the reliability of the word relations and leads to the worse performance is too large –the discovered word relations will be very limited and thus weaken the progressive system 20
21
Experiments and evaluation 2005-2007 DUC query-focused multi-document summarization data set –The data set in each year contains about 50 topics –each topic consisting of 25–50 documents –system-generated summaries are strictly limited to 250 English words in length 21
22
Experiments and evaluation It is also shown that incorporating the query to refine the word importance is effective for both the progressive system and the sequential system. 22
23
23 Conclusion and future work In the process, a sentence can be selected either as a general sentence or as a supporting sentence. The sentence relationship is used to improve the saliency of the supporting sentences. A fact is that a single word alone is often insufficient to represent a complex concept and the sense of a word can be ambiguous in a document set. –In future work, we’d like to explore concept relations
24
24 Conclusion and future work due to the limitation of the current natural language generation techniques, automatic summarization systems still cannot freely compose ideal sentences like human do. In the future, we’d like to investigate other means to break the limitation of the original sentences, such as sentence compression or sentence fusion, which can generate additional candidate sentences in order to more accurately express the desired concepts
25
Speech summarization
26
Experiments Data 實驗語料 List –ds2_all_list.txt 100 訓練語料 List ds2_all_list_test.txt 105 測試語料 List ds2_all_list_train.txt 20 篇測試語料 List –test_difficult.txt RM WRM 使用額外資訊的資料 –2002_News_Content.txt.seg 26
27
Experiments Data BG DATA –CNA0102.GT3-7.lm.wid N-gram, Lmscore (10 維底 要換成 e 為底 ), LMWID, Backoffscore 字典 –NTNULexicon2003-72K.txt AcousticWID, LMWID, N-gram, 中文字 注音符號 No tone syllableWID tone syllableWID 27
28
Experiments Data ROUGE 字典 –RougeDict.txt –a1 a2 a3 –a (LMWID) 28
29
Sentence modeling ULM KL 29
30
Sentence modeling RM 30
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.