Presentation is loading. Please wait.

Presentation is loading. Please wait.

Summary  Extractive speech summarization aims to automatically select an indicative set of sentences from a spoken document to concisely represent the.

Similar presentations


Presentation on theme: "Summary  Extractive speech summarization aims to automatically select an indicative set of sentences from a spoken document to concisely represent the."— Presentation transcript:

1 Summary  Extractive speech summarization aims to automatically select an indicative set of sentences from a spoken document to concisely represent the important aspects of the document  An emerging stream of work is to employ the language modeling framework in an unsupervised manner  The major challenge is how to formulate the sentence models and accurately estimate their parameters  We propose a novel and effective recurrent neural network language modeling (RNNLM) framework for speech summarization  The deduced models are able to render not only word usage cues but also long-span structural information of word co-occurrence relationships A Recurrent Neural Network Language Modeling Framework for Extractive Speech Summarization Kuan-Yu Chen †,*, Shih-Hung Liu †,*, Berlin Chen #, Hsin-Min Wang †, Wen-Lian Hsu †, Hsin-His Chen * † Institute of Information Science, Academia Sinica, Taiwan, # National Taiwan Normal University, Taiwan, * National Taiwan University, Taiwan The Proposed RNNLM Summarization Method Speech Recognition System Sentence S 1 Document- level RNNLM Sentence- Specific RNNLM Sentence Ranking Sentence S N … Speech Summary P RNNLM (D|S) Speech Signal Language Modeling Framework  A principal realization of using language modeling for summarization is to use a probabilistic generative paradigm for ranking each sentence S of a spoken document D to be summarized  The simplest way is to estimate a unigram language model (ULM) on the basis of the frequency of each distinct word w occurring in the sentence S, with the maximum likelihood (ML) criterion Recurrent Neural Network Language Modeling  RNNLM has recently emerged as a promising modeling framework for several tasks  The statistical cues of previously encountered word retained in the hidden layer can be fed back for predicting an arbitrary succeeding word  Both word usage cues and long-span structural information of word co-occurrence relationships can be take into account naturally RNNLM for Summarization  A hierarchical training strategy has been proposed to obtain the corresponding RNNLM model for each sentence 1)A document-level RNNLM model is trained for each document to be summarized The model will memorize both word usage and long- span word dependence cues in the document 2)The sentence-specific RNNLM model is trained based on the document-level RNNLM mode 3)The resulting sentence-specific RNNLM model can be used in place of, or to complement, the original sentence model (i.e., ULM) Experimental Results  Dataset: MATBN Corpus The corpus collected by the Academia Sinica and the Public Television Service Foundation of Taiwan between November 2001 and April 2003 TD: using the manual transcripts of spoken documents (without speech recognition errors) SD: using the speech recognition transcripts that may contain speech recognition errors  Baseline Approaches  Summarization results achieved by a few state-of-the-art unsupervised methods ULM shows competitive results when compared to the other state-of-the-art unsupervised methods, confirming the applicability of the language modeling approach for speech summarization  The RNNLM Framework  RNNLM: The deduced sentence-specific RNNLM model be used in isolation  RNNLM+ULM: The RNNLM model be linearly combined with the unigram language model ROUGE-1ROUGE-2ROUGE-L TD ULM41.129.836.1 BLM40.829.835.9 RNNLM43.331.939.0 RNNLM+ULM53.343.948.3 SD ULM36.421.030.7 BLM36.721.831.1 RNNLM33.018.429.4 RNNLM+ULM43.930.439.3


Download ppt "Summary  Extractive speech summarization aims to automatically select an indicative set of sentences from a spoken document to concisely represent the."

Similar presentations


Ads by Google