1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

1 CS 388: Natural Language Processing: N-Gram Language Models Raymond J. Mooney University of Texas at Austin.
Ani Nenkova Lucy Vanderwende Kathleen McKeown SIGIR 2006.
Punctuation Generation Inspired Linguistic Features For Mandarin Prosodic Boundary Prediction CHEN-YU CHIANG, YIH-RU WANG AND SIN-HORNG CHEN 2012 ICASSP.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Towards Twitter Context Summarization with User Influence Models Yi Chang et al. WSDM 2013 Hyewon Lim 21 June 2013.
1 Latent Semantic Mapping: Dimensionality Reduction via Globally Optimal Continuous Parameter Modeling Jerome R. Bellegarda.
Linear Model Incorporating Feature Ranking for Chinese Documents Readability Gang Sun, Zhiwei Jiang, Qing Gu and Daoxu Chen State Key Laboratory for Novel.
Latent Semantic Analysis
Evaluating Search Engine
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 30, (2014) BERLIN CHEN, YI-WEN CHEN, KUAN-YU CHEN, HSIN-MIN WANG2 AND KUEN-TYNG YU Department of Computer.
Acoustic and Linguistic Characterization of Spontaneous Speech Masanobu Nakamura, Koji Iwano, and Sadaoki Furui Department of Computer Science Tokyo Institute.
Mining and Summarizing Customer Reviews
Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.
WebPage Summarization Using Clickthrough Data JianTao Sun & Yuchang Lu, TsingHua University, China Dou Shen & Qiang Yang, HK University of Science & Technology.
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.
A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,
Web-page Classification through Summarization D. Shen, *Z. Chen, **Q Yang, *H.J. Zeng, *B.Y. Zhang, Y.H. Lu and *W.Y. Ma TsingHua University, *Microsoft.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
NL Question-Answering using Naïve Bayes and LSA By Kaushik Krishnasamy.
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
Learn to Comment Lance Lebanoff Mentor: Mahdi. Emotion classification of text  In our neural network, one feature is the emotion detected in the image.
A Machine Learning Approach to Sentence Ordering for Multidocument Summarization and Its Evaluation D. Bollegala, N. Okazaki and M. Ishizuka The University.
COMPARISON OF A BIGRAM PLSA AND A NOVEL CONTEXT-BASED PLSA LANGUAGE MODEL FOR SPEECH RECOGNITION Md. Akmal Haidar and Douglas O’Shaughnessy INRS-EMT,
Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.
1 Sentence-extractive automatic speech summarization and evaluation techniques Makoto Hirohata, Yosuke Shinnaka, Koji Iwano, Sadaoki Furui Presented by.
DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR SPEECH RECOGNITION Hong-Kwang Jeff Kuo, Eric Fosler-Lussier, Hui Jiang, Chin-Hui Lee ICASSP 2002 Min-Hsuan.
Generic text summarization using relevance measure and latent semantic analysis Gong Yihong and Xin Liu SIGIR, April 2015 Yubin Lim.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
1 SIGIR 2004 Web-page Classification through Summarization Dou Shen Zheng Chen * Qiang Yang Presentation : Yao-Min Huang Date : 09/15/2004.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Kevin Heinrich, Lai Wei, and Michael W. Berry University of Tennessee.
Methods for Automatic Evaluation of Sentence Extract Summaries * G.Ravindra +, N.Balakrishnan +, K.R.Ramakrishnan * Supercomputer Education & Research.
Mutual-reinforcement document summarization using embedded graph based sentence clustering for storytelling Zhengchen Zhang , Shuzhi Sam Ge , Hongsheng.
National Taiwan University, Taiwan
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
A Critique and Improvement of an Evaluation Metric for Text Segmentation A Paper by Lev Pevzner (Harvard University) Marti A. Hearst (UC, Berkeley) Presented.
Latent Topic Modeling of Word Vicinity Information for Speech Recognition Kuan-Yu Chen, Hsuan-Sheng Chiu, Berlin Chen ICASSP 2010 Hao-Chin Chang Department.
Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.
MINIMUM WORD CLASSIFICATION ERROR TRAINING OF HMMS FOR AUTOMATIC SPEECH RECOGNITION Yueng-Tien, Lo Speech Lab, CSIE National.
A DYNAMIC APPROACH TO THE SELECTION OF HIGH ORDER N-GRAMS IN PHONOTACTIC LANGUAGE RECOGNITION Mikel Penagarikano, Amparo Varona, Luis Javier Rodriguez-
Relevance Language Modeling For Speech Recognition Kuan-Yu Chen and Berlin Chen National Taiwan Normal University, Taipei, Taiwan ICASSP /1/17.
Retrieval of Relevant Opinion Sentences for New Products
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.
Learning Extraction Patterns for Subjective Expressions 2007/10/09 DataMining Lab 안민영.
10.0 Latent Semantic Analysis for Linguistic Processing References : 1. “Exploiting Latent Semantic Information in Statistical Language Modeling”, Proceedings.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
LexPageRank: Prestige in Multi-Document Text Summarization Gunes Erkan, Dragomir R. Radev (EMNLP 2004)
Natural Language Processing Topics in Information Retrieval August, 2002.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
A Maximum Entropy Language Model Integrating N-grams and Topic Dependencies for Conversational Speech Recognition Sanjeev Khudanpur and Jun Wu Johns Hopkins.
Correcting Misuse of Verb Forms John Lee, Stephanie Seneff Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge ACL 2008.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
Cross-Dialectal Data Transferring for Gaussian Mixture Model Training in Arabic Speech Recognition Po-Sen Huang Mark Hasegawa-Johnson University of Illinois.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
Utterance verification in continuous speech recognition decoding and training Procedures Author :Eduardo Lleida, Richard C. Rose Reporter : 陳燦輝.
Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Speaker : chia hua Authors : Long Qin, Ming Sun, Alexander Rudnicky
Efficient Estimation of Word Representation in Vector Space
Multimedia Information Retrieval
A Brief Review of Extractive Summarization Research
Presentation transcript:

1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui ICASSP2005 Present : Yao-Min Huang Date : 04/07/2005

2 Introduction One of the major applications of automatic speech recognition is to transcribe speech documents such as talks, presentations, lectures, and broadcast news. Spontaneous speech is ill-formed and very different from written text. –redundant information  repetitions, repairs... –Irrelevant or incorrect information  recognition errors

3 Introduction Therefore, an approach in which all words are simply transcribed is not an effective one for spontaneous speech. Instead, speech summarization which extracts important information and removes redundant and incorrect information is ideal for recognizing spontaneous speech.

4 Introduction (SAP2004)

5 Sentence Extraction Methods Review SAP2004 –Important Sentence Extraction The score for important sentence extraction is calculated for each sentence. N : # of words in the sentence W L(w i ) 、 I(w i ) 、 C(w i ) are the linguistic score(trigram), the significance score(nouns, verbs, adjectives and out-of-vocabulary (OOV) ), and the confidence score of word w i, respectively. –The experiment shows that Significance score is more effective than L score and C score. This paper simply uses the significance score.

6 Sentence Extraction Methods Review SAP2004 –Significance score measured by the amount of information for content words including nouns, verbs, adjectives and out-of- vocabulary (OOV) words, based on word occurrence in a corpus – –Important keywords are weighted and the words unrelated to the original content, such as recognition errors, are de- weighted by this score.

7 Sentence Extraction Methods Extraction using latent semantic analysis –Equal to the “ Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis SIGIR2001 “ – = ×∑××∑× Select the sentence which has the largest index value with the right singular vector (column vector of V)

8 Sentence Extraction Methods Extraction using dimension reduction based on SVD –K=5

9 Sentence Extraction Methods Extraction using sentence location –human subjects tend to extract sentences from the first and the last segments under the condition of 10% summarization ratio,whereas there is no such tendency at 50% summarization ratio.

10 Sentence Extraction Methods Extraction using sentence location –The introduction and conclusion segments are estimated based on the Hearst method(1997) using sentence cohesiveness. which is measured by a cosine value between content word-frequency vectors consisting of more than a fixed number of content words.

11 Sentence Extraction Methods Each segmentation boundary is the first sentence from the beginning or end of the presentation speech

12 Objective Evaluation Metrics Summarization accuracy –SAP2004 measured in comparison with the closest word string extracted from the word network as the summarization accuracy. works reasonably well at relatively high summarization ratios such as 50%, but has problems at low summarization ratios such as 10% –since the variation between manual summaries is so large that the network accepts inappropriate summaries.

13 Objective Evaluation Metrics Summarization accuracy –Therefore investigated word accuracy obtained by individually using the manual summaries (SumACCY-E). –(SumACCY-E/max) »the largest score among human summaries »which is equivalent to the NrstACCY proposed in [C. Hori..etc 2004,ACL]. –(SumACCY-E/ave) »average score

14 Objective Evaluation Metrics Sentence recall / precision –Since sentence boundaries are not explicitly indicated in input speech Solution  T.Kitade.. etc,2004 –Extraction of a sentence in the recognition result is considered as extraction of one or multiple sentences in the manual summary with an overlap of 50% or more words. –In this metric, sentence recall/precision is measured by the largest score (F-measure/max) or the average score (F-measure/ave) of the F-measures.

15 Objective Evaluation Metrics ROUGE-N

16 Experiments Experimental conditions –30 presentations by 20 males and 10 females in the CSJ were automatically summarized at 10% summarization ratio. –Mean word recognition accuracy was 69%. –Sentence boundaries in the recognition results were automatically determined using language models, which achieved 72% recall and 75% precision. –The technique of extracting sentences significance score (SIG) latent semantic analysis (LSA) dimension reduction based on SVD (DIM); SIG combined with IC (sentence location) (SIG+IC); LSA combined with IC (LSA+IC); DIM combined with IC (DIM+IC).

17 Experiments Subjective evaluation –180 automatic summaries (30 presentations x 6 summarization methods) were evaluated by 12 human subjects. –The summaries were evaluated in terms of ease of understanding and appropriateness as summaries in five levels 1-very bad; 2-bad; 3-normal;4-good; 5-very good. –The subjective evaluation results were converted into factor scores using factor analysis in order to normalize subjective differences.

18 Experiments By combining the IC method using sentence location, every summarization method was significantly improved. SIG+IC achieved the best score, but the difference between SIG+IC and DIM+IC was not significant.

19 Experiments Correlation between subjective and objective evaluation results –In order to investigate the relationship between subjective and objective evaluation results, the automatic summaries were evaluated by eight objective evaluation metrics SumACCY SumACCYE/max, SumACCY-E/ave, F-measure/max, F-measure/ave, ROUGE-1 (Uni-gram) ROUGE-2 (Bi-gram) ROUGE-3 (Tri-gram)

20 Experiments max

21 Experiments

22 Experiments All the objective metrics yielded correlation with human judgment. If the effect of word recognition accuracy for each sentence is removed, all the metrics, except ROUGE-1, yield high correlations. –ROUGE-1 measures overlapping 1-grams, which probably causes the correlation between ROUGE-1 and the recognition accuracy.

23 Experiments

24 Experiments In contrast with the results averaged over all the presentations, no metric has strong correlation. –This is due to the large variation of scores over the whole set of presentations.

25 CONCLUSION This paper has presented several sentence extraction methods for automatic presentation speech summarization and objective eval-uation metrics. We have proposed sentence extraction methods using dimension reduction based on SVD and sentence location. Under the condition of 10% summarization ratio, it was confirmed that the method using sentence location improves summarization results.

26 CONCLUSION Among the objective evaluation metrics, SumACCY, SumACCY-E, F-measure, ROUGE-2 and 3 were found to be effective. Although the correlation between the subjective and objective scores averaged over presentations is high, the correlation for each individual presentation is not so high –due to the large variation of scores across presentations.

27 Future Future research includes investigation of –other objective evaluation metrics –evaluation of summarization methods containing sentence compaction –producing optimum summarization techniques by employing objective evaluation metrics