Product Review Summarization Ly Duy Khang
Outline 1.Motivation 2.Problem statement 3.Related works 4.Baseline 5.Discussion
1. Motivation (1) A rapid expansion of e-commerce, where more and more products are sold via online portals (Amazon, eBay … ) Online product reviews thus become an important resource: – Customers to share and find opinions about products easily – Producers to get certain degrees of feedback
1. Motivation (2)
2. Problem statement Given a set of reviews of a product, produce an abstractive summary that captures users’ opinions about that product
3. Related works (1) Single-document summarization – Extractive-based approach Sentence score + ranking Machine learning technique – Abstractive-based approach Template Concept hierarchy
3. Related works (2) Multi-document summarization – Extractive-based approach Sentence score + ranking + MMR + Ordering – Abstractive-based approach Template Concept hierarchy Sentence fusion with paraphrasing rules
3. Related works (3) Sentiment analysis – Reviews polarity classification – PROS/ CONS identification – Mining review opinions Identify product facets Identify opinion orientation on the facet
4. Baseline (1) Extractive based summary An integration between Liu et. al. (2004) and NUS - DUC 2005
4. Baseline (2)
4. Baseline (3) Product facets identification – Association rule mining Each transaction consists of nouns/noun phrases from single sentence The frequent itemsets are the candidate product facets – Redundancy pruning Removing redundant facets that contain only single words. (e.g. life -> battery life) – Compactness pruning Removing meaningless facets that contain multiple words
4. Baseline (4) Sentiment classification – WordNet to grow seed lists of (+) and (-) ADJ – ADJ share the same orientation as their synonyms and opposite orientation as their antonyms
4. Baseline (5) Reviews labeling with facets and polarity – The unit of labeling is sentence – The summation of all these polarities yields the polarity of the whole sentence
4. Baseline (6) Summary generation – Sentences are clustered based on their labeling – For each facet, we produce a summary Sentences are scored based on concept link similarity MMR ranks the sentences
5. Discussion (1) Evaluation – We plan to carry on human evaluation.
5. Discussion (2) In the baseline, – Inherit all problems of extractive-based summary – The unit of sentence is too coarse-grained – Relationship between facets are not addressed
References [1] V. Hatzivassiloglou, J. L. Klavans, M. L. Holcombe, R. Barzilay, M. Y. Kan, and K. R. Mckeown. SimFinder: A Flexible Clustering Tool for Summarization. Machine Learning, [2] R. Barzilay, K. R. Mckeown, and M. Elhadad. Information fusion in the context of multidocument summarization. Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, page , [3] I. Mani and M. T. Maybury. Advances in automatic text summarization [4] R. Mooney and G. DeJong. Learning schemata for natural language processing. Strategied for Natural Lanaguage Processing, pages [5] E. Hovy and C. Lin. Automated text summarization in SUMMARIST. Advances in Automatic Text Summarization, 94, 1999.
[6] M. Hu and B. Liu. Mining and summarizing customer reviews. Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, page , [7] M. Hu and B. Liu. Mining opinion features in customer reviews. Proceedings of the National Conference on Articial Intelligence, page , [8] S. Ye, L. Qiu, T. S. Chua, and M. Y. Kan. NUS at DUC 2005: Understanding Documents via Concept Links. Document Understanding Conference (DUC05), [9[ X. Ding, B. Liu, and P. S. Yu. A holistic lexicon-based approach to opinion mining Proceedings of the international conference on Web search and web data mining – WSDM '08, page 231, 2008.