Farag Saad i-KNOW 2014 Graz- Austria,

Slides:



Advertisements
Similar presentations
A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Chapter 5: Introduction to Information Retrieval
Text Categorization Moshe Koppel Lecture 1: Introduction Slides based on Manning, Raghavan and Schutze and odds and ends from here and there.
Distant Supervision for Emotion Classification in Twitter posts 1/17.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
Sentiment Analysis An Overview of Concepts and Selected Techniques.
Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.
A Survey on Text Categorization with Machine Learning Chikayama lab. Dai Saito.
Peiti Li 1, Shan Wu 2, Xiaoli Chen 1 1 Computer Science Dept. 2 Statistics Dept. Columbia University 116th Street and Broadway, New York, NY 10027, USA.
A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts 04 10, 2014 Hyun Geun Soo Bo Pang and Lillian Lee (2004)
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Learning Subjective Adjectives from Corpora Janyce M. Wiebe Presenter: Gabriel Nicolae.
Automatic Sentiment Analysis in On-line Text Erik Boiy Pieter Hens Koen Deschacht Marie-Francine Moens CS & ICRI Katholieke Universiteit Leuven.
Scalable Text Mining with Sparse Generative Models
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.
Forecasting with Twitter data Presented by : Thusitha Chandrapala MARTA ARIAS, ARGIMIRO ARRATIA, and RAMON XURIGUERA.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
Mining and Summarizing Customer Reviews
Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews K. Dave et al, WWW 2003, citations Presented by Sarah.
A Joint Model of Feature Mining and Sentiment Analysis for Product Review Rating Jorge Carrillo de Albornoz Laura Plaza Pablo Gervás Alberto Díaz Universidad.
Opinion mining in social networks Student: Aleksandar Ponjavić 3244/2014 Mentor: Profesor dr Veljko Milutinović.
(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence
Mining and Summarizing Customer Reviews Minqing Hu and Bing Liu University of Illinois SIGKDD 2004.
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
Automatically Identifying Localizable Queries Center for E-Business Technology Seoul National University Seoul, Korea Nam, Kwang-hyun Intelligent Database.
1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
Text Classification, Active/Interactive learning.
Introduction to Text and Web Mining. I. Text Mining is part of our lives.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
1 Co-Training for Cross-Lingual Sentiment Classification Xiaojun Wan ( 萬小軍 ) Associate Professor, Peking University ACL 2009.
Learning from Multi-topic Web Documents for Contextual Advertisement KDD 2008.
Automatic Detection of Social Tag Spams Using a Text Mining Approach Hsin-Chang Yang Associate Professor Department of Information Management National.
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
MODEL ADAPTATION FOR PERSONALIZED OPINION ANALYSIS MOHAMMAD AL BONI KEIRA ZHOU.
TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.
Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization Shubhanshu Mishra 1, Jana Diesner 1, Jason Byrne 2, Elizabeth.
Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification John Blitzer, Mark Dredze and Fernando Pereira University.
1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
CSC 594 Topics in AI – Text Mining and Analytics
Extracting Hidden Components from Text Reviews for Restaurant Evaluation Juanita Ordonez Data Mining Final Project Instructor: Dr Shahriar Hossain Computer.
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
1 Generating Comparative Summaries of Contradictory Opinions in Text (CIKM09’)Hyun Duk Kim, ChengXiang Zhai 2010/05/24 Yu-wen,Hsu.
Comparative Experiments on Sentiment Classification for Online Product Reviews Hang Cui, Vibhu Mittal, and Mayur Datar AAAI 2006.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Subjectivity Recognition on Word Senses via Semi-supervised Mincuts Fangzhong Su and Katja Markert School of Computing, University of Leeds Human Language.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
A Document-Level Sentiment Analysis Approach Using Artificial Neural Network and Sentiment Lexicons Yan Zhu.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
An Effective Statistical Approach to Blog Post Opinion Retrieval Ben He, Craig Macdonald, Jiyin He, Iadh Ounis (CIKM 2008)
A Simple Approach for Author Profiling in MapReduce
Learning to Detect and Classify Malicious Executables in the Wild by J
Kim Schouten, Flavius Frasincar, and Rommert Dekker
Sentiment analysis algorithms and applications: A survey
Aspect-Based Sentiment Analysis Using Lexico-Semantic Patterns
Erasmus University Rotterdam
Source: Procedia Computer Science(2015)70:
Juweek Adolphe Zhaoyu Li Ressi Miranda Dr. Shang
Aspect-based sentiment analysis
iSRD Spam Review Detection with Imbalanced Data Distributions
Presentation transcript:

Farag Saad farag.saad@gesis.org i-KNOW 2014 Graz- Austria, 18-09-2014 Baseline Evaluation: An Empirical Study of the Performance of Machine Learning Algorithms in Short Snippet Sentiment Analysis Farag Saad farag.saad@gesis.org i-KNOW 2014 Graz- Austria, 18-09-2014

Outline Introduction Sentiment Classification Training data Evaluation Sentiment Analysis Sentiment Classification Features extraction Feature weighting Classification algorithms (Binarized Multinomial Naïve Bayes) Training data Evaluation Classification performance comparison between various classifiers Does features selection useful for classification? Does the up weighting of adjectives improves classifiers’ performance? Conclusion

Introduction Emergence of Web 2.0 Internet is more user interactive Many users generate content daily Rich opinions are important. However, User-generated content Lacks Organization Contains Improper Structure Reviews are Long Therefore, automatically mining user-generated content is very difficult but a very important task to achieve blog.journals.cambridge.org http://www.talkwalker.com/

Sentiment Analysis Sentiment Analysis aka opinion mining Attempts to identify the opinion/sentiment that a person may hold towards an object (Bing Liu, 2010) Our task is to determine firstly, if a piece of text is Objective or subjective ? e.g.,: “Yesterday I bought a Nikon camera” is an objective text “The video capability is truly amazing” is subjective text Second, is to detect a text polarity: Positive or negative sentiment. However,

Sentiment Analysis A piece of text can fall between positive or negative “With the exception of burst shooting, this camera’s performance is excellent” The sentiment might be expressed explicitly or implicitly e.g., ”poor picture quality” explicit sentiment, while “The laptop battery lasted for 3 hour” implicit sentiment The sentiment is domain dependent e.g., “gangster kills a guy in a fight” bears a negative sentiment “fight illness with healthy food” bears positive sentiment

Sentiment Classification Consist of three main steps: Feature extraction Unigram feature Data preprocessing: Remove stop words but keep the rest Features reduction (select only the most useful feature) Feature weighting Term frequency Term frequency & inverse document frequency (TF-IDF) Term Presence Part of Speech (only adjective is selected)

Sentiment Classification Algorithms Naïve Bayes (NB) with its variations, Support Vector Machine (SVM) and J48 We will focus on describing the best performing classifier that is the Binarized Multinomial NB

The Binarized Multinomial NB Given an unlabeled set of sentences Where denotes the ith test sentence and the denotes the word within it, and given a manually annotated training sentences, that contain sentences with their sentiment polarities where denotes the ith labeled training sentence and the refers to its polarity 𝑃 𝐶 𝑗 𝑡 𝑖 )=𝑃 𝑐 𝑗 𝑃 𝑡 𝑖 𝑐 𝑗 ) 𝑃( 𝑡 𝑖 ) −−−−−> (1) 𝑃 𝑡 𝑖 𝑐 𝑗 )= 𝑡=1 𝑛 𝑥𝑡 ! 𝑡=1 |𝑉| 𝑃( 𝑤 𝑡 | 𝑐 𝑗 ) 𝑥 𝑡 𝑥 𝑡 ! −−−−−−> (2)

The Binarized Multinomial NB The probability 𝑃 𝑤 𝑡 𝑐 𝑗 ) based on a set of Documents D is computed as follows: Duplicates in each document in 𝐷 will be eliminated where for each word 𝑤 𝑡 in a document 𝑑 𝑗 only one instance is kept Concatenate all documents resulting in the first step into a single document 𝑑 𝑘 Count the number of occurrences for each 𝑤 𝑡 in 𝑑 𝑘 Laplace to avoid zero estimates: 𝑃 𝑤 𝑡 𝐶 𝑗 = 𝑓 𝑤 𝑡 , 𝑐 𝑗 +𝜇 𝑓 𝑐 𝑗 +|𝑉| −−−−−−> (3)

The Training/Test Data Data collection (Blitzer et al.,2007) The 12 domains dataset were binary annotated (+/-) It consists of reviews for different product types such as books, apparel, health and personal care, magazines etc. The selected products contain an equal number of sentences from positive and negative reviews. Each product contains 1000 sentences positive and 1000 negative The total annotated sentences across all products was 24000 J. Blitzer, M. Dredze, and F. Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification (ACL 2007)

Evaluation Inconsistence in the previous studies in regard to algorithms performance We have designed a binary classification task, where each sentence is a test instance and the target class attribute is either positive or negative Train a selected classifier models to predict the class of any unlabeled test instances

Evaluation Three main experiments: A comparison of classifiers‘ performance has been carried out. What is the effect of feature selection methods (mainly using Information Gain) Does the up-weighting of adjectives leads to a classification improvement?

Evaluation / Cross-Validation Exp.1 Training Exp.2 Exp.3 Exp.4 Exp.5 Exp.10 1 2 3 4 5 10 For each data set: break up the data into 10 folds For each fold Select the current fold as test set Train the classifier on the rest (9 folds) Compute the average classification performance on the 10 runs

Evaluation Corpora NB MNB/BMNB SVM J48 F1-Measure TF TP TF-IDF Apparel 0.747 0.781 0.685 0.828 0.827 0.811 0.699 0.762 0.815 0.736 0.737 0.734 Baby 0.706 0.756 0.700 0.797 0.803 0.792 0.655 0.503 0.790 0.693 0.701 0.686 Books 0.658 0.697 0.767 0.769 0.763 0.588 0.555 0.768 0.636 0.638 0.656 Camera & Photo 0.698 0.728 0.798 0.665 0.820 0.735 0.727 DVD 0.682 0.726 0.661 0.773 0.782 0.780 0.597 0.547 0.786 0.672 0.676 0.673 Electronics 0.708 0.680 0.777 0.779 0.614 0.678 0.670 0.671 Health & personal care 0.707 0.72 0.810 0.800 0.640 0.642 0.808 0.695 Kitchen & housewares 0.681 0.739 0.691 0.794 0.690 0.793 0.724 0.717 Magazines 0.722 0.821 0.822 0.823 0.621 0.562 0.826 0.778 software 0.630 0.709 0.731 0.806 0.583 0.719 0.713 0.712 Sport & outdoors 0.758 0.725 0.801 0.789 0.743 Video 0.703 0.745 0.688 0.765 0.647 0.714 0.729 Table 1: The polarity classification results using F1-Measure for different classifiers applied on 12 test data domains (the best performed method for each domain is in bold and underlined).

Evaluation Corpora NB MNB/BMNB SVM J48 F1-Measure TF TP TF-IDF Apparel 0.780 0.789 0.746 0.848 0.852 0.839 0.785 0.792 0.826 0.741 0.743 Baby 0.760 0.771 0.735 0.819 0.818 0.808 0.769 0.764 0.812 0.716 0.718 0.705 Books 0.703 0.709 0.701 0.766 0.777 0.763 0.693 0.695 0.774 0.667 0.664 0.674 Camera & Photo 0.744 0.742 0.817 0.828 0.825 0.748 0.739 0.719 DVD 0.751 0.715 0.783 0.805 0.702 0.725 0.691 0.685 0.704 Electronics 0.720 0.707 0.813 0.806 0.738 0.683 Health & personal care 0.757 0.827 0.821 0.824 0.807 0.69 Kitchen & housewares 0.736 0.759 0.833 0.829 0.728 0.734 Magazines 0.761 0.843 0.849 0.858 0.755 0.753 software 0.670 0.724 0.747 0.815 0.669 0.714 0.713 Sport & outdoors 0.778 0.800 0.820 0.781 0.809 0.723 0.706 Video 0.749 0.776 0.801 0.721 0.727 0.730 Table 2: The impact of feature selection method Information Gain (IG) on the classifiers' performance using F1-Measure. (the best performed method for each domain is in bold and underlined).

Evaluation Weighting method Improvement TF 6.10% TP 5.62% TF&IDF 2.63% Corpora NB MNB/BMNB SVM J48 F1-Measure TF TP TF-IDF Apparel 0.780 0.789 0.746 0.848 0.852 0.839 0.785 0.792 0.826 0.741 0.743 Baby 0.760 0.771 0.735 0.819 0.818 0.808 0.769 0.764 0.812 0.716 0.718 0.705 Books 0.703 0.709 0.701 0.766 0.777 0.763 0.693 0.695 0.774 0.667 0.664 0.674 Camera & Photo 0.744 0.742 0.817 0.828 0.825 0.748 0.739 0.719 DVD 0.751 0.715 0.783 0.805 0.702 0.725 0.691 0.685 0.704 Electronics 0.720 0.707 0.813 0.806 0.738 0.683 Health & personal care 0.757 0.827 0.821 0.824 0.807 0.69 Kitchen & housewares 0.736 0.759 0.833 0.829 0.728 0.734 Magazines 0.761 0.843 0.849 0.858 0.755 0.753 software 0.670 0.724 0.747 0.815 0.669 0.714 0.713 Sport & outdoors 0.778 0.800 0.820 0.781 0.809 0.723 0.706 Video 0.749 0.776 0.801 0.721 0.727 0.730 Weighting method Improvement TF 6.10% TP 5.62% TF&IDF 2.63% Overall average 4.7% Table 2: The impact of feature selection method Information Gain (IG) on the classifiers' performance using F1-Measure. (the best performed method for each domain is in bold and underlined).

Conclusion We have conducted a series of comparative experiments in order to compare the performance of various machine learning classifiers on the sentiment analysis task We studied the impact of feature selection methods on the classification performance improvement The best achieved classification results were obtained using the BMNB classifier Using feature selection methods have led to a significant increase of all classifiers‘ performance using different feature weighting methods

Conclusion Based on the carried out experiments in this paper, our finding raised the possibility that the BMNB classier performs the best in a short snippet sentiment analysis We further support the recent finding done by Wang and Manning (2012*) (in the short snippet sentiment analysis, MNB actually performs better than other classifiers particularly better than SVM classifier * S. Wang and C. D. Manning. Baselines and bigrams: Simple, good sentiment and topic classification. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics- Volume 2, ACL '12, pages 90-94, 2012.