Grey Sentiment Analysis

Slides:



Advertisements
Similar presentations
Sentiment Analysis on Twitter Data
Advertisements

Large-Scale Entity-Based Online Social Network Profile Linkage.
#Retweet this: HIV stigma in the twitterverse Miriam Y. Vega, PhD Latino Commission on AIDS & SPSSI UN NGO Abstract: TUAD0301.
Distant Supervision for Emotion Classification in Twitter posts 1/17.
Data Analysis Outline What is an experiment? What are independent vs. dependent variables in an experimental study? What are our dependent measures/variables.
Sarcasm Detection on Twitter A Behavioral Modeling Approach
Pollyanna Gonçalves (UFMG, Brazil) Matheus Araújo (UFMG, Brazil) Fabrício Benevenuto (UFMG, Brazil) Meeyoung Cha (KAIST, Korea) Comparing and Combining.
AUTOMATIC SPEECH CLASSIFICATION TO FIVE EMOTIONAL STATES BASED ON GENDER INFORMATION ABSTRACT We report on the statistics of global prosodic features of.
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam
Sentiment Analysis with a Multilingual Pipeline 12th International Conference on Web Information System Engineering (WISE 2011) October 13, 2011 Daniëlla.
More than words: Social networks’ text mining for consumer brand sentiments A Case on Text Mining Key words: Sentiment analysis, SNS Mining Opinion Mining,
Fuzzy Sets Introduction/Overview Material for these slides obtained from: Modern Information Retrieval by Ricardo Baeza-Yates and Berthier Ribeiro-Neto.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
U & I: Users & Information Lab Sept 2008  Alice Oh 
Automated Scoring of Picture- based Story Narration Swapna Somasundaran Chong Min Lee Martin Chodorow Xinhao Wang.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Designing Ranking Systems for Consumer Reviews: The Economic Impact of Customer Sentiment in Electronic Markets Anindya Ghose Panagiotis Ipeirotis Stern.
A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,
14/12/2009ICON Dipankar Das and Sivaji Bandyopadhyay Department of Computer Science & Engineering Jadavpur University, Kolkata , India ICON.
How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings Stefan Siersdorfer, Sergiu Chelaru, Wolfgang Nejdl, Jose San.
TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.
Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization Shubhanshu Mishra 1, Jana Diesner 1, Jason Byrne 2, Elizabeth.
Performance Comparison of Speaker and Emotion Recognition
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Automatic Categorization of Patent Applications Presentation to the 3rd IPC Workshop, WIPO, Feb , The need for automatic categorization of.
2014 Lexicon-Based Sentiment Analysis Using the Most-Mentioned Word Tree Oct 10 th, 2014 Bo-Hyun Kim, Sr. Software Engineer With Lina Chen, Sr. Software.
Lecture: Sentiment Analysis Krista Lagus Statistical Natural Language Processing course at Aalto
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
A Document-Level Sentiment Analysis Approach Using Artificial Neural Network and Sentiment Lexicons Yan Zhu.
© NCSR, Frascati, July 18-19, 2002 CROSSMARC big picture Domain-specific Web sites Domain-specific Spidering Domain Ontology XHTML pages WEB Focused Crawling.
More than words: Social network’s text mining for consumer brand sentiments Expert Systems with Applications 40 (2013) 4241–4251 Mohamed M. Mostafa Reporter.
Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
A Sentiment-Based Approach to Twitter User Recommendation BY AJAY ABDULPUR RAJARAM NIKKAM.
The problem. Psychologically plausible ways of
A SURVEY ON SENTIMENT ANALYSIS
Jonatas Wehrmann, Willian Becker, Henry E. L. Cagnini, and Rodrigo C
Kim Schouten, Flavius Frasincar, and Rommert Dekker
Lecture: Sentiment Analysis
Like It or Not: A Survey of Twitter Sentiment Analysis Methods
Name: Sushmita Laila Khan Affiliation: Georgia Southern University
Emotion propagation in online communities
Just do it. Obey your thirst. Intelligence everywhere.
Approaches to Machine Translation
Evaluating Classifiers
Sentiment analysis algorithms and applications: A survey
University of Rochester
An Artificial Intelligence Approach to Precision Oncology
Social Dynamics: Coding Blame and Emotion in Social Media
Building & Applying Emotion Recognition
Aspect-based sentiment analysis
Artificial Intelligence with Heart: Improving Customer Experience through Sentiment Analysis.
What is Pattern Recognition?
Proportion of Original Tweets
Retrieval of audio testimonials via voice search
Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang
An Inteligent System to Diabetes Prediction
David Cyphert CS 2310 – Software Engineering
Sentiment/opinion analysis
iSRD Spam Review Detection with Imbalanced Data Distributions
Approaches to Machine Translation
Final Project Presentation | CIS3203
How Can Brands Best Understand Their Consumers: Social Listening
Text Mining & Natural Language Processing
Introduction to Sentiment Analysis
NAÏVE BAYES CLASSIFICATION
Measuring The Influence In The News Media’s Narratives
Presentation transcript:

Grey Sentiment Analysis //TODO – know how they were created Liviu-Adrian Cotfas, Camelia Delcea liviu-adrian.cotfas@univ-fcomte.fr liviu.cotfas@ase.ro

Sentiment Analysis Sentiment analysis is a growing area of Natural Language Processing, commonly used to get insights from customer reviews, blogs and more recently from social media messages. It requires a multidisciplinary approach, that combines elements from fields such as linguistics, psychology and artificial intelligence. It is used to determine whether a text expresses a positive, negative or neutral perception, also known as polarity.

“Everyone’s customer will sooner or later express what they want on the Net. […]. You just put your ear to the ground and listen. You probably want to ask questions too but that will be to get details, to fine tune — not to understand the picture, only to understand what particular shade of green the customer is seeing out of 3,500 shades of green.” Source: Sony [http://breakthroughanalysis.com/]

How? Main approaches: Lexicon based; Machine learning based; Hybrid approaches. //ToSay: Lexicon based approaches can exploit linguistics such as the topic of the phrase, while machine learning approaches rely mostly on statistics. Also, a large corpora of examples is needed in order to successfully train machine learning algorithms. The performance of ML based approaches is highly domain specific.

Many Sentiment Analysis Lexicons…

What if? we could combine existing lexicons using grey systems theory, in order to improve sentiment analysis accuracy Validate using a base-line sentiment analysis

Grey Sentiment Lexicons MaxDiff-Twitter-Lexicon + MaxDiff-Twitter-Lexicon VaderSentiment-Lexicon Sentiment140 SentiWordNet SenticNet MPQA-Sentiment-Lexicon Grey Sentiment Lexicons VaderSentiment Lexicon + Sentiment140 Lexicon

Maxdiff-Twitter-Lexicon_-1to1 number of tokens: 1,515 contains: English words, emoticons, sentiment-related acronyms and initialisms (e.g. LOL), commonly used slang (e.g. nah) intensity range: –1 (extremely negative) to 1 (extremely positive) ex: the word "okay" has a positive valence of 0.376, "good" of 0.656, and "great" of 0.734 url: http://www.purl.com/net/lexicons

VaderSentiment Lexicon number of tokens: 7,500 contains: English words, emoticons, sentiment-related acronyms and initialisms (e.g. LOL), commonly used slang (e.g. nah) intensity range: –4 (extremely negative) to 4 (extremely positive) ex: the word "okay" has a positive valence of 0.9, "good" of 1.9, and "great" of 3.1 url: https://github.com/cjhutto/vaderSentiment We kept every lexical feature that had a non-zero mean rating, and whose standard deviation was less than 2.5 as determined by the aggregate of ten independent raters.

Sentiment140 Lexicon number of tokens: 62,468 contains: English words, sentiment-related acronyms and initialisms (e.g. LOL), commonly used slang (e.g. nah) intensity range: –1 (extremely negative) to 1 (extremely positive) ex: the word "okay" has a positive valence of 0.056, "good" of 0.165, and "great" of 0.2354 url: http://www.purl.com/net/lexicons We kept every lexical feature that had a non-zero mean rating, and whose standard deviation was less than 2.5 as determined by the aggregate of ten independent raters.

Comparison between the three lexicons Positive vs Negative terms   Maxdiff-Twitter-Lexicon -1to1 VaderSentiment Lexicon Sentiment140 Lexicon Positive 789 52% 4,171 56% 38,260 61% Negative 726 48% 3,332 44% 24,089 39%

Grey Sentiment Lexicon Steps: normalize the values in all lexicons in the interval [-1,1]; [min, max] Example (using all three lexicons): Compute the difference Maxdiff-Twitter-Lexicon -1to1 VaderSentiment Lexicon Sentiment140 Lexicon Grey Sentiment Lexicon ok 0.376 0.225 0.056 [0.056 - 0.376] good 0.656 0.475 0.165 [0.165 - 0.656] great 0.734 0.775 0.2354 [0.2354 - 0.775]

Grey Sentiment Analysis "I am very upset and disappointed by this iphone update failed backup. @iphone“ [-0.766, -0.4] + [-0.688,-0.44] + [-0.688,-0.334] = [-2.142, -1.174] booster word

Demo http://scopanum.cloudapp.net/greysentimentanalysis :) :| :(

Vander, MaxDiff2, Sentiment 140 Results* Vander Maxdiff Sentiment 140 Vander, MaxDiff Vander, MaxDiff2, Sentiment 140 TruePositive 429 436 324 451 445 TrueNegative 317 341 166 430 432 FalsePositive 188 164 339 75 73 FalseNegative 39 32 144 17 23 Precision 70% 73% 49% 86% Recall 92% 93% 69% 96% 95% Accuracy 77% 80% 50% 91% 90% *obtained on the Sanders dataset: http://www.sananalytics.com/lab/twitter-sentiment/

Conclusions visible improvements in precision, recall and accuracy , when using Grey Lexicons; extremely noisy lexicons can negatively affect the performance of the resulting Grey Lexicon.

Further Research Directions integrate additional lexicons such as SentiWordNet and SenticNet; validate the approach using a state-of-the-art sentiment analysis algorithm; extend the proposed approach for emotion analysis. //ToSay: While knowing the perception of the user is definitely important, analyzing the categories of emotions contained in Twitter messages using emotion analysis can provide even more information, by putting the focus on the actual feelings, such as joy, surprise, sadness or anger.

Source code https://github.com/lcotfas/GreySentimentAnalysis

Thank you!

?