Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sentiment Analysis An Overview of Concepts and Selected Techniques.

Similar presentations


Presentation on theme: "Sentiment Analysis An Overview of Concepts and Selected Techniques."— Presentation transcript:

1 Sentiment Analysis An Overview of Concepts and Selected Techniques

2 Terms  Sentiment A thought, view, or attitude, especially one based mainly on emotion instead of reason  Sentiment Analysis aka opinion mining use of natural language processing (NLP) and computational techniques to automate the extraction or classification of sentiment from typically unstructured text

3 Motivation  Consumer information Product reviews  Marketing Consumer attitudes Trends  Politics Politicians want to know voters’ views Voters want to know policitians’ stances and who else supports them  Social Find like-minded individuals or communities

4 Problem  Which features to use? Words (unigrams) Phrases/n-grams Sentences  How to interpret features for sentiment detection? Bag of words (IR) Annotated lexicons (WordNet, SentiWordNet) Syntactic patterns Paragraph structure

5 Challenges  Harder than topical classification, with which bag of words features perform well  Must consider other features due to… Subtlety of sentiment expression  irony  expression of sentiment using neutral words Domain/context dependence  words/phrases can mean different things in different contexts and domains Effect of syntax on semantics

6 Approaches  Machine learning Naïve Bayes Maximum Entropy Classifier SVM Markov Blanket Classifier  Accounts for conditional feature dependencies  Allowed reduction of discriminating features from thousands of words to about 20 (movie review domain)  Unsupervised methods Use lexicons Assume pairwise independent features

7 LingPipe Polarity Classifier  First eliminate objective sentences, then use remaining sentences to classify document polarity (reduce noise)

8 LingPipe Polarity Classifier  Uses unigram features extracted from movie review data  Assumes that adjacent sentences are likely to have similar subjective-objective (SO) polarity  Uses a min-cut algorithm to efficiently extract subjective sentences

9 LingPipe Polarity Classifier Graph for classifying three items.

10 LingPipe Polarity Classifier  Accurate as baseline but uses only 22% of content in test data (average)  Metrics suggests properties of movie review structure

11 SentiWordNet  Based on WordNet “synsets” http://wordnet.princeton.edu/  Ternary classifier Positive, negative, and neutral scores for each synset  Provides means of gauging sentiment for a text

12 SentiWordNet: Construction  Created training sets of synsets, L p and L n Start with small number of synsets with fundamentally positive or negative semantics, e.g., “nice” and “nasty” Use WordNet relations, e.g., direct antonymy, similarity, derived-from, to expand L p and L n over K iterations L o (objective) is set of synsets not in L p or L n  Trained classifiers on training set Rocchio and SVM Use four values of K to create eight classifiers with different precision/recall characteristics As K increases, P decreases and R increases

13 SentiWordNet: Results  24.6% synsets with Objective<1.0 Many terms are classified with some degree of subjectivity  10.45% with Objective<=0.5  0.56% with Objective<=0.125 Only a few terms are classified as definitively subjective  Difficult (if not impossible) to accurately assess performance

14 SentiWordNet: How to use it  Use score to select features (+/-) e.g. Zhang and Zhang (2006) used words in corpus with subjectivity score of 0.5 or greater  Combine pos/neg/objective scores to calculate document-level score e.g. Devitt and Ahmad (2007) conflated polarity scores with a Wordnet-based graph representation of documents to create predictive metrics

15 References 1. http://www.answers.com/sentiment, 9/22/08 http://www.answers.com/sentiment  B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up? Sentiment classification using machine learning techniques,” in Proc Conf on Empirical Methods in Natural Language Processing (EMNLP), pp. 79–86, 2002.  Esuli A, Sebastiani F. SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining. In: Proc of LREC 2006 - 5th Conf on Language Resources and Evaluation, 2006.  Zhang E, Zhang Y. UCSC on TREC 2006 Blog Opinion Mining. TREC 2006 Blog Track, Opinion Retrieval Task.  Devitt A, Ahmad K. Sentiment Polarity Identification in Financial News: A Cohesion-based Approach. ACL 2007.  Bo Pang, Lillian Lee, A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts, Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p.271-es, July 21-26, 2004.


Download ppt "Sentiment Analysis An Overview of Concepts and Selected Techniques."

Similar presentations


Ads by Google