Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analyzing Sentiment in a Large Set of Web Data while Accounting for Negation AWIC 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam.

Similar presentations


Presentation on theme: "Analyzing Sentiment in a Large Set of Web Data while Accounting for Negation AWIC 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam."— Presentation transcript:

1 Analyzing Sentiment in a Large Set of Web Data while Accounting for Negation AWIC 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam basheerschop@gmail.com Paul van Iterson Erasmus School of Economics Erasmus University Rotterdam paulvaniterson@gmail.com Alexander Hogenboom Erasmus School of Economics Erasmus University Rotterdam hogenboom@ese.eur.nl Flavius Frasincar Erasmus School of Economics Erasmus University Rotterdam frasincar@ese.eur.nl Uzay Kaymak Erasmus School of Economics Erasmus University Rotterdam kaymak@ese.eur.nl 1 January 26, 2011

2 Outline Introduction Sentiment Analysis Framework Impact of Accounting for Negation Conclusions Future Work AWIC 2011 2

3 Introduction Need for information monitoring tools for tracking sentiment in today’s complex economic systems The Web offers an overwhelming amount of textual data, containing traces of sentiment Existing sentiment analysis approaches are based on word frequencies, yet there is a tendency of involving various other aspects of content Accounting for negation seems promising, but to what extent does it advance sentiment analysis? AWIC 2011 3

4 Sentiment Analysis (1) Sentiment analysis is typically focused on determining the polarity of natural language texts Applications in summarizing reviews, determining a general mood (consumer confidence, politics) Common approach to sentiment analysis: –Creation of lexicon (list of words and their sentiment scores) –Utilization of lexicon to determine sentiment in text Sentiment analysis approaches differ on several distinguishing characteristic features 44 AWIC 2011

5 Sentiment Analysis (2) Lexicon creation: manual, machine learning Analysis level: document, sentence, window Filtering: topic relevance, subjectivity, part-of-speech Syntactical variants: stemming, lemmatization Modification: comparison, amplification, negation 55 AWIC 2011

6 Framework (1) Sentiment lexicon creation and subsequent lexicon- based document scoring Optional support for sentiment negation Individual words (adjectives only) are assigned scores in the range [-1,1] Word scores are used to classify a document as positive (1), neutral (0), or negative (-1) 66 AWIC 2011

7 Framework (2) Sentiment lexicon creation from training corpus of documents with document-level sentiment scores Retrieve all adjectives from the training corpus For each adjective: –Calculate sentiment score as the average sentiment score of all documents containing the adjective –Weight document scores for the influence of the adjective in the respective documents –Document influence is an adjective’s frequency in the document in relation to the total frequency of all adjectives in the document When accounting for negation, adjective frequencies are corrected for the number of negated occurrences 77 AWIC 2011

8 Framework (3) Score documents in test corpus for their sentiment For an arbitrary document: –Retrieve all adjectives (duplicates allowed) –Retrieve adjectives’ sentiment scores from lexicon –Calculate document score as sum of adjectives’ scores –Classify document as positive (score > 0.002), neutral (-0.021 ≤ score ≤ 0.002), or negative (score < -0.021) When accounting for negation, negate sentiment scores of negated adjectives 88 AWIC 2011

9 Impact of Accounting for Negation Corpus of 13,628 Dutch documents on 40 topics, manually classified as positive, neutral, or negative (60% training corpus, 40% testing corpus) Determine document-level sentiment with and without taking into account negation Implementation in C#, Microsoft SQL Server database, commercial POS tagger (OpenNLP-based) Basic negation detection: negation key words that directly precede a sentiment-carrying word Accounting for negation yields a 2% increase in precision and recall, with less than 1% of the sentences in our corpus containing negation 99 AWIC 2011

10 Conclusions Recent sentiment analysis approaches consider more and more aspects of content other than just word frequencies Even simple ways of accounting for negation already appear to help improve sentiment analysis performance AWIC 2011 10

11 Future Work Optimize scope of influence of negation key words Account for various degrees of negation Experiment with other word types in our semantic lexicon (e.g., adverbs) Assess performance on English texts AWIC 2011 11

12 Questions? Feel free to contact: Alexander Hogenboom Erasmus School of Economics Erasmus University Rotterdam P.O. Box 1738, 3000 DR, The Netherlands hogenboom@ese.eur.nl hogenboom@ese.eur.nl AWIC 2011 12


Download ppt "Analyzing Sentiment in a Large Set of Web Data while Accounting for Negation AWIC 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam."

Similar presentations


Ads by Google