Ontology-Driven Sentiment Analysis of Product and Service Aspects

Slides:

Advertisements

Similar presentations

ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS

Advertisements

Farag Saad i-KNOW 2014 Graz- Austria,

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.

Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Sentiment Analysis An Overview of Concepts and Selected Techniques.

D ETERMINING THE S ENTIMENT OF O PINIONS Presentation by Md Mustafizur Rahman (mr4xb) 1.

Joint Sentiment/Topic Model for Sentiment Analysis Chenghua Lin & Yulan He CIKM09.

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, Bing Qin

Text Classification With Labeled and Unlabeled Data Presenter: Aleksandar Milisic Supervisor: Dr. David Albrecht.

Chapter 14 Introduction to Linear Regression and Correlation Analysis

Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.

8/20/2015Slide 1 SOLVING THE PROBLEM The two-sample t-test compare the means for two groups on a single variable. the The paired t-test compares the means.

Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.

Writing a Discussion Section. Writing a discussion section is where you really begin to add your interpretations to the work. In this critical part of.

Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.

Learning from Multi-topic Web Documents for Contextual Advertisement KDD 2008.

1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.

Microsoft ® Office Excel 2003 Training Using XML in Excel SynAppSys Educational Services presents:

CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.

TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.

1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )

Blog Summarization We have built a blog summarization system to assist people in getting opinions from the blogs. After identifying topic-relevant sentences,

CSC 594 Topics in AI – Text Mining and Analytics

Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.

1 Generating Comparative Summaries of Contradictory Opinions in Text (CIKM09’)Hyun Duk Kim, ChengXiang Zhai 2010/05/24 Yu-wen,Hsu.

In silico immune response prediction based on peptide array data Mitja Luštrek Institute for Biostatistics and Informatics in Medicine and Aging Research.

From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:

Information Extraction for Clinical Data Mining: A Mammography Case Study H. Nassif, R. Woods, E. Burnside, M. Ayvaci, J. Shavlik and D. Page University.

26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.

Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.

Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.

26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.

Machine Learning: Ensemble Methods

Chapter 13 Simple Linear Regression

Kim Schouten, Flavius Frasincar, and Rommert Dekker

CS Fall 2016 (Shavlik©), Lecture 5

Linguistic Graph Similarity for News Sentence Searching

Evaluating Classifiers

Sentiment analysis algorithms and applications: A survey

Aspect-Based Sentiment Analysis Using Lexico-Semantic Patterns

Rule Induction for Classification Using

Aspect-Based Sentiment Analysis on the Web using Rhetorical Structure Theory Rowan Hoogervorst1, Erik Essink1, Wouter Jansen1, Max van den Helder1 Kim.

Web News Sentence Searching Using Linguistic Graph Similarity

Authorship Attribution Using Probabilistic Context-Free Grammars

Relation Extraction CSCI-GA.2591

Effects of Targeted Troubleshooting Activities on

Erasmus University Rotterdam

Presented by: Hassan Sayyadi

Sentiment Analyzer Using a Multi-Level Classifier

Elementary Statistics

Ontology-Based Aspect Detection for Sentiment Analysis

An Ontology-Enhanced Hybrid Approach to Aspect-Based Sentiment Analysis Daan de Heij, Artiom Troyanovsky, Cynthia Yang, Milena Zychlinsky Scharff, Kim.

Interpreting and analysing data

iSRD Spam Review Detection with Imbalanced Data Distributions

Review-Level Aspect-Based Sentiment Analysis Using an Ontology

Applying SVM to Data Bypass Prediction

Chi-Square Goodness of Fit Test

Video Ad Mining for Predicting Revenue using Random Forest

CS246: Information Retrieval

Giuseppe Attardi Dipartimento di Informatica Università di Pisa

Exploring Lexico-Semantic Patterns for Aspect-Based Sentiment Analysis

What is The Optimal Number of Features

Interpreting and analysing data

Hierarchical, Perceptron-like Learning for OBIE

MAS 622J Course Project Classification of Affective States - GP Semi-Supervised Learning, SVM and kNN Hyungil Ahn

Ontology-Enhanced Aspect-Based Sentiment Analysis

Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017

Presentation transcript:

Ontology-Driven Sentiment Analysis of Product and Service Aspects Kim Schouten and Flavius Frasincar

Problem statement What sentiment is expressed about which aspect of a given entity? Usually only look at polarity: is it positive, neutral, or negative? SemEval-2015/2016 ABSA task data Reviews are split in sentences Sentences are annotated with aspects For each aspect, determine positive/negative/neutral Can we do this task using an ontology, and not just as a glorified sentiment lexicon? Lexalytics, Brandwatch, startups: Wonderflow, Vita.io

Role of ontology Previous work used ontology to get additional features to improve a classifier Hard to interpret results What if we used just the ontology to infer sentiment? Results are 100% explainable Ontology has to be large enough (which it isn’t) To cover for the small ontology size, we also train a bag-of-words classifier Used when ontology does not provide conclusive evidence No sentiment words at all Both positive and negative words

Ontology

Purpose of ontology Sentiment lexicon Aspect and sentiment concepts have lexicalizations Link classes to words in text High-level aspect concepts have an aspect annotation Link classes to ‘aspect category’ annotations in data set Sentiment words that are always positive are subclasses of the Positive class Same for Negative, no Neutral class. Sentiment words that have the same sentiment value, regardless of aspect are called type-1 sentiment words in the paper.

Data Snippet <sentence id="1032695:1"> <text>Everything is always cooked to perfection , the service is excellent , the decor cool and understated .</text> <Opinions> <Opinion target="NULL" category="FOOD#QUALITY" polarity="positive" from="0" to="0"/> <Opinion target="service" category="SERVICE#GENERAL" polarity="positive" from="47" to="54"/> <Opinion target="decor" category="AMBIENCE#GENERAL" polarity="positive" from="73" to="78"/> </Opinions> </sentence>

Purpose of ontology Sentiment lexicon Sentiment scope Some sentiment words are only ever used for a single aspect category These classes have this aspect class as an extra superclass Sentiment word will not be used to determine sentiment for other aspect categories For example: “noisy” implies the “ambience” aspect in the restaurant domain If the sentence also has the “food” aspect to compute sentiment for, “noisy” will be ignored In the paper, these are called type-2 sentiment words

Purpose of ontology Sentiment lexicon Sentiment scope Context-dependent sentiment The same sentiment word can have a different polarity for different aspects “For such a high price, the quality is indeed high, as expected.” This is modeled in the ontology using class axioms and referred to as type-3 Quality and High SubclassOf Positive Price and High SubclassOf Negative Creating a subclass of both aspect and sentiment class will trigger the axiom Reasoner will infer the right sentiment class

Sentiment classification

Sentiment classification The ontology (Ont) method uses a very simple mechanism to compute sentiment For each aspect, get all sentiment concepts in the sentence For each sentiment concept, check type If type-1: save superclasses in set If type-2: save superclasses only when aspect matches If type-3: for each directly related word that is the lexicalization of an aspect class; make a new subclass with both aspect and sentiment class as superclasses; save superclasses of this new class In case a negation is found then flip the sentiment Negator word in preceding 3 words or a ‘neg’ relation Set of superclasses hopefully includes the Positive or Negative class

Sentiment classification If set contains only Positive -> predict “positive” If set contains only Negative -> predict “negative” If set contains both or none, the ontology method cannot do much We experimented with counting Positive and Negative and picking the highest, but that did not improve performance. In case the method is inconclusive we can do two things: Predict majority class (Positive) (Ont method in paper) Use a bag-of-words model to predict sentiment (Ont+BoW method in paper)

The bag-of-words model Simple model using as features: The presence of words in the whole review The aspect category of the current aspect The sentiment value of the sentence as computed by the CoreNLP sentiment module Classifier is the standard Weka SVM model with RBF kernel and optimized hyperparameters

The alternative bag-of-words model (BoW+Ont) Basic bag-of-words model augmented with ontology features Use ontology method to find the classes for a given aspect If it only contains Positive, add Positive to the feature set In this way it has the same information as the two-stage Ont+Bow method

Results

Sentiment distribution (2016 data set) Use decision tree to find triggers for neutral case.

Out-of-sample accuracy Results SemEval-2015 data Out-of-sample accuracy In-sample accuracy 10-fold cv accuracy 10-fold cv st.dev. Ont 63.3% 79.4% 79.3% 0.0508 BoW 80.0% 91.1% 81.9% 0.0510 Ont+BoW 82.5% 89.9% 84.2% 0.0444 BoW+Ont 81.5% 91.7% 83.9% 0.0453 All averages are statistically significant, except Ont+BoW vs. BoW+Ont

Out-of-sample accuracy Results SemEval-2016 data Out-of-sample accuracy In-sample accuracy 10-fold cv accuracy 10-fold cv st.dev. Ont 76.1% 73.9% 74.2% 0.0527 BoW 82.0% 90.0% 81.9% 0.0332 Ont+BoW 86.0% 89.3% 84.3% 0.0319 BoW+Ont 85.7% 90.4% 83.7% 0.0370 All averages are statistically significant

Data size sensitivity analysis (SemEval-2015 data) Keep the test set the same Use only n% of available training data Set n to 10:100 with step size 10

Data size sensitivity analysis (SemEval-2016 data) Keep the test set the same Use only n% of available training data Set n to 10:100 with step size 10

Performance of Ont and BoW per scenario Ontology method is only used when it finds just Positive or just Negative Bag-of-words model is only used in the remaining cases Measure performance for each of the scenarios SemEval-2016 data size Ontology accuracy Bag-of-words accuracy Found only Positive 42.7% 88.1% 83.7% Found only Negative 9.8% 94.0% 85.5% Found both 4.3% 47.2% 52.8% Found none 43.2% 33.4% 77.3%

Conclusions & Future Work

Conclusions Ontology method and bag-of-words method complement each other Good hybrid performance Performance of pure ontology method is low due to lack of coverage However, when applicable Gives good performance Explainable results No training data necessary Of course…good domain ontologies do not appear instantly either…

Future Work Automate ontology population We currently have a semi-automatic approach that provides suggestions for ontology population Include multi-word sentiment expressions “The food here is out of this world.” Investigate best way to combine sentiment and distance information Currently just one step in dependency graph, but not really investigated Improve speed The Jena library rebuilds the full inference model every time a class is added Solved a bit by caching, but not pretty

Questions? https://github.com/KSchouten/Heracles