Ontology-Enhanced Aspect-Based Sentiment Analysis

Slides:



Advertisements
Similar presentations
Information Extraction Lecture 7 – Linear Models (Basic Machine Learning) CIS, LMU München Winter Semester Dr. Alexander Fraser, CIS.
Advertisements

Sentiment Analysis on Twitter Data
Farag Saad i-KNOW 2014 Graz- Austria,
Distant Supervision for Emotion Classification in Twitter posts 1/17.
Machine learning continued Image source:
The Many Ways of Improving the Industrial Coding for Statistics Canada’s Business Register Yanick Beaucage ICES III June 2007.
Sentiment Analysis An Overview of Concepts and Selected Techniques.
Exploiting Emoticons in Sentiment Analysis SAC 2013 Daniella Bal Erasmus University Rotterdam Flavius Frasincar Erasmus University.
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Mapping Between Taxonomies Elena Eneva 11 Dec 2001 Advanced IR Seminar.
Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
A Joint Model of Feature Mining and Sentiment Analysis for Product Review Rating Jorge Carrillo de Albornoz Laura Plaza Pablo Gervás Alberto Díaz Universidad.
More than words: Social networks’ text mining for consumer brand sentiments A Case on Text Mining Key words: Sentiment analysis, SNS Mining Opinion Mining,
Opinion mining in social networks Student: Aleksandar Ponjavić 3244/2014 Mentor: Profesor dr Veljko Milutinović.
Erasmus University Rotterdam Introduction Nowadays, emerging news on economic events such as acquisitions has a substantial impact on the financial markets.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Learning to Classify Short and Sparse Text & Web with Hidden Topics from Large- scale Data Collections Xuan-Hieu PhanLe-Minh NguyenSusumu Horiguchi GSIS,
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali and Vasileios Hatzivassiloglou Human Language Technology Research Institute The.
An Introduction to Machine Learning and Natural Language Processing Tools Presented by: Mark Sammons, Vivek Srikumar (Many slides courtesy of Nick Rizzolo)
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
Learning from Multi-topic Web Documents for Contextual Advertisement KDD 2008.
Face Detection Using Large Margin Classifiers Ming-Hsuan Yang Dan Roth Narendra Ahuja Presented by Kiang “Sean” Zhou Beckman Institute University of Illinois.
TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Lexico-semantic Patterns for Information Extraction from Text The International Conference on Operations Research 2013 (OR 2013) Frederik Hogenboom
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
Text Annotation By: Harika kode Bala S Divakaruni.
Advanced Gene Selection Algorithms Designed for Microarray Datasets Limitation of current feature selection methods: –Ignores gene/gene interaction: single.
Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
Sparse Coding: A Deep Learning using Unlabeled Data for High - Level Representation Dr.G.M.Nasira R. Vidya R. P. Jaia Priyankka.
You Can’t Afford to be Late!
ORec : An Opinion-Based Point-of-Interest Recommendation Framework
Jonatas Wehrmann, Willian Becker, Henry E. L. Cagnini, and Rodrigo C
Kim Schouten, Flavius Frasincar, and Rommert Dekker
Name: Sushmita Laila Khan Affiliation: Georgia Southern University
Linguistic Graph Similarity for News Sentence Searching
Sentiment analysis algorithms and applications: A survey
Aspect-Based Sentiment Analysis Using Lexico-Semantic Patterns
System for Semi-automatic ontology construction
Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Guangbing Yang Presentation for Xerox Docushare Symposium in 2011
Aspect-Based Sentiment Analysis on the Web using Rhetorical Structure Theory Rowan Hoogervorst1, Erik Essink1, Wouter Jansen1, Max van den Helder1 Kim.
Web News Sentence Searching Using Linguistic Graph Similarity
Erasmus University Rotterdam
Tagging documents made easy, using machine learning
Introductory Statistical Language
Data Mining (and machine learning)
Ontology-Based Aspect Detection for Sentiment Analysis
An Ontology-Enhanced Hybrid Approach to Aspect-Based Sentiment Analysis Daan de Heij, Artiom Troyanovsky, Cynthia Yang, Milena Zychlinsky Scharff, Kim.
CS Fall 2016 (Shavlik©), Lecture 2
The Assistive System Progress Report 2 Shifali Kumar Bishwo Gurung
An Overview of Concepts and Selected Techniques
iSRD Spam Review Detection with Imbalanced Data Distributions
Review-Level Aspect-Based Sentiment Analysis Using an Ontology
Ontology-Driven Sentiment Analysis of Product and Service Aspects
ISWC 2013 Entity Recommendations in Web Search
Introduction to Text Analysis
Exploring Lexico-Semantic Patterns for Aspect-Based Sentiment Analysis
Clinically Significant Information Extraction from Radiology Reports
Deep SEARCH 9 A new tool in the box for automatic content classification: DS9 Machine Learning uses Hybrid Semantic AI ConTech November.
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Extracting Why Text Segment from Web Based on Grammar-gram
Presentation transcript:

Ontology-Enhanced Aspect-Based Sentiment Analysis Kim Schouten, Flavius Frasincar, and Franciska de Jong

What is sentiment analysis? Many people freely express their opinions on the Web Extract sentiment from unstructured text Useful: For consumers when making a purchase decision For producers to assess the impact of marketing campaigns, success of product launches, etc. Many companies in business analytics include some form of sentiment analysis Lexalytics, Brandwatch, startups: Wonderflow, Vita.io

Why aspect-based? Finer level of analysis Directly link expressed sentiment to actual topic or aspect Useful when topic is not known beforehand, or when there can be multiple topics Two main tasks: Aspect detection Sentiment analysis for aspects

Data Snippet <sentence id="1032695:1"> <text>Everything is always cooked to perfection , the service is excellent , the decor cool and understated .</text> <Opinions> <Opinion target="NULL" category="FOOD#QUALITY" polarity="positive" from="0" to="0"/> <Opinion target="service" category="SERVICE#GENERAL" polarity="positive" from="47" to="54"/> <Opinion target="decor" category="AMBIENCE#GENERAL" polarity="positive" from="73" to="78"/> </Opinions> </sentence>

Data Snippet <sentence id="1032695:1"> <text>Everything is always cooked to perfection , the service is excellent , the decor cool and understated .</text> <Opinions> <Opinion target="NULL" category="FOOD#QUALITY" polarity="positive" from="0" to="0"/> <Opinion target="service" category="SERVICE#GENERAL" polarity="positive" from="47" to="54"/> <Opinion target="decor" category="AMBIENCE#GENERAL" polarity="positive" from="73" to="78"/> </Opinions> </sentence>

Data Snippet <sentence id="1032695:1"> <text>Everything is always cooked to perfection , the service is excellent , the decor cool and understated .</text> <Opinions> <Opinion target="NULL" category="FOOD#QUALITY" polarity="positive" from="0" to="0"/> <Opinion target="service" category="SERVICE#GENERAL" polarity="positive" from="47" to="54"/> <Opinion target="decor" category="AMBIENCE#GENERAL" polarity="positive" from="73" to="78"/> </Opinions> </sentence>

Task 1: Aspect Detection Data Snippet <sentence id="1032695:1"> <text>Everything is always cooked to perfection , the service is excellent , the decor cool and understated .</text> <Opinions> <Opinion target="NULL" category="FOOD#QUALITY" polarity="positive" from="0" to="0"/> <Opinion target="service" category="SERVICE#GENERAL" polarity="positive" from="47" to="54"/> <Opinion target="decor" category="AMBIENCE#GENERAL" polarity="positive" from="73" to="78"/> </Opinions> </sentence> Task 1: Aspect Detection

Data Snippet <sentence id="1032695:1"> <text>Everything is always cooked to perfection , the service is excellent , the decor cool and understated .</text> <Opinions> <Opinion target="NULL" category="FOOD#QUALITY" polarity="positive" from="0" to="0"/> <Opinion target="service" category="SERVICE#GENERAL" polarity="positive" from="47" to="54"/> <Opinion target="decor" category="AMBIENCE#GENERAL" polarity="positive" from="73" to="78"/> </Opinions> </sentence>

Task 2: Sentiment Analysis for Aspects Data Snippet <sentence id="1032695:1"> <text>Everything is always cooked to perfection , the service is excellent , the decor cool and understated .</text> <Opinions> <Opinion target="NULL" category="FOOD#QUALITY" polarity="positive" from="0" to="0"/> <Opinion target="service" category="SERVICE#GENERAL" polarity="positive" from="47" to="54"/> <Opinion target="decor" category="AMBIENCE#GENERAL" polarity="positive" from="73" to="78"/> </Opinions> </sentence> Task 2: Sentiment Analysis for Aspects

Motivation and Setup

Why use external knowledge? Using external knowledge alleviates the need for data Good for resource sparse languages / domains Use machine learning for its high performance Our setup: Use linear SVM Basic linguistic features (lemmas/synsets) Add features and/or adjust feature weights based on external knowledge

Ontology as external knowledge An ontology is a formal representation of knowledge usually created manually by domain experts An ontology allows easy sharing of knowledge (e.g. LOD) reasoning to derive (new) facts Our ontology describes a part of the restaurant domain 56 sentiment expressions 3 sentiment values 185 target concepts 5:00

Ontology Snippet 5:00

Setup for Aspect Detection Linear SVMs, one classifier per aspect category (13 in total) Input features: Basic binary features: lemmas and synsets Binary ontology features: For each concept: include all its types When we find “Heineken”, we include the presence of Beer, Drink, Food, and Target in our input feature vector For each sentiment word: include types of its target When we find “delicious”, we include the types of its target: Food and Target

Setup for Aspect Sentiment Classification Single linear multiclass SVM to find positive, neutral, and negative Input features: Aspect category Stanford sentiment tool for Overall sentence sentiment value Sentiment value of the phrase that contains aspect (if applicable) Word-based features: lemmas, synsets, and ontology concepts

Setup for Aspect Sentiment Classification Instead of binary input features for lemma, synset, and ontology concepts, assign weights to these word-based features Weight = avg. sentiment value * distance factor Features get an average sentiment value by checking these three sources for the originating word: Stanford Sentiment tool NRC Canada’s Yelp Restaurant Review sentiment word list Our own ontology Feature weights are factored based on grammatical distance from the aspect (if applicable)

Context Dependent Sentiment Sentiment values in our ontology are context dependent The aspect text is checked for ontology concepts The SentimentExpression has to relate to one of these concepts Then the positive and negative concepts are translated to a +1 and -1 sentiment score, resp.

Context Dependent Sentiment - Example <sentence> <text>Food was delicious, staff was rude</text> <Opinions> <Opinion target="Food" category="FOOD#QUALITY" polarity="positive" from="0" to="4"/> <Opinion target="staff" category="SERVICE#GENERAL" polarity="negative" from="20" to="25"/> </Opinions> </sentence>

Context Dependent Sentiment - Example <sentence> <text>Food was delicious, staff was rude</text> <Opinions> <Opinion target="Food" category="FOOD#QUALITY" polarity="positive" from="0" to="4"/> <Opinion target="staff" category="SERVICE#GENERAL" polarity="negative" from="20" to="25"/> </Opinions> </sentence>

Results

Results – Aspect Detection Average over 10 runs with 10-fold cross-validation Aspect Detection p-values of two-sided paired t-test Avg. F1 St.dev. base +S +O Base 0.5749 0.0057 0.6317 0.0039 <0.0001 0.6870 0.0026 +SO 0.6981 0.0040

Results – Aspect Sentiment Classification Average over 10 runs with 10-fold cross-validation Sentiment Classification p-values of two-sided paired t-test Avg. F1 St.dev. base +S +O 0.7823 0.0079 0.7862 0.0049 0.0294 0.7958 0.0069 0.0002 0.0008 +SO 0.7995 0.0063 <0.0001

Results Single run on training and test data Aspect Detection In-sample F1 (on training data) Out-of-sample F1 (on test data) base 0.803 0.5392 +S 0.896 0.5728 +O 0.858 0.6135 +SO 0.920 0.6281 Sentiment Classification In-sample F1 (on training data) Out-of-sample F1 (on test data) base 0.831 0.7372 +S 0.847 0.7349 +O 0.863 0.7479 +SO 0.884 0.7527

Data size sensitivity analysis Keep the test set the same Use only n% of available training data Set n to 10:100 with step size 10 Performed for both aspect detection and aspect sentiment classification

Feature analysis Investigate the weights assigned by SVM to input features Gives a measure of how important a feature is Interpretation of these weights is only feasible with binary input values Hence, it is only performed for the aspect detection task Two success stories, one failure

Feature analysis for Aspect Detection Aspect: DRINK#STYLE_OPTIONS SVM Weight Feature 0.369 Ontology Concept: Menu 0.356 Ontology Concept: Drink 0.307 Synset: list 0.265 Synset: enough

Feature analysis for Aspect Detection Aspect: DRINK#PRICES SVM Weight Feature 0.428 Ontology Concept: Price 0.303 Ontology Concept: Drink 0.232 Synset: drink 0.204 Lemma: drink 0.184 Lemma: at 0.180 Lemma: price 0.176 Synset: wine

Feature analysis for Aspect Detection Aspect: RESTAURANT#GENERAL SVM Weight Feature 0.636 Lemma: worth 0.505 Lemma: love 0.487 Lemma: back 0.419 Lemma: wrong 0.406 Lemma: return 0.402 Lemma: up 0.399 Lemma: again 0.368 Lemma: overall 0.360 Lemma: favorite 0.355 Lemma: recommend

Conclusions & Future Work

Conclusions External knowledge improves aspect-based sentiment analysis Especially aspect detection is improved Better overall performance Less data is needed when using ontology Ontology coverage is important, as shown in feature analysis Sentiment analysis is less impacted by ontology Could be due to coverage Could be due to already using external knowledge in form of sentiment dictionaries

Future Work Include more sentiment expressions Improve reasoning Investigate best way to combine sentiment and distance information Locate aspects within sentence Investigate the use of domain relations Automate ontology population

Discussion

Bonus slides: Data Analysis

Natural Language Processing

Aspect Statistics

Aspect Statistics

Sentiment Statistics

Sentiment Statistics