Review-Level Aspect-Based Sentiment Analysis Using an Ontology

Slides:



Advertisements
Similar presentations
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Advertisements

Farag Saad i-KNOW 2014 Graz- Austria,
Learning Semantic Information Extraction Rules from News The Dutch-Belgian Database Day 2013 (DBDBD 2013) Frederik Hogenboom Erasmus.
Semantic News Recommendation Using WordNet and Bing Similarities 28th Symposium On Applied Computing 2013 (SAC 2013) March 21, 2013 Michel Capelle
Sentiment Analysis An Overview of Concepts and Selected Techniques.
D ETERMINING THE S ENTIMENT OF O PINIONS Presentation by Md Mustafizur Rahman (mr4xb) 1.
Page-level Template Detection via Isotonic Smoothing Deepayan ChakrabartiYahoo! Research Ravi KumarYahoo! Research Kunal PuneraUniv. of Texas at Austin.
Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
Analyzing Sentiment in a Large Set of Web Data while Accounting for Negation AWIC 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam.
Mining and Summarizing Customer Reviews
Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews K. Dave et al, WWW 2003, citations Presented by Sarah.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
C OLLECTIVE ANNOTATION OF WIKIPEDIA ENTITIES IN WEB TEXT - Presented by Avinash S Bharadwaj ( )
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
by B. Zadrozny and C. Elkan
Processing of large document collections Part 2 (Text categorization, term selection) Helena Ahonen-Myka Spring 2005.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
1 Intelligente Analyse- und Informationssysteme Frank Reichartz, Hannes Korte & Gerhard Paass Fraunhofer IAIS, Sankt Augustin, Germany Dependency Tree.
TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
1 Generating Comparative Summaries of Contradictory Opinions in Text (CIKM09’)Hyun Duk Kim, ChengXiang Zhai 2010/05/24 Yu-wen,Hsu.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Text Based Similarity Metrics and Delta for Semantic Web Graphs Krishnamurthy Koduvayur Viswanathan Monday, June 28,
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
Text Categorization by Boosting Automatically Extracted Concepts Lijuan Cai and Tommas Hofmann Department of Computer Science, Brown University SIGIR 2003.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Advanced Gene Selection Algorithms Designed for Microarray Datasets Limitation of current feature selection methods: –Ignores gene/gene interaction: single.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
An Effective Statistical Approach to Blog Post Opinion Retrieval Ben He, Craig Macdonald, Jiyin He, Iadh Ounis (CIKM 2008)
Language Identification and Part-of-Speech Tagging
Learning to Detect and Classify Malicious Executables in the Wild by J
Kim Schouten, Flavius Frasincar, and Rommert Dekker
Linguistic Graph Similarity for News Sentence Searching
Sentiment analysis algorithms and applications: A survey
Aspect-Based Sentiment Analysis Using Lexico-Semantic Patterns
Rule Induction for Classification Using
Aspect-Based Sentiment Analysis on the Web using Rhetorical Structure Theory Rowan Hoogervorst1, Erik Essink1, Wouter Jansen1, Max van den Helder1 Kim.
Deep Compositional Cross-modal Learning to Rank via Local-Global Alignment Xinyang Jiang, Fei Wu, Xi Li, Zhou Zhao, Weiming Lu, Siliang Tang, Yueting.
Web News Sentence Searching Using Linguistic Graph Similarity
Erasmus University Rotterdam
Wei Wei, PhD, Zhanglong Ji, PhD, Lucila Ohno-Machado, MD, PhD
Supervised Machine Learning
Aspect-based sentiment analysis
Data Mining Practical Machine Learning Tools and Techniques
Learning Emoji Embeddings Using Emoji Co-Occurrence Network Graph
Presented by: Prof. Ali Jaoua
Ontology-Based Aspect Detection for Sentiment Analysis
An Ontology-Enhanced Hybrid Approach to Aspect-Based Sentiment Analysis Daan de Heij, Artiom Troyanovsky, Cynthia Yang, Milena Zychlinsky Scharff, Kim.
An Overview of Concepts and Selected Techniques
iSRD Spam Review Detection with Imbalanced Data Distributions
Ontology-Driven Sentiment Analysis of Product and Service Aspects
Enriching Taxonomies With Functional Domain Knowledge
Word embeddings (continued)
Exploring Lexico-Semantic Patterns for Aspect-Based Sentiment Analysis
Hierarchical, Perceptron-like Learning for OBIE
Attention for translation
Introduction to Sentiment Analysis
Extracting Why Text Segment from Web Based on Grammar-gram
Ontology-Enhanced Aspect-Based Sentiment Analysis
Presentation transcript:

Review-Level Aspect-Based Sentiment Analysis Using an Ontology Flavius Frasincar* frasincar@ese.eur.nl * Joint work with Sophie de Kok, Linda Punt, Rosita van den Puttelaar, Karoliina Ranta, and Kim Schouten

Contents Motivation Related Work Data Methodology Evaluation Conclusion Future Work

Motivation Due to the convenience of shopping online there is an increasing number of Web shops Web shops often provide a platform for consumers to share their experiences, which lead to an increasing number of product reviews: In 2014: the number of reviews on Amazon exceeded 10 million Product reviews used for decision making: Consumers: decide or confirm which products to buy Producers: improve or develop new products, marketing campaigns, etc.

Motivation Reading all reviews is time consuming, therefore the need for automation Sentiment mining is defined as the automatic assessment of the sentiment expressed in text (in our case by consumers in product reviews) Several granularities of sentiment mining: Review-level Sentence-level Aspect-level (product aspects are sometimes referred to as product features): Aspect-Based Sentiment Analysis (ABSA): Review-level [our focus here]

Motivation Aspect-Based Sentiment Analysis (ABSA) has two stages: Aspect detection: Explicit aspect detection: aspects appear literally in product reviews Implicit aspect detection: aspects do not appear literally in the product reviews Sentiment detection: assigning the sentiment associated to explicit or implicit aspects [our focus here] Main problem: In previous work we have proposed an approach to detect the sentiment for an aspect at sentence-level How to find the sentiment for an aspect at review-level?

Main Idea and Evaluation Result Approach: Ontology-Driven Machine Learning (Multi-class classification with ontology-related features) Ontologies advantages: Deal with small training data Use axioms to derive implicit information Two solutions: Use a classifier to predict the aspect-based sentiment at review level Use a classifier to predict the aspect-based sentiment at sentence level and aggregate the sentiment

Main Idea and Evaluation Result Collection of restaurant reviews from SemEval 2016 Evaluation result 1: The review-level approach has an F1 of 81.19% on test data The sentence-level approach has an F1 of 77.17% on test data There is a 4.02 percentage points increase in F1 for the review-level classifier compared to the sentence-level classifier Evaluation result 2: With ontology features the review-level approach F1 increases from 80.20% to 81.19% on test data With ontology features the sentence-level approach F1 increases from 68.24% to 77.17% on test data Using the ontology features both classifiers get a better F1 with a larger increase for the sentence-level classifier compared to the review-level classifier

Related Work (Schouten et al., 2017): (Wei and Gulla, 2010): Uses a sentiment ontology and an SVM for classification Find ontology concepts associated for review words and related to the considered aspect, and add superclasses as ontology features No treatment of synonyms and does only sentence-level ABSA (Wei and Gulla, 2010): Uses a Sentiment Ontology Tree (SOT) where aspect nodes form a hierarchy and there are two leaf nodes (positive and negative) for each internal node Learns a classifier for each leaf node (Lau et al., 2009): Uses a sentiment ontology and manually crafted NLP rules for classification

Data SemEval 2014 dataset: restaurants reviews Training set: 335 reviews: 1435 review-aspect pairs 2455 sentence-aspect pairs Test set: 90 reviews: 404 review-aspect pairs 859 sentence-aspect pairs Each review-aspect pair is annotated with sentiment: positive, negative, neutral, or conflict Each review-aspect pair is annotated with sentiment: positive, negative, or neutral A sentence or review can contain multiple aspects Task: detect the aspect-based sentiment at review-level

Relative Frequencies of Aspects in Reviews RESTAURANT#GENERAL has a frequency of 100% (present in all reviews)

Relative Frequencies of Sentiment in Reviews Unbalanced sentiment distribution (Positive labels are the most frequent)

Methodology Multi-class classifier: linear SVM (shown to give good results for sentiment analysis in literature) Review-level: 4 classes (positive, negative, neutral, and conflict) Sentence-level: 3 classes (positive, negative, and neutral) SVM implementation: Weka, one-versus-one Data processing: Stanford CoreNLP Toolkit Tokenization Part-of-Speech Lemmatization Grammatical dependencies Ontology gazeteering

Ontology Three main classes: Available online: http://www.kimschouten.com/papers/sac2018-ontology.owl (manually created using the training set and three external resources http://quizlet.com , http://www.macmillandictionary.com/, and https://wordnet.princeton.edu/) Three main classes: Entity class with subclasses (noun aspect hierarchy): Ambience, Experience, Location, Person, Price, Restaurant, Service, StyleOptions, and Sustenance which have their own subclasses The aspect relation links an entity class with its corresponding aspect (e.g., FOOD#QUALITY) Property class (adjectives): Generic properties (e.g., GenericPositiveProperty): general positive or negative properties related to many Entity classes Entity-specific properties (e.g., AmbienceNegativeProperty): specific positive or negative properties for one Entity class (e.g., subclass of Property and subclass of Ambience) Sentiment class with subclasses Positive, Negative, and Neutral

Ontology Context specific sentiment properties: these properties (e.g., Cold) in combination with an entity (e.g., WarmDrinks) imply a subclass of sentiment (e.g., Negative)

Example Let assume that the word “cramped” appears in text lex.{“cramped”} ⊑ Cramped Cramped ⊑ AmbienceNegativeProperty AmbienceNegativeProperty ⊑ Ambience AmbienceNegativeProperty ⊑ Negative Ambience ⊑ aspect.{“AMBIENCE#GENERAL”} Thus “cramped” implies a negative sentiment about the aspect AMBIENCE#GENERAL

Two Algorithms Use a linear SVM classifier to predict the aspect-based sentiment at review level: Four classes: positive, negative, neutral, and conflict Use a linear SVM classifier to predict the aspect-based sentiment at sentence level and aggregate the sentiment: Three classes: positive (1), negative (-1), and neutral (0) Aggregation: Compute the average sentiment for an aspect If both positive and negative sentiment is present for an aspect then the overall sentiment for this aspect is conflict Otherwise: If average sentiment for an aspect is 0 then the overall sentiment for this aspect is neutral Otherwise the overall sentiment for this aspect is given by the sign of the average sentiment for an aspect (positive or negative)

Model Features Feature Generators: create one or more features [Ontology Independent] Aspect: the aspects present in a sentence/review Sentence count: the number of sentences in a review Lemma: the words present in a sentence/review [Ontology Dependent] Ontology concepts: If a concept lexicalization is found in a sentence/review and one of the superclasses relates to the current aspect category then add all superclasses as features Sentiment count: Whenever a concept lexicalization is found and the associated concept is a subclass of Positive or Negative, then increment the respective counter feature (positive or negative)

Model Features Feature Adaptors: adapt existing features [Ontology Dependent] Ontology concept score: multiplies the ontology concept score with 1 (for superclasses that do not relate to the current aspect category) or m >1 (for superclasses that do relate to the current aspect category) Negation handling: for the sentiment count, an ontology hit word that has a negation word in front of it negates the sentiment class of the associated concept Synonyms: for the ontology concepts, use the WordNet synonyms in addition to a given concept lexicalization (for a given domain there is in general only one WordNet synset associated to a concept)

Model Features Feature Adaptors: adapt existing features [Ontology Dependent] Weight: for the ontology concepts use the TF-IDF of the associated lexicalization of the superclasses of a found ontology concept Word window: for the feature generators, when a concept lexical representation is found (including synonyms) we use as textual unit (context) the words at most k grammatical dependency steps away

Evaluation Collection of restaurant reviews from SemEval 2016 We use the average F1 score for 10-fold cross-validation on the training data to determine the parameters and set of features Review-level approach: base (no ontology features): feature generators aspect, sentence count, and lemma final (with ontology features): feature generators aspect, sentence count, lemma, ontology concepts, and sentiment count, and feature adaptors: negation handling, synonyms, and weight For both model the optimized complexity parameter was c=0.1

Evaluation Review-level approach: The final review-level model (i.e., w/ ontology features) performs better than the base review-level model (i.e., w/o ontology features) for both training and test sets

Evaluation Sentence-level approach: baseSL (no ontology features): feature generators aspect and lemma ontSL (with ontology features): feature generators aspect, lemma, ontology concepts, and sentiment count, and feature adaptors: ontology concept score, negation handling, synonyms, weight, and word window Parameters: The ontology concept score parameter was m = 5 The word window parameter was k = 2 The optimized complexity parameter was c = 1 for baseSL and c = 0.1 for ontSL

Evaluation Sentence-level approach: sentence The ontology-based sentence-level model performs better than the base sentence-level model (i.e., w/o ontology features) for both training and test sets at sentence level

Evaluation Sentence-level approach: review The ontology-based sentence-level model performs better than the base sentence-level model (i.e., w/o ontology features) for both training and test sets at review level The gold value is an upper bound of F1 when using the gold annotations at sentence level

Evaluation SemEval 2016 ranking (on test set):

Evaluation There is a 4.02 percentage points increase in F1 for the review-level classifier compared to the sentence-level classifier Using the ontology features both classifiers get a better F1 with a larger increase for the sentence-level classifier compared to the review-level classifier The accuracy difference of the review-level classifier to the best performing SemEval 2016 classifier is less than 1 percentage points

Evaluation Data size sensitivity (on test set): 10 runs on training set The ontology gives better results for all training data sizes The ontology boost seems not to depend on the training data size

Evaluation Top 10 most important features for the final review-level model based on to information gain (feature generator: feature) Most features relate to the dominant class (Negative) The top 80 features with the largest SVM weight are all ontology features such Negative, Boring, and Cozy

Conclusion We proposed two algorithms for review-level aspect-based sentiment analysis: Review-based algorithm Sentence-based algorithm The review-based algorithm performs better than the sentence-based algorithm The use of ontology features boosts the performance of both algorithms The ontology performance boost seem not to depend on the size of the training data

Future Work Apply a two step approach: Use ontology reasoning first If ontology inconclusive, apply SVM without ontology features (worked well for sentence-based sentiment analysis, results to be presented at ESWC 2018) Automatic creation of the ontology from text Extend the ontology coverage using word embeddings Extract the strength of a sentiment instead of just the polarity: positive, negative, neutral, and conflict Replace the SVM classifier with a deep learning solution