Exploring Lexico-Semantic Patterns for Aspect-Based Sentiment Analysis

Slides:

Advertisements

Similar presentations

CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)

Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Farag Saad i-KNOW 2014 Graz- Austria,

Distant Supervision for Emotion Classification in Twitter posts 1/17.

Machine learning continued Image source:

Problem Semi supervised sarcasm identification using SASI

Multiple Aspect Ranking using the Good Grief Algorithm Benjamin Snyder and Regina Barzilay at MIT Elizabeth Kierstead.

TEMPLATE DESIGN © Identifying Noun Product Features that Imply Opinions Lei Zhang Bing Liu Department of Computer Science,

Sentiment Analysis An Overview of Concepts and Selected Techniques.

Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.

Exploiting Emoticons in Sentiment Analysis SAC 2013 Daniella Bal Erasmus University Rotterdam Flavius Frasincar Erasmus University.

Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam

Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.

Analyzing Sentiment in a Large Set of Web Data while Accounting for Negation AWIC 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam.

Nikolay Archak,Anindya Ghose,Panagiotis G. Ipeirotis Class Presentation By: Arunava Bhattacharya.

Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.

Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews K. Dave et al, WWW 2003, citations Presented by Sarah.

Short Introduction to Machine Learning Instructor: Rada Mihalcea.

2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.

 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.

1 Statistical NLP: Lecture 9 Word Sense Disambiguation.

Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者：郝柏翰 2013/01/28.

Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.

*Erasmus University Rotterdam P.O. Box 1738, NL-3000 DR Rotterdam, the Netherlands † Teezir BV Wilhelminapark 46, NL-3581 NL, Utrecht, the Netherlands.

A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,

TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.

1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )

Lexico-semantic Patterns for Information Extraction from Text The International Conference on Operations Research 2013 (OR 2013) Frederik Hogenboom

CSC 594 Topics in AI – Text Mining and Analytics

Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.

Comparative Experiments on Sentiment Classification for Online Product Reviews Hang Cui, Vibhu Mittal, and Mayur Datar AAAI 2006.

From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

A Kernel Approach for Learning From Almost Orthogonal Pattern * CIS 525 Class Presentation Professor: Slobodan Vucetic Presenter: Yilian Qin * B. Scholkopf.

Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.

Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.

A Document-Level Sentiment Analysis Approach Using Artificial Neural Network and Sentiment Lexicons Yan Zhu.

An Effective Statistical Approach to Blog Post Opinion Retrieval Ben He, Craig Macdonald, Jiyin He, Iadh Ounis (CIKM 2008)

Language Identification and Part-of-Speech Tagging

Kim Schouten, Flavius Frasincar, and Rommert Dekker

Sentiment analysis algorithms and applications: A survey

Introduction to Machine Learning and Text Mining

Aspect-Based Sentiment Analysis Using Lexico-Semantic Patterns

School of Computer Science & Engineering

Aspect-Based Sentiment Analysis on the Web using Rhetorical Structure Theory Rowan Hoogervorst1, Erik Essink1, Wouter Jansen1, Max van den Helder1 Kim.

CRF &SVM in Medication Extraction

Cross Domain Distribution Adaptation via Kernel Mapping

Erasmus University Rotterdam

Supervised Machine Learning

Improving a Pipeline Architecture for Shallow Discourse Parsing

University of Computer Studies, Mandalay

Aspect-based sentiment analysis

Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang

Statistical NLP: Lecture 9

Ontology-Based Aspect Detection for Sentiment Analysis

An Ontology-Enhanced Hybrid Approach to Aspect-Based Sentiment Analysis Daan de Heij, Artiom Troyanovsky, Cynthia Yang, Milena Zychlinsky Scharff, Kim.

An Overview of Concepts and Selected Techniques

iSRD Spam Review Detection with Imbalanced Data Distributions

Review-Level Aspect-Based Sentiment Analysis Using an Ontology

Ontology-Driven Sentiment Analysis of Product and Service Aspects

Chunk Parsing CS1573: AI Application Development, Spring 2003

Statistical n-gram David ling.

Hierarchical, Perceptron-like Learning for OBIE

Rachit Saluja 03/20/2019 Relation Extraction with Matrix Factorization and Universal Schemas Sebastian Riedel, Limin Yao, Andrew.

Part-of-Speech Tagging Using Hidden Markov Models

Modeling IDS using hybrid intelligent systems

Extracting Why Text Segment from Web Based on Grammar-gram

Statistical NLP : Lecture 9 Word Sense Disambiguation

Ontology-Enhanced Aspect-Based Sentiment Analysis

Presentation transcript:

Exploring Lexico-Semantic Patterns for Aspect-Based Sentiment Analysis Flavius Frasincar* frasincar@ese.eur.nl * Joint work with Frederique Baas, Olivier Bus, Alexander Osinga, Nikki van de Ven, Steffie van Loenhout, Lisanne Vrolijk, and Kim Schouten

Contents Motivation Related Work Data Methodology Evaluation Conclusion (includes Future Work)

Motivation Due to the convenience of shopping online there is an increasing number of Web shops Web shops often provide a platform for consumers to share their experiences, which lead to an increasing number of product reviews: In 2014: the number of reviews on Amazon exceeded 10 million Product reviews used for decision making: Consumers: decide or confirm which products to buy Producers: improve or develop new products, marketing strategies, etc.

Motivation Reading all reviews is time consuming, therefore the need for automation Sentiment mining is defined as the automatic assessment of the sentiment expressed in text (in our case by consumers in product reviews) Several granularities of sentiment mining: Review-level Sentence-level Aspect-level (product aspects are sometimes referred to as product features): Aspect-Based Sentiment Analysis (ABSA): Sentence-level [our focus here]

Motivation Aspect-Based Sentiment Analysis (ABSA) has two stages: Aspect detection: Explicit aspect detection: aspects appear literally in product reviews Implicit aspect detection: aspects do not appear literally in the product reviews Sentiment detection: assigning the sentiment associated to explicit or implicit aspects [our focus here] Main problem: Which lexico-semantic patterns provide the best features for an SVM-based ABSA sentiment detection algorithm?

Main Idea and Evaluation Result People use fixed patterns to express their opinions We investigated 5 different lexico-semantic patterns classes: Lexical Syntactic Semantic Sentiment Hybrid Collection of restaurant and laptop reviews from SemEval 2015

Main Idea and Evaluation Result Some of the selected patterns: Synset bigram Negator-POS bigram POS bigram Compared to an SVM with BoW baseline: For restaurants test data the F1 increased from 63.6% to 69.0% For laptops test data the F1 increased from 70% to 73.1%

Related Work Lexical Features: Syntactic Features: Semantic Features: Use unigram features (Kiritchenko, 2014) Use surface context (use of n words surrounding the aspect) and possibly assign weights based on the distance to the aspect (Brychcin, 2014) Syntactic Features: Use of POS and POS sequences (Koto and Adriani, 2015) Assign higher weights to some POS bigrams, e.g., adjective+noun (Das and Balabantaray, 2014) Semantic Features: Use synsets for sentiment analysis (Schouten and Frasincar, 2015) Use dependency relations to propagate sentiment from synset to synset (Poria et al., 2014)

Methodology Linear SVM: Robust in high-dimensional space Performs well even for small samples Allows interpretation of the weights Multi-class classification: positive, negative, and neutral (one-vs-all strategy) Aspect are given and have a surface context: Explicit aspects: two parameters, number of words before the target and number of words after the target (window size) Implicit aspects: use all words in a sentence Hyper-parameters (C and window size) optimized and feature selection using 10-fold cross-validation (on training dataset)

Lexico-Semantic Patterns Lexical: Word unigrams, bigrams, trigrams, and quadgrams (larger n-grams are too sparse) Syntactic: POS unigrams, bigrams, trigrams, and quadgrams (based on Stanford POS tagger) Semantic: Synset unigrams and synsets bigrams (based on adapted Lesk algorithm for word sense disambiguation) Examples of synset bigram: “small#JJ#1”+ “device#NN#1” denotes in general positive sentiment, but “small#JJ#1”+ “meal#NN#1” denotes in general negative sentiment For synset bigrams the order of the synsets was discarded (to reduce sparsity)

Lexico-Semantic Sentiment: Hybrid: Sentisynset unigram, negator-sentisynset bigram (based on SentiWordNet, sentiment = positive_score – negative_score and negator words that flip the sentiment are from GeneralEnquirer) The only non-binary features Hybrid: Negator-POS bigram and synset-POS bigram Negator-POS bigram generalizes over word bigram (e.g., “not good” and “not tasty” generalized as negator-adjective) and specializes over POS-bigram (“not good”, “not tasty”, and “very delicious” are all adverb-adjective bigrams but the sentiment is very different)

Data SemEval 2015 two datasets: restaurants and laptops Example: <sentence id="1486041:2"> <text>Despite a slightly limited menu, everything prepared is done to perfection, ultra fresh and a work of food art.</text> <Opinions> <Opinion target="menu" category="FOOD#STYLE_OPTIONS" polarity="negative" from="27" to="31"/> <Opinion target="NULL" category="FOOD#QUALITY" polarity="positive" from="0" to="0"/> <Opinion target="NULL" category="FOOD#STYLE_OPTIONS" polarity="positive" from="0" to="0"/> </Opinions> </sentence>

Data Restaurants: Laptops: 254 reviews: 1315 sentences Mix of explicit and implicit aspects Laptops: 277 reviews: 1739 sentences Only implicit aspects (targets are not specified) Each sentence contains 0, 1, or multiple aspects Each aspect is labeled as positive, neutral, or negative Sentences can have different sentiments for different aspects and sometimes even for the very same aspect!

Data

Data

Evaluation Marginal effect of adding one feature versus majority baseline (positive) using F1:

Feature Selection Using forward feature selection Laptop: Restaurant: word unigram, synset bigram, sentisynset unigram, and synset unigram Restaurant: word unigram, synset bigram, sentisynset unigram, POS bigram, and negator-POS bigram Observation: POS-based bigrams are selected for the restaurant dataset but not for the laptop dataset

Feature Prevalence Deviation from the expected negative polarity for POS bigram (if the feature does not discriminate the aspect sentiment, it will appear in the same proportion in a negative context as in the whole dataset) Legend: DT = determiner NN = noun IN = preposition PRP = personal pronoun VBD = verb past tense RB = adverb JJ = adjective Bigram PRP VBD (e.g., “I thought”) is good indicator for negative sentiment for restaurants

Surface Context and Feature Ablation Optimal surface context: 8 words to the left of the target and 8 words to the right target Feature ablation: Using all features except for the analyzed feature word unigram important for laptops synset bigram important for both restaurants and laptops

Feature Weight SVM: Optimal set of features and optimal hyper-parameters Only positive/negative classification for interpretation (removed neutral) Legend: W = word unigram SS = synset unigram SSS = sentisynset unigram * = present tense ** = past tense word “soggy” strong indicator for negative sentiment for restaurants word “be” associated with positive sentiment in present tense and negative sentiment in past tense word “Dell” strong indicator for negative sentiment for laptops

Classification Results Using the optimal set of features and optimal hyper-parameters Majority baseline: predict always positive It is harder to predict the less occurring sentiment classes (neutral and negative) Restaurant dataset is harder: relatively more negative sentiment in test than in training, less instances, and more conflictual sentiment per sentence 5.4 percentage points increase compared to the SVM with BoW 3.1 percentage points increase compared to the SVM with BoW

Classification Results F1 scores: The restaurant training dataset has relatively more positive sentiment instances than the restaurant test dataset -> The cross-validation sentiment scores for the restaurant training dataset with positive sentiment are better than the sentiment scores for the restaurant test dataset with positive sentiment (not true for the laptop dataset)

Failure Analysis – Example 1 “It is not a small, compact laptop, but I won’t be traveling via air often, so its size and weight is not a problem for me.” ‘laptop design’ is an implicit feature, annotated as positive but predicted as negative Observation 1: The current features cannot deal with double negation (negations considered only in bigrams) Observation 2: Some people might argue that the sentiment is neutral, underlining the subjectivity level involved in sentiment analysis

Failure Analysis – Example 2 “kinda to light and plastic feeling.” ‘laptop design’ is an implicit feature, annotated as negative but predicted as positive Observation 1: There is one spelling error “to” instead of “too” (undetected by the algorithm) Observation 2: The spelling error implies a positive sentiment as “light” has a positive sentiment in relation to laptops (while the “too light” is negative)

Failure Analysis – Examples 1 and 2 The sentiment is user-specific: Example 1: laptop is heavy but the user does not mind (positive sentiment) Example 2: laptop is light but the user prefers a heavier one (negative sentiment) Observation: Difficult to predict sentiment even when taking domain knowledge into account

Failure Analysis – Example 3 “The biggie though is the fact that it disconnects from the internet whenever if feels like it, even when the strength bar is filled.” ‘laptop connectivity’ is annotated as negative but predicted as positive Observation: The writing style is very specific here (“whenever it feels like it”), lexico-semantic patterns are unable to capture the negative sentiment

Conclusion Lexico-semantic patterns help aspect-based sentiment analysis: Laptop: word unigram, synset bigram, sentisynset unigram, and synset unigram (F1 increases 73.1% from 70.0% with BoW) Restaurant: word unigram, synset bigram, sentisynset unigram, POS bigram, and negator-POS bigram (F1 increases to 69.0% from 63.6% with BoW) Future work: Applying a spell-checker before sentiment analysis Using concept patterns based on a sentiment domain ontology Developing hybrid approaches: Knowledge reasoning (using a sentiment domain ontology) Machine learning (focus on deep learning solutions)