Opinion Mining : A Multifaceted Problem Lei Zhang University of Illinois at Chicago Some slides are based on Prof. Bing Liu’s presentation.

Slides:



Advertisements
Similar presentations
Introduction to Information Retrieval
Advertisements

Problem Semi supervised sarcasm identification using SASI
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
TEMPLATE DESIGN © Identifying Noun Product Features that Imply Opinions Lei Zhang Bing Liu Department of Computer Science,
Sentiment Analysis and Opinion Mining
Sentiment Analysis Bing Liu University Of Illinois at Chicago
Extract from various presentations: Bing Liu, Aditya Joshi, Aster Data … Sentiment Analysis January 2012.
Sentiment Analysis An Overview of Concepts and Selected Techniques.
CSE 538 Bing Liu Book Chapter 11: Opinion Mining and Sentiment Analysis.
CIS630 Spring 2013 Lecture 2 Affect analysis in text and speech.
Peiti Li 1, Shan Wu 2, Xiaoli Chen 1 1 Computer Science Dept. 2 Statistics Dept. Columbia University 116th Street and Broadway, New York, NY 10027, USA.
Product Feature Discovery and Ranking for Sentiment Analysis from Online Reviews. __________________________________________________________________________________________________.
Bag-of-Words Methods for Text Mining CSCI-GA.2590 – Lecture 2A
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining and Summarizing Customer Reviews Advisor : Dr.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
Bing LiuCS Department, UIC1 Learning from Positive and Unlabeled Examples Bing Liu Department of Computer Science University of Illinois at Chicago Joint.
Opinion Mining and Sentiment Analysis: NLP Meets Social Sciences Bing Liu Department of Computer Science University Of Illinois at Chicago
Mining and Searching Opinions in User-Generated Contents Bing Liu Department of Computer Science University of Illinois at Chicago.
A Holistic Lexicon-Based Approach to Opinion Mining
Chapter 11: Opinion Mining
Mining and Summarizing Customer Reviews
Opinion Mining and Sentiment Analysis
Mining and Summarizing Customer Reviews Minqing Hu and Bing Liu University of Illinois SIGKDD 2004.
Unsupervised Learning. CS583, Bing Liu, UIC 2 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate.
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
A Holistic Lexicon-Based Approach to Opinion Mining Xiaowen Ding, Bing Liu and Philip Yu Department of Computer Science University of Illinois at Chicago.
1 Entity Discovery and Assignment for Opinion Mining Applications (ACM KDD 09’) Xiaowen Ding, Bing Liu, Lei Zhang Date: 09/01/09 Speaker: Hsu, Yu-Wen Advisor:
Identifying Comparative Sentences in Text Documents
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
WSDM’08 Xiaowen Ding 、 Bing Liu 、 Philip S. Yu Department of Computer Science University of Illinois at Chicago Conference on Web Search and Data Mining.
Partially Supervised Classification of Text Documents by Bing Liu, Philip Yu, and Xiaoli Li Presented by: Rick Knowles 7 April 2005.
Learning from Multi-topic Web Documents for Contextual Advertisement KDD 2008.
Presenter: Shanshan Lu 03/04/2010
Chapter 11: Opinion Mining Bing Liu Department of Computer Science University of Illinois at Chicago
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
Chapter 11: Opinion Mining Bing Liu Department of Computer Science University of Illinois at Chicago
Opinion Mining of Customer Feedback Data on the Web Presented By Dongjoo Lee, Intelligent Databases Systems Lab. 1 Dongjoo Lee School of Computer Science.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Entity Set Expansion in Opinion Documents Lei Zhang Bing Liu University of Illinois at Chicago.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
Bag-of-Words Methods for Text Mining CSCI-GA.2590 – Lecture 2A Ralph Grishman NYU.
Bing LiuCS Department, UIC1 Chapter 8: Semi-supervised learning.
Active learning Haidong Shi, Nanyi Zeng Nov,12,2008.
CSC 594 Topics in AI – Text Mining and Analytics
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
CSE 538 Bing Liu Book Chapter 11: Opinion Mining and Sentiment Analysis.
Opinion Observer: Analyzing and Comparing Opinions on the Web
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
Extracting and Ranking Product Features in Opinion Documents Lei Zhang #, Bing Liu #, Suk Hwan Lim *, Eamonn O’Brien-Strain * # University of Illinois.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
COMP423 Summary Information retrieval and Web search  Vecter space model  Tf-idf  Cosine similarity  Evaluation: precision, recall  PageRank 1.
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Research Progress Kieu Que Anh School of Knowledge, JAIST.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Queensland University of Technology
Sentiment analysis algorithms and applications: A survey
University of Computer Studies, Mandalay
Aspect-based sentiment analysis
Statistical NLP: Lecture 9
Statistical NLP : Lecture 9 Word Sense Disambiguation
Presentation transcript:

Opinion Mining : A Multifaceted Problem Lei Zhang University of Illinois at Chicago Some slides are based on Prof. Bing Liu’s presentation

Introduction Most text information processing methods (e.g. web search, text mining) work with factual information but not deal with opinion information. Opinion Mining Computational study of opinions, sentiments expressed in text Why opinion mining now? mainly because of the Web, we can get huge volumes of opinionated text

Why opinion mining is important Whenever we need to make a decision, we would like to hear other’s advice. In the past.  Individual : Friends or family.  Business : Surveys and consultants. Word of mouth on the Web People can express their opinions in reviews, forum discussions, blogs…

Intellectually challenging & major applications  A popular research topic in recent years in NLP (Natural Language Processing) and Web data mining  A lot of companies in US. It touches every aspect of NLP and is well-scoped. Potentially it would be a major application for NLP But this problem is NOT easy. A popular problem

An example review “I bought an iPhone a few days ago. It was such a nice phone. The touch screen was really cool. The voice quality was clear too. Although the battery life was not long, that is ok for me. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive, and wanted me to return it to the shop. …” What we see? Opinions, targets of opinions, and opinion holders

Target entity Definition ( entity ): an entity e is a product, person, event or organization. e is represented as a hierarchy of components, sub-components, and so on. Each node represents a component and is associated with a set of attributes of the component. An opinion can be expressed on any node or attribute of the node. To simplify our discussion, we use the term features to represent both components and attributes.

What is an opinion An opinion is a quintuple ( e j, f jk, so ijkl, h i, t l ), where e j is a target entity. f jk is a feature of the entity e j. so ijkl is the sentiment value of the opinion of the opinion holder h i on feature f jk of entity e j at time t l. so ijkl is +ve, -ve, or neu, or a more granular rating. h i is an opinion holder. t l is the time when the opinion is expressed.

Opinion mining objective Objective: given an opinionated document, Discover all quintuples ( e j, f jk, so ijkl, h i, t l ), i.e., mine the five corresponding pieces of information in each quintuple, and Or, solve some simpler problems. With the quintuples, Unstructured Text  Structured Data Traditional data and visualization tools can be used to slice, dice and visualize the results in all kinds of ways Enable qualitative and quantitative analysis.

Sentiment classification: doc-level Classify a document (e.g., a review) based on the overall sentiment expressed by opinion holder Classes: positive, or negative (and neutral) It assumes Each document focuses on a single entity and contains opinions from a single opinion holder.

Subjectivity analysis : sentence-level Sentence-level sentiment analysis has two tasks: Subjectivity classification: Subjective or objective. Objective: e.g., “ I bought an iPhone a few days ago.” Subjective: e.g., “It is such a nice phone.” Sentiment classification: For subjective sentences or clauses, classify positive or negative. Positive: e.g., “ It is such a nice phone.” Negative: e.g., “ The screen is bad.”

Feature-based sentiment analysis Sentiment classification at both document and sentence (or clause) levels are NOT sufficient, they do not tell what people like and/or dislike A positive opinion on an entity does not mean that the opinion holder likes everything. An negative opinion on an entity does not mean that the opinion holder dislikes everything.

Feature-based opinion summary “I bought an iPhone a few days ago. It was such a nice phone. The touch screen was really cool. The voice quality was clear too. Although the battery life was not long, that is ok for me. However, my mother was mad with me as I did not tell her before I bought the phone. She also thought the phone was too expensive, and wanted me to return it to the shop. …” …. Feature based summary: Feature1 : Touch screen Positive: 212 The touch screen was really cool. The touch screen was so easy to use and can do amazing things. … Negative: 6 The screen is easily scratched. I have a lot of difficulty in removing finger marks from the touch screen. … Feature2 : battery life …

Visual comparison Summary of reviews of Cell Phone 1 VoiceScreen SizeWeight Battery + _ Comparison of reviews of Cell Phone 1 Cell Phone 2 _ +

Feature-based opinion summary

Opinion mining is challenging “ This past Saturday, I bought a Nokia phone and my girlfriend bought a Moto phone with Bluetooth. We called each other when we got home. The voice on my phone was not so clear, worse than my previous phone. The battery life was long. My girlfriend was quite happy with her phone. I wanted a phone with good sound quality. So my purchase was a real disappointment. I returned the phone yesterday.”

Opinion mining is a multifaceted problem ( e j, f jk, so ijkl, h i, t l ), e j - an entity: Named entity extraction (more) f jk - a feature of e j : Information extraction so ijkl is sentiment: Sentiment determination h i is an opinion holder: Information/Data Extraction t l is the time: Data Extraction Co-reference resolution Relation extraction Synonym match (voice = sound quality) …

Entity extraction ( competing entities ) An entity can be a product, service, person, organization or event in opinion document. “ This past Saturday, I bought a Nokia phone and my girlfriend bought a Moto phone with Bluetooth.” Nokia and Moto(Motorola) are entities.

Why we need entity extraction Without knowing the entity, the piece of opinion has little value. Companies want to know the competitors in the market. This is the first step to understand the competitive landscape from opinion documents.

Related work Named entity recognition (NER) Aims to identity entities such as names of persons, organizations and locations in natural language text. Our problem is similar to NER problem, but with some differences. 1. Fine grained entity classes (products, service) rather than coarse grained entity classes (people, location, organization ) 2. Only want a specific type: e.g. a particular type of drug names. 3. Neologism : e.g. “Sammy” (Sony), “SE” (Sony-Ericsson) 4. Feature sparseness (lack of contextual patterns) 5. Data noise (over-capitalization, under-capitalization)

NER methods Supervised learning methods The current dominant technique for addressing the NER problem Hidden Markov Models (HMM) Maximum Entropy Models (ME) Support Vector Machines (SVM) Conditional Random Field (CRF) Shortcomings: Rely on large sets of labeled examples. Labeling is labor-intensive and time-consuming.

NER methods Unsupervised learning methods Mainly clustering. Gathering named entities from clustered groups based on the similarity of context. The techniques rely on lexical resources (e.g., WordNet), on lexical patterns and on statistics computed on a large unannotated corpus. Shortcomings: low precision and recall for the result

NER methods Semi-supervised learning methods Show promise for identifying and labeling entities. Starting with a set of seed entities, semi-supervised methods use either class specific patterns to populate an entity class or distributional similarity to find terms similar to the seeds. Specific methods: Bootstrapping Co-traning Distributional similarity

Set expansion problem To find competing entities, the extracted entities must be relevant, i.e., they must be of the same class/type as the user provided entities. The user can only provide a few names because there are so many different brands and models. Our problem is actually a set expansion problem, which expands a set of given seed entities.

Set expansion problem Given a set Q of seed entities of a particular class C, and a set D of candidate entities, we wish to determine which of the entities in D belong to C. That is, we “grow” the class C based on the set of seed examples Q. This is a classification problem. However, in practice, the problem is often solved as a ranking problem.

Distributional similarity Distributional similarity is classical method for set expansion problem. It compares the similarity of the word distribution of the surround words of a candidate entity and the seed entities, and then ranking the candidate entities based on the similarity values. Our experiment shows this approach is inaccurate.

Positive and unlabeled learning model (PU learning model) A two-class classification model. Given a set P of positive examples of a particular class and a set U of unlabeled examples (containing hidden positive and negative cases), a classifier is built using P and U for classifying the data in U or future test cases. The set expansion problem can be mapped into PU learning exactly.

S-EM algorithm S-EM is an algorithm under PU learning model. It is based on Naïve Bayes classification and Expectation Maximum (EM) algorithm. The main idea of S-EM is to use spy technique to identify some reliable negatives ( RN ) from the unlabeled set U, and then use an EM algorithm to learn from P, RN and U-RN. We use classification score to rank entities.

S-EM algorithm (Liu et.al, ICML 2002)

Our algorithm (Li, Zhang, et al., ACL 2010) Given positive set P and unlabelled set U, S-EM produces a Bayesian classifier C, which is used to classify each vector u  U and to assign a probability p (+| u ) to indicate the likelihood that u belongs to the positive class.

Entity ranking Rank candidate d : Let M d be the median of {P(+|Vector 1), P(+|Vector 2), P(+|Vector 3), ……, P(+|Vector n )}. The final score ( fs ) for d is defined as: fs ( d )= M d * log ( 1 + n ) Where n is the frequency count of candidate entity d in the corpus. A high fs ( d ) implies a high likelihood that d is in the expanded entity set. Candidate entities with higher median score and higher frequency count in the corpus will be ranked high.

Feature extraction Feature indicators (1) Dependency relation Opinions words modify object features, e.g., “This camera takes great pictures ” Exploits the dependency relations of Opinions and features to extract Features. Given a set of seed opinion words (no feature input), we can extract features and also opinion words iteratively. “The voice on my phone was not so clear, worse than my previous phone. The battery life was long”

Extraction rules

Feature extraction (2) Part-whole relation pattern A part-whole pattern indicates one object is part of another object. It is a good indicator for features if the class concept word (the “whole” part) is known. (3) “No” pattern a specific pattern for product review and forum posts. People often express their comments or opinions on features by this short pattern (e.g. no noise)

Feature ranking Rank extracted feature candidates by feature importance. If a feature candidate is correct and important, it should be ranked high. For unimportant feature or noise, it should be ranked low. Two major factors affecting the feature importance. Feature relevance : it describes how possible a feature candidate is a correct feature. Feature frequency : a feature is important, if appears frequently in opinion documents.

HITS algorithm for feature relevance There is a mutual enforcement relation between opinion words, part-whole relation and “no” patterns and features. If an adjective modifies many correct features, it is highly possible to be a good opinion word. Similarly, if a feature candidate can be extracted by many opinion words, part-whole patterns, or “no” pattern, it is also highly likely to be a correct feature. The Web page ranking algorithm HITS is applicable.

Our algorithm ( Zhang, et al., COLING 2010) (1)Extract features by dependency relation, part-whole pattern etc. (2)Compute feature score using HITS without considering frequency. (3)The final score function considering the feature frequency S = S(f) * log (freq(f)) freq(f) is the frequency count of feature f. and S(f) is the authority score of feature f.

Identify opinion orientation For each feature, we identify the sentiment or opinion orientation expressed by a reviewer. Almost all approaches make use of opinion words and phrases(Lexicon-based method). Some opinion words have context independent orientations, e.g., “great”. Some other opinion words have context dependent orientations, e.g., “short” Many ways to use opinion words. Machine learning methods for sentiment classification at the sentence and clause levels are also applicable.

Aggregation of opinion words Input: a pair ( f, s ), where f is a feature and s is a sentence that contains f. Output: whether the opinion on f in s is positive, negative, or neutral. Two steps: Step 1: split the sentence if needed based on BUT words (but, except that, etc). Step 2: work on the segment s f containing f. Let the set of opinion words in s f be w 1,.., w n. Sum up their orientations (1, -1, 0), and assign the orientation to ( f, s ) accordingly. Step 2 can be changed to with better results. w i.o is the opinion orientation of w i. d ( w i, f ) is the distance from f to w i.

Basic opinion rules (Liu, Ch. in NLP Handbook) Negation rules: A negation word or phrase usually reverses the opinion expressed in a sentence. Negation words include “no” “not”, etc. e.g. “ this cellphone is not good.” But-clause rules: A sentence containing “but” also needs special treatment. The opinion before “but” and after “but” are usually the opposite to each other. Phrases such as “except that” “except for” behave similarly. e.g. “ I love Nicholas Cage but I really have no desire to see the Sorcerer’s Apprentice ” More…

Two main types of opinion Direct Opinions: direct sentiment expressions on some entity or feature e.g., “the picture quality of this camera is great.” Comparative Opinions: Comparisons expressing similarities or differences of more than one entity or feature. Usually stating an ordering or preference. e.g., “car x is cheaper than car y.”

Comparative opinions Gradable Non-Equal Gradable : Relations of the type greater or less than e.g: “ optics of camera A is better than that of camera B” Equative : Relations of the type equal to e.g: “ camera A and camera B both come in 7MP ” Superlative : Relations of the type greater or less than all others e.g: “ camera A is the cheapest camera available in market ”

Mining comparative opinions ( Jindal and Liu, SIGIR 2006; Ding, Liu, Zhang, KDD 2009 ) Objective: Given an opinionated document d,. Extract comparative opinions: ( O 1, O 2, F, po, h, t ), where O 1 and O 2 are the object sets being compared based on their shared features F, po is the preferred object set of the opinion holder h, and t is the time when the comparative opinion is expressed.

Thank you