Download presentation
Presentation is loading. Please wait.
1
Sentiment analysis overview in the text area
--Yuanyuan Liu 纯文本领域
2
Sentiment analysis Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials. Sentiment analysis is widely applied to reviews and social media for a variety of applications, ranging from marketing to customer service. Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgment or evaluation (see appraisal theory), affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader).
3
Introduction Goal Granularity Evaluation Document level
Paragraph level Sentence level [feature/aspect level] Evaluation accuracy[precision and recall]
4
Methods knowledge-based techniques statistical methods
classify text by affect categories based on the presence of unambiguous affect words such as happy, sad, afraid, and bored. assign arbitrary words a probable “affinity” to particular emotions. statistical methods Machine learning hybrid approaches
5
Measures using ML Classifier Neural networks
Naïve Bayes Maximum Entropy (MaxEnt) Feature-based SVM … Neural networks Recurrent neural network(RNN) Convolutional neural network(CNN) Deep memory network and attention model
6
Sentiment Lexicons GI (The General Inquirer)
LIWC (Linguistic Inquiry and Word Count) MPQA Subjectivity Cues Lexicon Bing Liu Opinion Lexicon SentiWordNet
7
Naïve Bayes assign to a given document d the class c∗ = arg maxc P (c | d) Assumption: the fi’s are conditionally independent given d’s class:
8
Naïve Bayes Advantages: Disadvantages: Simple
Its conditional independence assumption clearly does not hold in real-world situations.
9
MaxEnt MaxEnt model when characterizing some unknown events with a statistical model, we should always choose the one that has Maximum Entropy
10
MaxEnt Advantages: Disadvantages: Adam Berger
MaxEnt makes no assumptions about the relationships between features, and so might potentially perform better when conditional independence assumptions are not met. Disadvantages: A lot of computations. Adam Berger
11
SVM Find a hyper plane and maximize the margin.
12
Datasets: movie reviews from the Internet Movie Database(IMDb)
Accuracy comparison Thumbs up? Sentiment Classification using Machine Learning Techniques [Bo Pang and Lillian Lee ] Datasets: movie reviews from the Internet Movie Database(IMDb)
13
papers Survey: Thumbs up? Sentiment Classification using machine Learning Techniques (Pang & Lee) Opinion mining and sentiment analysis (Pang & Lee) Comprehensive Review Of Opinion Summarization (Kim et al) New Avenues in Opinion Mining and Sentiment Analysis (Cambria et al)
14
RNN A recurrent neural network (RNN) is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Application: handwriting recognition speech recognition
15
RNN
16
RNN
17
CNN A convolutional neural network (CNN, or ConvNet) is a type of feed-forward artificial neural network in which the connectivity pattern between its neurons is inspired by the organization of the animal visual cortex.
18
CNN
19
Aspect Level Sentiment Classification with Deep Memory Network
Duyu Tang Bing Qin Ting Liu
20
Motivation Drawbacks of conventional neural models
capture context information in an implicit way, and are incapable of explicitly exhibiting important context clues of an aspect. expensive computation Intuition: only some subset of context words are needed to infer the sentiment towards an aspect. E.g. “ great food but the service was dreadful! ”
21
Background:Memory network
Question answering Central idea: inference with a long-term memory component Components: Memory m: an array of objects I: converts input to internal feature representation G: updates old memories with new input O: generates an output representation given a new input and the current memory state R: outputs a response based on the output representation IBM著名的Watson 用长期记忆(存储)部件来做推断,可以对其进行读写操作
22
Background: attention model
One important property of human perception is that one does not tend to process a whole scene in its entirety at once. Instead, humans focus attention selectively on parts of the visual space to acquire information when and where it is needed, and combine information from different fixations over time to build up an internal representation of the scene, guiding future eye movements and decision making.
23
Deep memory network model
aspect word sentence s = {w1, w2, … , wi, … , wn} Word embedding matrix: word embedding of wi : Task: determining the sentiment polarity of sentences towards the aspect wi. vocabulary size The dimension of the word vector
24
Overview of the approach
Figure 1: An illustration of our deep memory network with three computational layers (hops) for aspect level sentiment classification
25
Attention model Content attention Location attention
26
Content attention Intuition:
context words do not contribute equally to the semantic meaning of a sentence the importance of a word should be different if we focus on different aspect
27
Content attention Input: Output: external memory m:
aspect vector vaspect : Output: mi is a piece of memory m αi ∈ [0,1] is the weight of mi and ∑i αi = 1
28
Calculation of αi Softmax function where
29
Location attention Intuition:
a context word closer to the aspect should be more important than a farther one.
30
Location attention—model 1
The memory vector mi: vi ∈ Rdx1 is a location vector for word wi n is the sentence length k is the hop number li is the location of wi
31
Location attention—model 2
The memory vector mi: vi ∈ Rdx1 is a location vector for word wi
32
Location attention—model 3
The memory vector mi: vi is regarded as a parameter
33
Location attention—model 4
The memory vector mi: Different from Model 3, location representations are regarded as neural gates to control how many percent of word semantics is written into the memory.
34
The Need for Multiple Hops
Computational models that are composed of multiple processing layers have the ability to learn representations of data with multiple levels of abstraction. In this work, the attention layer in one layer is essentially a weighted average compositional function, which is not powerful enough to handle the sophisticated computationality like negation, intensification and contrary in language.
35
Aspect level sentiment classification
Regard the output vector in last hop as the feature, and feed it to a softmax layer for aspect level sentiment classification. Means; minimizing the cross entropy error of sentiment classification Loss function: gradient descent.
36
Experiments Datasets [from SemEval 2014]
37
Comparison to other methods
accuracy runtime
38
Effects of location attention
39
Visualize Attention Models
40
Error Analysis 1. non-compositional sentiment expression.
E.g. “dessert was also to die for!” 2. complex aspect expression consisting of many words. E.g. “ask for the round corner table next to the large window.” 3. sentimental relation between context words such as negation, comparison and condition. E.g. “but dinner here is never disappointing, even if the prices are a bit over the top”.
41
Conclusion develop deep memory networks that capture importance of context words for aspect level sentiment classification. leverage both content and location information. using multiple computational layers in memory network could obtain improved performance.
42
Thanks
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.