Hierarchical emotion classification and emotion component analysis on chinese micro-blog posts Hua Xu 1, Weiwei Yang 1, Jiushuo Wang 1, 2 1 State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University 2 School of Information Science and Engineering, Hebei University of Science and Technology Expert Systems with Applications 2015 報告者:劉憶年 2015/8/18
Outline Introduction Related work Emotion classification Emotion component analysis Experiment results and analysis Application Conclusion 2
Introduction (1/4) For years, researchers are trying to classify the emotions in text automatically. The views and attitudes, of course, often contain emotions. Micro-blog posts directly reflect users’ opinions. The length of posts brings challenges to emotion classification and requires more effective methods to extract features. Besides, Internet slang is not easy to cope with either because it does not follow language rules. Emotion, definitionly, is a subjective thought or feeling like happy, angry, etc, while sentiment addresses the objective positive and negative attitudes. It is possible that a post contains sentiment but no emotions. 1. The phone broke within two days. 3
Introduction (2/4) Currently, most researchers are focusing on sentiment analysis and emotion classification on six basic coarse- grained emotion classes, which consist of happy, surprise, angry, disgusted, fear and sad. However, coarse-grained emotions cannot depict the emotions in text perfectly. 2. This car is not so easy to drive as the ad says. I am so disappointed. In order to better describe emotions, fine-grained emotions need to be added to coarse-grained emotion categories, which forms hierarchy. Besides, adopting fine-grained emotions greatly increases the number of classes, which brings difficulty for flat classifiers, so hierarchical classification is required. 4
Introduction (3/4) So far, the corpus of most work is in English. Not many papers’ results are based on Chinese. Psychological emotion dictionary, Internet slang dictionary and emoticon dictionary are employed to segment posts and form the feature space, which is then selected by a combination of χ 2 -test, word frequency and pointwise mutual information (PMI), in order to retain effective features. Finally, we employ support vector regression (SVR) and rule sets, which are generated by PMI values, to get the classification results, which, as reported later, are very encouraging. 5
Introduction (4/4) In this paper, a four-level fine-grained emotion hierarchy with 19 basic emotions is adopted. However, posts usually contain more than one kind of emotions. So we propose an emotion component analysis (ECA) algorithm to detect the principal emotions in posts and calculate the corresponding ratios according to the classification results, which, more specifically, according to distances between regression values and class thresholds. 6
Related work (1/3) Although probability-based algorithms are quite useful, machine learning approach is more preferred by researchers nowadays. In order to better classify text, researchers spend time constructing and improving emotion lexicons. Emotion lexicons bring magnificent improvement to emotion classification on text. In addition to classification algorithms and emotion lexicons, corpus is also an option. Some researchers try to classify emotions on blog posts. 7
Related work (2/3) The flat classification can classify the examples directly relative to the hierarchical classification. While the hierarchical classification classifies the examples from top to bottom according to the pre-determined multi-layer classification system and gets the final classification result in the bottom. The flat classification is mostly adopted, which brings difficulty for classifiers to distinguish between the examples belong to its class and other classes when given a large dataset. Recent years, as micro-blog is used more and more widely, micro-blog posts become a new source of corpus for emotion classification. Besides, experiments on other kinds of corpus are also reported, e.g. s, novels and Japanese dialog systems. 8
Related work (3/3) Our contributions are different. We hierarchically classify Chinese micro-blog posts into 19 fine-grained emotion classes with machine learning approach and propose an ECA algorithm based on the regression values. In the process of segmentation, a psychological emotion dictionary is adopted in this paper for improving the effect of the algorithm, which has important scientific values both on social network knowledge discovery and data mining. 9
Emotion classification -- Hierarchy This hierarchy contains 19 fine-grained emotion classes at the bottom level and 20 leaf nodes if considering neutral, which denotes the non-emotional class. 10
Emotion classification -- Preprocessing Usernames. –However, this part is surely non-emotional, so we take it away by symbol and remove it together with the username. Topics. –Every user can take part in discussions under a certain topic. To participate, users only need to include the topic in posts denoted by two # symbols, e.g. #Emotion Analysis#. Links. –Users can include links in their posts. The links will be converted into short links by the micro-blog platform to reduce occupied space. Position information. –Micro-blog platforms allow users to add position information at the end of posts, which will not help in emotion classification. 11
Emotion classification -- Feature extraction In all, emoticon features can express some more complex emotions, so extracting the emoticons features is important. By mining the POS features, we employ ICTCLAS package to segment posts and then extract adjectives, nouns, verbs, etc to form the feature space. Meanwhile, two semantic rules are applied. The first one is to extract repeated exclamation marks (!) and question marks (?). The second one is to put negative words and adjacent adjectives together, such as phrases have opposite meanings from the original adjectives. However, there may be adverbs between them, we set a distance threshold at 3 according to Chinese language habit. 12
Emotion classification -- Feature selection (1/3) More than 20,000 words are extracted in the last step, so it is necessary to select effective features from the original feature space. Here χ 2 -test, which is implemented by Weka, together with word frequency and PMI are adopted. (1) 13
Emotion classification -- Feature selection (2/3) 14
Emotion classification -- Feature selection (3/3) χ 2 -test can pick out the words that are highly correlated with classes. However, it can be affected by the frequency of the words, so word frequency ratio is adopted as auxiliary information. The selection of low-frequency words depends on PMI, as it is less sensitive to word frequencies. The words with higher PMI values than positive threshold are all picked out to form the low-frequency word set. 15
Emotion classification -- Classification (1/2) SVR allows us to dynamically select the classification threshold, rather than a fixed one in SVM. 16
Emotion classification -- Classification (2/2) The class with maximum distance between regression value and threshold is selected as the final result, as it is the most confident one. 17
Emotion component analysis (1/2) Usually, a micro-blog post contains more than one kind of emotions, so only classification results can not accurately reflect the emotion components. Based on the confidence concept in multi-class classification, we propose an ECA algorithm to detect the principal emotions and calculate ratios in the post. 3. This flower is picked at the side of road and brings me good mood. If you can find such little nice things in daily life, you will be a happy guy. 18
Emotion component analysis (2/2) 19
Experiment results and analysis -- Dataset (1/2) As there is no benchmark dataset for fine-grained emotion classification, we chose 9960 original Chinese micro-blog posts from Sina Weibo randomly and crawled them as dataset for keeping the authenticity and practicality of the posts. Two annotators finish the annotation separately. Disagreed annotations make up about 35%. This is acceptable considering the lack of clear boundaries between emotions and the existence of emotion combinations. Disagreed annotations are resolved by the first author, who chooses one of the competing labels as the final label. 20
Experiment results and analysis -- Dataset (2/2) 21
Experiment results and analysis -- Experimental group setting In the psychological emotion dictionary, there are more than 52,000 words, and we put these words into 6 groups. Each group can describe one kind of emotions. These emotions are happy, distressed, surprised, fearful, angry and disgusted. 22
Experiment results and analysis -- Level results (1/2) 23
Experiment results and analysis -- Level results (2/2) It proves the effect of our feature selection method, by which many noisy features are taken away and highly correlated features are retained. It turns out that the psychological emotion dictionary does have positive effect for classification, as it’s the only difference between them. It turns out that all classifiers perform well and good performance of the whole model can be expected. 24
Experiment results and analysis -- Hierarchical results (1/2) In hierarchical classification, each test example is classified from the top level successively to the bottom level. 25
Experiment results and analysis -- Hierarchical results (2/2) In flat classification, it is not easy for each classifier to distinguish between the examples belong to its class and other classes when given the whole dataset. Hierarchical classification, on the contrary, takes away most of irrelevant examples by upper-level classifiers and makes it easier for lower-level ones to classify. 26
Experiment results and analysis -- ECA results We adopt human judgement to judge the ECA results. Generally, if the analysis result of a post is supported by more than half of judgers, we would consider it plausible. 27
Application First, we will apply our algorithm to consumer behavior analysis. Second, we can also apply our algorithm to the effect analysis of commercial promotion. Third, it is possible for us to track the emotion changes characteristics of micro-blog users, so that we can track their happiness and the happiness index of certain areas and so on. 28
Conclusion (1/4) This paper focuses on emotion classification and emotion component analysis on Chinese micro-blog posts. We get good classification results on our dataset by applying several optimization methods, which are proved effective by the comparison between groups. We also propose an ECA algorithm, which can detect the four principal emotions in posts and calculate portions. 29
Conclusion (2/4) First, in the application area of social management, the government can find some existing problems by analyzing public emotions in social media. Second, in the process of segmentation, a psychological emotion dictionary is adopted in this paper for improving the effect of the algorithm, which has important scientific values both on social network knowledge discovery and data mining. Third, many researchers are now focusing on positive / negative or coarse-grained basic emotion classification with 6–7 classes, while in this classification procedure, a four-level fine-grained emotion hierarchy with 19 basic emotions is adopted. 30
Conclusion (3/4) First, this paper employs ICTCLAS package to segment Chinese posts, but because of more oral expressions in blogs, the effect of Chinese word segmentation is not very well. Second, due to the complexity of feature space in the process of classification, we need to perfect the algorithm of feature extraction and feature selection. Third, our ECA algorithm is designed based on the limited factors, although it has certain rationality, it could be improved in the future. 31
Conclusion (4/4) First, we will focus on making up a new dictionary, which contains more emotional words and slang on micro-blog, so that the effect of feature extraction can be improved. Second, we will also try to improve our ECA algorithm by adding more factors in order to get better analysis results, such as redesigning the calculation formula, normalizing the classification value and son on. Third, as for the sarcasm expressions on micro-blogs posts, they involve the problem of more deep semantic analysis, scenario analysis and contextual analysis, and we will put them as our further research content. First, our research could be applied to precision marketing for product recommendation. Second, our research can also be used to develop a system of opinion analysis system. 32