Download presentation
Presentation is loading. Please wait.
Published byAmi Campbell Modified over 9 years ago
1
1 Entity Discovery and Assignment for Opinion Mining Applications (ACM KDD 09’) Xiaowen Ding, Bing Liu, Lei Zhang Date: 09/01/09 Speaker: Hsu, Yu-Wen Advisor: Dr. Koh, Jia-Ling
2
2 Outline Introduction Problem Define Entity Discovery Entity Assignment Opinion Mining Empirical Evaluation Conclusion & Future Work
3
3 Introduction Most opinion mining researches are based on product reviews because a review usually focuses on a specific product or entity and contains little irrelevant information. However, in forum discussions and blogs, the situation is very different, where the authors often talk about multiple entities (e.g., products), and compare them.
4
4 Introduction *This raises two important issues: (1) how to discover the entities that are talked about in a sentence the named entity recognition (NER) problem Over-capitalization Under-capitalization
5
5 (2) how to assign entities to each sentence because in many sentences entity names are not explicitly mentioned, but are implied. similar to pronoun resolution in NLP harder due to ungrammatical sentences, and missing or wrong punctuations.
6
6 Example 1: “(1) I bought Camera-A yesterday. (2) I took some pictures in the evening in my living room. (3) The images are very clear. (4) They are definitely better than those from my old Camera-B. (5) The battery is very good too.” Example 2: “(1) (2) (3) (4) (5) The pictures of that camera were blurring for night shots, but for day shots it was ok”
7
7 sentiment consistency : which says that consecutive sentiment expressions should be consistent with each other. It would be ambiguous if this consistency is not observed in writing.
8
8 Opinion Mining: Two tasks are necessary: (1) for a comparative sentence, we need to identify which entity is superior (2) for the subsequent sentence, we need to determine whether its first clause (sentence 5 of Example 2) is positive, negative, or neutral.
9
9 Problem Definition Thread: consists of a start post and a list of follow-up posts or replies. : A thread thus can be modeled as a sequence of posts is the start post. : Each post consists of a sequence of sentences : Each sentence describes something on a subset of entities
10
10 Problem statement: Given a set of threads T in a particular domain, two tasks are performed in this paper: 1. Entity discovery: discover the set of entities E discussed in the posts of the threads 2. Entity assignment: assign the entities in E that each sentence of each post in talks about.
11
11 Entity Discovery Step 1 – data preparation for sequential pattern mining Step 2 – Sequential pattern mining Step 3 – Pattern matching to extract candidate entities a/DT Nokia/NNP 7390/CD at/IN Nokia 7390
12
12 Step 4 – Candidate pruning … with/IN all/PDT the/DT Sony/NNP Ericsson/NNP walkman/NN phone/NN accessories/CD NNS (pruning) Step 5 – Pruning using brand and model relation and syntactic patterns discover relationships from the entities Nokia, 7390 Nokia: brand 7390: Model remove those entities discover in step 4 that never appear together with a or a, or never appear with a candidate in the syntactic patterns.
13
13 Entity Assignment * Comparatives and Superlatives Comparative Sentences Non-equal gradable: “greater or less” Equative: “equal to” Non-gradable: compare two or more entities Superlative Sentences: –est
14
14 Entity Assignment *Sentiment Consistency If he/she wants to introduce a new entity e, he/she has to state the name of the entity explicitly in a sentence, which can be (1) a normal, : normal, : normal e : normal, : comparative e & new entity
15
15 (2) is a comparative : normal non-equal gradable :positive (respectively negative) sentiment the superior (or inferior) entity equative the previous entity before. non-gradable the previous entity before. : comparative the entities in (3) is a superlative sentence. : normal the superlative entity in : comparative the entities in
16
16 Opinion Mining Opinion Indicators Opinion words and phrases opinion lexicon orientations depend on contexts Negations “not” without “not only…but also” But-clauses The orientation before “but” is opposite to that after “but”. not contain “but also”
17
17 *Specification for Opinion Indicators we propose a specification language to enable the user to specify indicators, which are (1)opinion words and phrases, (2) negation words and phrases, (3)but-like words and phrases, (4) non-opinion phrases involving sentiment words, a good deal of (5) non-negation phrases involving negation words, not only (6) non-but phrases involving but-like words. but also
18
18 Specification of Individual Words ex: like [VB] => Po *Two Type of Specification
19
19 Specification for Phrases “great => Po” “a great[T] + deal + of => NEU”
20
20 Opinion Mining Step 1 – Part-of-speech tagging Step 2 – Applying indicator word rules The picture quality is not[Ng] good[Po], reaction is too slow[Neu], but[But] the battery life is long[Neu]. Step 3 - Applying phrase rules The picture quality is not[Ng] good[Po], reaction is too slow[NE], but[But] the battery life is long[Neu].
21
21 Step 4 - Handling negations The picture quality is not[Ng] good[Negative], reaction is too slow[NE], but[But] the battery life is long[Neu]. Step 5 - Aggregating opinions Opinion aggregation : postive:1 negative: -1 sum up >0:postive, =0: neutral, <0: nagative
22
22 Opinion Mining of Comparisons more/most + Pos → Positive more/most + Neg → Negative less/least + Pos → Negative less/least + Neg → Positive Non-standard words “In term of battery life, Camera-X is superior to Camera- Y” depend on the meaning Identify comparative and superlative sentences Discover superior entities
23
23 Empirical Evluation
24
24 Experimental Results Entity Discovery NET: Named Entity Tagger CRF: Conditional Random Fields Method
25
25
26
26 Entity Assignment
27
27 Conclusion This paper presented two problem: mining entities discussed in a set of posts and assigning entities to each sentence. Our experimental results show that the proposed techniques are effective.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.