Download presentation
Presentation is loading. Please wait.
Published byLaurence Watson Modified over 9 years ago
1
Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Ahmed Abbasi, Stephen France, Zhu Zhang, and Hsinchun Chen 2011, IEEE TKDE Selecting Attributes for Sentiment Classification Using Feature Relation Networks
2
Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments
3
Intelligent Database Systems Lab Motivation Sentiment analysis has emerged as a method for mining opinions from such text archives. challenging problem: 1.requires the use of large quantities of linguistic features 2.integrate these heterogeneous n-gram categories into a single feature set - noise 、 redundancy and computational limitations 1)polarity 2)intensity I don’t like you 、 I hate you
4
Intelligent Database Systems Lab n-gram - (Markov model) 天氣:晴天、陰天、雨天 美麗 vs 美痢 “HAPAX” and “DIS” tags I hate Jim replaced with “I hate HAPAX”
5
Intelligent Database Systems Lab Objectives Feature Relation Network (FRN) considers semantic information and also leverages the syntactic relationships between n-gram features. - enhanced sentiment classification on extended sets of heterogeneous n-gram features.
6
Intelligent Database Systems Lab Methodology- Extended N-Gram Feature Set
7
Intelligent Database Systems Lab Methodology - Subsumption Relations A subsumes B(A → B) “I love chocolate” unigram : I, LOVE, CHOCOLATE bigrams : I LOVE, LOVE CHOCOLATE trigrams : I LOVE CHOCOLATE “I love chocolate” unigram : I, LOVE, CHOCOLATE bigrams : I LOVE, LOVE CHOCOLATE trigrams : I LOVE CHOCOLATE W hat about the bigrams and trigrams? It depends on their weight. Their weight exceeds that of their general lower order counterparts by threshold t.
8
Intelligent Database Systems Lab Methodology - Parallel Relations A parallel B (A - B) POS tag: “ADMIRE_VP” → “like” semantic class: “SYN-Affection” → “love” POS tag: “ADMIRE_VP” → “like” semantic class: “SYN-Affection” → “love” A and B have a correlation coefficient greater than some threshold p, one of the attributes is removed to avoid redundancy.
9
Intelligent Database Systems Lab Methodology - The Complete Network
10
Intelligent Database Systems Lab Methodology - Incorporating Semantic Information
11
Intelligent Database Systems Lab Experiments - Datasets
12
Intelligent Database Systems Lab Experiments – FRN vs Univariate
13
Intelligent Database Systems Lab Experiments - FRN vs Univariate (WithinOne)
14
Intelligent Database Systems Lab Experiments - FRN vs Multivariate
15
Intelligent Database Systems Lab Experiments - FRN vs Multivariate (WithinOne)
16
Intelligent Database Systems Lab Experiments - FRN vs Hybrid
17
Intelligent Database Systems Lab Experiments - FRN vs Hybrid (WithinOne)
18
Intelligent Database Systems Lab Experiments - Ablation
19
Intelligent Database Systems Lab Experiments - Parameter t (0.0005, 0.005, 0.05, and 0.5) p (0.80, 0.90, and 1.00)
20
Intelligent Database Systems Lab Experiments - Average Runtimes
21
Intelligent Database Systems Lab Conclusions FRN had significantly higher best accuracy and best percentage within-one across three testbeds. The ablation and parameter testing results play an important role for the subsumption and parallel relation thresholds.
22
Intelligent Database Systems Lab Comments Advantages - accuracy 、 computationally efficient Disadvantage - ablation and parameter is sensitive Applications - sentiment classification - feature selection method
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.