Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Sheng-Tun Li a,b,*, Fu-Ching Tsai a 2013, KBS A fuzzy conceptualization model for text mining with application in opinion polarity classification
Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments
Intelligent Database Systems Lab Motivation Most existing document classification algorithms are easily affected by ambiguous terms. The ability to disambiguate for a classifier is thus as important as the ability to classify accurately. - opinion polarity classification
Intelligent Database Systems Lab Objectives We propose a concept driven text classification approach based on Formal Concept Analysis (FCA) to train a classifier using concepts instead of documents, so as to reduce the inherent ambiguities. We further utilize fuzzy formal concept analysis (FFCA) to take uncertain information into consideration.
Intelligent Database Systems Lab Formal concept analysis Objects: {Review6,Review7} Attributes: {Phenomenal, Fantastic, Love} => formal concept positive class: ‘‘Phenomenal’’, ‘‘Fantastic’’ and ‘‘Love’’ {Review1, Review4, Review6 and Review7} neutral class: ‘‘Cover’’ {Review5} negative class: ‘‘Awful’’ {Review2, Review3}
Intelligent Database Systems Lab Formal concept analysis positive class: {Review1, Review4, Review6, Review7} negative class: {Review2, Review3} neutral class: {Review5}
Intelligent Database Systems Lab Methodology - Architecture
Intelligent Database Systems Lab Methodology tf-idf: Inverted Conformity Frequency (ICF): Uniformity (Uni): tf-idf > 26 ICF < log(2) Uni > 0.2
Intelligent Database Systems Lab Methodology
Intelligent Database Systems Lab Methodology
Intelligent Database Systems Lab Experiments - Data set and evaluation Data set: Reuter movie review e-book review Evaluation
Intelligent Database Systems Lab Experiments (parameters)
Intelligent Database Systems Lab Experiments
Intelligent Database Systems Lab Experiments (conceptualization)
Intelligent Database Systems Lab Experiments
Intelligent Database Systems Lab Experiments
Intelligent Database Systems Lab Conclusions FFCM successfully reduce the impact from textual ambiguity. The results from the experiments show that FFCM outperforms other state-of-the-art algorithms for both Reuters and two opinion polarity collections.
Intelligent Database Systems Lab Comments Advantages - the formal concepts plays an important role Disadvantage - α may differ from various datasets - only focuses on single-class classification Applications - text mining