Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature Discovery Presenter : Cheng-Hui Chen Author : Yuefeng Li, Abdulmohsen Algarni, Ning Zhong KDD 2010

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation  Over the years, people have often held the hypothesis that pattern-based methods should perform better than term-based ones in describing user preferences, but many experiments do not support this hypothesis.  Many text mining only consider term’s distributions. 3

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objectives  The innovative technique presented in paper makes a breakthrough for this difficulty.  To purpose consider both term’s distributions and their specificities when we use them for text mining and classification. 4

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology 5 Frequency weight Specificity Weight Specificity Weight New weight

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Definitions  Frequent pattern ─ Absolute support: ─ Relative support : ─ A termset X is called, if sup a (or sup r ) >= min_sup  Closed pattern ─ ─ Cls (X) = termset (coverset (X)) ─ A termset X is called, if and only if X = Cls (X) ─, for all pattern X 1 X  Closed sequential pattern 6

Intelligent Database Systems Lab N.Y.U.S.T. I. M. The deploying method  To improve the efficiency of the pattern taxonomy mining (PTM), an algorithm, SPMining(D+; min_sup). ─ For a given term t, its support (or called weight) in discovered patterns can be described as follow: ─ the following rank will be assigned to every incoming document d to decide its relevance. 7

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining Algorithms 8

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Specificity of low-level features  We define the specificity of a given term t in the training set D = D+ ∪ D- as follows: ─ 9

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Revision of discovered features  Revision of discovered Features ─ 10

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Revision Algorithms 11

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Data ─ This research uses Reuters Corpus Volume1 (RCV1) and the 50 assessor topics to evaluate the proposed model.  Compare ─ The up-to date pattern mining ─ The well-known term-based method 12

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  The well-known term-based methods ─ The Rocchio model ─ BM25 ─ SVM 13

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 14

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 15 Experiments

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusions  Compared with the state-of-the-art models, the experiments on RCV1 and TREC topics demonstrate that the effectiveness of relevance feature discovery can be significantly improved by the proposed approach.  This paper recommends to classify low-level terms into three categories in order to largely improve the performance of the revision. 19

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Comments  Advantages ─ The effectiveness of relevance feature discovery can be significantly improved by the proposed approach.  Drawback ─ …  Applications ─ Text mining ─ Classification 20

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature.

Similar presentations

Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature.

Similar presentations

Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature."— Presentation transcript:

Similar presentations

About project

Feedback