Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : Youngjoong Ko, Jungyun Seo 2009, IPM Text classification from unlabeled documents.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : Youngjoong Ko, Jungyun Seo 2009, IPM Text classification from unlabeled documents."— Presentation transcript:

1 Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : Youngjoong Ko, Jungyun Seo 2009, IPM Text classification from unlabeled documents with bootstrapping and feature projection techniques

2 Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments

3 Intelligent Database Systems Lab Motivation A general inductive process automatically builds a text classifier by learning, generally known as supervised learning. The most notable problem is that they require a large number of labeled training documents for accurate learning.

4 Intelligent Database Systems Lab Objectives The propose a new text classification method based on unsupervised or semi-supervised learning The proposed method launches text classification tasks with only unlabeled documents.

5 Intelligent Database Systems Lab Methodology-Framework

6 Intelligent Database Systems Lab Methodology -Creating keyword lists

7 Intelligent Database Systems Lab Methodology -Creating keyword lists 1 = 1.0+( 1.0 - 1.0 ) Student traffic is 1.0 TitleWord Title Word Student traffic book 0.05 0.6 1.15 = 0.6+( 0.6 – 0.05 )

8 Intelligent Database Systems Lab Methodology -Extracting & verifying centroid-context

9 Intelligent Database Systems Lab Methodology -Creating the context-cluster of each category 1.

10 Intelligent Database Systems Lab Methodology - Creating the context-cluster of each category 2. 3.

11 Intelligent Database Systems Lab Methodology -Creating the context-cluster of each category EX: 1. eat Banana 2. taste Banana 3. eat Apple

12 Intelligent Database Systems Lab Methodology - The TCFP classifier with robustness from noisy data

13 Intelligent Database Systems Lab Methodology - The TCFP classifier with robustness from noisy data

14 Intelligent Database Systems Lab Experiments

15 Intelligent Database Systems Lab Experiments

16 Intelligent Database Systems Lab Experiments

17 Intelligent Database Systems Lab Experiments

18 Intelligent Database Systems Lab Experiments

19 Intelligent Database Systems Lab Experiments

20 Intelligent Database Systems Lab Experiments

21 Intelligent Database Systems Lab Conclusions The proposed method is useful for low-cost text classification If some text classification tasks require high accuracy, can be used as an assistant tool for easily creating training data.

22 Intelligent Database Systems Lab Comments Advantages – faster – less expensive Applications – Text classification


Download ppt "Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : Youngjoong Ko, Jungyun Seo 2009, IPM Text classification from unlabeled documents."

Similar presentations


Ads by Google