Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy."— Presentation transcript:

1 Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy information

2 Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments

3 Intelligent Database Systems Lab Motivation A number of different approaches to build accurate classifiers have been proposed but the integration of taxonomy information in data used for classifier training has never been investigated so far.

4 Intelligent Database Systems Lab Objectives This paper presents a general-purpose strategy to improve structured data classifier accuracy provided by a taxonomy built over data items.

5 Intelligent Database Systems Lab Definition. Aggregation tree Definition. Multiple-taxonomy Let T ¼ t 1 ; …; t be a set of attributes. A multiple-taxonomy Θ={AT 1,…,AT m } is a forest of aggregation trees defined on the domains of attributes in T.

6 Intelligent Database Systems Lab Methodology

7 Intelligent Database Systems Lab Methodology – Multiple-taxonomy over data items in D

8 Intelligent Database Systems Lab two-step process: (i)Generalized classification rule mining. ex: {(Location,Italy)} ⇒ {(User category, Entrepreneur)} (s=50%, c=100%) (1)An extended training dataset version is generated first (2) FP-tree-like representation of the extended dataset is generated. Only frequent items are included in the FP-tree. (ii)Rule selection by means of lazy pruning.

9 Intelligent Database Systems Lab Methodology – lazy pruning (1)Pruning rules that only misclassify training data. (2) Rules that correctly classify at least one training data are grouped in the Level I rule set, while rules that remain unused during the training phase are kept in the Level II.

10 Intelligent Database Systems Lab Methodology – The G−L3 algorithm

11 Intelligent Database Systems Lab Methodology

12 Intelligent Database Systems Lab Methodology – G−L3 class prediction When a new test case rt has to be classified, G−L3 considers the sorted rule sets in Level I and Level II. If none of the Level I rules match rt, then the top-ranked rule in Level II matching r is considered. If none of the rules belonging to the two model sets match rt, the default class label is assigned to rt.

13 Intelligent Database Systems Lab Experiments – Dataset characteristics

14 Intelligent Database Systems Lab Experiments – Accuracy comparison(baseline V.S. extended)

15 Intelligent Database Systems Lab Experiments – Accuracy comparison

16 Intelligent Database Systems Lab Experiments –

17 Intelligent Database Systems Lab Experiments –

18 Intelligent Database Systems Lab Experiments –

19 Intelligent Database Systems Lab Experiments – execution time comparison

20 Intelligent Database Systems Lab Conclusions –Taxonomy integration is shown to yield significant accuracy improvements.

21 Intelligent Database Systems Lab Comments Advantages –More accurate. Applications –Classification 、 Data mining.


Download ppt "Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy."

Similar presentations


Ads by Google