Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Decision trees for hierarchical multi-label classification Presenter : Shao-Wei Cheng Authors : Celine Vens, Jan Struyf, Leander Schietgat, Sašo Džeroski, Hendrik Blockeel ML 2008
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outline Motivation Objective Methodology Experiments Conclusion Comments
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation 3 The Hierarchical multi-label classification Problem. Instances may belong to multiple classes at the same time. These classes are organized in a hierarchy. More complex class hierarchies – DAG structure. DAG - directed acyclic graphs. Such that classes may have multiple parents.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objectives This paper presents several approaches to the induction of decision trees for HMC problem, as well as an empirical study of their use in functional genomics. The paper shows how the decision tree approaches can be modified to support class hierarchies with a DAG structure. The three proposed approaches SC - learning a separate binary decision tree for each class label. HSC - learning and applying such single-label decision trees in a hierarchical way. HMC - learning one tree that predicts all classes at once.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Methodology Definition. The Predictive Clustering Tree (PCT) framework. The three approaches. Adaptation to the DAG structure. an instance space X. a class hierarchy ( C, ≤ h ), for all c 1, c 2 ∈ C : c 1 ≤ h c 2 means c 1 is a superclass of c 2. a set T of examples ( x i, S i ) a quality criterion q. Find: a function f : X→2 C and c ∈ f(x) ⇒ ∀ c′ ≤ h c : c′ ∈ f(x). C = { 1, 2, 2.1, 2.2, 3 } C 0 : 8 C 1 : 2 C 0 : 2 C 1 : 3 C 0 :10 C 1 : 5 a set T of examples ( x i, S i ) V 2 = { 1, 0, 0, 0, 1 } V 3 = { 1, 0, 0, 0, 0 } V 5 = { 1, 1, 0, 0, 0 } V 6 = { 1, 1, 0, 0, 1 } V 1 = { 0, 1, 1, 1, 0 } V 3 = { 0, 1, 1, 0, 0 } C = { 1, 2, 2.1, 2.2, 3 } S 1 = { 2, 2.1, 2.2 } V 1 = { 0, 1, 1, 1, 0 }
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology The Clus-HMC approaches V 2 = { 1, 0, 0, 0, 1 } V 3 = { 1, 0, 0, 0, 0 } V 5 = { 1, 1, 0, 0, 0 } V 6 = { 1, 1, 0, 0, 1 } V 1 = { 0, 1, 1, 1, 0 } V 3 = { 0, 1, 1, 0, 0 } V l = { 0, 1, 1, 0.5, 0 }V r = { 0, 1, 1, 0.5, 0 } 6
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology The Clus-SC approaches Can be constructed with any classification tree induction algorithm. Clus-HMC can reduce to a single-label binary classification. So HMC and SC use the same induction algorithm. The Clus-HSC approaches P(c) = P( c | par(c) ) · P(par(c)) 7
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology Adaptation to the DAG structure. Adaptations to Clus-HMC. Adaptations to Clus-HSC. P(c) = min j P( c | par j (c) ) · P( par j (c) ) 8
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 9 The predictive performance measure Precision-recall based evaluation - PR curve Area under the average PR curve – AU(PRC) Average area under the PR curves – AUPRC W Dataset : yeast functional genomics There are 12 yeast data sets. FunCat : A tree-structured class hierarchy. Gene Ontology (GO) : A directed acyclic graph (DAG).
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 10 FunCat Gene Ontology
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 11 FunCat Gene Ontology
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusion 12 CLUS-HMC has a better predictive performance than CLUS-SC and CLUS-HSC, both for tree and DAG structured class hierarchies, and for all evaluation measures. The size of the HMC tree is much smaller than the total size of the models output by CLUS-HSC and CLUS-SC. Learning a single HMC tree is also much faster than learning many regular trees.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 13 Comments Advantage Many examples and detailed explanations. Drawback … Application Text classification. Functional genomics. Object recognition.