Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Decision trees for hierarchical multi-label classification.

Slides:



Advertisements
Similar presentations
Intelligent Database Systems Lab Presenter : WU, MIN-CONG Authors : KADIM TA¸SDEMIR, PAVEL MILENOV, AND BROOKE TAPSALL 2011,IEEE Topology-Based Hierarchical.
Advertisements

Multi-label Classification without Multi-label Cost - Multi-label Random Decision Tree Classifier 1.IBM Research – China 2.IBM T.J.Watson Research Center.
Decision trees for hierarchical multilabel classification A case study in functional genomics.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A 24-h forecast of solar irradiance using artificial neural.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Clustering data in an uncertain environment using an artificial.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A novel document similarity measure based on earth mover’s.
Intelligent Database Systems Lab Presenter: HONG, CHIA-TSE Authors: Yen-Hsien Lee, Chih-Ping Wei, Tsang-Hsiang Cheng, Ching-Ting Yang DSS Nearest-neighbor-based.
K.U.Leuven Department of Computer Science Predicting gene functions using hierarchical multi-label decision tree ensembles Celine Vens, Leander Schietgat,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Fast exact k nearest neighbors search using an orthogonal search tree Presenter : Chun-Ping Wu Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Text classification based on multi-word with support vector.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Unsupervised pattern recognition models for mixed feature-type.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology U*F clustering : a new performant “ clustering-mining ”
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Human eye sclera detection and tracking using a modified.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. BNS Feature Scaling: An Improved Representation over TF·IDF for SVM Text Classification Presenter : Lin,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Graph self-organizing maps for cyclic and unbounded graphs.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A new student performance analysing system using knowledge discovery in higher educational databases.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Data mining for credit card fraud: A comparative study.
Hierarchical multilabel classification trees for gene function prediction Leander Schietgat Hendrik Blockeel Jan Struyf Katholieke Universiteit Leuven.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology HE-Tree: a framework for detecting changes in clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comprehensive Comparison Study of Document Clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology On Data Labeling for Clustering Categorical Data Hung-Leng.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Topology Preservation in Self-Organizing Feature Maps: Exact.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A quantitative stock prediction system based on financial news Presenter : Chun-Jung Shih Authors :Robert.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Determining the best K for clustering transactional datasets – A coverage density-based approach Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Empirical Study of Learning from Imbalanced Data Using.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A semantic similarity metric combining features and intrinsic information content Presenter: Chun-Ping.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Hybrid Supervised ANN for Classification and Data Visualization.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Plagiarism Detection Technique for Java Program Using.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A fast nearest neighbor classifier based on self-organizing incremental neural network (SOINN) Neuron.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A k-mean clustering algorithm for mixed numeric and categorical.
國立雲林科技大學 National Yunlin University of Science and Technology Self-organizing map learning nonlinearly embedded manifoldsmanifolds Author :Timo Simila.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Study on Automatic Recognition of Road Signs Presenter.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology SEP/COP: An efficient method to find the best partition.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Novel Density-Based Clustering Framework by Using Level.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Extending the Growing Hierarchal SOM for Clustering Documents.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Unsupervised word sense disambiguation for Korean through the acyclic weighted digraph using corpus and.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Visualization of multi-algorithm clustering for better economic decisions - The case of car pricing.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An initialization method to simultaneously find initial.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Enhanced neural gas network for prototype-based clustering.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Validity index for clusters of different sizes and densities Presenter: Jun-Yi Wu Authors: Krista Rizman.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A personal route prediction system base on trajectory.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
Date: 2011/1/11 Advisor: Dr. Koh. Jia-Ling Speaker: Lin, Yi-Jhen Mr. KNN: Soft Relevance for Multi-label Classification (CIKM’10) 1.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Cost- sensitive boosting for classification of imbalanced.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Towards comprehensive support for organizational mining Presenter : Yu-hui Huang Authors : Minseok Song,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Wei Xu,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A survey of kernel and spectral methods for clustering.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Predicting corporate bankruptcy using a self-organizing map: An empirical study to improve the forecasting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Comparing Association Rules and Decision Trees for Disease.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Community self-Organizing Map and its Application to Data Extraction Presenter: Chun-Ping Wu Authors:
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Text Classification, Business Intelligence, and Interactivity:
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An Integrated Machine Learning Approach to Stroke Prediction Presenter: Tsai Tzung Ruei Authors: Aditya.
Decision Trees for Hierarchical Multilabel Classification : A Case Study in Functional Genomics Hendrik Blockeel 1, Leander Schietgat 1, Jan Struyf 1,2,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive Clustering for Multiple Evolving Streams Graduate.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A support system for predicting eBay end prices Presenter.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A method of extracting malicious expressions in bulletin board systems by using context analysis Presenter:
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Learning Portfolio Analysis and Mining for SCORM Compliant Environment Pattern Recognition (PR, 2010)
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Self-organizing information fusion and hierarchical knowledge.
Presentation transcript:

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Decision trees for hierarchical multi-label classification Presenter : Shao-Wei Cheng Authors : Celine Vens, Jan Struyf, Leander Schietgat, Sašo Džeroski, Hendrik Blockeel ML 2008

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outline Motivation Objective Methodology Experiments Conclusion Comments

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation 3 The Hierarchical multi-label classification Problem. Instances may belong to multiple classes at the same time. These classes are organized in a hierarchy. More complex class hierarchies – DAG structure. DAG - directed acyclic graphs. Such that classes may have multiple parents.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objectives This paper presents several approaches to the induction of decision trees for HMC problem, as well as an empirical study of their use in functional genomics. The paper shows how the decision tree approaches can be modified to support class hierarchies with a DAG structure. The three proposed approaches SC - learning a separate binary decision tree for each class label. HSC - learning and applying such single-label decision trees in a hierarchical way. HMC - learning one tree that predicts all classes at once.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Methodology Definition. The Predictive Clustering Tree (PCT) framework. The three approaches. Adaptation to the DAG structure. an instance space X. a class hierarchy ( C, ≤ h ), for all c 1, c 2 ∈ C : c 1 ≤ h c 2 means c 1 is a superclass of c 2. a set T of examples ( x i, S i ) a quality criterion q. Find: a function f : X→2 C and c ∈ f(x) ⇒ ∀ c′ ≤ h c : c′ ∈ f(x). C = { 1, 2, 2.1, 2.2, 3 } C 0 : 8 C 1 : 2 C 0 : 2 C 1 : 3 C 0 :10 C 1 : 5 a set T of examples ( x i, S i ) V 2 = { 1, 0, 0, 0, 1 } V 3 = { 1, 0, 0, 0, 0 } V 5 = { 1, 1, 0, 0, 0 } V 6 = { 1, 1, 0, 0, 1 } V 1 = { 0, 1, 1, 1, 0 } V 3 = { 0, 1, 1, 0, 0 } C = { 1, 2, 2.1, 2.2, 3 } S 1 = { 2, 2.1, 2.2 } V 1 = { 0, 1, 1, 1, 0 }

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology The Clus-HMC approaches V 2 = { 1, 0, 0, 0, 1 } V 3 = { 1, 0, 0, 0, 0 } V 5 = { 1, 1, 0, 0, 0 } V 6 = { 1, 1, 0, 0, 1 } V 1 = { 0, 1, 1, 1, 0 } V 3 = { 0, 1, 1, 0, 0 } V l = { 0, 1, 1, 0.5, 0 }V r = { 0, 1, 1, 0.5, 0 } 6

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology The Clus-SC approaches Can be constructed with any classification tree induction algorithm. Clus-HMC can reduce to a single-label binary classification. So HMC and SC use the same induction algorithm. The Clus-HSC approaches P(c) = P( c | par(c) ) · P(par(c)) 7

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology Adaptation to the DAG structure. Adaptations to Clus-HMC. Adaptations to Clus-HSC. P(c) = min j P( c | par j (c) ) · P( par j (c) ) 8

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 9 The predictive performance measure Precision-recall based evaluation - PR curve Area under the average PR curve – AU(PRC) Average area under the PR curves – AUPRC W Dataset : yeast functional genomics There are 12 yeast data sets. FunCat : A tree-structured class hierarchy. Gene Ontology (GO) : A directed acyclic graph (DAG).

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 10 FunCat Gene Ontology

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments 11 FunCat Gene Ontology

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusion 12 CLUS-HMC has a better predictive performance than CLUS-SC and CLUS-HSC, both for tree and DAG structured class hierarchies, and for all evaluation measures. The size of the HMC tree is much smaller than the total size of the models output by CLUS-HSC and CLUS-SC. Learning a single HMC tree is also much faster than learning many regular trees.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 13 Comments Advantage  Many examples and detailed explanations. Drawback  … Application  Text classification.  Functional genomics.  Object recognition.