Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy.

Slides:



Advertisements
Similar presentations
Associative Classification (AC) Mining for A Personnel Scheduling Problem Fadi Thabtah.
Advertisements

Data Mining Techniques: Classification. Classification What is Classification? –Classifying tuples in a database –In training set E each tuple consists.
Visual Data Mining: Concepts, Frameworks and Algorithm Development Student: Fasheng Qiu Instructor: Dr. Yingshu Li.
Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Harun Ug˘uz 2011.KBS A two-stage feature selection method for text categorization by.
Bring Order to Your Photos: Event-Driven Classification of Flickr Images Based on Social Knowledge Date: 2011/11/21 Source: Claudiu S. Firan (CIKM’10)
Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Ahmed Abbasi, Stephen France, Zhu Zhang, and Hsinchun Chen 2011, IEEE TKDE Selecting.
SLIQ: A Fast Scalable Classifier for Data Mining Manish Mehta, Rakesh Agrawal, Jorma Rissanen Presentation by: Vladan Radosavljevic.
Decision Tree Rong Jin. Determine Milage Per Gallon.
Lazy Associative Classification By Adriano Veloso,Wagner Meira Jr., Mohammad J. Zaki Presented by: Fariba Mahdavifard Department of Computing Science University.
IMPUTING MISSING VALUES FOR HIERARCHICAL POPULATION DATA Overview of Database Research Muhammad Aurangzeb Ahmad Nupur Bhatnagar.
(C) 2001 SNU CSE Biointelligence Lab Incremental Classification Using Tree- Based Sampling for Large Data H. Yoon, K. Alsabti, and S. Ranka Instance Selection.
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Quality evaluation of product reviews using an information.
Data Mining: Classification
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology U*F clustering : a new performant “ clustering-mining ”
WEKA - Explorer (sumber: WEKA Explorer user Guide for Version 3-5-5)
Intelligent Database Systems Lab Presenter: MIN-CHIEH HSIU Authors: NHAT-QUANG DOAN ∗, HANANE AZZAG, MUSTAPHA LEBBAH 2013 NN Growing self-organizing trees.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. BNS Feature Scaling: An Improved Representation over TF·IDF for SVM Text Classification Presenter : Lin,
Intelligent Database Systems Lab Presenter : WU, MIN-CONG Authors : Jorge Villalon and Rafael A. Calvo 2011, EST Concept Maps as Cognitive Visualizations.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A data mining approach to the prediction of corporate failure.
Basic Data Mining Technique
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology On Data Labeling for Clustering Categorical Data Hung-Leng.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Sheng-Tun Li a,b,*, Fu-Ching Tsai a 2013, KBS A fuzzy conceptualization model for.
Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Empirical Study of Learning from Imbalanced Data Using.
Outline Introduction – Frequent patterns and the Rare Item Problem – Multiple Minimum Support Framework – Issues with Multiple Minimum Support Framework.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Automatic Recommendations for E-Learning Personalization.
Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting.
Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : Youngjoong Ko, Jungyun Seo 2009, IPM Text classification from unlabeled documents.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 GMDH-based feature ranking and selection for improved.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A fast nearest neighbor classifier based on self-organizing incremental neural network (SOINN) Neuron.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A k-mean clustering algorithm for mixed numeric and categorical.
Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Medhdi Khashei, Mehdi Bijari 2011, ASOC A novel hybridization of artificial neural.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.
Intelligent Database Systems Lab Presenter: Wu, Jhen-Wei Authors: Fabian Bürger, Josef Pauli ICPRAM. Representation Optimization with Feature Selection.
Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : CHRISTOS BOURAS, VASSILIS TSOGKAS 2012, KBS A clustering technique for news articles.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Bui Quang Hung, Masanori Otsubo, Yoshinori Hijikata, Shogo Nishida 2010.WIA. HITS.
Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Eghbal G. Mansoori 2011,IEEE FRBC: A Fuzzy Rule-Based Clustering Algorithm.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Kevin Meijer, Flavius Frasincar, Frederik Hogenboom 2014.DSS. A semantic approach.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Visualization of multi-algorithm clustering for better economic decisions - The case of car pricing.
Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Wen Zhang, Taketoshi Yoshida, Xijin Tang 2011.ESWA A comparative study of TF*IDF,
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Longzhuang Li, Yi Shang, Wei Zhang 2002.ACM. Improvement of HITS-based Algorithms.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Cost- sensitive boosting for classification of imbalanced.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining concept maps from news stories for measuring civic scientific literacy in media Presenter :
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Intelligent Database Systems Lab Presenter: NENG-KAI, HONG Authors: HUAN LONG A, ZIJUN ZHANG A, ⇑, YAN SU 2014, APPLIED ENERGY Analysis of daily solar.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Providing Justifications in Recommender Systems Presenter.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Comparing Association Rules and Decision Trees for Disease.
Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Tao Liu, Zheng Chen, Benyu Zhang, Wei-ying Ma, Gongyi Wu 2004.ICDM. Improving Text.
Intelligent Database Systems Lab Presenter : CHANG, SHIH-JIE Authors : Andrés Ortiz, Juan M. Górriz, Javier Ramírez, F.J. Martínez-Murcia 2013.PRL LVQ-SVM.
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Vittorio Carlei, Massimiliano Nuccio PRL Mapping industrial patterns in spatial agglomeration:
Intelligent Database Systems Lab Presenter : Fen-Rou Ciou Authors : Hamdy K. Elminir, Yosry A. Azzam, Farag I. Younes 2007,ENERGY Prediction of hourly.
Using category-Based Adherence to Cluster Market-Basket Data Author : Ching-Huang Yun, Kun-Ta Chuang, Ming-Syan Chen Graduate : Chien-Ming Hsiao.
Optimization of Association Rules Extraction Through Exploitation of Context Dependent Constraints Arianna Gallo, Roberto Esposito, Rosa Meo, Marco Botta.
On Reducing Classifier Granularity in Mining Concept-Drifting Data Streams Peng Wang, H. Wang, X. Wu, W. Wang, and B. Shi Proc. of the Fifth IEEE International.
Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : Emilio Corchado, Bruno Baruque 2012 NeurCom WeVoS-ViSOM: An ensemble summarization.
Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee IPM Multilingual document mining.
Intelligent Database Systems Lab Presenter : CHANG, SHIH-JIE Authors : Chun Fu Lin, Yu-chu Yeh, Yu Hsin Hung, Ray I Chang 2013.CE. Data mining for providing.
1 Discriminative Frequent Pattern Analysis for Effective Classification Presenter: Han Liang COURSE PRESENTATION:
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Decision trees for hierarchical multi-label classification.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Learning Portfolio Analysis and Mining for SCORM Compliant Environment Pattern Recognition (PR, 2010)
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
Discriminative Frequent Pattern Analysis for Effective Classification
Presentation transcript:

Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy information

Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Intelligent Database Systems Lab Motivation A number of different approaches to build accurate classifiers have been proposed but the integration of taxonomy information in data used for classifier training has never been investigated so far.

Intelligent Database Systems Lab Objectives This paper presents a general-purpose strategy to improve structured data classifier accuracy provided by a taxonomy built over data items.

Intelligent Database Systems Lab Definition. Aggregation tree Definition. Multiple-taxonomy Let T ¼ t 1 ; …; t be a set of attributes. A multiple-taxonomy Θ={AT 1,…,AT m } is a forest of aggregation trees defined on the domains of attributes in T.

Intelligent Database Systems Lab Methodology

Intelligent Database Systems Lab Methodology – Multiple-taxonomy over data items in D

Intelligent Database Systems Lab two-step process: (i)Generalized classification rule mining. ex: {(Location,Italy)} ⇒ {(User category, Entrepreneur)} (s=50%, c=100%) (1)An extended training dataset version is generated first (2) FP-tree-like representation of the extended dataset is generated. Only frequent items are included in the FP-tree. (ii)Rule selection by means of lazy pruning.

Intelligent Database Systems Lab Methodology – lazy pruning (1)Pruning rules that only misclassify training data. (2) Rules that correctly classify at least one training data are grouped in the Level I rule set, while rules that remain unused during the training phase are kept in the Level II.

Intelligent Database Systems Lab Methodology – The G−L3 algorithm

Intelligent Database Systems Lab Methodology

Intelligent Database Systems Lab Methodology – G−L3 class prediction When a new test case rt has to be classified, G−L3 considers the sorted rule sets in Level I and Level II. If none of the Level I rules match rt, then the top-ranked rule in Level II matching r is considered. If none of the rules belonging to the two model sets match rt, the default class label is assigned to rt.

Intelligent Database Systems Lab Experiments – Dataset characteristics

Intelligent Database Systems Lab Experiments – Accuracy comparison(baseline V.S. extended)

Intelligent Database Systems Lab Experiments – Accuracy comparison

Intelligent Database Systems Lab Experiments –

Intelligent Database Systems Lab Experiments –

Intelligent Database Systems Lab Experiments –

Intelligent Database Systems Lab Experiments – execution time comparison

Intelligent Database Systems Lab Conclusions –Taxonomy integration is shown to yield significant accuracy improvements.

Intelligent Database Systems Lab Comments Advantages –More accurate. Applications –Classification 、 Data mining.