Classification by Association Rules: Use Minimum Set of Rules Jianyu Yang December 10, 2003.

Slides:



Advertisements
Similar presentations
Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
Advertisements

Associative Classification (AC) Mining for A Personnel Scheduling Problem Fadi Thabtah.
CHAPTER 2: Supervised Learning. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Learning a Class from Examples.
Rule Generation from Decision Tree Decision tree classifiers are popular method of classification due to it is easy understanding However, decision tree.
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
10 -1 Lecture 10 Association Rules Mining Topics –Basics –Mining Frequent Patterns –Mining Frequent Sequential Patterns –Applications.
Association Mining Data Mining Spring Transactional Database Transaction – A row in the database i.e.: {Eggs, Cheese, Milk} Transactional Database.
Association Rules l Mining Association Rules between Sets of Items in Large Databases (R. Agrawal, T. Imielinski & A. Swami) l Fast Algorithms for.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
1 Fast Algorithms for Mining Association Rules Rakesh Agrawal Ramakrishnan Srikant.
Mining Association Rules. Association rules Association rules… –… can predict any attribute and combinations of attributes … are not intended to be used.
Algorithms: The basic methods. Inferring rudimentary rules Simplicity first Simple algorithms often work surprisingly well Many different kinds of simple.
Data Quality Class 9. Rule Discovery Decision and Classification Trees Association Rules.
The UNIVERSITY of Kansas EECS 800 Research Seminar Mining Biological Data Instructor: Luke Huan Fall, 2006.
Lazy Associative Classification By Adriano Veloso,Wagner Meira Jr., Mohammad J. Zaki Presented by: Fariba Mahdavifard Department of Computing Science University.
4/3/01CS632 - Data Mining1 Data Mining Presented By: Kevin Seng.
1 Mining Quantitative Association Rules in Large Relational Database Presented by Jin Jin April 1, 2004.
Fast Algorithms for Mining Association Rules * CS401 Final Presentation Presented by Lin Yang University of Missouri-Rolla * Rakesh Agrawal, Ramakrishnam.
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
Sai Moturu. Introduction Current approaches to microarray data analysis –Analysis of experimental data followed by a posterior process where biological.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Example of a Decision Tree categorical continuous class Splitting Attributes Refund Yes No NO MarSt Single, Divorced Married TaxInc NO < 80K > 80K.
Research Project Mining Negative Rules in Large Databases using GRD.
1 Fast Algorithms for Mining Association Rules Rakesh Agrawal Ramakrishnan Srikant Slides from Ofer Pasternak.
General Mining Issues a.j.m.m. (ton) weijters Overfitting Noise and Overfitting Quality of mined models (some figures are based on the ML-introduction.
Basic Data Mining Techniques
Rule Generation [Chapter ]
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Classification - SVM CS 685: Special Topics in Data Mining Jinze Liu.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
Comparing Univariate and Multivariate Decision Trees Olcay Taner Yıldız Ethem Alpaydın Department of Computer Engineering Bogazici University
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.
연관규칙탐사, 박종수 1 연관 규칙 탐사와 그 응용 성신여자대학교 전산학과 박 종수
Association Rule Mining on Remotely Sensed Imagery Using Peano-trees (P-trees) Qin Ding, Qiang Ding, and William Perrizo Computer Science Department North.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Classification - SVM CS 685: Special Topics in Data Mining Spring 2008 Jinze Liu.
Fast Algorithms for Mining Association Rules Rakesh Agrawal and Ramakrishnan Srikant VLDB '94 presented by kurt partridge cse 590db oct 4, 1999.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
Mining Quantitative Association Rules in Large Relational Tables ACM SIGMOD Conference 1996 Authors: R. Srikant, and R. Agrawal Presented by: Sasi Sekhar.
1 Appendix D: Application of Genetic Algorithm in Classification Duong Tuan Anh 5/2014.
Christoph Eick: Learning Models to Predict and Classify 1 Learning from Examples Example of Learning from Examples  Classification: Is car x a family.
Practical Issues of Classification Underfitting and Overfitting –Training errors –Generalization (test) errors Missing Values Costs of Classification.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 5-Inducción de árboles de decisión (2/2) Eduardo Poggi.
CS Data Mining1 Data Mining The Extraction of useful information from data The automated extraction of hidden predictive information from (large)
Associations and Frequent Item Analysis. 2 Outline  Transactions  Frequent itemsets  Subset Property  Association rules  Applications.
Decision Trees Example of a Decision Tree categorical continuous class Refund MarSt TaxInc YES NO YesNo Married Single, Divorced < 80K> 80K Splitting.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
Machine Learning: Decision Trees Homework 4 assigned courtesy: Geoffrey Hinton, Yann LeCun, Tan, Steinbach, Kumar.
Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
On Reducing Classifier Granularity in Mining Concept-Drifting Data Streams Peng Wang, H. Wang, X. Wu, W. Wang, and B. Shi Proc. of the Fifth IEEE International.
CS4445 Data Mining B term WPI Solutions HW4: Classification Rules using RIPPER By Chiying Wang 1.
1 Discriminative Frequent Pattern Analysis for Effective Classification Presenter: Han Liang COURSE PRESENTATION:
Inna Levitskaya.  Productiveness of successful people  The rule of “6 P”  The decision tree  The most important - identify the main  The word "no“
Classification - CBA CS 485: Special Topics in Data Mining Jinze Liu.
Data Science Algorithms: The Basic Methods
Frequent Pattern Mining
Issues in Decision-Tree Learning Avoiding overfitting through pruning
Waikato Environment for Knowledge Analysis
Introduction to Data Mining, 2nd Edition by
Data Mining Practical Machine Learning Tools and Techniques
DIRECT HASHING AND PRUNING (DHP) ALGORITHM
Discriminative Frequent Pattern Analysis for Effective Classification
CSCI N317 Computation for Scientific Applications Unit Weka
CS539 Project Report -- Evaluating hypothesis
Chapter 14 – Association Rules
Presentation transcript:

Classification by Association Rules: Use Minimum Set of Rules Jianyu Yang December 10, 2003

Classification System Problem: (A, B, C) => y | n ? Problem: (A, B, C) => y | n ? –Decision tree learning, etc. Association rules: X => c Association rules: X => c –X: antecedent, c : consequent –Support & Confidence –Algorithms: Apriori

Association Rules: Issues Too many rules Too many rules –Inefficient –Overfitting Applying order matters Applying order matters –Example: (A, B) => y, (C) => n Minimum Support (minsup) Minimum Support (minsup) Minimum Confidence (minconf ) Minimum Confidence (minconf )

MSR Algorithm Ideas: No redundant rules – –(A, B) =>y – –(A, B, C) =>y Total order of rules – –“Occum’s razor”: favor general rules Pre-pruning – –(A, B) =>y – –(A, B, D)=>? 1 L 1 = {large 1-ruleitems}; 2CAR 1 = genRules(L 1 ) 3pruneSet(L 1 ) 4for (k = 2; L k-1 ≠  ; k++) do begin 5 C k = apriori-gen(L k-1 ); 6 forall training instances t  D do begin 7C t = subset(C k, t) 8forall candidates c  C t 9 C i.count++ for class label i 10 end 11 L k = {c  C t | c i.count ≥ minsup for any class i} 12 CAR k = genRules(L k ) 13 pruneSet(L k ) 14end 15CARs = UNION k (CAR k )

minsup

minconf

Results: Error Rate Comparison

Conclusions A new algorithm was designed to build a classification system using a minimum set of association rules. A new algorithm was designed to build a classification system using a minimum set of association rules. In general, low minsup and high minconf produce low error rates. In general, low minsup and high minconf produce low error rates. Experiments on 26 benchmark datasets showed lower error rates in 17 datasets thanC4.5 (R8), in 16 than CBA (v2.0). Experiments on 26 benchmark datasets showed lower error rates in 17 datasets thanC4.5 (R8), in 16 than CBA (v2.0). The new algorithm does not always produce lower error rates than other algorithms. The new algorithm does not always produce lower error rates than other algorithms.