Rule Generation from Decision Tree Decision tree classifiers are popular method of classification due to it is easy understanding However, decision tree.

Slides:



Advertisements
Similar presentations
Data Mining Classification: Alternative Techniques
Advertisements

CSE 634/590 Data mining Extra Credit: Submitted By: Moieed Ahmed
Rule-Based Classifiers. Rule-Based Classifier Classify records by using a collection of “if…then…” rules Rule: (Condition)  y –where Condition is a conjunctions.
From Decision Trees To Rules
IT 433 Data Warehousing and Data Mining
Data Mining Tri Nguyen. Agenda Data Mining As Part of KDD Decision Tree Association Rules Clustering Amazon Data Mining Examples.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Classification: Alternative Techniques
Mining Association Rules. Association rules Association rules… –… can predict any attribute and combinations of attributes … are not intended to be used.
Data Quality Class 9. Rule Discovery Decision and Classification Trees Association Rules.
Classification and Prediction
AI Week 23 Machine Learning Data Mining – Week 2 Lee McCluskey, room 2/07
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Classification and Regression. Classification and regression  What is classification? What is regression?  Issues regarding classification and regression.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Classification II.
Research Project Mining Negative Rules in Large Databases using GRD.
Classification.
Gini Index (IBM IntelligentMiner)
Enterprise systems infrastructure and architecture DT211 4
Bayesian Decision Theory Making Decisions Under uncertainty 1.
Chapter 7 Decision Tree.
Basic Data Mining Techniques
Mining Optimal Decision Trees from Itemset Lattices Dr, Siegfried Nijssen Dr. Elisa Fromont KDD 2007.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
Data Mining By Fu-Chun (Tracy) Juang. What is Data Mining? ► The process of analyzing LARGE databases to find useful patterns. ► Attempts to discover.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
Bab 5 Classification: Alternative Techniques Part 1 Rule-Based Classifer.
Bayesian Classifier. 2 Review: Decision Tree Age? Student? Credit? fair excellent >40 31…40
Decision Trees. MS Algorithms Decision Trees The basic idea –creating a series of splits, also called nodes, in the tree. The algorithm adds a node to.
Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia.
CS 8751 ML & KDDSupport Vector Machines1 Mining Association Rules KDD from a DBMS point of view –The importance of efficiency Market basket analysis Association.
Outline Knowledge discovery in databases. Data warehousing. Data mining. Different types of data mining. The Apriori algorithm for generating association.
Association Rule Mining Data Mining and Knowledge Discovery Prof. Carolina Ruiz and Weiyang Lin Department of Computer Science Worcester Polytechnic Institute.
Chapter 20 Data Analysis and Mining. 2 n Decision Support Systems  Obtain high-level information out of detailed information stored in (DB) transaction-processing.
1 Appendix D: Application of Genetic Algorithm in Classification Duong Tuan Anh 5/2014.
Association Rule Mining
Decision Tree (Rule Induction)
DATA MINING By Cecilia Parng CS 157B.
Bayesian Classification
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
Classification & Prediction — Continue—. Overfitting in decision trees Small training set, noise, missing values Error rate decreases as training set.
Elsayed Hemayed Data Mining Course
Classification Today: Basic Problem Decision Trees.
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
Decision Tree. Classification Databases are rich with hidden information that can be used for making intelligent decisions. Classification is a form of.
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
Lecture 10 (big data) Knowledge Induction using association rule and decision tree (Understanding customer behavior Using data mining skills)
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
MIS2502: Data Analytics Association Rule Mining David Schuff
By N.Gopinath AP/CSE.  A decision tree is a flowchart-like tree structure, where each internal node (nonleaf node) denotes a test on an attribute, each.
Mining Association Rules in Large Database This work is created by Dr. Anamika Bhargava, Ms. Pooja Kaul, Ms. Priti Bali and Ms. Rajnipriya Dhawan and licensed.
Rule-Based Classification
Market Basket Analysis and Association Rules
Exam #3 Review Zuyin (Alvin) Zheng.
Data Analysis.
Transactional data Algorithm Applications
MIS2502: Data Analytics Classification using Decision Trees
(Understanding customer behavior Using data mining skills)
Market Basket Analysis and Association Rules
Decision Tree Concept of Decision Tree
©Jiawei Han and Micheline Kamber
Decision Tree (Rule Induction)
Decision Tree  Decision tree is a popular classifier.
Decision Tree  Decision tree is a popular classifier.
Presentation transcript:

Rule Generation from Decision Tree Decision tree classifiers are popular method of classification due to it is easy understanding However, decision tree can become large and difficult to interpret In comparison with decision tree, the IF-THEN rules may be easier for humans to understand, particularly if the decision tree is very large

Rule Generation from Decision Tree Rules are easier to understand than large trees One rule is created for each path from the root to a leaf Each attribute-value pair along a path forms a conjunction: the leaf holds the class prediction

Rule Generation from Decision Tree Example: Rule extraction from our buys_computer decision-tree IF age = young AND student = no THEN buys_computer = no IF age = young AND student = yes THEN buys_computer = yes IF age = mid-age THEN buys_computer = yes IF age = old AND credit_rating = excellent THEN buys_computer = yes IF age = young AND credit_rating = fair THEN buys_computer = no

Rule Generation from Decision Tree Rules are expected to be mutually exclusive and exhaustive Mutually exclusive: we can not have rules conflict because no two rules will triggered for the same tuple Exhaustive: there is one rule for each possible attribute-value combination, so that the set of rules does not require a default rule

Association Classification Association rules show strong associations between items that occur frequently in a given data set The discovery of association rules is based on frequent itemset mining The general idea of association classification is that we can search for strong associations between frequent patterns and class labels

Association Classification All association rules must satisfy certain criteria regarding their: Support – the proportion of the data set that they actually represent Confidence – their accuracy

Association Classification Association rules can have any number of items in the rule antecedent (left- hand side) and any number of items in the rule consequent (right –hand side) However, in association classification, we are only interested in association rules of the form p1 ^ p2^ … => A class

Association Classification Age=young ^ credit=ok => buys_computer=yes [support=20%, confidence=93%] The percentage of tuples in D satisfying the rules antecedent and having class label C is called the support of R A support of 20% for association rule means that 20% of the customer in D are young, have an OK credit rating, and belong to the class buys_ciomputer=yes The confidence is the accuracy

Association Classification Regard each row as on transaction

Association Classification A1B1C1D1N A1B1C1D2N A2B1C1D1Y A3B2C1D1Y A3B3C2D1Y A3B3C2D2N A2B3C2D2Y A1B2C1D1N A1B3C2D1Y A3B2C2D1Y A1B2C2D2Y A2B2C1D2Y A2B1C2D1Y A3B2C1D2N A1: age<=30 A2:age between 31~40 A3: Age >40 B1: high income B2: medium income B3: low income C1: not student C2: student D1: fair credit D2: excellent credit Y: buy computer N: don’t buy computer

Association Classification Let support become 20% 14*20%=2.8 therefore minimum support count=3

Association Classification Find 1-itemset on attributes: A1: 5 A2:4 A3: 5 B1:4 B2:6 B3:4 C1:7 C2: 7 D1:8 D2:6

Association Classification Generate All 2-item combination: A lot of combination!!! Use Apriori and some observation (A1 and A2 is not going to be frequent item set)

Association Classification A1 B1 A1 B2 A1 B3 A1 C1 A1 C2 A1 D1 A1 D2 A2 B1 A2 B2 A2 B3 A2 C1 A2 C2 A2 D1 A2 D2

Association Classification A3 B1 A3 B2 A3 B3 A3 C1 A3 C2 A3 D1 A3 D2 B1 C1 B1 C2 B1 D1 B1 D2 B2 C1 B2 C2 B2 D1 B2 D2

Association Classification B3 C1 B3 C2 B3 D1 C1 D1 C1 D2

Association Classification A2 Y Support: 4/14 Confidence(A2=>Y): 4/4 A1 C1 N Support: 3/14 Confidence(A1, C1=>N): 3/3 A1 C2 Y Support: 2/14