Classification - CBA CS 485: Special Topics in Data Mining Jinze Liu.

Slides:



Advertisements
Similar presentations
Association Rule and Sequential Pattern Mining for Episode Extraction Jonathan Yip.
Advertisements

Huffman Codes and Asssociation Rules (II) Prof. Sin-Min Lee Department of Computer Science.
CSE 634 Data Mining Techniques
Data Mining Techniques Association Rule
Association rules and frequent itemsets mining
FUNGSI MAYOR Assosiation. What Is Association Mining? Association rule mining: –Finding frequent patterns, associations, correlations, or causal structures.
Chapter 5: Mining Frequent Patterns, Association and Correlations
Learning Fuzzy Association Rules and Associative Classification Rules Jianchao Han Computer Science Department California State University Dominguez Hills.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms
1 Mining Quantitative Association Rules in Large Relational Database Presented by Jin Jin April 1, 2004.
Data Mining Association Analysis: Basic Concepts and Algorithms
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Fast Algorithms for Association Rule Mining
Research Project Mining Negative Rules in Large Databases using GRD.
Mining Association Rules
1 Fast Algorithms for Mining Association Rules Rakesh Agrawal Ramakrishnan Srikant Slides from Ofer Pasternak.
Pattern Recognition Lecture 20: Data Mining 3 Dr. Richard Spillman Pacific Lutheran University.
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Classification - SVM CS 685: Special Topics in Data Mining Jinze Liu.
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
Chapter 2: Association Rules & Sequential Patterns.
Chapter 2: Association Rules & Sequential Patterns.
3.Mining Association Rules in Large Database 3.1 Market Basket Analysis:Example for Association Rule Mining 1.A typical example of association rule mining.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Classification - SVM CS 685: Special Topics in Data Mining Spring 2008 Jinze Liu.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Association Rule Mining Data Mining and Knowledge Discovery Prof. Carolina Ruiz and Weiyang Lin Department of Computer Science Worcester Polytechnic Institute.
Association Rule.. Association rule mining  It is an important data mining model studied extensively by the database and data mining community.  Assume.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
Mining Quantitative Association Rules in Large Relational Tables ACM SIGMOD Conference 1996 Authors: R. Srikant, and R. Agrawal Presented by: Sasi Sekhar.
1 What is Association Analysis: l Association analysis uses a set of transactions to discover rules that indicate the likely occurrence of an item based.
Association Rule Mining
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Classification - SVM CS 685: Special Topics in Data Mining Jinze Liu.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
Mining Frequent Patterns. What Is Frequent Pattern Analysis? Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs.
Data Mining  Association Rule  Classification  Clustering.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Association Rule Mining CS 685: Special Topics in Data Mining Jinze Liu.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
Data Mining – Association Rules
Chapter 2: Mining Association Rules
Data Mining Find information from data data ? information.
Data Mining: Concepts and Techniques
Association rule mining
Knowledge discovery & data mining Association rules and market basket analysis--introduction UCLA CS240A Course Notes*
Frequent Pattern Mining
William Norris Professor and Head, Department of Computer Science
Waikato Environment for Knowledge Analysis
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Transactional data Algorithm Applications
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
©Jiawei Han and Micheline Kamber
Department of Computer Science National Tsing Hua University
Chapter 2: Association Rules & Sequential Patterns
Association Rule Mining
CS 685: Special Topics in Data Mining Spring 2008 Jinze Liu
CS 685: Special Topics in Data Mining Spring 2009 Jinze Liu
Association Analysis: Basic Concepts
Presentation transcript:

Classification - CBA CS 485: Special Topics in Data Mining Jinze Liu

Association Rules Itemset X = {x 1, …, x k } Find all the rules X  Y with minimum support and confidence support, s, is the probability that a transaction contains X  Y confidence, c, is the conditional probability that a transaction having X also contains Y Let sup min = 50%, conf min = 50% Association rules: A  C (60%, 100%) C  A (60%, 75%) Customer buys diaper Customer buys both Customer buys beer Transaction- id Items bought 100f, a, c, d, g, I, m, p 200a, b, c, f, l,m, o 300b, f, h, j, o 400b, c, k, s, p 500a, f, c, e, l, p, m, n 2

Classification based on Association Classification rule mining versus Association rule mining Aim A small set of rules as classifier All rules according to minsup and minconf Syntax X  y X  Y 3

Why & How to Integrate Both classification rule mining and association rule mining are indispensable to practical applications. The integration is done by focusing on a special subset of association rules whose right-hand-side are restricted to the classification class attribute. CARs: class association rules 4

CBA: Three Steps Discretize continuous attributes, if any Generate all class association rules (CARs) Build a classifier based on the generated CARs. 5

Our Objectives To generate the complete set of CARs that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf) constraints. To build a classifier from the CARs. 6

Rule Generator: Basic Concepts Ruleitem :condset is a set of items, y is a class label Each ruleitem represents a rule: condset->y condsupCount The number of cases in D that contain condset rulesupCount The number of cases in D that contain the condset and are labeled with class y Support =(rulesupCount/|D|)*100% Confidence =(rulesupCount/condsupCount)*100% 7

RG: Basic Concepts (Cont.) Frequent ruleitems A ruleitem is frequent if its support is above minsup Accurate rule A rule is accurate if its confidence is above minconf Possible rule For all ruleitems that have the same condset, the ruleitem with the highest confidence is the possible rule of this set of ruleitems. The set of class association rules (CARs) consists of all the possible rules (PRs) that are both frequent and accurate. 8

RG: An Example A ruleitem: assume that the support count of the condset (condsupCount) is 3, the support of this ruleitem (rulesupCount) is 2, and |D|=10 then (A,1),(B,1) -> (class,1) supt=20% (rulesupCount/|D|)*100% confd=66.7% (rulesupCount/condsupCount)*100% 9

RG: The Algorithm 1 F 1 = {large 1-ruleitems}; 2 CAR 1 = genRules (F 1 ); 3 prCAR 1 = pruneRules (CAR 1 ); //count the item and class occurrences to determine the frequent 1-ruleitems and prune it 4 for (k = 2; F k-1  Ø; k++) do 5C k = candidateGen (F k-1 ); //generate the candidate ruleitems C k using the frequent ruleitems F k-1 6 for each data case d  D do //scan the database 7C d = ruleSubset (C k, d); //find all the ruleitems in C k whose condsets are supported by d 8 for each candidate c  C d do 9 c.condsupCount++; 10 if d.class = c.class then c.rulesupCount++; //update various support counts of the candidates in C k 11 end 12 end 10

RG: The Algorithm(cont.) 13F k = {c  C k | c.rulesupCount  minsup}; //select those new frequent ruleitems to form F k 14 CAR k = genRules(F k ); //select the ruleitems both accurate and frequent 15 prCAR k = pruneRules(CAR k ); 16 end 17 CARs =  k CAR k ; 18 prCARs =  k prCAR k ; 11