Association Rule Mining

Slides:



Advertisements
Similar presentations
Association Rule Mining
Advertisements

Recap: Mining association rules from large datasets
Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
Data Mining Techniques Association Rule
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Organization “Association Analysis”
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Analysis: Basic Concepts and Algorithms.
Data Mining Association Analysis: Basic Concepts and Algorithms
Mining Association Rules in Large Databases
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Fast Algorithms for Association Rule Mining
Mining Association Rules
Mining Frequent Patterns I: Association Rule Discovery Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Mining Association Rules
Association Rule Mining. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and closed patterns.
Mining Association Rules in Large Databases. What Is Association Rule Mining?  Association rule mining: Finding frequent patterns, associations, correlations,
Pattern Recognition Lecture 20: Data Mining 3 Dr. Richard Spillman Pacific Lutheran University.
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
Eick, Tan, Steinbach, Kumar: Association Analysis Part1 Organization “Association Analysis” 1. What is Association Analysis? 2. Association Rules 3. The.
1 CISC 4631 Data Mining Lecture 09: Association Rule Mining Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) Prof.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
Data & Text Mining1 Introduction to Association Analysis Zhangxi Lin ISQS 3358 Texas Tech University.
Fast Algorithms For Mining Association Rules By Rakesh Agrawal and R. Srikant Presented By: Chirayu Modi.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
1 What is Association Analysis: l Association analysis uses a set of transactions to discover rules that indicate the likely occurrence of an item based.
Data Mining Find information from data data ? information.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
Mining Frequent Patterns, Associations, and Correlations Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Data Mining  Association Rule  Classification  Clustering.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
Stats 202: Statistical Aspects of Data Mining Professor Rajan Patel
Chapter 2: Mining Association Rules
Data Mining Find information from data data ? information.
Data Mining Association Analysis: Basic Concepts and Algorithms
Association rule mining
Association Rules Repoussis Panagiotis.
Mining Association Rules
Knowledge discovery & data mining Association rules and market basket analysis--introduction UCLA CS240A Course Notes*
Frequent Pattern Mining
©Jiawei Han and Micheline Kamber
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Mining Association Rules in Large Databases
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Unit 3 MINING FREQUENT PATTERNS ASSOCIATION AND CORRELATIONS
Association Analysis: Basic Concepts and Algorithms
©Jiawei Han and Micheline Kamber
Association Rules & Sequential Patterns
Mining Association Rules in Large Databases
Association Analysis: Basic Concepts
Presentation transcript:

Association Rule Mining

What Is Association Rule Mining? Association rule mining is finding frequent patterns or associations among sets of items or objects, usually amongst transactional data Applications include Market Basket analysis, cross-marketing, catalog design, etc.

Association Mining Examples. Rule form: “Body ® Head [support, confidence]”. buys(x, “diapers”) ® buys(x, “beers”) [0.5%, 60%] buys(x, "bread") ® buys(x, "milk") [0.6%, 65%] major(x, "CS") /\ takes(x, "DB") ® grade(x, "A") [1%, 75%] age(X,30-45) /\ income(X, 50K-75K) ® buys(X, SUVcar) age=“30-45”, income=“50K-75K” ® car=“SUV”

Market-basket Analysis & Finding Associations Do items occur together? Proposed by Agrawal et al in 1993. It is an important data mining model studied extensively by the database and data mining community. Assumes all data are categorical. Initially used for Market Basket Analysis to find how items purchased by customers are related. Bread  Milk [sup = 5%, conf = 100%]

Association Rule: Basic Concepts Given: (1) database of transactions, (2) each transaction is a list of items (purchased by a customer in a visit) Find: all rules that correlate the presence of one set of items with that of another set of items E.g., 98% of people who purchase tires and auto accessories also get automotive services done Applications *  Maintenance Agreement (What the store should do to boost Maintenance Agreement sales) Home Electronics  * (What other products should the store stocks up?) Detecting “ping-pong”ing of patients, faulty “collisions”

Association Rule Mining Given a set of transactions, find rules that will predict the occurrence of an item based on the occurrences of other items in the transaction Market-Basket transactions Example of Association Rules {Diaper}  {Beer}, {Milk, Bread}  {Eggs,Coke}, {Beer, Bread}  {Milk}, Implication means co-occurrence, not causality! An itemset is simply a set of items

Examples from a Supermarket Can you think of association rules from a supermarket? Let’s say you identify association rules from a supermarket, how might you exploit them? That is, if you are the store manager, how might you make money? Assume you have a rule of the form X  Y

Supermarket examples If you have a rule X  Y, you could: Run a sale on X if you want to increase sales of Y Locate the two items near each other Locate the two items far from each other to make the shopper walk through the store Print out a coupon on checkout for Y if shopper bought X but not Y

Association “rules”–standard format Rule format: (A set can consist of just a single item) If {set of items}  Then {set of items} Condition implies Results Condition Results Then If {Diapers, Baby Food} {Beer, Chips} Customer buys diaper Customer buys both Right side very often is a single item Rules do not imply causality Customer buys beer

What is an Interesting Association? Requires domain-knowledge validation Actionable, non-trivial, understandable Algorithms provide first-pass based on statistics on how “unexpected” an association is Some standard statistics used: C  R support ≈ p(R&C) percent of “baskets” where rule holds confidence ≈ p(R|C) percent of times R holds when C holds

Support and Confidence Find all the rules X  Y with minimum confidence and support Support = probability that a transaction contains {X,Y} i.e., ratio of transactions in which X, Y occur together to all transactions in DB. Confidence = conditional probability that a transaction having X contains Y i.e., ratio of transactions in which X, Y occur together to those in which X occurs. Customer buys both Customer buys diaper Customer buys beer Thel confidence of a rule LHS => RHS can be computed as the support of the whole itemset divided by the support of LHS: Confidence (LHS => RHS) = Support(LHS È RHS) / Support(LHS)

Definition: Frequent Itemset A collection of one or more items Example: {Milk, Bread, Diaper} k-itemset: itemset with k items Support count () Frequency count of occurrence of itemset E.g. ({Milk, Bread,Diaper}) = 2 Support Fraction of transactions containing the itemset E.g. s({Milk, Bread, Diaper}) = 2/5 Frequent Itemset An itemset whose support is greater than or equal to a minsup threshold

Support and Confidence Calculations Given Association Rule {Milk, Diaper}  {Beer} Rule Evaluation Metrics Support (s) Fraction of transactions that contain both X and Y Confidence (c) Measures how often items in Y appear in transactions that contain X Now Compute these two metrics

Support and Confidence – 2nd Example Itemset {A, C} has a support of 2/5 = 40% Rule {A} ==> {C} has confidence of 50% Rule {C} ==> {A} has confidence of 100% Support for {A, C, E} ? Support for {A, D, F} ? Confidence for {A, D} ==> {F} ? Confidence for {A} ==> {D, F} ? Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf).

Example Transaction data Assume: An example frequent itemset: t1: Beef, Chicken, Milk t2: Beef, Cheese t3: Cheese, Boots t4: Beef, Chicken, Cheese t5: Beef, Chicken, Clothes, Cheese, Milk t6: Chicken, Clothes, Milk t7: Chicken, Milk, Clothes Example Transaction data Assume: minsup = 30% minconf = 80% An example frequent itemset: {Chicken, Clothes, Milk} [sup = 3/7] Rules from the itemset are partitions of the items Association rules from above itemset: Clothes  Milk, Chicken [sup = 3/7, conf = 3/3] … … Clothes, Chicken  Milk, [sup = 3/7, conf = 3/3]

Mining Association Rules Example of Rules: {Milk,Diaper}  {Beer} (s=0.4, c=0.67) {Milk,Beer}  {Diaper} (s=0.4, c=1.0) {Diaper,Beer}  {Milk} (s=0.4, c=0.67) {Beer}  {Milk,Diaper} (s=0.4, c=0.67) {Diaper}  {Milk,Beer} (s=0.4, c=0.5) {Milk}  {Diaper,Beer} (s=0.4, c=0.5) Observations: All the above rules are binary partitions of the same itemset: {Milk, Diaper, Beer} Rules originating from the same itemset have identical support (by definition) but may have different confidence values

Drawback of Confidence Coffee Tea 15 5 20 75 80 90 10 100 Association Rule: Tea  Coffee Confidence= P(Coffee|Tea) = 0.75 but P(Coffee) = 0.9 Although confidence is high, rule is misleading P(Coffee|Tea) = 0.9375

Mining Association Rules Two-step approach: Frequent Itemset Generation Generate all itemsets whose support  minsup Rule Generation Generate high confidence rules from each frequent itemset, where each rule is a binary partitioning of a frequent itemset Frequent itemset generation is still computationally expensive

Transaction data representation A simplistic view of “shopping baskets” Some important information not considered: the quantity of each item purchased the price paid

Many mining algorithms There are a large number of them They use different strategies and data structures. Their resulting sets of rules are all the same. Given a transaction data set T, and a minimum support and a minimum confident, the set of association rules existing in T is uniquely determined. Any algorithm should find the same set of rules although their computational efficiencies and memory requirements may be different. We study only one: the Apriori Algorithm

The Apriori algorithm The best known algorithm Two steps: Find all itemsets that have minimum support (frequent itemsets, also called large itemsets). Use frequent itemsets to generate rules. E.g., a frequent itemset {Chicken, Clothes, Milk} [sup = 3/7] and one rule from the frequent itemset Clothes  Milk, Chicken [sup = 3/7, conf = 3/3]

Step 1: Mining all Frequent Itemsets A frequent itemset is an itemset whose support is ≥ minsup. Key idea: The Apriori property (downward closure property): any subsets of a frequent itemset are also frequent itemsets ABC ABD ACD BCD AB AC AD BC BD CD A B C D

Steps in Association Rule Discovery Find frequent itemsets Itemsets with at least minimum support Support is “downward closed” so a subset of a frequent itemset must be frequent if {AB} is a frequent itemset, both {A} and {B} are frequent itemsets If an itemset doesnot satisfy minimum support, none of its supersets will either (this is key point that allows pruning of search space) Iteratively find frequent itemsets with cardinality from 1 to k (k-itemsets) Use the frequent itemsets to generate assoc. rules Generate all binary partitions, but may have to fit template E.g., only one item on right side or only two items on left side

Frequent Itemset Generation Given d items, there are 2d possible candidate itemsets

Mining Association Rules—An Example User specifies these Min. support 50% Min. confidence 50% For rule A  C: support = support({A ,C}) = 50% confidence = support({A ,C})/support({A}) = 66.6% The Apriori principle: Any subset of a frequent itemset must be frequent

Illustrating the Apriori Principle Found to be Infrequent Pruned supersets

The Apriori Algorithm Terminology: Ck is the set of candidate k-itemsets Lk is the set of k-itemsets Join Step: Ck is generated by joining two elements from Lk-1 There must be a lot of overlap for the join to only increase length by 1 Prune Step: Any (k-1)-itemset that is not frequent cannot be a subset of a frequent k-itemset This is a bit confusing since we want to use it the other way. We prune a candidate k-itemset if any of its k-1 itemsets are not in our list of frequent k-1 itemsets To utilize this you simply start with k=1, which is single-item itemsets and they you work your way up from there!

The Algorithm Iterative algo. (also called level-wise search): Find all 1-item frequent itemsets; then all 2-item frequent itemsets, and so on. In each iteration k, only consider itemsets that contain some k-1 frequent itemset. Find frequent itemsets of size 1: F1 From k = 2 Ck = candidates of size k: those itemsets of size k that could be frequent, given Fk-1 Fk = those itemsets that are actually frequent, Fk  Ck (need to scan the database once).

Apriori Candidate Generation The candidate-gen function takes Lk-1 and returns a superset (called the candidates) of the set of all frequent k-itemsets. There are two steps: join step: Generate all possible candidate itemsets Ck of length k prune step: Remove those candidates in Ck that cannot be frequent.

How to Generate Candidates? Suppose the items in Lk-1 are listed in an order Step 1: self-joining Lk-1 The description below is a bit confusing– all we do is splice two sets together so that only one new item is added (see next slide) insert into Ck select p.item1, p.item2, …, p.itemk-1, q.itemk-1 from Lk-1 p, Lk-1 q where p.item1=q.item1, …, p.itemk-2=q.itemk-2, p.itemk-1 < q.itemk-1 Step 2: pruning forall itemsets c in Ck do forall (k-1)-subsets s of c do if (s is not in Lk-1) then delete c from Ck

Self-Joining Step All items in the itemset to be self joined are in a consistent order– any order Such as lexicographic (alphabetical) order Two items in the itemset can be joined only if they differ in the last position Then when you join them the size of the itemset goes up by one See example on next slide

Example of Generating Candidates (1) L3={abc, abd, acd, ace, bcd} Self-joining: L3*L3 abc and abd yields abcd acd and ace yields acde We do not join abd and acd Even though it would give abcd which is a candidate If the product were a candidate it would have already been generated given the ordering This may not be obvious at first glance

Example of Generating Candidates (2) Note that for abcd to be frequent by the Apriori property abc, bcd, and abd must be frequent abc and abd are alphabetically before bcd So if we see abc and bcd we do not need to generate abcd because if abd were there it would have already been generated If it is not there then it would be pruned later

Example of Generating Candidates (3) Given abde we go to the pruning phase acde is removed because ade is not in L3 Merge step does not ensure all subsets are frequent C4={abcd}

The Apriori Algorithm — Example (minsup = 30%) Database D L1 C1 Scan D C2 C2 L2 Scan D C3 L3 Scan D

Warning: Do Not Forget Pruning Rules get pruned in two ways Apriori property violated If Apriori not violated, still must scan database and if minsup not exceeded then prune Apriori property is necessary but not sufficient to keep a rule If you forget to prune via Apriori property, you will get same results since will catch on the scan But I will take off points on an exam. Make it clear when prune using Apriori property (do not fill in count when crossing off) Apriori property cannot be violated until k=3. Begins go get trickier at k=4 since more subsets to check

Step 2: Rules from Frequent Itemsets Frequent itemsets  association rules One more step is needed For each frequent itemset X, For each proper nonempty subset A of X, Let B = X - A A  B is an association rule if Confidence(A  B) ≥ minconf, support(A  B) = support(AB) = support(X) confidence(A  B) = support(A  B) / support(A)

Generating Rules: an Example Suppose {2,3,4} is frequent, with sup=50% Proper nonempty subsets: {2,3}, {2,4}, {3,4}, {2}, {3}, {4}, with sup=50%, 50%, 75%, 75%, 75%, 75% respectively These generate these association rules: 2,3  4, confidence=100% 2,4  3, confidence=100% 3,4  2, confidence=67% 2  3,4, confidence=67% 3  2,4, confidence=67% 4  2,3, confidence=67% All rules have support = 50% Then apply confidence threshold to identify strong rules Rules that meet the support and confidence requirements If confidence threshold is 80% we are left with 2 strong rules

Generating Rules: Summary To recap, in order to obtain A  B, we need to have support(A  B) and support(A) All the required information for confidence computation has already been recorded in itemset generation. No need to see the data T any more. This step is not as time-consuming as frequent itemset generation Hint: I almost always ask this on the exam Han and Kamber 2001

On Apriori Algorithm Seems to be very expensive Level-wise search K = the size of the largest itemset It makes at most K passes over data In practice, K is bounded (10). The algorithm is very fast. Under some conditions, all rules can be found in linear time. Scale up to large data sets

Granularity of items One exception to the “ease” of applying association rules is selecting the granularity of the items. Should you choose: diet coke? coke product? soft drink? beverage? Should you include more than one level of granularity? Some association finding techniques allow you to represent hierarchies explicitly

Multiple-Level Association Rules Items often form a hierarchy Items at the lower level are expected to have lower support Rules regarding itemsets at appropriate levels could be quite useful Transaction database can be encoded based on dimensions and levels Food Milk Bread Skim 2% Wheat White

Mining Multi-Level Associations A top-down, progressive deepening approach First find high-level strong rules: milk ® bread [20%, 60%] Then find their lower-level “weaker” rules: 2% milk ® wheat bread [6%, 50%] Usually requires different thresholds at different levels to find meaningful rules lower support at lower levels

Interestingness Measurements Objective measures Two popular measurements: Support Confidence Subjective measures (Silberschatz & Tuzhilin, KDD95) A rule (pattern) is interesting if it is unexpected (surprising to the user); and/or actionable (the user can do something with it)

Criticism to Support and Confidence Lift of A => B = P(B|A)/P(B) and a rule is interesting if lift is not near 1.0 Example 1: Among 5000 students 3000 play basketball 3750 eat cereal 2000 both play basket ball and eat cereal play basketball  eat cereal [40%, 66.7%] is misleading because the overall percentage of students eating cereal is 75% which is higher than 66.7%. play basketball  not eat cereal [20%, 33.3%] is far more interesting, although with lower support and confidence What is the lift of this rule? (1/3)/(1250/5000) = 1.33

Customer Number vs. Transaction ID In the homework you may have a problem where there is a customer id for each transaction You can be asked to do association analysis based on the customer id If this is so, you need to aggregate the transactions to the customer level If a customer has 3 transactions then you just create an itemset containing all of the items in the union of the 3 transactions Note we will ignore the frequency of purchase

Virtual items If you’re interested in including other possible variables, can create “virtual items” gift-wrap, used-coupon, new-store, winter- holidays, bought-nothing,…

Associations: Pros and Cons can quickly mine patterns describing business/customers/etc. without major effort in problem formulation virtual items allow much flexibility unparalleled tool for hypothesis generation Cons unfocused not clear exactly how to apply mined “knowledge” only hypothesis generation can produce many, many rules! may only be a few nuggets among them (or none)

Association Rules Association rule types: Actionable Rules – contain high-quality, actionable information Trivial Rules – information already well-known by those familiar with the business Inexplicable Rules – no explanation and do not suggest action Trivial and inexplicable rules occur most often