Download presentation
Presentation is loading. Please wait.
Published byGriselda Lorin Jefferson Modified over 9 years ago
1
Data Mining Association Rule Classification Clustering
2
Data Mining: Association Rule
3
What Is Association Mining? Association Rule Mining – Finding frequent patterns, associations, correlations, or causal structures among item sets in transaction databases, relational databases, and other information repositories Applications – Market basket analysis (marketing strategy: items to put on sale at reduced prices), cross-marketing, catalog design, shelf space layout design, etc Examples – Rule form: Body ead [Support, Confidence]. – buys(x, “Computer”) buys(x, “Software”) [2%, 60%] – major(x, “CS”) ^ takes(x, “ DB”) grade(x, “A”) [1%, 75%]
4
Market Basket Analysis Typically, association rules are considered interesting if they satisfy both a minimum support threshold and a minimum confidence threshold.
5
Rule Measures: Support and Confidence Let minimum support 50%, and minimum confidence 50%, we have –A C [50%, 66.6%] –C A [50%, 100%]
6
Support & Confidence
7
Association Rule: Basic Concepts Given –(1) database of transactions, –(2) each transaction is a list of items (purchased by a customer in a visit) Find all rules that correlate the presence of one set of items with that of another set of items Find all the rules A B with minimum confidence and support –support, s, P(A B) –confidence, c, P(B|A)
8
Terminologies Item –I1, I2, I3, … –A, B, C, … Itemset –{I1}, {I1, I7}, {I2, I3, I5}, … –{A}, {A, G}, {B, C, E}, … 1-Itemset –{I1}, {I2}, {A}, … 2-Itemset –{I1, I7}, {I3, I5}, {A, G}, …
9
Terminologies K-Itemset –If the length of the itemset is K Frequent (Large) K-Itemset –If the length of the itemset is K and the itemset satisfies a minimum support threshold. Association Rule –If a rule satisfies both a minimum support threshold and a minimum confidence threshold
10
Analysis The number of itemsets of a given cardinality tends to grow exponentially
11
Fast Algorithms for Mining Association Rules
12
Mining Association Rules: Apriori Principle For rule A C: –support = support({A C}) = 50% –confidence = support({A C})/support({A}) = 66.6% The Apriori principle: –Any subset of a frequent itemset must be frequent Min. support 50% Min. confidence 50%
13
Mining Frequent Itemsets: the Key Step Find the frequent itemsets: the sets of items that have minimum support –A subset of a frequent itemset must also be a frequent itemset i.e., if {AB} is a frequent itemset, both {A} and {B} should be a frequent itemset –Iteratively find frequent itemsets with cardinality from 1 to k (k-itemset) Use the frequent itemsets to generate association rules
14
Another Example 1 Database D 1 3 4 2 3 5 1 2 3 5 2 5 scan D count C 1 C 1 count 1 2 2 3 3 4 1 5 3 generate L 1 L 1 1 2 3 5 scan D count C 2 C 2 count 12 1 13 2 15 1 23 2 25 3 35 2 generate L 2 L 2 13 23 25 35 C 2 12 13 15 23 25 35 generate C 2 scan D count C 3 C 3 count 235 2 generate L 3 L 3 235 C 3 235 generate C 3
15
Example of Generating Candidates L 3 ={abc, abd, acd, ace, bcd} Self-joining: L 3 *L 3 –abcd from abc and abd –acde from acd and ace Pruning: –acde is removed because ade is not in L 3 C 4 ={abcd}
16
Example
17
Apriori Algorithm
20
Another Example 2
21
Demo-IBM Intelligent Minner
22
Demo Database
26
Multi-Dimensional Association Single-Dimensional (Intra-Dimension) Rules: Single Dimension (Predicate) with Multiple Occurrences. buys(X, “milk”) buys(X, “bread”) Multi-Dimensional Rules: 2 Dimensions –Inter-dimension association rules (no repeated predicates) age(X,”19-25”) occupation(X,“student”) buys(X,“coke”) –hybrid-dimension association rules (repeated predicates) age(X,”19-25”) buys(X, “popcorn”) buys(X, “coke”) Categorical (Nominal) Attributes –finite number of possible values, no ordering among values Quantitative Attributes –numeric, implicit ordering among values
27
An Example
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.