Download presentation
Presentation is loading. Please wait.
1
Generating Non-Redundant Association Rules Mohammed J. Zaki
2
Yaeer Master©2 Outline Introduction Association Rules – reminder Closed Frequent Itemsets Generating Rules Complexity Analysis Experimental Evaluation
3
Yaeer Master©3 Introduction Association Rule Discovery – The set of association rules can grow to be unwieldy especially as we lower the frequency requirement (support). Many rules are redundant. Number of redundant rules can be exponential in the length of the longest frequent itemset. For dense datasets it is not feasible to mine all frequent itemsets.
4
Yaeer Master©4 Introduction Solution: Using Closed Frequent Itemsets: The set is smaller in orders of magnitude. No loss of information. Creating a “Generating Set”. Algorithm for mining closed itemsets: CHARM
5
Yaeer Master©5 Association Rules
6
Yaeer Master©6 Mining Association Rules
7
Yaeer Master©7 Mining Association Rules Find all frequent itemsets: 2 m : NP-Complete. Assuming a bound on transaction length O (r · n · 2 L ). Generating confident rules: For each itemset of size k, 2 k potential rules. Complexity: O (f · 2 L ). Num of max frequent itemsets Num of transactions Longest frequent itemset Num of frequent itemsets Longest frequent itemset
8
Yaeer Master©8 Closed Frequent Itemsets – Defining a Galois connection The Mappings : Let: Define a Galois Connection between the partially ordered sets P(I), P(T). Galois connection: For all a in A and b in B: F (a) ≤ b ↔ G (b) ≤ a
9
Yaeer Master©9 Galois Connection Cont. Properties: 1. 1. 2. 2. 3. 3. )()( 2121 XtXtXX)()( 2121 YiYiYY ))(( ))((YitYandXtiX
10
Yaeer Master©10 Galois Connection
11
Yaeer Master©11 Example t (ACW) = t (A) ∩ t (C) ∩ t (W) = 1345 ∩ 123456 ∩ 12345 = 1345 = 1345 i (245) = CDW ACW ACDW ACW ACDW t (ACW) = 1345 135 = t (ACDW) t (ACW) = 1345 135 = t (ACDW)
12
Yaeer Master©12 Closure Operator c: P(s) P(s) if satisfies the following: 1. 1. 2. 2. 3. 3. Closure Composition: c it (x) = i t (x) = i(t(x)) c ti (x) )(:XcXExtension)()(:YcXcYXtyMonotonici )())((:XcXccyIdempotenc
13
Yaeer Master©13 Closure Operator – Round Trip
14
Yaeer Master©14 Closed Itemset - Definition A Closed Itemset X is an Itemset that is same as its closure. Example : c it (AC) = i(t(AC) = i(1345) = ACW conclusion: AC is not closed. ACW is closed. ACW is closed.
15
Yaeer Master©15 Closed Vs Frequent itemsets
16
Yaeer Master©16 Concept - Definition For any Closed Itemset X, there exists a Closed Tidset Y, with the property: Y = t(X). The Pair X × Y is called a Concept.
17
Yaeer Master©17 Galois Lattice A concept x 1 × y 1 is a sub concept of x 2 × y 2, If x 1 x 2 (if y 2 y 1 ). Let B(δ) be the set of all concepts. The ordered set (B(δ),≤) is a complete lattice, called the Galois lattice.
18
Yaeer Master©18 Galois Lattice Of Concepts
19
Yaeer Master©19 Frequent Closed ItemSets Vs. Frequent Itemsets Lattice operations Join: Meet: Frequent Concept: With support greater than minsup, We define the support is the cardinality of the closed tidset.
20
Yaeer Master©20 Join Meet Example Join: (ACDW × 45) (CDT × 56) = (ACDW × 45) (CDT × 56) = c it )ACDW CDT) × (45 56) = c it )ACDW CDT) × (45 56) = ACDTW × 5 Meet: (ACDW × 45) (CDT 56) = (ACDW × 45) (CDT 56) = (ACDW CDT) × c ti (4556) = (ACDW CDT) × c ti (45 56) = CD × 2456
21
Yaeer Master©21 Frequent Concepts
22
Yaeer Master©22 Frequent Concepts Lemma 1: An itemset’s (X) support is equal to the support of its closure, i.e. σ(X) = σ(c it (X)). Therefore all frequent itemsets are uniquely determined by the Closed itemsets and can be determined by the join operation on the frequent concepts. frequent concepts frequent concepts
23
Yaeer Master©23 Redundant Rules Definition: A rule R 1 : is more general than a rule R 2 denoted R 1 ‹ R 2, provided that R 2 can be generated by adding additional items to the antecedent or consequent of R 1. is more general than a rule R 2 denoted R 1 ‹ R 2, provided that R 2 can be generated by adding additional items to the antecedent or consequent of R 1. The Non-Redundant rules are those that are most general (with equal confidence). i p i XX i 21
24
Yaeer Master©24 Rule Generation Lemma 2: Transitivity: Let X 1, X 2, X 3 be frequent closed itemsets, with If, then Observation: it is sufficient to consider rules among adjacent concepts. 321 XXX 32 XX q 21 XX p 31 XX pq
25
Yaeer Master©25 Rule Generation – 100% conf. Lemma 3: An association rule has confidence p = 1.0 If and only if. 100% confidence rules are those directed from a super-concept to a sub-concept, i.e. Down Arcs. 2 0.1 1 XX )()( 21 XtXt
26
Yaeer Master©26 Rule Generation – 100% conf.
27
Yaeer Master©27 Rule Generation – 100% conf Theorem 1. Let R = {R 1,…, R n } be a set of rules with 100% confidence (p i for all i), such that for all rules R i. for all rules R i. Let R I denote the 100% confidence rule Then all rules R i ≠ R I are more specific than, and thus are redundant., and thus are redundant. )( and )( 22211 i it ii it XcIXXcI 2 0.1 1 II
28
Yaeer Master©28 Rule Generation – 100% conf Example: TW A, TW AC, CTW A c it (TW A) = c it (ATW) = ACTW c it (TW A) = c it (ATW) = ACTW c it (TW AC) = ACTW c it (TW AC) = ACTW c it (CTW A) = ACTW c it (CTW A) = ACTW The most general
29
Yaeer Master©29 Rule Generation – Confidence <100% Rules from sub-concepts to super- concepts i.e. correspond to up-arcs. Rules between non adjacent concepts can be derived by transitivity. For example: C W (with p= 0.83) and W A (q=0.8) C A (pq = 0.67)
30
Yaeer Master©30 Rule Generation – Confidence <100%
31
Yaeer Master©31 Rule Generation – Confidence <100% Theorem 2. Let R = {R 1,…, R n } be a set of rules with confidence p< 1.0 (p i for all i), such that for all rules R i. for all rules R i. Let R I denote the rule Then all rules R i ≠ R I are more specific than R I, and thus are redundant. )( and )( 21221 ii it i it XXcIXcI 21 II p
32
Yaeer Master©32 Generating Set Combining the two sets gives us a generating set for rules with minconf = 50% and minsup = 80%: }TW→A, A→W, W→C, T→C, D→C, W→A (0.8), C →W (0.83) } All association rules can Be derived from this set
33
Yaeer Master©33 Complexity of Rule Generation Traditional: New Framework: Best case: one closed itemset, no rules. Worst case: All frequent itemsets are closed. Number of rules: Reduction factor: )2(22222 2 000 llll i l i lll i l i ill i l i O )lOlil l l i l i l i l i 2 ( )( 00 ) 2 ( l O l l 2
34
Yaeer Master©34 Experimental Evaluation
35
Yaeer Master©35 Experimental Evaluation
36
Yaeer Master©36 Experimental Evaluation
37
Yaeer Master©37 Number of Rules Traditional Vs Closed itemset
38
Yaeer Master©38 Number of Rules Traditional Vs Closed itemset
39
Yaeer Master©39 Conclusion The new framework based on closed itemsets can drastically reduce the rule set, and can be presented to the user in a succinct manner. Future work: Interactive visualization and exploration of mined associations, generating rules on demand based on user’s interest. Finding a minimal generating set.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.