Using Attribute Value Lattice to Find Closed Frequent Itemsets – Lin, Hu, Louie
New Apporoach to Data Mining Find closed frequent itemsets Search only the attribute-value lattice Enables finding only the non-redundant association rule set
Frequent and Closed Itemsets Frequent Itemset: Itemset that occurs in a user-specified percentage of the database Closed Itemset: An itemset (A) that is identical to its closure Cl(A) Closure of an Itemset Cl(A): all items that appear in all tuples that contain A. Eg. Cl(A) = {1,3,4,5} Cl(C) = { 1,2,3,4,5} Cl(W) = {1,2,3,4,5} Cl(A)Cl(C ) Cl(W) = {I,3,4,5} ACW is a closed frequent itemset. 1 ACTW 2 CDW 3 ACTWHG 4 ACDWHF 5 ACDTWH
Partial order and lattice Partial Order: A binary relation that is reflexive (a <=a ), antisymmetric (a<=b and b<=a, then a = b) and transitive (a<=b and b<=c, then a<=c) Lattice: Partially ordered set in which non-empty finite subsets have a least upper bound and a greatest lower bound
Lin’s Algorithm Attribute value lattice constructed from database Construct bitmap of each frequent itemset B(Ii) Set level number Li of Ii to 1 Nodes contain Ii , Li , and B(Ii) where B(Ii) > threshold Sort the item in nodes based on bitcount For each node, Ii , Li ,B(Ii) in nodes For each sibling Ii I = Ii Ij and Bcomb = B(Ii) B(Ij) If Bcomb> threshold If B(Ii) = B(Ij) remove Ij from nodes replace Ii with I 2. If If B(Ii) B(Ij) create an edge from Ii to Ij Lj = max(Lj , Lj + 1) 3. If B(Ij) B(Ii) create an edge from Ij to Ii Li = max(L, Lj + 1)
Lin’s Algorithm Searches attribute value lattice to find closed frequent itemsets