Download presentation
Presentation is loading. Please wait.
Published byMolly Emily Hubbard Modified over 6 years ago
1
CFI-Stream: Mining Closed Frequent Itemsets in Data Streams
Nan Jiang,Le Gruenwald SIGKDD’06 報告者:林靜怡 2006/10/04
2
Introduction mining Closed frequent itemsets
computes and maintains closed itemsets online and incrementally perform the closure checking output the current closed frequent itemsets in real time based on users’ specified thresholds
3
Definition D:data stream I = { , , …, } :a set of n elements,
called items T: subsets of all the transactions X: subsets of all the items appearing in a data stream
4
Definition C(X):the smallest closed set containing X Definition 1
An itemset X is said to be closed if and only if C(X)= f(g(X)) = f•g(X) = X
5
Algorithm CFI-Stream algorithm DIrect Update (DIU) tree
perform the closure checking online over a data stream sliding window Conditions need to check for closed itemsets check when performing addition and deletion operations on the DIU tree
6
DIU tree maintain the current closed itemsets
k levels in the DIU tree, each level i stores the closed i-itemsets
7
DIU tree Each node in the DIU tree stores a closed itemset
its current support information links to its parent and children nodes
8
Add a Transaction to the DIU Tree
T1:original transaction set t:new arrived transaction Conditions to Check for Closed Itemsets (1) t is in the T1, if the largest itemset X it contains is not currently in the DIU tree ->check for all X’s subsets Y, which are in T1
9
(2) when t is not in T1, for each its subset Y, if Y is in T1, we need to check
10
Closure Checking for Addition
11
C,D 2 A,B 3 A,B,C CD C CD 2 1 3 4 A,B,C 2 1 3 1 AB ABC 1 2
12
Delete a Transaction in DIU Tree
Conditions to Check for Closed Itemsets When the number of the transactions with same itemset of X is equal to zero, if Y is a subset of X, and Y is a closed itemset in the original transaction set
13
Closure Checking for Deletion
14
C,D 2 A,B 3 A,B,C 4 A,B,C 2 3 C 2 3 1 AB CD 2 ABC
15
Experiment Synthetic datasets T10.I6.D100K and T5.I4.D100K
16
Experiment
17
Experiment
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.