Download presentation
Presentation is loading. Please wait.
Published byLoraine Simpson Modified over 9 years ago
1
KDD’09,June 28-July 1,2009,Paris,France Copyright 2009 ACM Frequent Pattern Mining with Uncertain Data
2
Outline Introduction Definition Algorithm Experiment Results Conclusion
3
Introduction This paper will study the problem of frequent pattern mining by examining the relative behavior of the extensions of well known classes of deterministic algorithms.
4
Definition
6
Algorithm Step1. Extending the H-mine Algorithm Step2. Extending the FP-growth Algorithm Step3.Computation of Support Upper Bounds Step4.Mining Frequent Patterns with UFP-tree Step5. Determining Support with a Trie Tree
7
H-Mine (Example) TDB IDItems 100 c, d, e, f, g, i 200 a, c, d, e, m 300 a, b, d, e, g, k 400 a, c, d, h min_sup_count = 2 Scan TDB Complete set of frequent items can be found and output : { a:3, c:3, d:4, e:3, g:2 } Following the alphabetical order of frequent items (called F-list): a-c-d-e-g ID Frequent-item projection 100 c, d, e, g 200 a, c, d, e 300 a, d, e, g 400 a, c, d Build H-struct in main memory Scan TDB
8
H-Mine (Example) TDB IDItems 100 c, d, e, f, g, i 200 a, c, d, e, m 300 a, b, d, e, g, k 400 a, c, d, h min_sup_count = 2 Scan TDB Complete set of frequent items can be found and output : { a:3, c:3, d:4, e:3, g:2 } Following the alphabetical order of frequent items (called F-list): a-c-d-e-g ID Frequent-item projection 100 c, d, e, g 200 a, c, d, e 300 a, d, e, g 400 a, c, d Build H-struct in main memory Scan TDB
9
H-Mine (Example) TDB IDItems 100 c, d, e, f, g, i 200 a, c, d, e, m 300 a, b, d, e, g, k 400 a, c, d, h min_sup_count = 2 Scan TDB Complete set of frequent items can be found and output : { a:3, c:3, d:4, e:3, g:2 } Following the alphabetical order of frequent items (called F-list): a-c-d-e-g ID Frequent-item projection 100 c, d, e, g 200 a, c, d, e 300 a, d, e, g 400 a, c, d Build H-struct in main memory Scan TDB
10
H-Mine (Example) (Cont.) acdeg 33432 cdeg acde adeg acd 100 200 300 400 Frequentprojections Header table H H-Struct
11
H-Mine (Example) (Cont.) cdeg acde adeg acd 100 200 300 400 Frequentprojections cdeg 2321 Header table H acdeg 33432Header ac: 2 ad: 3 ae: 2
12
H-Mine (Example) (Cont.) a:3, c:3, d:4, e:3, g:2, ac:2, ad:3, ae:2, acd:2,ade:2, cd:3, ce:2, cde:2, de:3, dg:2, deg:2, eg: 2 TDB IDItems 100 c, d, e, f, g, i 200 a, c, d, e, m 300 a, b, d, e, g, k 400 a, c, d, h min_sup_count = 2 Output
13
FP-growth(Example) {} f:4c:1 b:1 p:1 b:1c:3 a:3 b:1m:2 p:2m:1 Header Table Item frequency head f4 c4 a3 b3 m3 p3 min_support = 3 TIDItems bought (ordered) frequent items 100{f, a, c, d, g, i, m, p}{f, c, a, m, p} 200{a, b, c, f, l, m, o}{f, c, a, b, m} 300 {b, f, h, j, o, w}{f, b} 400 {b, c, k, s, p}{c, b, p} 500 {a, f, c, e, l, p, m, n}{f, c, a, m, p} f-c-a-m-p
14
Computation of Support Upper Bounds corollary
15
Mining Frequent Patterns with UFP-tree Goal: It avoids recursively constructing conditional FP-trees.
16
Trie Tree
17
Experiment Results
21
Conclusion In this tests, we found UApriori and UH-mine are both efficient in mining frequent itemsets.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.