KDD’09,June 28-July 1,2009,Paris,France Copyright 2009 ACM Frequent Pattern Mining with Uncertain Data.

Outline Introduction Definition Algorithm Experiment Results Conclusion

Introduction This paper will study the problem of frequent pattern mining by examining the relative behavior of the extensions of well known classes of deterministic algorithms.

Definition

Algorithm Step1. Extending the H-mine Algorithm Step2. Extending the FP-growth Algorithm Step3.Computation of Support Upper Bounds Step4.Mining Frequent Patterns with UFP-tree Step5. Determining Support with a Trie Tree

H-Mine (Example) TDB IDItems 100 c, d, e, f, g, i 200 a, c, d, e, m 300 a, b, d, e, g, k 400 a, c, d, h min_sup_count = 2 Scan TDB Complete set of frequent items can be found and output ： { a:3, c:3, d:4, e:3, g:2 } Following the alphabetical order of frequent items (called F-list): a-c-d-e-g ID Frequent-item projection 100 c, d, e, g 200 a, c, d, e 300 a, d, e, g 400 a, c, d Build H-struct in main memory Scan TDB

H-Mine (Example) (Cont.) acdeg 33432 cdeg acde adeg acd 100 200 300 400 Frequentprojections Header table H H-Struct

H-Mine (Example) (Cont.) cdeg acde adeg acd 100 200 300 400 Frequentprojections cdeg 2321 Header table H acdeg 33432Header ac: 2 ad: 3 ae: 2

H-Mine (Example) (Cont.) a:3, c:3, d:4, e:3, g:2, ac:2, ad:3, ae:2, acd:2,ade:2, cd:3, ce:2, cde:2, de:3, dg:2, deg:2, eg: 2 TDB IDItems 100 c, d, e, f, g, i 200 a, c, d, e, m 300 a, b, d, e, g, k 400 a, c, d, h min_sup_count = 2 Output

FP-growth(Example) {} f:4c:1 b:1 p:1 b:1c:3 a:3 b:1m:2 p:2m:1 Header Table Item frequency head f4 c4 a3 b3 m3 p3 min_support = 3 TIDItems bought (ordered) frequent items 100{f, a, c, d, g, i, m, p}{f, c, a, m, p} 200{a, b, c, f, l, m, o}{f, c, a, b, m} 300 {b, f, h, j, o, w}{f, b} 400 {b, c, k, s, p}{c, b, p} 500 {a, f, c, e, l, p, m, n}{f, c, a, m, p} f-c-a-m-p

Computation of Support Upper Bounds corollary

Mining Frequent Patterns with UFP-tree Goal: It avoids recursively constructing conditional FP-trees.

Trie Tree

Experiment Results

Conclusion In this tests, we found UApriori and UH-mine are both efficient in mining frequent itemsets.

KDD’09,June 28-July 1,2009,Paris,France Copyright 2009 ACM Frequent Pattern Mining with Uncertain Data.

Similar presentations

Presentation on theme: "KDD’09,June 28-July 1,2009,Paris,France Copyright 2009 ACM Frequent Pattern Mining with Uncertain Data."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

KDD’09,June 28-July 1,2009,Paris,France Copyright 2009 ACM Frequent Pattern Mining with Uncertain Data.

Similar presentations

Presentation on theme: "KDD’09,June 28-July 1,2009,Paris,France Copyright 2009 ACM Frequent Pattern Mining with Uncertain Data."— Presentation transcript:

Similar presentations

About project

Feedback