Presentation is loading. Please wait.

Presentation is loading. Please wait.

Detailed Description of an Algorithm for Enumeration of Maximal Frequent Sets with Irredundant Dualization I rredundant B order E numerator Takeaki Uno.

Similar presentations


Presentation on theme: "Detailed Description of an Algorithm for Enumeration of Maximal Frequent Sets with Irredundant Dualization I rredundant B order E numerator Takeaki Uno."— Presentation transcript:

1 Detailed Description of an Algorithm for Enumeration of Maximal Frequent Sets with Irredundant Dualization I rredundant B order E numerator Takeaki Uno Ken Satoh National Institute of Informatics, JAPAN 19/Nov/2003 FIMI 2003

2 Outline of This Talk ・ Explanation of our algorithm (improved version of Gunopulos et al.) ・ Algorithm technique using sparseness. ・ Computational experiments for datasets

3 1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. Algorithm of Gunopulos et al. 11…1 00…0

4 1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. Algorithm of Gunopulos et al. 11…1 00…0

5 1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. Algorithm of Gunopulos et al. 11…1 00…0

6 1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. Algorithm of Gunopulos et al. 11…1 00…0 - solves dualization many times - finds the same minimal set many times Our algorithm dualizes and finds maximal elements simultaneously Irredundant Dualization

7 1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. Our Algorithm 11…1 00…0

8 1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. Our Algorithm 11…1 00…0

9 1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. Our Algorithm 11…1 00…0

10 1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. Our Algorithm 11…1 00…0 - finds each minimal set once - solves one dualization - dualization can accept additional input Incremental dualization by Kavvadias and Stavropoulos, or by Uno

11 - Algorithms of Kavvadias and Stavropoulos, by Uno ( !! input sets are the complement in the terms of dualizaion) Incremental Dualization CDE ACDE BCDE φ AE CB CEACDCDE BCD CE CD ABC

12 - Algorithms of Kavvadias and Stavropoulos, by Uno, checks minimality many times (each takes O(|max sets|×|items|) time) - Algorithm of Uno checks it by using "crit" (critical elements) crit(e,H) ≠ φ ⇔ H is minimal - crit can be updated for H ∪ {e} in O(|max sets|) time Algorithm Technique: crit improving factor = O(|items|) items : itemset |max sets| : # max. sets

13 - Checking minimality for all H ∪ {e} takes O(|max. sets|×|items|) time - Checking them by tracing each max. set - |items|  ave. size of max sets Using Sparseness crit(*,H ∪ *) e1e2e3e4e5e6e1e2e3e4e5e6 max. sets remains remains

14 Summery - Irredundant dualizatioin  O(1/|max. sets|) - Checking minimality by crit  O(1/|items|) - Speeding up by sparseness  O(size of max sets / |items|) Computation time is reduced to O(size of max sets / |items| 2 |max sets|)

15 Comparison to Bottom Up - Computation time depends on: Bottom up approach (ex. apriori)  #frequent sets, #closed sets Our algorithm  #max. frequent sets, #min. infrequent sets. For instances with few minimum infrequent sets, Our algorithm performs well

16 Experiments

17 Conclusion - We improved the algorithm of Gunopulos et al. by irredundant dualizationsparse algorithms irredundant dualization and sparse algorithms - The computation time depends on #max. frequent sets, #min. infrequent sets. (reduced to size of max sets / |items| 2 |max sets|) For further improvements - Speed up dulization by pruning of unnecessary items - Speed up updating occurrences by usual techniques


Download ppt "Detailed Description of an Algorithm for Enumeration of Maximal Frequent Sets with Irredundant Dualization I rredundant B order E numerator Takeaki Uno."

Similar presentations


Ads by Google