Download presentation
Presentation is loading. Please wait.
Published byIra Dennis Modified over 9 years ago
1
Detailed Description of an Algorithm for Enumeration of Maximal Frequent Sets with Irredundant Dualization I rredundant B order E numerator Takeaki Uno Ken Satoh National Institute of Informatics, JAPAN 19/Nov/2003 FIMI 2003
2
Outline of This Talk ・ Explanation of our algorithm (improved version of Gunopulos et al.) ・ Algorithm technique using sparseness. ・ Computational experiments for datasets
3
1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. Algorithm of Gunopulos et al. 11…1 00…0
4
1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. Algorithm of Gunopulos et al. 11…1 00…0
5
1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. Algorithm of Gunopulos et al. 11…1 00…0
6
1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. Algorithm of Gunopulos et al. 11…1 00…0 - solves dualization many times - finds the same minimal set many times Our algorithm dualizes and finds maximal elements simultaneously Irredundant Dualization
7
1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. Our Algorithm 11…1 00…0
8
1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. Our Algorithm 11…1 00…0
9
1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. Our Algorithm 11…1 00…0
10
1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. Our Algorithm 11…1 00…0 - finds each minimal set once - solves one dualization - dualization can accept additional input Incremental dualization by Kavvadias and Stavropoulos, or by Uno
11
- Algorithms of Kavvadias and Stavropoulos, by Uno ( !! input sets are the complement in the terms of dualizaion) Incremental Dualization CDE ACDE BCDE φ AE CB CEACDCDE BCD CE CD ABC
12
- Algorithms of Kavvadias and Stavropoulos, by Uno, checks minimality many times (each takes O(|max sets|×|items|) time) - Algorithm of Uno checks it by using "crit" (critical elements) crit(e,H) ≠ φ ⇔ H is minimal - crit can be updated for H ∪ {e} in O(|max sets|) time Algorithm Technique: crit improving factor = O(|items|) items : itemset |max sets| : # max. sets
13
- Checking minimality for all H ∪ {e} takes O(|max. sets|×|items|) time - Checking them by tracing each max. set - |items| ave. size of max sets Using Sparseness crit(*,H ∪ *) e1e2e3e4e5e6e1e2e3e4e5e6 max. sets remains remains
14
Summery - Irredundant dualizatioin O(1/|max. sets|) - Checking minimality by crit O(1/|items|) - Speeding up by sparseness O(size of max sets / |items|) Computation time is reduced to O(size of max sets / |items| 2 |max sets|)
15
Comparison to Bottom Up - Computation time depends on: Bottom up approach (ex. apriori) #frequent sets, #closed sets Our algorithm #max. frequent sets, #min. infrequent sets. For instances with few minimum infrequent sets, Our algorithm performs well
16
Experiments
17
Conclusion - We improved the algorithm of Gunopulos et al. by irredundant dualizationsparse algorithms irredundant dualization and sparse algorithms - The computation time depends on #max. frequent sets, #min. infrequent sets. (reduced to size of max sets / |items| 2 |max sets|) For further improvements - Speed up dulization by pruning of unnecessary items - Speed up updating occurrences by usual techniques
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.