Summarization of Frequent Pattern Mining
What is FPM? Why being frequent is so important? Application of FPM Decision make/Business Software Debugging Bioinformatics Other data mining tasks Indexing Clustering/Classification/Association Rule
What have been done Frequent Itemset Mining Frequent Sequential Pattern Mining Frequent Subgraph Mining Frequent Tree Mining Mining A Single Large Graph Frequent motifs
FPM is a way to think B A E AB C C F B D F F D EAB A C AE D C F D A B A C E A D A B DC A AB B DD C C AB DC
Algorithm Foundations Apriori Property Enumeration Algorithm Level-wise search Depth-first search Data structure For Patterns For Data
Lattice
Apriori R. Agrawal and R. Srikant. Fast algorithms for mining association rules. VLDB, , 1994Fast algorithms for mining association rules
Resource and Tools Important FPM websites FIMI workshop website Mining Structure Data website Commercial Databases Oracle, IBM DB2, SQL Server General Data Mining Information KDDNuggets (general/job/software, etc) Weka (
Why FPM does not work? Too many patterns? What can we do? Pattern Pruning Additional constraints? Pattern summarization Representative Patterns? Pattern Ranking
What is missing The common foundation for FPM, clustering, classification, etc… FPM formalization language/compiler/automatic discovery FPM understanding How and why they are being generated? The relationship between dataset and pattern
How FIM relate to the underlying structure of the dataset?