Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining General Temporal Association Rules for Items with Different Exhibition Cheng-Yue Chang, Ming-Syan Chen, Chang-Hung Lee, Proc. of the 2002 IEEE international.

Similar presentations


Presentation on theme: "Mining General Temporal Association Rules for Items with Different Exhibition Cheng-Yue Chang, Ming-Syan Chen, Chang-Hung Lee, Proc. of the 2002 IEEE international."— Presentation transcript:

1 Mining General Temporal Association Rules for Items with Different Exhibition Cheng-Yue Chang, Ming-Syan Chen, Chang-Hung Lee, Proc. of the 2002 IEEE international Conference on Data Mining(ICDM’02) Adviser: Jia-Ling Koh Speaker: Yu-ting Kung

2 2 Introduction In this paper, explore a new model of mining general temporal association rules from large database where the exhibition periods of the items are allowed to be different from one to another. (see next page)

3 3 Introduction (Cont.) What’s wrong on conventional mining algorithm applied in this database? For example: Min_support = 30%, min_conf= 75% By conventional mining, only {A}, {B}, {C} and {F} are frequent itemsets No association rule discovered But some rules do exist in this database!!

4 4 Introduction (Cont.) What’s the problem of conventional mining algorithm? It doesn’t take the individual exhibition periods of items into consideration.

5 5 Introduction (Cont.) For allowing to have different exhibition periods, now define three basic definition: Maximal common exhibition period (MCP) MCP(X) = [p, q] For example: (in Figure1) MCP(BC) = [2,3] itemset Latest-exhibition-start time earliest-exhibition-end-time

6 6 Introduction (Cont.) Relative support For example: (in Figure1) Confidence For example: (in Figure1)

7 7 Introduction (Cont.) Based on the definition above, the frequent general temporal association rules in this database are:

8 8 Introduction (Cont.) In this model, the “downward closure” property is no longer valid. For example: (In Figure1) itemset BCD is frequent in [2,2], but BC, BD and CD are “not” all frequent in their corresponding MCP!! ex: BC’s relative support is only 25% (< 30%)

9 9 Problem Description Maximal temporal itemset For example: BCD 2,2 ( ) BD 2,2 ( ) BC 2,2 ( X ) Temporal sub-itemset of the maximal temporal itemset For example: BCD 2,2 is a maximal temporal itemset  BD 2,2, BC 2,2 and CD 2,2 are the temporal sub-itemset of BCD 2,2

10 10 Problem Description (Cont.) Maximal temporal itemset is frequent For example: (X MCP(X) is a maximal TI) If supp(X MCP(X) ) >= min_supp, then X MCP(X) is a frequent Property: All temporal sub-itemsets of a frequent maximal temporal itemset are frequent General temporal association rule It will be frequent iff

11 11 Mining General Temporal Association Rule ─ SPF Algorithm SPF consists of “two” major procedures: Segmentation (ProcSG) Progressively Filtering (ProcPF) First, SPF divide the database into partitions according to the time granularity imposed. Second, SPF employs ProcSG Third, SPF utilizes ProcPF Then, generate all candidate k-itemsets from (k-1)- itemset  transform to TIs, generate SIs Finally, scan database to determine all frequent TIs and SIs

12 12 SPF Algorithm ─ ProcSG Segment the database into sub-database that items in each will have either the common starting time or the common ending time db 1,6  db 1,3, db 4,4 and db 5,6

13 13 SPF Algorithm ─ ProcPF After the entire database is segmented by ProcSG, ProcPF is to progressivly filter candidate 2-itemsets from one partition to another in each sub-database

14 14 An Illustrative Example (SPF) Illustrative Example: Figure1 Min_supp = 30%, min_conf=75% Use ProcSG: database  sub-databases db 1,4  db 1,2 and db 3,4 (two sub-segments)

15 15 An Illustrative Example (SPF) Use ProcPF: progressively filter the candidate 2- itemsets

16 16 An Illustrative Example (SPF) After the 1st database scan, C2= {AB, BC, BD, CD, CF, EF} Generate C3, C3={BCD} Transform to TI and generate SI After the 2nd database scan, Frequent T1={AB 2,4, BD 2,2, CF 1,3, EF 3,3 BCD 2,2 }

17 17 Experiment Data |D| = the number of transactions |T| = average size in each transaction |N| = the number of different items |L| = the number of potential frequent itemsets Algorithms to compare SPF Apriori IP

18 18 Experiment (Cont.)

19 19 Experiment (Cont.)


Download ppt "Mining General Temporal Association Rules for Items with Different Exhibition Cheng-Yue Chang, Ming-Syan Chen, Chang-Hung Lee, Proc. of the 2002 IEEE international."

Similar presentations


Ads by Google