Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 On Mining General Temporal Association Rules in a Publication Database Chang-Hung Lee, Cheng-Ru Lin and Ming-Syan Chen, Proceedings of the 2001 IEEE.

Similar presentations


Presentation on theme: "1 On Mining General Temporal Association Rules in a Publication Database Chang-Hung Lee, Cheng-Ru Lin and Ming-Syan Chen, Proceedings of the 2001 IEEE."— Presentation transcript:

1 1 On Mining General Temporal Association Rules in a Publication Database Chang-Hung Lee, Cheng-Ru Lin and Ming-Syan Chen, Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM’01), 29 Nov.-2 Dec. 2001, pp. 337–344. Advisor : Jia-Ling Koh Speaker : Chen-Yi Lin Department of Information & Computer Education, NTNU

2 2 Introductions Problem Description General Temporal Association Rules (Progressive Partition Miner, PPM) Experimental Results Conclusions Department of Information & Computer Education, NTNU Outlines

3 3 Introductions (1/3) A publication database is a set of transactions where each transaction T is a set of items of which each item contains an individual exhibition period. Department of Information & Computer Education, NTNU

4 4 Introductions (2/3) Department of Information & Computer Education, NTNU Bookstore Transaction Database DataTIDItemset Jan-01 T1BD T2BCD T3BC T4AD Feb-01 T5BCE T6DE T7ABC T8CDE Mar-01 T9BCEF T10BF T11AD T12BDF Item Information ItemPublication Data AJan-95 BApr-96 CJuly-97 DAug-00 EFeb-01 FMar-01 min_sup=30% min_conf=75% Frequent itemsets: {B, C, D, E, BC} Frequent association rule: C => B

5 5 Introductions (3/3) The current model of association rule mining is not able to handle the publication database: –Lack of consideration of the exhibition period of each individual item –Lack of an equitable support counting basis for each item Department of Information & Computer Education, NTNU

6 6 Problem Description (1/4) Department of Information & Computer Education, NTNU Bookstore Transaction Database DataTIDItemset D P1 Jan-01 T1BD db 1,3 T2BCD T3BC T4AD P2 Feb-01 T5BCE db 2,3 T6DE T7ABC T8CDE P3 Mar-01 T9BCEF db 3,3 T10BF T11AD T12BDF Time granularity: Month P1+P2+P3

7 7 Problem Description (2/4) Department of Information & Computer Education, NTNU Maximal Common exhibition period(MCP) of items –MCP(x): the MCP value of item x –For example: MCP(C)=(1, 3) and MCP(E)=(2, 3) => MCP(CE)=(2, 3)

8 8 Problem Description (3/4) Department of Information & Computer Education, NTNU An association rule (X => Y) MCP(XY) is called a general temporal association rule. (X => Y) MCP(XY) is frequent if and only if –supp((X => Y) MCP(XY) ) >= min_supp –and conf((X => Y) MCP(XY) ) >= min_conf –For example: (C => E) 2,3 is general temporal association rule min_supp=30% and min_conf=75% => (C => E) 2,3 is frequent.

9 9 Problem Description (4/4) Department of Information & Computer Education, NTNU When a maximal temporal k-itemset is frequent in data set, each of its corresponding sub-itemset is also frequent in. –For example: is frequent. => and are also frequent.

10 10 General Temporal Association Rules (1/6) Department of Information & Computer Education, NTNU Bookstore Transaction Database DataTIDItemset D P1 Jan-01 T1BD db 1,3 T2BCD T3BC T4AD P2 Feb-01 T5BCE db 2,3 T6DE T7ABC T8CDE P3 Mar-01 T9BCEF db 3,3 T10BF T11AD T12BDF min_sup=30% min_conf=75%

11 11 General Temporal Association Rules (2/6) Department of Information & Computer Education, NTNU Scan DB (1) P1 C2startCount BD12 BC12 CD11 AD11   P1+P2 C2startCount BD12 BC14 BE21 CE22 DE22 AB21 AC21 CD21    P1+P2+P3 C2startCount BC15 CE23 DE22 BE31 BF33 CF31 EF31 AD31 BD31 DF31   

12 12 General Temporal Association Rules (3/6) Department of Information & Computer Education, NTNU After 1 st scan database D, we have candidate itemsets as follows: no candidate k-itemset is generated (k>=3)

13 13 General Temporal Association Rules (4/6) Department of Information & Computer Education, NTNU Scan DB (2) Candidate ItemsetscountSRSR C1 {B 1,3 }84 {B 3,3 }32 {C 1,3 }64 {C 2,3 }43 {E 2,3 }43 {F 3,3 }32 C2 {BC 1,3 }54 {BF 3,3 }32 {CE 2,3 }33 Pruning Frequent Itemsetscount L1 {B 1,3 }8 {B 3,3 }3 {C 1,3 }6 {C 2,3 }4 {E 2,3 }4 {F 3,3 }3 L2 {BC 1,3 }5 {BF 3,3 }3 {CE 2,3 }3

14 14 General Temporal Association Rules (5/6) Department of Information & Computer Education, NTNU After 2nd scan database D, we have frequent itemsets as follows: RulesSupp.Conf. (B=>C) 1,3 41.67%62.50% (C=>B) 1,3 41.67%83.33% (B=>F) 3,3 75.00%100.00% (F=>B) 3,3 75.00%100.00% (C=>E) 2,3 37.50%75.00% (C=>E) 2,3 37.50%75.00% Pruning RulesSupp.Conf. (C=>B) 1,3 41.67%83.33% (B=>F) 3,3 75.00%100.00% (F=>B) 3,3 75.00%100.00% (C=>E) 2,3 37.50%75.00% (C=>E) 2,3 37.50%75.00%

15 15 General Temporal Association Rules (6/6) Department of Information & Computer Education, NTNU Partition database based on exhibition periods Produce candidate 2-TIs Use candidate 2-Tis to produce candidate k-TIs and k-SIs Generate frequent k-TIs and k- SIs Rule generation 1st scan database 2nd scan database The flowchart of PPM

16 16 Experimental Results (1/3) Department of Information & Computer Education, NTNU |D|Number of transactions in the database |T|Average size of the transactions |I|Average size of the maximal frequent itemsets |L|Number of maximal potentially frequent itemsets (default 2000) NNumber of items (default 10000) |Pi|Number of transactions in the partition database Pi Meaning of various parameters:

17 17 Department of Information & Computer Education, NTNU Experimental Results (2/3) Relative performance

18 18 Department of Information & Computer Education, NTNU Experimental Results (3/3) Scaleup performance

19 19 Conclusions Department of Information & Computer Education, NTNU Algorithm PPM is particularly powerful for efficient mining for transaction databases, video rental store records, library rental records, book rental records, and transactions in electronic commerce.


Download ppt "1 On Mining General Temporal Association Rules in a Publication Database Chang-Hung Lee, Cheng-Ru Lin and Ming-Syan Chen, Proceedings of the 2001 IEEE."

Similar presentations


Ads by Google