Rapid Association Rule Mining Amitabha Das, Wee-Keong Ng, Yew-Kwong Woon, Proc. of the 10th ACM International Conference on Information and Knowledge Management(CIKM’01),2001.

Rapid Association Rule Mining Amitabha Das, Wee-Keong Ng, Yew-Kwong Woon, Proc. of the 10th ACM International Conference on Information and Knowledge Management(CIKM’01),2001 Adviser:Jia-Ling Koh Speaker: Yu-ting Kung

Introduction Propose an innovative algorithm to push the speed-up barrier ─Rapid Association Rule Mining (RARM) RARM uses a tree structure─Support- Ordered Trie Itemset (SOTrieIT) Hold pre-processed transactional data quickly discover large 1-itemsets and 2-itemsets without scanning the database and without candidate 2-itemsets generation

A Complete TrieIT Definition I (the set of items) = {a 1,a 2,…a N } ─lexicographically-ordered A complete TrieIT is a set of tree nodes such that every tree node w is a 2-tuple w i I is the label of the node w c is a support count

A Complete TrieIT(Cont.) Example Complete TrieIT W 1 (item A) Complete TrieIT W 2 (item B) Complete TrieIT W 3 (item C) Complete TrieIT W 4 (item D) ※ Database D is stored as a set of complete TrieITs

A Complete TrieIT(Cont.) Insertion Transaction database with N=4 After the transactions 100 to 300 have been inserted into the tree After the transactions 400 have been inserted into the tree

Support-Ordered Trie Itemset Definition A SOTrieIT is a complete TrieIT with a depth of 1; i.e.,it consists only of 1. A root node w i 2. Some child nodes. All nodes in the forest of SOTrieIT are sorted according to their support counts in descending order from the left

SOTrieIT(Cont.) Example A SOTrieIT(Item A) A SOTrieIT(Item C) A SOTrieIT(Item B)

SOTrieIT(Cont.) Insertion Transaction database with N=4 Resultant SOTrieIT Insert TID 100 Insert TID 200 Insert TID 300 Insert TID 400

Algorithm RARM Pre-processing Mining of large itemsets

Algorithm RARM(Cont.) Example (support threshold is 80%) total number of traversals is 3 and L1={{C}} The sequence with which the SOTrieIT is traversed

Performance Evaluation Definition of Parameters Experiment using two database D1: T25.I10.N1K.D10K D2: T25.I10.N10K.D100K

Performance Evaluation(Cont.) Comparison of Apriori and RARM─ execution time For D1:

Performance Evaluation(Cont.) For D2:

Performance Evaluation(Cont.) Why does RARM achieve a much greater speed-up in D2 than in D1?

Conclusion Experiments have shown that RARM outperforms Apriori at all support thresholds

Rapid Association Rule Mining Amitabha Das, Wee-Keong Ng, Yew-Kwong Woon, Proc. of the 10th ACM International Conference on Information and Knowledge Management(CIKM’01),2001.

Similar presentations

Presentation on theme: "Rapid Association Rule Mining Amitabha Das, Wee-Keong Ng, Yew-Kwong Woon, Proc. of the 10th ACM International Conference on Information and Knowledge Management(CIKM’01),2001."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Rapid Association Rule Mining Amitabha Das, Wee-Keong Ng, Yew-Kwong Woon, Proc. of the 10th ACM International Conference on Information and Knowledge Management(CIKM’01),2001.

Similar presentations

Presentation on theme: "Rapid Association Rule Mining Amitabha Das, Wee-Keong Ng, Yew-Kwong Woon, Proc. of the 10th ACM International Conference on Information and Knowledge Management(CIKM’01),2001."— Presentation transcript:

Similar presentations

About project

Feedback