Rapid Association Rule Mining Amitabha Das, Wee-Keong Ng, Yew-Kwong Woon, Proc. of the 10th ACM International Conference on Information and Knowledge Management(CIKM’01),2001.

Slides:



Advertisements
Similar presentations
Frequent Itemset Mining Methods. The Apriori algorithm Finding frequent itemsets using candidate generation Seminal algorithm proposed by R. Agrawal and.
Advertisements

gSpan: Graph-based substructure pattern mining
FP (FREQUENT PATTERN)-GROWTH ALGORITHM ERTAN LJAJIĆ, 3392/2013 Elektrotehnički fakultet Univerziteta u Beogradu.
1 Finding Recent Frequent Itemsets Adaptively over Online Data Streams J. H, Chang and W.S. Lee, in Proc. Of the 9th ACM International Conference on Knowledge.
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rules l Mining Association Rules between Sets of Items in Large Databases (R. Agrawal, T. Imielinski & A. Swami) l Fast Algorithms for.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Ex. 11 (pp.409) Given the lattice structure shown in Figure 6.33 and the transactions given in Table 6.24, label each node with the following letter(s):
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Rules Yao Meng Hongli Li Database II Fall 2002.
4/3/01CS632 - Data Mining1 Data Mining Presented By: Kevin Seng.
Data Mining Association Analysis: Basic Concepts and Algorithms
Fast Algorithms for Mining Association Rules * CS401 Final Presentation Presented by Lin Yang University of Missouri-Rolla * Rakesh Agrawal, Ramakrishnam.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Association Rule Mining (Some material adapted from: Mining Sequential Patterns by Karuna Pande Joshi)‏
1 Fast Algorithms for Mining Association Rules Rakesh Agrawal Ramakrishnan Srikant Slides from Ofer Pasternak.
Mining Association Rules between Sets of Items in Large Databases presented by Zhuang Wang.
USpan: An Efficient Algorithm for Mining High Utility Sequential Patterns Authors: Junfu Yin, Zhigang Zheng, Longbing Cao In: Proceedings of the 18th ACM.
Types of Binary Trees Introduction. Types of Binary Trees There are several types of binary trees possible each with its own properties. Few important.
Mining High Utility Itemsets without Candidate Generation Date: 2013/05/13 Author: Mengchi Liu, Junfeng Qu Source: CIKM "12 Advisor: Jia-ling Koh Speaker:
1 Apriori Algorithm Review for Finals. SE 157B, Spring Semester 2007 Professor Lee By Gaurang Negandhi.
Sequential PAttern Mining using A Bitmap Representation
Approximate Frequency Counts over Data Streams Gurmeet Singh Manku, Rajeev Motwani Standford University VLDB2002.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
© 2007 Cios / Pedrycz / Swiniarski / Kurgan Chapter 10 ASSOCIATION RULES Cios / Pedrycz / Swiniarski / Kurgan.
1 Mining Association Rules Mohamed G. Elfeky. 2 Introduction Data mining is the discovery of knowledge and useful information from the large amounts of.
Mining High Utility Itemset in Big Data
1 Efficient Algorithms for Incremental Update of Frequent Sequences Minghua ZHANG Dec. 7, 2001.
1 Inverted Matrix: Efficient Discovery of Frequent Items in Large Datasets in the Context of Interactive Mining -SIGKDD’03 Mohammad El-Hajj, Osmar R. Zaïane.
From Path Tree To Frequent Patterns: A Framework for Mining Frequent Patterns Yabo Xu, Jeffrey Xu Yu, Guimei Liu, Hongjun Lu, Proc. of the 2002 IEEE International.
Association Analysis (3)
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
A Scalable Association Rules Mining Algorithm Based on Sorting, Indexing and Trimming Chuang-Kai Chiou, Judy C. R Tseng Proceedings of the Sixth International.
Using category-Based Adherence to Cluster Market-Basket Data Author : Ching-Huang Yun, Kun-Ta Chuang, Ming-Syan Chen Graduate : Chien-Ming Hsiao.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
1 Mining the Smallest Association Rule Set for Predictions Jiuyong Li, Hong Shen, and Rodney Topor Proceedings of the 2001 IEEE International Conference.
Gspan: Graph-based Substructure Pattern Mining
Mining General Temporal Association Rules for Items with Different Exhibition Cheng-Yue Chang, Ming-Syan Chen, Chang-Hung Lee, Proc. of the 2002 IEEE international.
Data Mining Find information from data data ? information.
TITLE What should be in Objective, Method and Significant
Reducing Number of Candidates
Data Mining Association Analysis: Basic Concepts and Algorithms
New ideas on FP-Growth and batch incremental mining with FP-Tree
Sequential Pattern Mining Using A Bitmap Representation
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Section 8.1 Trees.
Frequent Pattern Mining
Byung Joon Park, Sung Hee Kim
Chang-Hung Lee, Jian Chih Ou, and Ming Syan Chen, Proc
CARPENTER Find Closed Patterns in Long Biological Datasets
Market Basket Analysis and Association Rules
Data Mining Association Analysis: Basic Concepts and Algorithms
DIRECT HASHING AND PRUNING (DHP) ALGORITHM
Gyozo Gidofalvi Uppsala Database Laboratory
A Parameterised Algorithm for Mining Association Rules
Mining Association Rules from Stars
Mining Complex Data COMP Seminar Spring 2011.
Data Mining Association Analysis: Basic Concepts and Algorithms
Farzaneh Mirzazadeh Fall 2007
Results, Discussion, and Conclusion
COMP5331 FP-Tree Prepared by Raymond Wong Presented by Raymond Wong
732A02 Data Mining - Clustering and Association Analysis
Frequent-Pattern Tree
Market Basket Analysis and Association Rules
FP-Growth Wenlong Zhang.
A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS*
Finding Frequent Itemsets by Transaction Mapping
Association Analysis: Basic Concepts
Presentation transcript:

Rapid Association Rule Mining Amitabha Das, Wee-Keong Ng, Yew-Kwong Woon, Proc. of the 10th ACM International Conference on Information and Knowledge Management(CIKM’01),2001 Adviser:Jia-Ling Koh Speaker: Yu-ting Kung

Introduction Propose an innovative algorithm to push the speed-up barrier ─Rapid Association Rule Mining (RARM) RARM uses a tree structure─Support- Ordered Trie Itemset (SOTrieIT) Hold pre-processed transactional data quickly discover large 1-itemsets and 2-itemsets without scanning the database and without candidate 2-itemsets generation

A Complete TrieIT Definition I (the set of items) = {a 1,a 2,…a N } ─lexicographically-ordered A complete TrieIT is a set of tree nodes such that every tree node w is a 2-tuple w i I is the label of the node w c is a support count

A Complete TrieIT(Cont.) Example Complete TrieIT W 1 (item A) Complete TrieIT W 2 (item B) Complete TrieIT W 3 (item C) Complete TrieIT W 4 (item D) ※ Database D is stored as a set of complete TrieITs

A Complete TrieIT(Cont.) Insertion Transaction database with N=4 After the transactions 100 to 300 have been inserted into the tree After the transactions 400 have been inserted into the tree

Support-Ordered Trie Itemset Definition A SOTrieIT is a complete TrieIT with a depth of 1; i.e.,it consists only of 1. A root node w i 2. Some child nodes. All nodes in the forest of SOTrieIT are sorted according to their support counts in descending order from the left

SOTrieIT(Cont.) Example A SOTrieIT(Item A) A SOTrieIT(Item C) A SOTrieIT(Item B)

SOTrieIT(Cont.) Insertion Transaction database with N=4 Resultant SOTrieIT Insert TID 100 Insert TID 200 Insert TID 300 Insert TID 400

Algorithm RARM Pre-processing Mining of large itemsets

Algorithm RARM(Cont.) Example (support threshold is 80%) total number of traversals is 3 and L1={{C}} The sequence with which the SOTrieIT is traversed

Performance Evaluation Definition of Parameters Experiment using two database D1: T25.I10.N1K.D10K D2: T25.I10.N10K.D100K

Performance Evaluation(Cont.) Comparison of Apriori and RARM─ execution time For D1:

Performance Evaluation(Cont.) For D2:

Performance Evaluation(Cont.) Why does RARM achieve a much greater speed-up in D2 than in D1?

Conclusion Experiments have shown that RARM outperforms Apriori at all support thresholds