Chang-Hung Lee, Jian Chih Ou, and Ming Syan Chen, Proc

Slides:



Advertisements
Similar presentations
Association Rule Mining
Advertisements

Huffman Codes and Asssociation Rules (II) Prof. Sin-Min Lee Department of Computer Science.
Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
Frequent Itemset Mining Methods. The Apriori algorithm Finding frequent itemsets using candidate generation Seminal algorithm proposed by R. Agrawal and.
Frequent Closed Pattern Search By Row and Feature Enumeration
Minimum Spanning Tree Sarah Brubaker Tuesday 4/22/8.
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
Zeev Dvir – GenMax From: “ Efficiently Mining Frequent Itemsets ” By : Karam Gouda & Mohammed J. Zaki.
Association rules The goal of mining association rules is to generate all possible rules that exceed some minimum user-specified support and confidence.
Sampling Large Databases for Association Rules ( Toivenon’s Approach, 1996) Farzaneh Mirzazadeh Fall 2007.
1 Finding Recent Frequent Itemsets Adaptively over Online Data Streams J. H, Chang and W.S. Lee, in Proc. Of the 9th ACM International Conference on Knowledge.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Efficient Mining of Both Positive and Negative Association Rules Xindong Wu (*), Chengqi Zhang (+), and Shichao Zhang (+) (*) University of Vermont, USA.
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms
Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules S.D. Lee, D. W. Cheung, B. Kao Department of Computer Science.
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules S.D. Lee David W. Cheung Ben Kao The University of Hong Kong.
1 Fast Algorithms for Mining Association Rules Rakesh Agrawal Ramakrishnan Srikant Slides from Ofer Pasternak.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
Chapter 4 sections 1 and 2.  Fig. 1  Not connected  All vertices are even.  Fig. 2  Connected  All vertices are even.
2006/12/06Chen Yi-Chun1 Mining Positive and Negative Association Rules: An Approach for Confined Rules Maria-Luiza Antonie, Osmar R. Zaiane PKDD2004.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Association Rule Mining III COMP Seminar GNET 713 BCB Module Spring 2007.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
1 On Mining General Temporal Association Rules in a Publication Database Chang-Hung Lee, Cheng-Ru Lin and Ming-Syan Chen, Proceedings of the 2001 IEEE.
1/24 Novel algorithm for mining high utility itemsets Shankar, S. Purusothaman, T. Jayanthi, S. International Conference on Computing, Communication and.
1 AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Hong.
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
1 Mining the Smallest Association Rule Set for Predictions Jiuyong Li, Hong Shen, and Rodney Topor Proceedings of the 2001 IEEE International Conference.
CS685: Special Topics in Data Mining The UNIVERSITY of KENTUCKY Frequent Itemset Mining II Tree-based Algorithm Max Itemsets Closed Itemsets.
Rapid Association Rule Mining Amitabha Das, Wee-Keong Ng, Yew-Kwong Woon, Proc. of the 10th ACM International Conference on Information and Knowledge Management(CIKM’01),2001.
Mining General Temporal Association Rules for Items with Different Exhibition Cheng-Yue Chang, Ming-Syan Chen, Chang-Hung Lee, Proc. of the 2002 IEEE international.
Discovering Frequent Arrangements of Temporal Intervals Papapetrou, P. ; Kollios, G. ; Sclaroff, S. ; Gunopulos, D. ICDM 2005.
Sequential Pattern Mining
Reducing Number of Candidates
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining: Concepts and Techniques
Association Rules Repoussis Panagiotis.
Knowledge discovery & data mining Association rules and market basket analysis--introduction UCLA CS240A Course Notes*
Data Mining and Its Applications to Image Processing
Frequent Pattern Mining
Association Rules Zbigniew W. Ras*,#) presented by
Dynamic Itemset Counting
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Action Association Rules Mining
DIRECT HASHING AND PRUNING (DHP) ALGORITHM
Association Rule Mining
An Efficient Algorithm for Incremental Mining of Association Rules
A Parameterised Algorithm for Mining Association Rules
Mining Association Rules from Stars
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Amer Zaheer PC Mohammad Ali Jinnah University, Islamabad
Farzaneh Mirzazadeh Fall 2007
Yun Chi, Haixun Wang, Philip S. Yu, Richard R. Muntz, ICDM 2004.
Association Analysis: Basic Concepts and Algorithms
Data Warehousing Mining & BI
Frequent-Pattern Tree
AB AC AD AE AF 5 ways If you used AB, then, there would be 4 remaining ODD vertices (C, D, E and F) CD CE CF 3 ways If you used CD, then, there.
Graduate Course DataMining
Association Rule Mining
Maintaining Frequent Itemsets over High-Speed Data Streams
Mining Association Rules in Large Databases
Association Analysis: Basic Concepts
Presentation transcript:

Progressive Weighted Miner: An Efficient Method for Time-Constraint Mining Chang-Hung Lee, Jian Chih Ou, and Ming Syan Chen, Proc. of the 7th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’03) Adviser: Jia-Ling Koh Speaker: Yu-ting Kung

Introduction Introduce a weighted model of transaction-weighted association rules in a time-variant database. Propose an efficient Progressive Weighted Miner (PWM) algorithm to produce weighted association rules

Introduction (Cont.) Progressive Weighted Miner The importance of each transaction period is first reflected by a proper weight assigned by the user. PWM partitions the time-variant database in light of weighted periods of transactions and performs weighted mining.

Problem Description Weighted minimum support Weighted-Support of X Weighted support ratio of an itemset X Amount of partial transactions Corresponding weight values by “weighting function” The number of transactions in partition Pi that contain itemset X

Problem Description (Cont.) Weighted-Confidence Frequent weighted association rule (X=>Y)W , and

Problem Description (Cont.) For example (min_sup=30%, min_conf=75%, W(P1) = 0.5, W(P2) = 1 and W(P3) = 2) Min_SW={4X0.5+4X1+4X2}X0.3=4.2 (C=>B)Wis frequent? Transaction Database Date TID Itemset P1 Jan-02 t1 B D t2 A t3 C t4 P2 Feb-02 t5 E t6 t7 t8 P3 Mar-02 t9 t10 F t11 t12 Yes!! > min_sup > min_conf

Partition database based PWM Algorithm Procedure I Partition database based on weighted periods 1st Scan database Procedure II W(Pi) Produce C2 Procedure III W(Pi) Use C2 to produce Ck 2st Scan database Procedure IV Generate LK (X=>Y)W

Procedure I Time granularity= month, W(P1) = 0.5, W(P2) = 1, W(P3) = 2 Transaction Database Date TID Itemset Jan-02 t1 B D t2 A t3 C t4 Feb-02 t5 E t6 t7 t8 Mar-02 t9 t10 F t11 t12 P1 P2 P3 Time granularity= month, W(P1) = 0.5, W(P2) = 1, W(P3) = 2

Procedure II p1 p2 scan scan 4x0.5x0.3 Min_SW(P1+P2)=1.8, Min_SW(P2)=1.2 C2 start NW(X)count AB 2 1*1=1 AC BC 1 1+2*1=3 BD 1+0*1=1 BE CD CE 2*1=2 DE Min_SW(P1)=0.6 C2 start NW(X)count BD 1 2*0.5=1 AD 1*0.5=0.5 BC CD

Procedure II (Cont.) scan p3 Min_SW(P1+P2+P3)=4.2, Min_SW(P2+P3)=3.6, Min_SW(P3)=2.4 C2 start NW(X)count AD 3 1*2=2 BC 1 3+1*2=5 BD BE BF 3*2=6 CE 2 2+1*2=4 CF DE 2+0*2=2 DF EF p3 scan

Procedure III After 1st scan D, candidate itemsets: {B}, {C}, {E}, {F}, {BC}, {BF}, {CE} (because C2 = {BC,CE,BF} from Procedure II)

Procedure IV After 2nd scan D pruning Min_SW(D)=4.2 Candidate NW(X)count C1 {B} 3*0.5+2*1+3*2=9.5 {C} 2*0.5+3*1+1*2=6 {E} 3*1+1*2=5 {F} 3*2=6 C2 {BC} 2*0.5+2*1+1*2=5 {BF} {CE} 2*1+1*2=4 Frequent Itemsets NW(X)count L1 {B} 9.5 {C} 6 {E} 5 {F} L2 {BC} {BF} pruning

WAR pruning Rules Support Confidence B => C 5/(4*0.5+4*1+4*0.2)= 35.7% 5/9.5=52.6% B => F 6/(4*0.5+4*1+4*0.2)= 42.8% 6/9.5=63.1% C => B 5/6=83.3% F => B 6/6=100% pruning Rules Support Confidence C => B 35.7% 83.3% F => B 42.8% 100%

Conclusion Develop PWM to generate the WAR in Time-variant database PWM employs a filtering threshold in each partition PWM can lead to more interesting results