Amer Zaheer PC Mohammad Ali Jinnah University, Islamabad

Slides:



Advertisements
Similar presentations
Association Rule Mining
Advertisements

Recap: Mining association rules from large datasets
Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
Association rules The goal of mining association rules is to generate all possible rules that exceed some minimum user-specified support and confidence.
Sampling Large Databases for Association Rules ( Toivenon’s Approach, 1996) Farzaneh Mirzazadeh Fall 2007.
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Analysis: Basic Concepts and Algorithms.
1 Mining Quantitative Association Rules in Large Relational Database Presented by Jin Jin April 1, 2004.
Data Mining Association Analysis: Basic Concepts and Algorithms
Pattern Lattice Traversal by Selective Jumps Osmar R. Zaïane and Mohammad El-Hajj Department of Computing Science, University of Alberta Edmonton, AB,
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Mining Association Rules
1 Fast Algorithms for Mining Association Rules Rakesh Agrawal Ramakrishnan Srikant Slides from Ofer Pasternak.
Mining Association Rules
Mining Association Rules in Large Databases. What Is Association Rule Mining?  Association rule mining: Finding frequent patterns, associations, correlations,
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
1 On Mining General Temporal Association Rules in a Publication Database Chang-Hung Lee, Cheng-Ru Lin and Ming-Syan Chen, Proceedings of the 2001 IEEE.
Association Rule Mining
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
Reducing Number of Candidates Apriori principle: – If an itemset is frequent, then all of its subsets must also be frequent Apriori principle holds due.
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
Mining General Temporal Association Rules for Items with Different Exhibition Cheng-Yue Chang, Ming-Syan Chen, Chang-Hung Lee, Proc. of the 2002 IEEE international.
Mining Sequential Patterns With Item Constraints
Sequential Pattern Mining
Reducing Number of Candidates
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining: Concepts and Techniques
Association rule mining
Association Rules Repoussis Panagiotis.
Data Mining and Its Applications to Image Processing
Frequent Pattern Mining
Combinations COURSE 3 LESSON 11-3
Chang-Hung Lee, Jian Chih Ou, and Ming Syan Chen, Proc
Dynamic Itemset Counting
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
DIRECT HASHING AND PRUNING (DHP) ALGORITHM
Gyozo Gidofalvi Uppsala Database Laboratory
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Association Rule Mining
An Efficient Algorithm for Incremental Mining of Association Rules
A Parameterised Algorithm for Mining Association Rules
Data Mining Association Analysis: Basic Concepts and Algorithms
Farzaneh Mirzazadeh Fall 2007
Unit 3 MINING FREQUENT PATTERNS ASSOCIATION AND CORRELATIONS
COMP5331 FP-Tree Prepared by Raymond Wong Presented by Raymond Wong
Association Analysis: Basic Concepts and Algorithms
Discriminative Pattern Mining
Data Warehousing Mining & BI
Frequent-Pattern Tree
Department of Computer Science National Tsing Hua University
Association Rule Mining
Association Analysis: Basic Concepts
Presentation transcript:

Amer Zaheer PC101005 Mohammad Ali Jinnah University, Islamabad Progressive Partition Miner: An Efficient Algorithm for Mining General Temporal Association Rules Amer Zaheer PC101005 Mohammad Ali Jinnah University, Islamabad

Agenda References Basic Definitions Association Rule Generation Traditional Association Rule Mining Algorithms General Temporal Association Rules Progressive Partition Miner Limitations of PPM

References: C. H. Lee, M. S. Chen, “Progressive Partition Miner: An Efficient Algorithm for Mining General Temporal Association Rules” C. H. Lee, C. Lin and M. Chen, “On Mining General Temporal Association Rules in a Publication Database” C. Chang, M. Chen and C. Lee, “Mining General Temporal Association Rules for Items with Different Exhibition Periods”

Basic Definitions Exhibition Period Maximal Common Exhibition Period Temporal Association Rules Publication Database Frequent Item set

Publication Database, D Basic Definitions Publication Database Publication Database is set of instructions where each transaction T is a set of items of which each item contains and individual exhibition period. Publication Database, D A B C D 1990 1992 1994 2001

Basic Definitions Exhibition Period: The exhibition period is a starting time of any item or set of items till end of transactions Example: In Publication database, D Item A and B are exhibited from 1990 to 2001 Item C is exhibited from 1992 to 2001 Item D is from 1994 to 2001 So each transaction item has a unique exhibition period

Basic Definitions Maximal Common Exhibition Period Latest exhibition start time of both item set X and Y and common end time. Temporal Association Rule An association rule is consider temporal association rule , if and only if its probability is greater then minimum support required and conditional probability is larger than minimum confidence needed.

Meaning of Symbols Used dbi,n The partial database of D formed by a continuous region from Pi to Pj |dbi,n| Number of transactions in dbi,n Xi,n A temporal Item set in partial database dbi,n MCP() The maximum common exhibition period of an item set (X⇒Y)MCP() A general temporal association rule supp(((X⇒Y)t,n) The support of X⇒Y in partial database dbi,n conf((X⇒Y)t,n) The confidence of X⇒Y in partial database dbi,n min_supp Minimum support threshold required min_conf Minimum confidence threshold required min_leng Minimum length of exhibition period required TI A maximal temporal itemset SI A corresponding temporal sub-itemset of TI

Association Rules Generation Let L= {X1, X2, X3, ……… Xn} D be a set of transactions, where each transaction T is a set of items such that A transaction T said to support X if and only if Conventionally, an association rule is an implication of the form , meaning that the presence of the set X implies the presence of another set Y, where

Association Rule Generation The rule holds in the transaction set D with confidence c if c% of transaction in D that contain X also contain Y. The rule has support in the transaction set D if s% of transaction in D contain Problem of mining association rules that have confidence and support greater then corresponding minimum support threshold and minimum confidence threshold.

Traditional Association Rule Mining Algorithms Conventional association rule mining algorithms works in two steps Generate all frequent item sets that satisfy min_supp Generate all association rules that satisfy min_conf using the frequent item sets But Lack of consideration of Exhibition Period of each individual item Lack of an fair support counting basis for each item

Traditional Association Rule Mining Algorithms Example Transaction Database TID Itemset T1 B D T2 B C D T3 B C T4 A D T5 B C E T6 D E T7 A B C T8 C D E T9 B C E F T10 B F T11 T12 B D F

Traditional Association Rule Mining Algorithms Example Assumptions: min_supp=30% min_conf=75% Traditional mining technique: Absolute Support Threshold SA=|12*0.3|=4 Thus B, C, D, E and BC can be termed as frequent item sets and C⇒B is termed as a frequent association rule with support 41.67% and confidence 83.33% An early publication intrinsically possesses a higher likelihood to be determined as a frequent itemset Some discovered rules may be expired from user interest

General Temporal Association Rules General Temporal Association Rules, i.e, (X⇒Y)t,n , where t is the latest exhibition-start time of both item set X and Y and n denotes the end time publication database. An association rule X⇒Y is termed to be frequent if its probability is larger then minimum support required and conditional probability is larger then minimum confidence needed. Instead of absolute support threshold for each item set, a relative minimum support is used. SRA = ||DX|*min_supp|, where DX indicates the amount of partial transaction in the exhibition period of itemset X

General Temporal Association Rules: Example Transaction Database Date TID Itemset Jan-01 T1 B D T2 B C D T3 B C T4 A D Feb-01 T5 B C E T6 D E T7 A B C T8 C D E Mar-01 T9 B C E F T10 B F T11 A D T12 B D F db1,3 db2,3 db2,3

General Temporal Association Rules: Example Assumption: mini_supp = 30% mini_conf= 75% General Temporal Association Rules: (C⇒E)2,3 with relative support 37.5 % and confidence 75% (E⇒C)2,3 with relative support 37.5 % and confidence 75% (B⇒F)3,3 with relative support 75 % and confidence 100% (F⇒B)3,3 with relative support 75 % and confidence 100%

Progressive Partition Miner To deal with the mining of general temporal association rule (X⇒Y)t,n , Progressive Partition Miner (PPM) is devised. The basic idea of PPM is to first partition the publication database in light of exhibition periods of items and then progressively accumulate the occurrence count of each candidate 2-itemset based on the intrinsic partitioning characteristics.

Progressive Partition Miner: Flow Chart Partition database based on exhibition periods Produce candidate 2-TIs Use candidate 2-TIs to produce candidate k-TIs and k-SIs Generate frequent Rule generation 1st Scan database 2nd Scan Database

Progressive Partition Miner: EXAMPLE Date TID Item Set Jan01 T1 B D T2 B C D T3 B C T4 A D FEB-01 T5 B C E T6 D E T7 A B C T8 C D E MARCH-01 T9 B C E F T10 B F T11 A D T12 B D F P1 P2 P3 Transaction database min_supp = 30%and min_conf = 75%.

Progressive Partition Miner: EXAMPLE P1+P2 C2 START COUNT BD 1 2 BC 4 CE DE AB AC CD BE First Scan P1 C2 START COUNT BD 1 2 BC CD AD four transactions in P1, the partial minimal support is (4 ∗ 0.3) = 2. Support: α ((4 + 4) ∗ 0.3) = 3 β(4 ∗ 0.3) = 2. min_supp = 30%and min_conf = 75%.

Progressive Partition Miner: EXAMPLE P1+P2+P3 C2 START COUNT BC 1 5 CE 2 3 DE DF BE BF CF EF AD BD CANDIDATE ITEM SET COUNT BC 5 BF 3 CE

Progressive Partition Miner: EXAMPLE Candidate 2-item set: BC 1,3 CE 2,3 BF 3,3 Candidate 1-item set: B 1,3 C 1,3 C 2,3 E 2,3 B3,3 F 3,3 CANDIDATE ITEM SET COUNT C1 B1,3 8 B3,3 3 C1,3 6 C2,3 4 F3,3 C2 BC1,3 5 BF3,3 CE2,3

Progressive Partition Miner: EXAMPLE Now count the support according to given rang each item set. Frequent ITEM SET COUNT L1 B1,3 8 B3,3 3 C1,3 6 C2,3 4 F3,3 L2 BC1,3 5 BF3,3 CE2,3 After 2nd scan database D, we have frequent itemsets ( relative support = 30% ) as follows: B 1,3 C 1,3 C 2,3 E 2,3 B3,3 F 3,3 BC1,3 CE 2,3 BF 3,3

General Temporal Association :Rule Generation (B ⇒C)1,3 Total Transaction=12 Support Count= 5 Thus support=41.67% Confidence (X ⇒ Y) = Support(XUY)/ Support(X) Confidence(B ⇒Y) =5/8 Confidence(B ⇒Y)=62.5%

General Temporal Association :Rule Generation SUPPORT CONFIDENCE (B => C)1,3 41.67% 62.50% (C => B)1,3 83.33% (B => F)3,3 75.00% 100.00% (F => B)3,3 (C => E)2,3 37.50% (E => C)2,3 min_supp = 30%and min_conf = 75%.

General Temporal Association :Rule Generation, Pruning SUPPORT CONFIDENCE (C => B)1,3 41.67% 83.33% (B => F)3,3 75.00% 100.00% (F => B)3,3 (C => E)2,3 37.50% (E => C)2,3

Limitation Partition the database on exhibition period rather than size of database so not uniform partition. Temporal database is updated continually it not handle the upcoming iteration.

Thanks