The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Association Rule Mining COMP 790-90 Seminar BCB 713 Module Spring 2011.

Slides:



Advertisements
Similar presentations
Association Rule Mining
Advertisements

Data Mining Techniques Association Rule
ICDM'06 Panel 1 Apriori Algorithm Rakesh Agrawal Ramakrishnan Srikant (description by C. Faloutsos)
1 of 25 1 of 45 Association Rule Mining CIT366: Data Mining & Data Warehousing Instructor: Bajuna Salehe The Institute of Finance Management: Computing.
Chapter 5: Mining Frequent Patterns, Association and Correlations
Organization “Association Analysis”
Data Mining: Concepts and Techniques (2nd ed.) — Chapter 5 —
732A02 Data Mining - Clustering and Association Analysis ………………… Jose M. Peña Association rules Apriori algorithm FP grow algorithm.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms
1 Association Rule Mining Instructor Qiang Yang Slides from Jiawei Han and Jian Pei And from Introduction to Data Mining By Tan, Steinbach, Kumar.
Business Systems Intelligence: 4. Mining Association Rules Dr. Brian Mac Namee (
Association Analysis: Basic Concepts and Algorithms.
1 Association Rule Mining Instructor Qiang Yang Thanks: Jiawei Han and Jian Pei.
Chapter 4: Mining Frequent Patterns, Associations and Correlations
Mining Association Rules in Large Databases
EECS 800 Research Seminar Mining Biological Data
Frequent Pattern and Association Analysis (baseado nos slides do livro: Data Mining: C & T)
Mining Association Rules in Large Databases
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and.
Mining Association Rules
Mining Association Rules
Mining Association Rules in Large Databases. What Is Association Rule Mining?  Association rule mining: Finding frequent patterns, associations, correlations,
Pattern Recognition Lecture 20: Data Mining 3 Dr. Richard Spillman Pacific Lutheran University.
Chapter 2: Mining Frequent Patterns, Associations and Correlations
Eick, Tan, Steinbach, Kumar: Association Analysis Part1 Organization “Association Analysis” 1. What is Association Analysis? 2. Association Rules 3. The.
Ch5 Mining Frequent Patterns, Associations, and Correlations
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 6 —
What Is Association Mining? l Association rule mining: – Finding frequent patterns, associations, correlations, or causal structures among sets of items.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Data Warehousing 資料倉儲 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang.
DATA MINING LECTURE 3 Frequent Itemsets Association Rules.
Lecture 10 Frequent Itemset Mining/Association Rule MW 4:00PM-5:15PM Dr. Jianjun Hu CSCE822 Data Mining and Warehousing.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Association Rule Mining III COMP Seminar GNET 713 BCB Module Spring 2007.
Frequent Item Mining. What is data mining? =Pattern Mining? What patterns? Why are they useful?
Fast Algorithms For Mining Association Rules By Rakesh Agrawal and R. Srikant Presented By: Chirayu Modi.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Data Mining Find information from data data ? information.
Data Mining: Concepts and Techniques — Chapter 3 —
Mining Frequent Patterns, Associations, and Correlations Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Mining Frequent Patterns. What Is Frequent Pattern Analysis? Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs.
Chapter 6: Mining Frequent Patterns, Association and Correlations
Dept. of Information Management, Tamkang University
What is Frequent Pattern Analysis?
Data Mining  Association Rule  Classification  Clustering.
The UNIVERSITY of KENTUCKY Association Rule Mining CS 685: Special Topics in Data Mining Spring 2009.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
The UNIVERSITY of KENTUCKY Association Rule Mining CS 685: Special Topics in Data Mining.
Reducing Number of Candidates Apriori principle: – If an itemset is frequent, then all of its subsets must also be frequent Apriori principle holds due.
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
COMP53311 Association Rule Mining Prepared by Raymond Wong Presented by Raymond Wong
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Association Rule Mining CS 685: Special Topics in Data Mining Jinze Liu.
Introduction to Machine Learning Lecture 13 Introduction to Association Rules Albert Orriols i Puig Artificial.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
Association Rule Mining CENG 514 Data Mining
Association Rule Mining CENG 514 Data Mining July 2,
Data Mining Find information from data data ? information.
Reducing Number of Candidates
Data Mining: Concepts and Techniques
Information Management course
Association rule mining
Mining Association Rules
©Jiawei Han and Micheline Kamber
Mining Association Rules in Large Databases
Association Rule Mining
©Jiawei Han and Micheline Kamber
Department of Computer Science National Tsing Hua University
Association Rule Mining
Presentation transcript:

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Association Rule Mining COMP Seminar BCB 713 Module Spring 2011

COMP Data Mining: Concepts, Algorithms, and Applications 2 Outline What is association rule mining? Methods for association rule mining Extensions of association rule

COMP Data Mining: Concepts, Algorithms, and Applications 3 What Is Association Rule Mining? Frequent patterns: patterns (set of items, sequence, etc.) that occur frequently in a database [AIS93] Frequent pattern mining: finding regularities in data What products were often purchased together? Beer and diapers?! What are the subsequent purchases after buying a car? Can we automatically profile customers?

COMP Data Mining: Concepts, Algorithms, and Applications 4 Why Essential? Foundation for many data mining tasks Association rules, correlation, causality, sequential patterns, structural patterns, spatial and multimedia patterns, associative classification, cluster analysis, iceberg cube, … Broad applications Basket data analysis, cross-marketing, catalog design, sale campaign analysis, web log (click stream) analysis, …

COMP Data Mining: Concepts, Algorithms, and Applications 5 Basics Itemset: a set of items E.g., acm={a, c, m} Support of itemsets Sup(acm)=3 Given min_sup=3, acm is a frequent pattern Frequent pattern mining: find all frequent patterns in a database TIDItems bought 100f, a, c, d, g, I, m, p 200a, b, c, f, l,m, o 300b, f, h, j, o 400b, c, k, s, p 500a, f, c, e, l, p, m, n Transaction database TDB

COMP Data Mining: Concepts, Algorithms, and Applications 6 Frequent Pattern Mining: A Road Map Boolean vs. quantitative associations age(x, “30..39”) ^ income(x, “42..48K”)  buys(x, “car”) [1%, 75%] Single dimension vs. multiple dimensional associations Single level vs. multiple-level analysis What brands of beers are associated with what brands of diapers?

COMP Data Mining: Concepts, Algorithms, and Applications 7 Extensions & Applications Correlation, causality analysis & mining interesting rules Maxpatterns and frequent closed itemsets Constraint-based mining Sequential patterns Periodic patterns Structural Patterns Computing iceberg cubes

COMP Data Mining: Concepts, Algorithms, and Applications 8 Frequent Pattern Mining Methods Apriori and its variations/improvements Mining frequent-patterns without candidate generation Mining max-patterns and closed itemsets Mining multi-dimensional, multi-level frequent patterns with flexible support constraints Interestingness: correlation and causality

COMP Data Mining: Concepts, Algorithms, and Applications 9 Apriori: Candidate Generation-and-test Any subset of a frequent itemset must be also frequent — an anti-monotone property A transaction containing {beer, diaper, nuts} also contains {beer, diaper} {beer, diaper, nuts} is frequent  {beer, diaper} must also be frequent No superset of any infrequent itemset should be generated or tested Many item combinations can be pruned

COMP Data Mining: Concepts, Algorithms, and Applications 10 Apriori-based Mining Generate length (k+1) candidate itemsets from length k frequent itemsets, and Test the candidates against DB

COMP Data Mining: Concepts, Algorithms, and Applications 11 Apriori Algorithm A level-wise, candidate-generation-and-test approach (Agrawal & Srikant 1994) TIDItems 10a, c, d 20b, c, e 30a, b, c, e 40b, e Min_sup=2 ItemsetSup a2 b3 c3 d1 e3 Data base D 1-candidates Scan D ItemsetSup a2 b3 c3 e3 Freq 1-itemsets Itemset ab ac ae bc be ce 2-candidates ItemsetSup ab1 ac2 ae1 bc2 be3 ce2 Counting Scan D ItemsetSup ac2 bc2 be3 ce2 Freq 2-itemsets Itemset bce 3-candidates ItemsetSup bce2 Freq 3-itemsets Scan D

COMP Data Mining: Concepts, Algorithms, and Applications 12 The Apriori Algorithm C k : Candidate itemset of size k L k : frequent itemset of size k L 1 = {frequent items}; for (k = 1; L k !=  ; k++) do C k+1 = candidates generated from L k ; for each transaction t in database do increment the count of all candidates in C k+1 that are contained in t L k+1 = candidates in C k+1 with min_support return  k L k ;

COMP Data Mining: Concepts, Algorithms, and Applications 13 Important Details of Apriori How to generate candidates? Step 1: self-joining L k Step 2: pruning How to count supports of candidates?

COMP Data Mining: Concepts, Algorithms, and Applications 14 How to Generate Candidates? Suppose the items in L k-1 are listed in an order Step 1: self-join L k-1 INSERT INTO C k SELECT p.item 1, p.item 2, …, p.item k-1, q.item k-1 FROM L k-1 p, L k-1 q WHERE p.item 1 =q.item 1, …, p.item k-2 =q.item k-2, p.item k-1 < q.item k-1 Step 2: pruning For each itemset c in C k do For each (k-1)-subsets s of c do if (s is not in L k-1 ) then delete c from C k

COMP Data Mining: Concepts, Algorithms, and Applications 15 Example of Candidate- generation L 3 ={abc, abd, acd, ace, bcd} Self-joining: L 3 *L 3 abcd from abc and abd acde from acd and ace Pruning: acde is removed because ade is not in L 3 C 4 ={abcd}