Association Rules Spring 2010. Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.

Slides:



Advertisements
Similar presentations
Association Rule Mining
Advertisements

Data Mining Techniques Association Rule
Association rules and frequent itemsets mining
FUNGSI MAYOR Assosiation. What Is Association Mining? Association rule mining: –Finding frequent patterns, associations, correlations, or causal structures.
Pertemuan XIV FUNGSI MAYOR Assosiation. What Is Association Mining? Association rule mining: –Finding frequent patterns, associations, correlations, or.
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
10 -1 Lecture 10 Association Rules Mining Topics –Basics –Mining Frequent Patterns –Mining Frequent Sequential Patterns –Applications.
MIS2502: Data Analytics Association Rule Mining. Uses What products are bought together? Amazon’s recommendation engine Telephone calling patterns Association.
ICDM'06 Panel 1 Apriori Algorithm Rakesh Agrawal Ramakrishnan Srikant (description by C. Faloutsos)
Association rules The goal of mining association rules is to generate all possible rules that exceed some minimum user-specified support and confidence.
1 of 25 1 of 45 Association Rule Mining CIT366: Data Mining & Data Warehousing Instructor: Bajuna Salehe The Institute of Finance Management: Computing.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
ICS 421 Spring 2010 Data Mining 1 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/6/20101Lipyeow Lim.
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rules Presented by: Anilkumar Panicker Presented by: Anilkumar Panicker.
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
Lecture14: Association Rules
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
『 Data Mining 』 By Jung, hae-sun. 1.Introduction 2.Definition 3.Data Mining Applications 4.Data Mining Tasks 5. Overview of the System 6. Data Mining.
Association Rules. 2 Customer buying habits by finding associations and correlations between the different items that customers place in their “shopping.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
ASSOCIATION RULE DISCOVERY (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
3.Mining Association Rules in Large Database 3.1 Market Basket Analysis:Example for Association Rule Mining 1.A typical example of association rule mining.
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Association Rule Mining March 5, 2009.
Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
CS 8751 ML & KDDSupport Vector Machines1 Mining Association Rules KDD from a DBMS point of view –The importance of efficiency Market basket analysis Association.
Part II - Association Rules © Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II – Association Rules Margaret H. Dunham Department of.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
1 What is Association Analysis: l Association analysis uses a set of transactions to discover rules that indicate the likely occurrence of an item based.
Association Rule Mining
ASSOCIATION RULES (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
1 Knowledge discovery & data mining Association rules and market basket analysis --introduction UCLA CS240A Course Notes* __________________________ *
Association Rules presented by Zbigniew W. Ras *,#) *) University of North Carolina – Charlotte #) ICS, Polish Academy of Sciences.
CURE Clustering Using Representatives Handles outliers well. Hierarchical, partition First a constant number of points c, are chosen from each cluster.
Data Mining  Association Rule  Classification  Clustering.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Association Rules Carissa Wang February 23, 2010.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
Elective-I Examination Scheme- In semester Assessment: 30 End semester Assessment :70 Text Books: Data Mining Concepts and Techniques- Micheline Kamber.
Introduction to Machine Learning Lecture 13 Introduction to Association Rules Albert Orriols i Puig Artificial.
DATA MINING It is a process of extracting interesting(non trivial, implicit, previously, unknown and useful ) information from any data repository. The.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
MIS2502: Data Analytics Association Rule Mining David Schuff
Introduction to Data Mining Mining Association Rules Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.
Mining Association Rules in Large Database This work is created by Dr. Anamika Bhargava, Ms. Pooja Kaul, Ms. Priti Bali and Ms. Rajnipriya Dhawan and licensed.
Data Mining – Association Rules
By Arijit Chatterjee Dr
DATA MINING © Prentice Hall.
Knowledge discovery & data mining Association rules and market basket analysis--introduction UCLA CS240A Course Notes*
Association Rules.
Association Rules Zbigniew W. Ras*,#) presented by
Market Basket Analysis and Association Rules
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Transactional data Algorithm Applications
Data Mining Association Analysis: Basic Concepts and Algorithms
MIS2502: Data Analytics Association Rule Mining
Market Basket Analysis and Association Rules
Association Analysis: Basic Concepts
Presentation transcript:

Association Rules Spring 2010

Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial extraction of implicit, previously unknown, and potentially useful information from data (W. Frawley)  The second one, spark and rebel, says that data mining is nothing else than torturing the data until it confesses… and if you torture it enough, you can get it to confess to anything (Fred Menger).

Data mining techniques  Association Rules  Classification  Prediction  Clustering

What is Association mining?  Finding frequent patterns, associations, or casual structures among sets of items or objects in transaction databases, relational databases, and other information repositories.  Searches for relationships between variables. For example a supermarket might gather data on customer purchasing habits. Using association rule learning, the supermarket can determine which products are frequently bought together and use this information for marketing purposes. This is sometimes referred to as market basket analysis.  Applications  Basket Data Analysis  Cross-marketing  Catalog design  …

Introduction to Association Rules (AR)  Ideas came from the market basket analysis (MBA)  What do customers buy?  Which products are bought together? AIM:  Find association and correlations between the different items that customers place in their shopping basket.

Some definitions in AR Pattern:  A particular data behavior, arrangement or form that might be of a business interest Itemset:  A set of items, a group of elements that represents together a single entity. It is actually a type of pattern.

Some definitions in AR (cntd.)  Transaction database T  A set of transactions T = {T 1, T 2, …, T n }  Itemset  Each transaction contains a set of items I(Itemset)  An Itemset is a collection of items I = {I 1, I 2,..., I n }

AR General Aim  Find frequent/interesting patterns, associations, correlations, or casual structures among set of items or elements in databases or other information repositories.  An AR is an implication of two itemsets:  X => y

AR (contd.) Frequent itemsets: items that frequently appear together. Example  Bread => peanut-butter  I = {bread, peanut-butter} Transaction ID (TID) Items T1T1T1T1 Bread, peanut-butter, jelly T2T2T2T2 Bread, peanut-butter T3T3T3T3 Bread, peanut-butter, milk T4T4T4T4 Bread, soda T5T5T5T5 Soda, milk

An Interesting Rule  Support count (σ):  Frequency of occurrence of an itemset  σ {bread, peanut-butter} = 3  Support:  Fraction of transactions that contain an itemset  S {bread, peanut-butter} = 3/5

AR (contd.) The two most used measures of interest:  Support(s): the occurring frequency of the rule, i.e. the number of transactions that contain both X and Y  S = σ (X union Y) / # of transactions  Confidence(s): the strength of the association, i.e. measures of how often items in Y appear in transactions that contain X.  C = σ (X union Y) / σ (X)

AR (contd.) Transaction ID (TID) SC Bread => peanut-butter 3/5=.63/4=.75 peanut-butter => Bread 3/5=.63/3=1 Soda =>Bread 1/5=.21/2=.5 peanut-butter => jelly 1/5 =.2 1/3=.33 Jelly => peanut-butter 1/5 =.2 1/1=1 Jelly => milk 00 Transaction ID (TID) Items T1T1T1T1 Bread, peanut-butter, jelly T2T2T2T2 Bread, peanut- butter T3T3T3T3 Bread, peanut- butter, milk T4T4T4T4 Bread, soda T5T5T5T5 Soda, milk

Types of AR  Binary Association Rules  Quantitative Association Rules  Fuzzy Association Rules Let’s start from the beginning:  Binary Association Rules, A-priori

A-priori algorithm  Priori is the most influential AR miner  It consist of two steps: 1.Generate all frequent itemsets whose support >= minimum support. 2.Use frequent itemsets to generate association rules.

A-priori (contd.)  Key Idea:  Downward closure property:  Any subsets of a frequent itemset are also frequent itemsets.  The algorithm iteratively does:  Create itemsets  Only continue exploration of those whose support >= minimum support

Back to our example (minsup = 3) Transaction ID (TID) Items T1T1T1T1 Bread, peanut-butter, jelly T2T2T2T2 Bread, peanut-butter T3T3T3T3 Bread, peanut- butter, milk T4T4T4T4 Bread, soda T5T5T5T5 Soda, milk ItemsCountBread4 peanut-butter3 jelly1 milk2 soda2 ItemsetCount Bread, peanut-butter 3

Example (minsup = 2) TIDItems 10A,C,D 20B,C,E 30A,B,C,E 40B,EItemssupA2 B3 C3 D1 E3Itemsetsup{A,B}1 {A,C}2 {A,E}1 {B,C}2 {B,E}3 {C,E}2 Itemsetsup{A,B,C}1 {B,C,E}2 Itemsetsup{B,C,E}2