Association Rules Presented by: Anilkumar Panicker Presented by: Anilkumar Panicker.

Slides:



Advertisements
Similar presentations
Association Rules Evgueni Smirnov.
Advertisements

Brian Chase.  Retailers now have massive databases full of transactional history ◦ Simply transaction date and list of items  Is it possible to gain.
DATA MINING Association Rule Discovery. AR Definition aka Affinity Grouping Common example: Discovery of which items are frequently sold together at a.
COMP5318 Knowledge Discovery and Data Mining
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
MIS2502: Data Analytics Association Rule Mining. Uses What products are bought together? Amazon’s recommendation engine Telephone calling patterns Association.
1 of 25 1 of 45 Association Rule Mining CIT366: Data Mining & Data Warehousing Instructor: Bajuna Salehe The Institute of Finance Management: Computing.
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Techniques Cluster Analysis Induction Neural Networks OLAP Data Visualization.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Association Rule Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Association Rule Mining Part 1 Introduction to Data Mining with Case Studies Author: G. K. Gupta Prentice Hall India, 2006.
Fast Algorithms for Association Rule Mining
Research Project Mining Negative Rules in Large Databases using GRD.
Lecture14: Association Rules
Mining Association Rules
MIS 451 Building Business Intelligence Systems Association Rule Mining (1)
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
Mining Association Rules between Sets of Items in Large Databases presented by Zhuang Wang.
Apriori algorithm Seminar of Popular Algorithms in Data Mining and Machine Learning, TKK Presentation Lauri Lahti.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Association Rules. CS583, Bing Liu, UIC 2 Association rule mining Proposed by Agrawal et al in Initially used for Market Basket Analysis to find.
1 Mining Association Rules Mohamed G. Elfeky. 2 Introduction Data mining is the discovery of knowledge and useful information from the large amounts of.
ASSOCIATION RULE DISCOVERY (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Association Rule Mining March 5, 2009.
CS 8751 ML & KDDSupport Vector Machines1 Mining Association Rules KDD from a DBMS point of view –The importance of efficiency Market basket analysis Association.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Association Rule Mining Data Mining and Knowledge Discovery Prof. Carolina Ruiz and Weiyang Lin Department of Computer Science Worcester Polytechnic Institute.
Part II - Association Rules © Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II – Association Rules Margaret H. Dunham Department of.
Association rule mining Goal: Find all rules that satisfy the user-specified minimum support (minsup) and minimum confidence (minconf). Assume all data.
Sampling Large Databases for Association Rules Jingting Zeng CIS 664 Presentation March 13, 2007.
1 What is Association Analysis: l Association analysis uses a set of transactions to discover rules that indicate the likely occurrence of an item based.
Association Rule Mining
ASSOCIATION RULES (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
CURE Clustering Using Representatives Handles outliers well. Hierarchical, partition First a constant number of points c, are chosen from each cluster.
DATA MINING Using Association Rules by Andrew Williamson.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Association Rules Carissa Wang February 23, 2010.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
Elective-I Examination Scheme- In semester Assessment: 30 End semester Assessment :70 Text Books: Data Mining Concepts and Techniques- Micheline Kamber.
Introduction to Machine Learning Lecture 13 Introduction to Association Rules Albert Orriols i Puig Artificial.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
MIS2502: Data Analytics Association Rule Mining David Schuff
MIS2502: Data Analytics Association Rule Mining Jeremy Shafer
Mining Association Rules in Large Database This work is created by Dr. Anamika Bhargava, Ms. Pooja Kaul, Ms. Priti Bali and Ms. Rajnipriya Dhawan and licensed.
A Research Oriented Study Report By :- Akash Saxena
Association Rules Repoussis Panagiotis.
Knowledge discovery & data mining Association rules and market basket analysis--introduction UCLA CS240A Course Notes*
Frequent Pattern Mining
Market Basket Many-to-many relationship between different objects
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Data Mining Association Analysis: Basic Concepts and Algorithms
MIS2502: Data Analytics Association Rule Mining
Market Basket Analysis and Association Rules
MIS2502: Data Analytics Association Rule Mining
MIS2502: Data Analytics Association Rule Learning
Association Analysis: Basic Concepts
Presentation transcript:

Association Rules Presented by: Anilkumar Panicker Presented by: Anilkumar Panicker

What is Data Mining?? Search for valuable information in large volumes of data. A step in knowledge discovery in databases. It enables companies to focus on customer satisfaction, corporate profits, and determining the impact of various parameters on the sales.

Association Rule Association rules are used to show the relationships between data items. Association rules detect common usage of data items. E.g. The purchasing of one product when another product is purchased represents an association rule.

Example 1 Grocery store. Association rules have most direct application in the retail businesses. Association rules used to assist in marketing, advertising, floor placements and inventory control.

From the transaction history several association rules can be derived. E.g. 100% of the time that PeanutButter is purchased, so is bread. 33% of the time PeanutButter is purchased, Jelly is also purchased.

Example 2 A Telephone Company. A telephone company must ensure that all calls are completed and in acceptable period of time. In this environment, a potential data mining problem would be to predict a failure of a node. This can be done by finding association rules of the type X  Failure.

If these types of rules occur with a high confidence, Failures can be predicted. Even though the support might be low because the X condition does not frequently occur.

Association rule Given a set of items I = {I 1,I 2,….I m } and a database of transactions D = {t 1,t 2,….t m } where t i = { I i1,I i2,….I ik } and I iJ € I, an association rule is an implication of the form X  Y where X,Y C I are sets of items called itemsets and X∩Y =ø.

Support (s): The support (s) for an association rule X  Y is the percentage of transactions in the database that contain X U Y. E.g. If bread along with peanutbutter occurs in 60% of the total transactions, then the support for bread  peanutbutter is 60%

Confidence or Strength (α): The confidence or strength (α) for an association rule X  Y is the ratio of the number of transactions that contain X U Y to the number of transactions that contain X. Eg.if support for bread  peanutbutter is 60% and bread occurs in 80% of total transactions then confidence for bread  peanutbutter is 75%.

Selecting Association rules The selection of association rules is based on Support and Confidence. Confidence measures the strength of the rule, Whereas support measures how often it should occur in the database. Typically large confidence values and a smaller support are used. Rules that satisfy both minimum support and minimum confidence are called strong rules.

Association Rule Problem Given a set of Items I = {I 1,I 2,….I m } and a database of transactions D = {t 1,t 2,….t n } where t i = { I i1,I i2,….I ik } and I iJ € I. The association rule problem is to identify all association rules X  Y with a minimum support and confidence. These values (s,α) are given as input to the problem.

Large Itemsets A Large Itemset / frequent Itemset is an itemset whose number of occurrences is above a threshold, s (Support) Finding large Itemsets generally is quite easy but very costly. The naive approach would be to count all itemsets that appear in any transaction. Given a set of items of size m, there are 2 m subsets. Ignoring the empty set we are still left with 2 m – 1 subsets.

For e.g. In the retail store example if have set of items of size 5, i.e the store sells 5 products. Then the possible number of itemsets is 2 5 – 1 = 31. If the 5 products sold are bread,peanutbutter,milk,beer and jelly. then the 31 possible itemsets are

Bread Peanutbutter Milk Beer Jelly Bread,peanutbutter Bread,milk Bread,beer Bread,jelly Peanutbutter,milk Peanutbutter,beer Peanutbutter,jelly Milk,beer Milk,jelly Beer, jelly Bread,peanutbutter,milk Bread, Peanutbutter, beer and so on.

For m = 30 the number of potential itemsets become The challenge in solving an association problem is hence to efficiently determining all large itemsets. Most association rule algorithms are based on smart ways to reduce the number of itemsets to be counted.

Large Itemsets The most common approach to finding association rules is to breakup the problem into two parts 1.Finding large Itemsets and 2.Generating rules from these itemsets.

Subset of any large itemset is also large. Once the large Itemsets have been found, we know that any interesting association rule, X  Y,must have X U Y in this set of frequent itemsets. When all large itemsets are found, generating the association rules is straightforward.

Apriori Algorithm Apriori algorithm is the most well known association rule algorithm. Apriori algorithm is used to efficiently discover large itemsets. Apriori algorithm uses the property that any subset of a large itemset must be large. Inputs: Itemsets, Database of transactions, support and the output is large itemsets.

Apriori Algorithm Example T.I.D.Items 1001,3,4 2002,3,5 3001,2,3,5 4002,5 ITEM SET SUPPORT {1}2 {2}3 {3}3 {4}1 {5}3

Support threshold = 2 ITEM SETSUPPORT {1}2 {2}3 {3}3 {5}3 ITEM SET {1,2} {1,3} {1,5} {2,3} {2,5} {3,5}

Threshold Support = 2 ITEM SETSUPPORT {1,2}1 {1,3}2 {1,5}1 {2,3}2 {2,5}3 {3,5}2 ITEM SETSUPPORT {1,3}2 {2,3}2 {2,5}3 {3,5}2

ITEM SET {2,3,5} ITEM SETSUPPORT {2,3,5} 2

References Data Mining by Margaret Dunham. Wikipedia

Q & A …… Thanks..