Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling.

Slides:



Advertisements
Similar presentations
Data Mining Techniques Association Rule
Advertisements

3/3/20081 Data Warehousing and Data Mining. 3/3/20082 Why Data Mining? — Potential Applications Database analysis and decision support –Market analysis.
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
FUNGSI MAYOR Assosiation. What Is Association Mining? Association rule mining: –Finding frequent patterns, associations, correlations, or causal structures.
Pertemuan XIV FUNGSI MAYOR Assosiation. What Is Association Mining? Association rule mining: –Finding frequent patterns, associations, correlations, or.
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
MIS2502: Data Analytics Association Rule Mining. Uses What products are bought together? Amazon’s recommendation engine Telephone calling patterns Association.
1 of 25 1 of 45 Association Rule Mining CIT366: Data Mining & Data Warehousing Instructor: Bajuna Salehe The Institute of Finance Management: Computing.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms
1 Association Rule Mining Instructor Qiang Yang Slides from Jiawei Han and Jian Pei And from Introduction to Data Mining By Tan, Steinbach, Kumar.
Mining Association Rules in Large Databases
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
Asssociation Rules Prof. Sin-Min Lee Department of Computer Science.
1 Εξόρυξη Γνώσης (data mining) Χ. Παπαθεοδώρου Εργαστήριο Ψηφιακών Βιβλιοθηκών & Ηλεκτρονικής Δημοσίευσης Τμήμα Αρχειονομίας – Βιβλιοθηκονομίας, Ιόνιο.
Mining Association Rules
Mining Frequent Patterns I: Association Rule Discovery Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Mining Association Rules
Mining Association Rules in Large Databases. What Is Association Rule Mining?  Association rule mining: Finding frequent patterns, associations, correlations,
Pattern Recognition Lecture 20: Data Mining 3 Dr. Richard Spillman Pacific Lutheran University.
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
Mining Association Rules between Sets of Items in Large Databases presented by Zhuang Wang.
Data Mining Using IBM Intelligent Miner Presented by: Qiyan (Jennifer ) Huang.
Ch5 Mining Frequent Patterns, Associations, and Correlations
Association Rule By Kenneth Leung. Data Mining The process of extracting valid, previously unknown, comprehensible, and actionable information from large.
ASSOCIATION RULE DISCOVERY (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Information Systems Data Analysis – Association Mining Prof. Les Sztandera.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
Fast Algorithms For Mining Association Rules By Rakesh Agrawal and R. Srikant Presented By: Chirayu Modi.
Association Rule Mining Data Mining and Knowledge Discovery Prof. Carolina Ruiz and Weiyang Lin Department of Computer Science Worcester Polytechnic Institute.
Association Rule.. Association rule mining  It is an important data mining model studied extensively by the database and data mining community.  Assume.
Data Mining Find information from data data ? information.
Association Rule Mining
ASSOCIATION RULES (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
Mining Frequent Patterns, Associations, and Correlations Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
1 Knowledge discovery & data mining Association rules and market basket analysis --introduction UCLA CS240A Course Notes* __________________________ *
Mining Frequent Patterns. What Is Frequent Pattern Analysis? Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs.
Elsayed Hemayed Data Mining Course
Data Mining  Association Rule  Classification  Clustering.
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Association Rules Carissa Wang February 23, 2010.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
Jerry Post Copyright © Database Management Systems: Data Mining Market Baskets Association Rules.
Lecture 10 (big data) Knowledge Induction using association rule and decision tree (Understanding customer behavior Using data mining skills)
MIS2502: Data Analytics Association Rule Mining David Schuff
MIS2502: Data Analytics Association Rule Mining Jeremy Shafer
Mining Association Rules in Large Database This work is created by Dr. Anamika Bhargava, Ms. Pooja Kaul, Ms. Priti Bali and Ms. Rajnipriya Dhawan and licensed.
Data Mining Find information from data data ? information.
Association rule mining
Mining Association Rules
Knowledge discovery & data mining Association rules and market basket analysis--introduction UCLA CS240A Course Notes*
I. Association Market Basket Analysis.
©Jiawei Han and Micheline Kamber
Mining Association Rules in Large Databases
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Transactional data Algorithm Applications
Big Data.
Analysis of Customer Behavior and Service Modeling
Unit 3 MINING FREQUENT PATTERNS ASSOCIATION AND CORRELATIONS
Frequent patterns and Association Rules
(Understanding customer behavior Using data mining skills)
©Jiawei Han and Micheline Kamber
I. Association Market Basket Analysis.
Department of Computer Science National Tsing Hua University
Association Rues Analysis .Event A -> Event ?
Analysis of Customer Behavior and Service Modeling
Chapter 14 – Association Rules
Analysis of Customer Behavior and Service Modeling
Presentation transcript:

Lecture 4: Association Market Basket Analysis Analysis of Customer Behavior and Service Modeling

What Is Association Mining?  Association rule mining: – Finding frequent patterns, associations, correlations, or causal structures among sets of items or objects in transaction databases, relational databases, and other information repositories.  Applications: – Market basket analysis, cross-marketing, catalog design, loss- leader analysis, clustering, classification, etc.  Examples: – Rule form: “ Body  Head [support, confidence] ” buys(x, “ diapers ” )  buys(x, “ beers ” ) [0.5%, 60%] major(x, “ CS ” ) ^ takes(x, “ DB ” )  grade(x, “ A ” ) [1%, 75%]

Support and Confidence  Support –Percent of samples contain both A and B –support(A  B) = P(A ∩ B)  Confidence –Percent of A samples also containing B –confidence(A  B) = P(B|A)  Example –computer  financial_management_software [support = 2%, confidence = 60%]

Association Rules: Basic Concepts  Given: (1) database of transactions, (2) each transaction is a list of items (purchased by a customer in a visit)  Find: all rules that correlate the presence of one set of items with that of another set of items –e.g., 98% of people who purchase tires and auto accessories also get automotive services done  Applications –Home Electronics - What other products should the store stocks up? –Retailing – Shelf design, promotion structuring, direct marketing

 Find all the rules A  C with minimum confidence and support –Support (s) probability that a transaction contains {A & C} –Confidence (c) conditional probability that a transaction having {A} also contains {C} Let minimum support 50%, and minimum confidence 50%, we have A  C (50%, 66.6%) C  A (50%, 100%) Customer buys diaper Customer buys both Customer buys beer Rule Measures: Support and Confidence

For rule A  C: support = support({A, C}) = 50% confidence = support({A, C})/support({A}) = 66.6% Target: Min. support 50% Min. confidence 50% Mining Association Rules: An Example

An Example of Market Basket(1)  There are 8 transactions on three items on A (Apple), B (Banana), C (Carrot).  Check associations for below two cases. (1) A  B (2) (A, B)  C #Basket 1A 2B 3C 4A, B 5A, C 6B, C 7A, B, C 8

An Example of Market Basket(1(2)  Basic probabilities are below: (1) A  B(2) (A, B)  C LHSP(A) = 5/8 = 0.625P(A,B) = 3/8 = RHSP(B) = 5/8 = 0.625P(C) = 5/8 = CoverageLHS = 0.625LHS = SupportP(A∩B) = 3/8 = 0.375P((A,B)∩C)) = 2/8 =0.25 ConfidenceP(B|A)=0.375/0.625=0.6P(C|(A,B))=0.25/0.375=0.7 Lift0.375/(0.625*0.625)= /(0.375*0.625)=1.07 Leverage = = 0.016

 What are good association rules? (How to interpret them?) –If lift is close to 1, it means there is no association between two items (sets). –If lift is greater than 1, it means there is a positive association between two items (sets). –If lift is less than 1, it means there is a negative association between two items (sets). Lift

Leverage –Leverage = P(A∩B) - P(A)*P(B), it has three types ① Leverage > 0 ② Leverage = 0 ③ Leverage < 0 – ① Two items (sets) are positively associated – ② Two items (sets) are independent – ③ Two items (sets) are negatively associated

Lab on Association Rules(1)  SPSS Clementine, SAS Enterprise Miner have association rules softwares.  This exercise uses Magnum Opus.  Go to and download Magnum Opus evaluation version (  click)

 After you install the problem, you can see below initial screen. From menu, choose File – Import Data (Ctrl – O).

 Demo Data sets are already there. Magnum Opus has two types of data sets available: (transaction data: *.idi, *.itl) and (attribute-value data: *.data, *.nam)  Data format has below two types:(*.idi, *.itl). idi (identifier-item file) itl (item list file) 001, apples 001, oranges 001, bananas 002, apples 002, carrots 002, lettuce 002, tomatoes apples, oranges, bananas apples, carrots, lettuce, tomatoes

 If you open tutorial.idi using note pad, you can see the file inside as left.  The example left has 5 transactions (baskets)

 File – Import Data, or click. click Tutorial.idi  Check Identifier – item file and click Next >.

 Click Yes and click Next > …  click Next > …

 Click Next > …  What percentage of whole file you want to use? Type 50% and click Next > …

 click Import Data 를 클릭  Then, you can see a screen like below left.

 Set things as they are. –Search by: LIFT –Minimum lift: 1 –Maximum no. of rules: 10  Click GO

 Results are saved in tutorial.out file.  Below are rules derived: lettuce & carrots are associated with tomatoes with strength = coverage = 0.042: 21 cases satisfy the LHS support = 0.036: 18 cases satisfy both the LHS and the RHS lift 3.51: the strength is 3.51 times greater than the strength if there were no association leverage = : the support is (12.9 cases) greater than if there were no association

 lettuce & carrots  tomatoes –When Lettuce and carrots are purchase then they buy tomatoes –coverage = 0.042: 21 cases satisfy the LHS –LHS(lettuce & carrots) = 21/500 =  support = 0.036: 18 cases satisfy both the LHS and the RHS –P((lettuce & carrots) ∩ tomatoes)) = 18/500 =  strength(confidence) = –P(support|LHS)= 18/21 = 0.036/0.042 = 0.857

 lift 3.51: the strength is 3.51 times greater than the strength if there were no association – 즉, (18/21)/(122/500) = 3.51  leverage = : the support is (12.9 cases) greater than if there were no association –P(LHS ∩ RHS) – P(A)*P(B) = – 0.042*0.244 =