Elective-I Examination Scheme- In semester Assessment: 30 End semester Assessment :70 Text Books: Data Mining Concepts and Techniques- Micheline Kamber Introduction to Data Mining with case studies-G.k.Gupta Reference Books: Mining the Web Discovering Knowledge from Hypertext data- Saumen charkrobarti Reinforcement and systemic machine learning for decision making- Parag Kulkarni
Market Basket Analysis Frequent item set, Closed item set, Association Rules Mining multilevel Association Rules Constraint based association rule mining Apriori Algorithm FP growth Algorithm
Itemset: Transaction is a set of items (Itemset). Confidence : It is the measure of trust worthiness associated with each discovered pattern. Support : It is the measure of how often the collection of items in an association occur together as percentage of all transactions Frequent itemset : If an itemset satisfies minimum support,then it is a frequent itemset.
Def: Market Basket Analysis (Association Analysis) is a mathematical modeling technique based upon the theory that if you buy a certain group of items, you are likely to buy another group of items. It is used to analyze the customer purchasing behavior and helps in increasing the sales and maintain inventory by focusing on the point of sale transaction data.
identify purchase patterns what items tend to be purchased together ▪ obvious: steak-potatoes; diaper- baby lotion what items are purchased sequentially ▪ obvious: house-furniture; car-tires what items tend to be purchased by season
Categorize customer purchase behavior purchase profiles profitability of each purchase profile Use it for marketing ▪ layout or catalogs ▪ select products for promotion ▪ space allocation
Customer 1: beer, pretzels, potato chips, aspirin Customer 2: diapers, baby lotion, grapefruit juice, baby food, milk Customer 3: soda, potato chips, milk Customer 4: soup, beer, milk, ice cream Customer 5: soda, coffee, milk, bread Customer 6: beer, potato chips
beauty consciouskids’ playconvenience food health consciouspet loverwomen’s fashion sports consciousgardenerkid’s fashion smokerautomotivehobbyist casual drinkerphotographerstudent/home office new familytv/stereo enthusiastillness (prescription) illness over-the-counterseasonal/traditionalpersonal care casual readerhomemaker home handymanhome comfort men’s image consciousfashion footwear sentimentalmen’s fashion
Beauty conscious cotton balls hair dye cologne nail polish
BENEFITS: simple computations can be undirected (don’t have to have hypotheses before analysis) different data forms can be analyzed
Itemset: A collection of one or more items ▪ Example: {Milk, Bread, Diaper} Support count ( ) Frequency of occurrence of an itemset E.g. ({Milk, Bread, Diaper}) = 2 Support Fraction of transactions that contain an itemset E.g. s({Milk, Bread, Diaper}) = 2/5 Frequent Itemset An itemset whose support is greater than or equal to a minsup threshold
Given a set of transactions, find rules that will predict the occurrence of an item based on the occurrences of other items in the transaction Market-Basket transactions Example of Association Rules {Diaper} {Beer}, {Milk, Bread} {Eggs,Coke}, {Beer, Bread} {Milk}, Implication means co- occurrence..
Example of Rules: {Milk, Diaper} {Beer} (s=0.4, c=0.67) {Milk, Beer} {Diaper} (s=0.4, c=1.0) {Diaper, Beer} {Milk} (s=0.4, c=0.67) {Beer} {Milk, Diaper} (s=0.4, c=0.67) {Diaper} {Milk, Beer} (s=0.4, c=0.5) {Milk} {Diaper, Beer} (s=0.4, c=0.5) Observations: All the above rules are binary partitions of the same itemset: {Milk, Diaper, Beer} Rules originating from the same itemset have identical support but can have different confidence Thus, we may decouple the support and confidence requirements