Presentation is loading. Please wait.

Presentation is loading. Please wait.

Association Rule By Kenneth Leung. Data Mining The process of extracting valid, previously unknown, comprehensible, and actionable information from large.

Similar presentations


Presentation on theme: "Association Rule By Kenneth Leung. Data Mining The process of extracting valid, previously unknown, comprehensible, and actionable information from large."— Presentation transcript:

1 Association Rule By Kenneth Leung

2 Data Mining The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases, and using it to make crucial business decisions. Make decision based on previous experience or observation

3 Association Rule Mining Formal: To find interesting associations and/or correlation relationships among large set of data items. Association rules show attribute value conditions that occur frequently together in a given dataset. Informal: “If – Then” relationship. If this happen, what is most likely to happen next. Obesity => Diabetes

4 Market Basket Analysis A typical and widely-used example of association rule mining. Example: Data are collected using bar-code scanners in supermarkets. Each record will consist of all items in a single purchase transaction. Managers would be interested to know if certain groups of items are consistently purchased together. They could use this data for adjusting store layouts (placing items optimally with respect to each other), for cross-selling, for promotions, for catalog design and to identify customer segments based on buying patterns.

5 Famous & Interesting Finding Beer & Diaper “A number of convenience store clerks noticed that men often bought beer at the same time they bought diapers. The store mined its receipts and proved the clerks' observations correct. So, the store began stocking diapers next to the beer coolers, and sales skyrocketed”

6 Why beer and Diapers?? Moms are stressed out by their naughty babies, and they need some beers for relief? Diapers boxes for putting old beer bottles. Very environmental Friendly, and easy handling.

7 Two Certainty Indices Determine whether a rule is good Support of AR: percentage of transactions that contain X and Y (X and Y are two items) Confidence of AR: Ratio of number of transactions that contain X and Y to the number that contain X The higher, the more reliable.

8 Example: Support Supermarket has 100,000 transactions. 2000/100,000 transactions include beer 800/2000 transactions contain diapers Support for the rule “beer->diapers” is 800 or 800/100,000 = 0.0008, or 0.8%

9 Example: Confidence Supermarket has 100,000 transactions. 2000/100,000 transactions include beer 800/2000 transactions contain item diapers Confidence for the rule “beer->diapers” is 800/2000 = 0.4, or 40%

10 Full example from Wiki 1.{Cold, Raining} => No 2.{Calm, Dry} => Yes 3.{Dry} => No 4.{Windy} => No 1.{Cold, Raining} => No Support: 2/5 = 40% Confidence: 2/2 = 100% => Good 2.{Calm, Dry} => Yes Support: 2/5 = 40% Confidence: 2/2 = 100% => Good 3.{Dry} => No Support: 1/5 = 20% Confidence: 1/3 = 33.3% => Bad 4.{Windy} => No Support: 0/5 = 0% Confidence: 1/1 = 100% =>Bad

11 References http://www.resample.com/xlminer/help/ Assocrules/associationrules_intro.htm http://www.resample.com/xlminer/help/ Assocrules/associationrules_intro.htm http://en.wikipedia.org/wiki/Association _rule_learning http://en.wikipedia.org/wiki/Association _rule_learning Dr Sin-Min Lee’s lecture 30


Download ppt "Association Rule By Kenneth Leung. Data Mining The process of extracting valid, previously unknown, comprehensible, and actionable information from large."

Similar presentations


Ads by Google