Presentation is loading. Please wait.

Presentation is loading. Please wait.

Association Rues Analysis .Event A -> Event ?

Similar presentations


Presentation on theme: "Association Rues Analysis .Event A -> Event ?"— Presentation transcript:

1 Association Rues Analysis .Event A -> Event ?
Market Basket Analysis

2 What Is Association Mining?
Association rule mining: Finding frequent patterns, associations, correlations, or causal structures among sets of items or objects in transaction databases, relational databases, and other information repositories. Applications: Market basket analysis, cross-marketing, catalog design, loss-leader analysis, clustering, classification, etc. Examples: Rule form: “Body ® Head [support, confidence]” buys(x, “diapers”) ® buys(x, “beers”) [0.5%, 60%]

3 Support and Confidence
Percent of samples contain both A and B support(A  B) = P(A ∩ B) Confidence Percent of A samples also containing B confidence(A  B) = P(B|A) Example pork  lattuce [support = 2%, confidence = 60%]

4 A store selling fruits and vegetables
Which items are sold together frequently?

5 An Example of Market Basket(1)
There are 8 transactions on three items on A (Apple), B (Banana) , C (Carrot). Check associations for below two cases. (1) A  B # Basket 1 A 2 B 3 C 4 A, B 5 A, C 6 B, C 7 A, B, C 8

6 An Example of Market Basket(1(2)
Basic probabilities are below: (1) AB Coverage 5/8 = 0.625 Support P(A∩B) = 3/8 = 0.375 Confidence P(B|A)=3/5=0.6 Lift P(A∩B) P(A)*P(B) /(0.625*0.625)=0.375/0.39=0.0.96 Leverage P(A∩B) - P(A)*P(B) = =

7 Lift What are good association rules? (How to interpret them?)
If lift is close to 1, it means there is no association between two items (sets). If lift is greater than 1, it means there is a positive association between two items (sets). If lift is less than 1, it means there is a negative association between two items (sets).

8 Leverage Leverage = P(A∩B) - P(A)*P(B) , it has three types
① Two items (sets) are positively associated ② Two items (sets) are independent ③Two items (sets) are negatively associated

9 Lab on Association Rules(1)
SPSS Clementine, SAS Enterprise Miner have association rules softwares. This exercise, however uses Magnum Opus. download Magnum Opus evaluation version ( click)

10 After you install the problem, you can see below initial screen
After you install the problem, you can see below initial screen. From menu, choose File – Import Data (Ctrl – O).

11 Demo Data sets are already there
Demo Data sets are already there. Magnum Opus has two types of data sets available: (transaction data: *.idi, *.itl) and (attribute-value data: *.data, *.nam) Data format has below two types:(*.idi, *.itl). idi (identifier-item file) itl (item list file) 001, apples 001, oranges 001, bananas 002, apples 002, carrots 002, lettuce 002, tomatoes apples, oranges, bananas apples, carrots, lettuce, tomatoes

12 If you open tutorial.idi using note pad, you can see the file inside as left.
The example left has 5 transactions (baskets)

13 File – Import Data, or click . click Tutorial.idi
Check Identifier – item file and click Next >.

14 Click Yes and click Next > …

15 Click Next > … What percentage of whole file you want to use? Type 50% and click Next > …

16 click Import Data를 클릭 Then, you can see a screen like below left.

17 Set things as they are. Click GO Search by: LIFT Minimum lift: 1
Maximum no. of rules: 10 Click GO

18 Results are saved in tutorial.out file. Below are rules derived:
lettuce & carrots are associated with tomatoes with strength = 0.857 coverage = 0.042: 21 cases satisfy the LHS support = 0.036: 18 cases satisfy both the LHS and the RHS lift 3.51: the strength is 3.51 times greater than the strength if there were no association leverage = : the support is (12.9 cases) greater than

19 lettuce & carrots  tomatoes
When Lettuce and carrots are purchase then they buy tomatoes coverage = 0.042: 21 cases satisfy the LHS LHS(lettuce & carrots) = 21/500 = 0.042 support = 0.036: 18 cases satisfy both the LHS and the RHS P((lettuce & carrots) ∩ tomatoes)) = 18/500 = 0.036 strength(confidence) = 0.857 P(support|LHS)= 18/21 = 0.036/0.042 = 0.857

20 lift 3.51: the strength is 3.51 times greater than the strength if there were no association
즉, (18/21)/(122/500) = 3.51 leverage = : the support is (12.9 cases) greater than if there were no association P(LHS ∩ RHS) – P(A)*P(B) = – 0.042*0.244 =


Download ppt "Association Rues Analysis .Event A -> Event ?"

Similar presentations


Ads by Google