Presentation is loading. Please wait.

Presentation is loading. Please wait.

CURE Clustering Using Representatives Handles outliers well. Hierarchical, partition First a constant number of points c, are chosen from each cluster.

Similar presentations


Presentation on theme: "CURE Clustering Using Representatives Handles outliers well. Hierarchical, partition First a constant number of points c, are chosen from each cluster."— Presentation transcript:

1 CURE Clustering Using Representatives Handles outliers well. Hierarchical, partition First a constant number of points c, are chosen from each cluster. These well scattered points are then shrunk towards the cluster’s centroid by applying a shrinkage factor alpha(α). Use many points to represent a cluster instead of only one Points will be well scattered. CURE then uses the hierarchical algorithm Limited main memory.

2 CURE Approach

3 CURE Algorithm

4 CURE for Large Databases

5 Comparison of Clustering Techniques

6 Association Rule are used to shoe the relationship between data items. Application : retail stores, marketing, advertisement, floor placement, inventory control.

7 Example: Market Basket Data Items frequently purchased together: Bread  PeanutButter Uses:  Placement  Advertising  Sales  Coupons Objective: increase sales and reduce costs

8 Association Rule Definitions Set of items: I={I 1,I 2,…,I m } Transactions: D={t 1,t 2, …, t n }, t j  I Itemset: {I i1,I i2, …, I ik }  I Support of an itemset: Percentage of transactions which contain that itemset. Large (Frequent) itemset: Itemset whose number of occurrences is above a threshold.

9 Association Rules Example I = { Beer, Bread, Jelly, Milk, PeanutButter} Support of {Bread,PeanutButter} is 60%

10 Association Rule Definitions Given a set of items I={I 1,I 2,…,I m } and a database of transactions D={t 1,t 2, …, t n } where t i ={I i1,I i2, …, I ik } and I ij  I, Association Rule (AR): implication X  Y where X,Y  I and X  Y = ; Support of AR (s) X  Y: Percentage of transactions that contain X  Y Confidence of AR (  ) X  Y: Ratio of number of transactions that contain X  Y to the number that contain X.

11 Association Rules Ex (cont’d)

12 Association Rule Problem Given a set of items I={I 1,I 2,…,I m } and a database of transactions D={t 1,t 2, …, t n } where t i ={I i1,I i2, …, I ik } and I ij  I, the Association Rule Problem is to identify all association rules X  Y with a minimum support and confidence. Link Analysis NOTE: Support of X  Y is same as support of X  Y.

13 Large Itemsets To finding association Rule : 1. Find Large Itemsets. 2. Generate rules from frequent itemsets. An itemset is any subset of the set of all items.

14 Algorithm to Generate ARs

15 Apriori Large Itemset Property: Any subset of a large itemset is large. Contrapositive: If an itemset is not large, none of its supersets are large.

16 Large Itemset Property


Download ppt "CURE Clustering Using Representatives Handles outliers well. Hierarchical, partition First a constant number of points c, are chosen from each cluster."

Similar presentations


Ads by Google