Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wine Clustering Ling Lin. Contents ❏ Motivation ❏ Data ❏ Dimensionality Reduction-MDS, Isomap ❏ Clustering-Kmeans, Ncut, Ratio Cut, SCC ❏ Conclustion.

Similar presentations


Presentation on theme: "Wine Clustering Ling Lin. Contents ❏ Motivation ❏ Data ❏ Dimensionality Reduction-MDS, Isomap ❏ Clustering-Kmeans, Ncut, Ratio Cut, SCC ❏ Conclustion."— Presentation transcript:

1 Wine Clustering Ling Lin

2 Contents ❏ Motivation ❏ Data ❏ Dimensionality Reduction-MDS, Isomap ❏ Clustering-Kmeans, Ncut, Ratio Cut, SCC ❏ Conclustion ❏ Reference

3 Motivation Clustering is a main task of exploratory data mining Make market Segementation, marketing strategies Document Clustering Target appropriate treatment to patients with similar response patterns Image segementation Apply clustering methods to a real data

4 Data ➢ Wine data Source of the data set : “Machine Learning Repository”, University of California, Irvine. Data sample size : 14 variables and 178 observations in 3 classes : different cultivar Variables : 1) Alcohol 2) Malic acid 3) Ash 4) Alcalinity of ash 5) Magnesium 6) Total phenols 7) Flavanoids 8) Nonflavanoid phenols 9) Proanthocyanins 10)Color intensity 11)Hue 12)OD280/OD315 of diluted wines 13)Proline

5 MDS Can I seperate objects better? ---> change the ways to find the distances

6 Cityblock(L1) Distance Chebychev Distance Cosine DistanceMahalanobis Distance

7 Distances

8

9 a b c θ

10 MDS in 3D

11 MDS in 2D

12 Isomap Cosine Mahalanobis

13 Isomap Cosine Mahalanobis

14 Kmeans Clustering Error rate = 0.03

15 True LabeledKmeans Clustering Normalized CutRatio CutSCC Clustering Comparison

16 Conclusion Dimensionality Reduction- Different methods for calculating distances and reducing dimension --->Wine data VX 3D MDSCosine DistanceMahalanobis 2D MDSCosine DistanceMahalanobis Isomap make Mahalanobis distance a better display

17 Conclusion Clustering: Kmeans= Rcut → SCC → Ncut Ncut and Rcut : consider both inter and intra cluster connections. However, in this dataset, the intra cluster connections are weak.


Download ppt "Wine Clustering Ling Lin. Contents ❏ Motivation ❏ Data ❏ Dimensionality Reduction-MDS, Isomap ❏ Clustering-Kmeans, Ncut, Ratio Cut, SCC ❏ Conclustion."

Similar presentations


Ads by Google