Download presentation
Presentation is loading. Please wait.
Published byAlexandra Sparks Modified over 9 years ago
1
Wine Clustering Ling Lin
2
Contents ❏ Motivation ❏ Data ❏ Dimensionality Reduction-MDS, Isomap ❏ Clustering-Kmeans, Ncut, Ratio Cut, SCC ❏ Conclustion ❏ Reference
3
Motivation Clustering is a main task of exploratory data mining Make market Segementation, marketing strategies Document Clustering Target appropriate treatment to patients with similar response patterns Image segementation Apply clustering methods to a real data
4
Data ➢ Wine data Source of the data set : “Machine Learning Repository”, University of California, Irvine. Data sample size : 14 variables and 178 observations in 3 classes : different cultivar Variables : 1) Alcohol 2) Malic acid 3) Ash 4) Alcalinity of ash 5) Magnesium 6) Total phenols 7) Flavanoids 8) Nonflavanoid phenols 9) Proanthocyanins 10)Color intensity 11)Hue 12)OD280/OD315 of diluted wines 13)Proline
5
MDS Can I seperate objects better? ---> change the ways to find the distances
6
Cityblock(L1) Distance Chebychev Distance Cosine DistanceMahalanobis Distance
7
Distances
9
a b c θ
10
MDS in 3D
11
MDS in 2D
12
Isomap Cosine Mahalanobis
13
Isomap Cosine Mahalanobis
14
Kmeans Clustering Error rate = 0.03
15
True LabeledKmeans Clustering Normalized CutRatio CutSCC Clustering Comparison
16
Conclusion Dimensionality Reduction- Different methods for calculating distances and reducing dimension --->Wine data VX 3D MDSCosine DistanceMahalanobis 2D MDSCosine DistanceMahalanobis Isomap make Mahalanobis distance a better display
17
Conclusion Clustering: Kmeans= Rcut → SCC → Ncut Ncut and Rcut : consider both inter and intra cluster connections. However, in this dataset, the intra cluster connections are weak.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.