Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.

Cluster analysis

Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters from it.

K-means

Criteria

Same criteria with multivariate data:

Justifying the criteria Anova: decomposition of the variance. Univariate: SST=SSW+SSB Multivariate: Minimizing the withing clusters variance is equivalent to maximize the between clusters variance (the difference between clusters).

K-means algorithm

Number of clusters

Consequences of standardization

Ruspini example

Problems of k-means Very sensitive to outliers Euclidean distances not appropriate for eliptical clusters It does not give the number of clusters.

Hierarchical Algoritms

Agglomerative algorithms

Nearest neighbour distance

Farthest neighbour distance

Average distance

Centroid method distance

Ward’s method distance

Dendograms

Example

Problems of hierarchical cluster If n is large, slow. Each time n(n-1)/2 comparisons. Euclidean distances not always appropriate If n is large, dendogram difficult to interpret

Clustering by variables

Distances between quantitative variables

Distances between qualitative variables

Similarity between attributes

Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.

Similar presentations

Presentation on theme: "Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.

Similar presentations

Presentation on theme: "Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters."— Presentation transcript:

Similar presentations

About project

Feedback