Presentation is loading. Please wait.

Presentation is loading. Please wait.

Estimating the Number of Clusters (k) Selection Approaches: Use a Criterion to select among the solutions for several values of k (kmeans or GMMs are used)

Similar presentations


Presentation on theme: "Estimating the Number of Clusters (k) Selection Approaches: Use a Criterion to select among the solutions for several values of k (kmeans or GMMs are used)"— Presentation transcript:

1 Estimating the Number of Clusters (k) Selection Approaches: Use a Criterion to select among the solutions for several values of k (kmeans or GMMs are used) Criterion(k): Training Objective(k) + Model Complexity(k) Model Complexity: Bayesian arguments (BIC): L(k) – M(k) lnN Information theory (MDL, MML) Marginal Likelihood (Bayesian GMMs) Gap Statistic (Tibshirani et al, 2001) Stability-based methods Incremental methods are convenient

2 Estimating the Number of Clusters (k) Optimal solutions wrt clustering error do not always reveal the true clustering structure

3 Estimating the Number of Clusters (k) Dynamic Approaches (add or remove components on-line) Bottom-up (GMM) Starting from many initial components gradually eliminate redundant ones Methods: MML-GMM (Law & Figueiredo, IEEE TPAMI 2002)MML-GMM Bayesian GMM (Corduneanou & Bishop, AISTATS 2001)Bayesian GMM Newtonian Clustering (Blekas & Lagaris, PR 2007)

4 Bayesian GMM (C-B) C-B method: Results depend on the number of initial components initialization of components specification of the scale matrix V of the Wishart prior p(T)

5 Bayesian GMM (C-B)

6 Estimating the Number of Clusters (k) Top – down (incremental) Starting from one component Iteratively add components (usually through splitting) Until no component can be further splitted

7 Estimating the Number of Clusters (k) Top – down (incremental) X-means (BIC criterion for 2 clusters) (Pelleg & Moore, ICML 2000)X-means G-means (1d test for Gaussianity, PCA-based projection) (Hamerly & Elkan, NIPS 2003)G-means PG-means (1d tests for GMM fitness, random projections) (Feng & Hamerly, NIPS 2006)PG-means VBGMM (Incremental Variational Bayes) (Constantinopoulos & Likas, IEEE TNN 2007)VBGMM Dip-means (test for unimodality) (Kalogeratos & Likas, NIPS 2012)Dip-means


Download ppt "Estimating the Number of Clusters (k) Selection Approaches: Use a Criterion to select among the solutions for several values of k (kmeans or GMMs are used)"

Similar presentations


Ads by Google