Estimating the Number of Clusters (k) Clustering error cannot be used as a criterion for deciding on the number of clusters. Selection Approaches: Use.

Slides:



Advertisements
Similar presentations
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Gaussian Mixture.
Advertisements

1 Learning the topology of a data set Michaël Aupetit – Researcher Engineer (CEA) Pierre Gaillard – Ph.D. Student Gerard Govaert – Professor (University.
1 Bayesian Adaptation in HMM Training and Decoding Using a Mixture of Feature Transforms Stavros Tsakalidis and Spyros Matsoukas.
Self Organization: Competitive Learning
Model Assessment and Selection
Model Assessment, Selection and Averaging
Model assessment and cross-validation - overview
Machine learning optimization
Chapter 4: Linear Models for Classification
K-means clustering Hongning Wang
Machine Learning and Data Mining Clustering
Clustering approaches for high- throughput data Sushmita Roy BMI/CS 576 Nov 12 th, 2013.
Clustering CMPUT 466/551 Nilanjan Ray. What is Clustering? Attach label to each observation or data points in a set You can say this “unsupervised classification”
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
10/6/2015Nikos Hourdakis, MSc Thesis1 Design and Evaluation of Clustering Approaches for Large Document Collections, The “BIC-Means” Method Nikolaos Hourdakis.
Radial Basis Functions
Unsupervised Learning: Clustering Rong Jin Outline  Unsupervised learning  K means for clustering  Expectation Maximization algorithm for clustering.
Incremental Learning of Temporally-Coherent Gaussian Mixture Models Ognjen Arandjelović, Roberto Cipolla Engineering Department, University of Cambridge.
Dan Phelleg, Andrew Moore Carnegie Mellon University
Lecture 4 Unsupervised Learning Clustering & Dimensionality Reduction
Unsupervised Learning
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
Bayesian Learning Rong Jin.
Intro. ANN & Fuzzy Systems Lecture 21 Clustering (2)
FLANN Fast Library for Approximate Nearest Neighbors
Evaluating Performance for Data Mining Techniques
Geology 5670/6670 Inverse Theory 26 Jan 2015 © A.R. Lowry 2015 Read for Wed 28 Jan: Menke Ch 4 (69-88) Last time: Ordinary Least Squares (   Statistics)
沈致远. Test error(generalization error): the expected prediction error over an independent test sample Training error: the average loss over the training.
X = 11 X 2 = 9 X = 3 Check: X = 3 x 3 +2 = 11 We can solve this equation by:
Incremental Methods for Machine Learning Problems Aristidis Likas Department of Computer Science University of Ioannina
MML Inference of RBFs Enes Makalic Lloyd Allison Andrew Paplinski.
Machine Learning Problems Unsupervised Learning – Clustering – Density estimation – Dimensionality Reduction Supervised Learning – Classification – Regression.
Vladyslav Kolbasin Stable Clustering. Clustering data Clustering is part of exploratory process Standard definition:  Clustering - grouping a set of.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Sensitivity derivatives Can obtain sensitivity derivatives of structural response at several levels Finite difference sensitivity (section 7.1) Analytical.
Slides for “Data Mining” by I. H. Witten and E. Frank.
A split-and-merge framework for 2D shape summarization D. Gerogiannis, C. Nikou and A. Likas Department of Computer Science, University of Ioannina, Greece.
Prototype Classification Methods Fu Chang Institute of Information Science Academia Sinica ext. 1819
CHAPTER 1: Introduction. 2 Why “Learn”? Machine learning is programming computers to optimize a performance criterion using example data or past experience.
Analyzing Expression Data: Clustering and Stats Chapter 16.
K-Means clustering accelerated algorithms using the triangle inequality Ottawa-Carleton Institute for Computer Science Alejandra Ornelas Barajas School.
Flat clustering approaches
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Return to Big Picture Main statistical goals of OODA: Understanding population structure –Low dim ’ al Projections, PCA … Classification (i. e. Discrimination)
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Specification: Choosing the Independent.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
A PAC-Bayesian Approach to Formulation of Clustering Objectives Yevgeny Seldin Joint work with Naftali Tishby.
CZ5211 Topics in Computational Biology Lecture 4: Clustering Analysis for Microarray Data II Prof. Chen Yu Zong Tel:
NIPS 2013 Michael C. Hughes and Erik B. Sudderth
Given a set of data points as input Randomly assign each point to one of the k clusters Repeat until convergence – Calculate model of each of the k clusters.
1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.
Estimating the Number of Clusters (k) Selection Approaches: Use a Criterion to select among the solutions for several values of k (kmeans or GMMs are used)
Intelligent and Adaptive Systems Research Group A Novel Method of Estimating the Number of Clusters in a Dataset Reza Zafarani and Ali A. Ghorbani Faculty.
PatReco: Model and Feature Selection Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Algorithms: The Basic Methods Clustering WFH:
Data Science Practical Machine Learning Tools and Techniques 6.8: Clustering Rodney Nielsen Many / most of these slides were adapted from: I. H. Witten,
Fast search for Dirichlet process mixture models
Clustering Clustering definition: Partition a given set of objects into M groups (clusters) such that the objects of each group are ‘similar’ and ‘different’
MCMC Stopping and Variance Estimation: Idea here is to first use multiple Chains from different initial conditions to determine a burn-in period so the.
Learning Representations of Data
Bias and Variance of the Estimator
Probabilistic Models with Latent Variables
EM for GMM.
Linear Model Selection and regularization
MATH 6380J Mini-Project 1: Realization of Recent Trends in Machine Learning Community in Recent Years by Pattern Mining of NIPS Words Chan Lok Chun
Mixture Models with Adaptive Spatial Priors
Machine Learning and Data Mining Clustering
Presentation transcript:

Estimating the Number of Clusters (k) Clustering error cannot be used as a criterion for deciding on the number of clusters. Selection Approaches: Use a Criterion to select among the solutions for several values of k (kmeans or GMMs are used) Criterion(k): Training Objective(k) + Model Complexity(k) Model Complexity: Bayesian arguments (BIC): L(k) – M(k) lnN Information theory (MDL, MML) Variance ratio criterion (VRC) (matlab)Variance ratio criterion Davies-Bouldin Criterion (matlab) Davies-Bouldin Criterion Silhouette criterion (matlab)Silhouette criterion Gap Statistic (matlab) Gap Statistic

Estimating the Number of Clusters (k) Optimal solutions wrt clustering error do not always reveal the true clustering structure

Estimating the Number of Clusters (k) Top – down (incremental) Starting from one component Iteratively add components (usually through splitting) Until no component can be further splitted based on a criterion (one cluster is preferable over two clusters)

Estimating the Number of Clusters (k) Top – down (incremental) X-means (BIC criterion for 2 clusters) (Pelleg & Moore, ICML 2000)X-means G-means (1d test for Gaussianity, PCA-based projection) (Hamerly & Elkan, NIPS 2003)G-means Dip-means (test for unimodality) (Kalogeratos & Likas, NIPS 2012) Dip-means