Combinatorial clustering algorithms. Example: K-means clustering

Slides:



Advertisements
Similar presentations
Clustering.
Advertisements

CS 478 – Tools for Machine Learning and Data Mining Clustering: Distance-based Approaches.
Cluster Analysis: Basic Concepts and Algorithms
PARTITIONAL CLUSTERING
Water Resources Development and Management Optimization (Integer Programming) CVEN 5393 Mar 11, 2013.
Cluster Analysis.
Computability and Complexity 23-1 Computability and Complexity Andrei Bulatov Search and Optimization.
Clustering CMPUT 466/551 Nilanjan Ray. What is Clustering? Attach label to each observation or data points in a set You can say this “unsupervised classification”
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Slide 1 EE3J2 Data Mining Lecture 16 Unsupervised Learning Ali Al-Shahib.
Adapted by Doug Downey from Machine Learning EECS 349, Bryan Pardo Machine Learning Clustering.
What is Cluster Analysis?
MAE 552 – Heuristic Optimization Lecture 5 February 1, 2002.
Clustering with Bregman Divergences Arindam Banerjee, Srujana Merugu, Inderjit S. Dhillon, Joydeep Ghosh Presented by Rohit Gupta CSci 8980: Machine Learning.
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
Ulf Schmitz, Pattern recognition - Clustering1 Bioinformatics Pattern recognition - Clustering Ulf Schmitz
Lecture 09 Clustering-based Learning
Radial Basis Function Networks
Evaluating Performance for Data Mining Techniques
CSC321: Neural Networks Lecture 12: Clustering Geoffrey Hinton.
START OF DAY 8 Reading: Chap. 14. Midterm Go over questions General issues only Specific issues: visit with me Regrading may make your grade go up OR.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman modified by Hanne Jarmer.
1 1 © 2003 Thomson  /South-Western Slide Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Lecture 6 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
Cluster Analysis Potyó László. Cluster: a collection of data objects Similar to one another within the same cluster Similar to one another within the.
Divide and Conquer Optimization problem: z = max{cx : x  S}
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Flat clustering approaches
5-1 Copyright © 2013 Pearson Education Integer Programming: The Branch and Bound Method Module C.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Cluster Analysis This lecture node is modified based on Lecture Notes for Chapter.
CZ5211 Topics in Computational Biology Lecture 4: Clustering Analysis for Microarray Data II Prof. Chen Yu Zong Tel:
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Given a set of data points as input Randomly assign each point to one of the k clusters Repeat until convergence – Calculate model of each of the k clusters.
DATA MINING: CLUSTER ANALYSIS Instructor: Dr. Chun Yu School of Statistics Jiangxi University of Finance and Economics Fall 2015.
Data Mining – Algorithms: K Means Clustering
Clustering (2) Center-based algorithms Fuzzy k-means Density-based algorithms ( DBSCAN as an example ) Evaluation of clustering results Figures and equations.
1 1 Slide © 2005 Thomson/South-Western Chapter 9 Network Models n Shortest-Route Problem n Minimal Spanning Tree Problem n Maximal Flow Problem.
Unsupervised Learning
Water Resources Development and Management Optimization (Integer and Mixed Integer Programming) CVEN 5393 Mar 28, 2011.
Data Transformation: Normalization
Clustering CSC 600: Data Mining Class 21.
LINEAR CLASSIFIERS The Problem: Consider a two class task with ω1, ω2.
St. Edward’s University
Clustering Usman Roshan.
Greedy Technique.
Data Mining K-means Algorithm
Classification of unlabeled data:
Clustering (3) Center-based algorithms Fuzzy k-means
Introduction to Operations Research
CSE 5243 Intro. to Data Mining
Critical Issues with Respect to Clustering
Data Mining – Chapter 4 Cluster Analysis Part 2
5.2 Least-Squares Fit to a Straight Line
Cluster Analysis.
Text Categorization Berlin Chen 2003 Reference:
Biointelligence Laboratory, Seoul National University
Clustering Techniques
The Greedy Approach Young CS 530 Adv. Algo. Greedy.
SEEM4630 Tutorial 3 – Clustering.
Data Mining CSCI 307, Spring 2019 Lecture 24
Data Mining Cluster Analysis: Basic Concepts and Algorithms
Introduction to Machine learning
Clustering Usman Roshan CS 675.
Unsupervised Learning
Presentation transcript:

Combinatorial clustering algorithms. Example: K-means clustering

Clustering algorithms Goal: partition the observations into groups ("clusters") so that the pairwise dissimilarities between those assigned to the same cluster tend to be smaller than those in different clusters. 3 types of clustering algorithms: mixture modeling, mode seekers (e.g. PRIM algorithm), and combinatorial algorithms. We focus on the most popular combinatorial algorithms.

Combinatorial clustering algorithms Most popular clustering algorithms directly assign each observation to a group or cluster without regard to a probability model describing the data. Notation: Label observations by an integer “i” in {1,...,N} and clusters by an integer k in {1,...,K}. The cluster assignments can be characterized by a many to one mapping C(i) that assigns the i-th observation to the k-th cluster: C(i)=k. (aka encoder)   One seeks a particular encoder C*(i) that minimizes a particular *loss* function (aka energy function).

Loss functions for judging clusterings One seeks a particular encoder C*(i) that minimizes a particular *loss* function (aka energy function). Example: within cluster point scatters

Cluster analysis by combinatorial optimization Straightforward in principle: Simply minimize W(C) over all possible assignments of the N data points to K clusters. Unfortunately such optimization by complete enumeration is feasible only for small data sets. For this reason practical clustering algorithms are able to examine only a fraction of all possible encoders C. The goal is to identify a small subset that is likely to contain the optimal one or at least a good sub-optimal partition. Feasible strategies are based on iterative greedy descent.

K-means clustering is a very popular iterative descent clustering methods. Setting: all variables are of the quantitative type and one uses a squared Euclidean distance. In this case Note that this can be re-expressed as 

Thus one can obtain the optimal C Thus one can obtain the optimal C* by solving the enlarged optimization problem   This can be minimized by an alternating optimization procedure given on the next slide…

K-means clustering algorithm leads to a local minimum 1. For a given cluster assignment C, the total cluster variance is minimized with respect to {m1,...,mk} yielding the means of the currently assigned clusters, i.e. find the cluster means. 2. Given the current set of means, TotVar is minimized by assigning each observation to the closest (current) cluster mean. That is C(i)=argmink ||xi-mk||2 3. Steps 1 and 2 are iterated until the assignments do not change.

Recommendations for k-means clustering   Either: Start with many different random choices of starting means, and choose the solution having smallest value of the objective function. Or use another clustering method (e.g. hierarchical clustering) to determine an initial set of cluster centers.