K-MEANS Michael Jones ENEE698Q Fall 2011. Overview  Introduction  Problem Formulation  How K-Means Works  Pros and Cons of Using K-Means  How to.

Slides:

Advertisements

Similar presentations

Clustering. How are we doing on the pass sequence? Pretty good! We can now automatically learn the features needed to track both people But, it sucks.

Advertisements

Nonnegative Matrix Factorization with Sparseness Constraints S. Race MA591R.

Real-Time Template Tracking

1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

PARTITIONAL CLUSTERING

2806 Neural Computation Self-Organizing Maps Lecture Ari Visa.

Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.

K-means method for Signal Compression: Vector Quantization

Introduction to Bioinformatics

Machine Learning Neural Networks

Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.

CSE 589 Applied Algorithms Spring 1999 Image Compression Vector Quantization Nearest Neighbor Search.

Spatial and Temporal Data Mining

Reduced Support Vector Machine

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

1 Numerical geometry of non-rigid shapes Non-Euclidean Embedding Non-Euclidean Embedding Lecture 6 © Alexander & Michael Bronstein tosca.cs.technion.ac.il/book.

Lecture 09 Clustering-based Learning

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

Evaluating Performance for Data Mining Techniques

Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.

CSE 185 Introduction to Computer Vision Pattern Recognition.

Clustering methods Course code: Pasi Fränti Speech & Image Processing Unit School of Computing University of Eastern Finland Joensuu,

Biointelligence Laboratory, Seoul National University

Non Negative Matrix Factorization

COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.

CSE 185 Introduction to Computer Vision Pattern Recognition 2.

CE Digital Signal Processing Fall 1992 Waveform Coding Hossein Sameti Department of Computer Engineering Sharif University of Technology.

Chih-Ming Chen, Student Member, IEEE, Ying-ping Chen, Member, IEEE, Tzu-Ching Shen, and John K. Zao, Senior Member, IEEE Evolutionary Computation (CEC),

1 PCM & DPCM & DM. 2 Pulse-Code Modulation (PCM) : In PCM each sample of the signal is quantized to one of the amplitude levels, where B is the number.

Fast search methods Pasi Fränti Clustering methods: Part 5 Speech and Image Processing Unit School of Computing University of Eastern Finland

Data Extraction using Image Similarity CIS 601 Image Processing Ajay Kumar Yadav.

Artificial Intelligence Chapter 3 Neural Networks Artificial Intelligence Chapter 3 Neural Networks Biointelligence Lab School of Computer Sci. & Eng.

Learning to Sense Sparse Signals: Simultaneous Sensing Matrix and Sparsifying Dictionary Optimization Julio Martin Duarte-Carvajalino, and Guillermo Sapiro.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.

MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Clustering COMP Research Seminar BCB 713 Module Spring 2011 Wei Wang.

Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.

A Fast LBG Codebook Training Algorithm for Vector Quantization Presented by 蔡進義.

A Multiresolution Symbolic Representation of Time Series Vasileios Megalooikonomou Qiang Wang Guo Li Christos Faloutsos Presented by Rui Li.

Vector Quantization Vector quantization is used in many applications such as image and voice compression, voice recognition (in general statistical pattern.

Vector Quantization CAP5015 Fall 2005.

Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.

Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.

Rate Distortion Theory. Introduction The description of an arbitrary real number requires an infinite number of bits, so a finite representation of a.

Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!

Clustering Wei Wang. Outline What is clustering Partitioning methods Hierarchical methods Density-based methods Grid-based methods Model-based clustering.

Debrup Chakraborty Non Parametric Methods Pattern Recognition and Machine Learning.

Algorithm for non-negative matrix factorization Daniel D. Lee, H. Sebastian Seung. Algorithm for non-negative matrix factorization. Nature.

Efficient Huffman Decoding Aggarwal, M. and Narayan, A., International Conference on Image Processing, vol. 1, pp. 936 – 939, 2000 Presenter :Yu-Cheng.

Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.

Semi-Supervised Clustering

An Image Database Retrieval Scheme Based Upon Multivariate Analysis and Data Mining Presented by C.C. Chang Dept. of Computer Science and Information.

Data Mining K-means Algorithm

Unsupervised Riemannian Clustering of Probability Density Functions

Context-based Data Compression

Clustering (3) Center-based algorithms Fuzzy k-means

Unsupervised learning

PCM & DPCM & DM.

Clustering 77B Recommender Systems

Artificial Intelligence Chapter 3 Neural Networks

Artificial Intelligence Chapter 3 Neural Networks

Clustering Wei Wang.

Artificial Intelligence Chapter 3 Neural Networks

Text Categorization Berlin Chen 2003 Reference:

Artificial Intelligence Chapter 3 Neural Networks

EM Algorithm and its Applications

Introduction to Machine learning

Artificial Intelligence Chapter 3 Neural Networks

Presentation transcript:

K-MEANS Michael Jones ENEE698Q Fall 2011

Overview  Introduction  Problem Formulation  How K-Means Works  Pros and Cons of Using K-Means  How to Improve K-Means  K-Means on a Manifold  Vector Quantization

Introduction  K-means was first proposed by Stuart Lloyd in 1957 as a technique for pulse-code modulation.  “Least square quantization in PCM”, Bell Telephone Laboratories Paper.  Groups data into K clusters and attempts to group data points to minimize the sum of squares distance to their central mean.  Algorithm works by iterating between two stages until the data points converge.

Problem Formulation  Given a data set of {x 1,…,x N } which consists of N random instances of a random D-dimensional Euclidean variable x.  Introduce a set of K prototype vectors, µ k where k=1,…,K and µ k corresponds to the mean of the k th cluster.  Goal is to find a grouping of data points and prototype vectors that minimizes the sum of squares distance of each data point.

Problem Formulation (cont.)  This can be formalized by introduce a indicator variable for each data point:  r nk is {0,1}, and k=1,…,K  Our objective function becomes: 

How K-Means works  Algorithm initializes the K prototype vectors to K distinct random data points.  Cycles between two stages until convergence is reached.  1. For each data point, determine r nk where:  2. Update µ k :

How K-Means works (cont)  K-Means follows the Expectation Maximization algorithm.  Stage 1 is the E step.  Stage 2 is the M step.  If K and D are fixed, the clustering can be performed in time.

How K-Means works (example

Pros and Cons of K-Means  Convergence: J may converge to a local minima and not the global minimum. May have to repeat algorithm multiple times.  Inter-Vector Relationships: Works well for Euclidian data but cannot make use of inter-vector relationships with each x.  With a large data set, the Euclidian distance calculations can be slow.  K is an input parameter. If K is inappropriately chosen it may yield poor results.

How to Improve K-Means  The E step can modified to have a general dissimilarity measure which leads to the K- medoids algorithm.   Can speed up K-means through various methods:  Pre-compute a tree where near by points are in the same sub tree. (Ramas. And Paliwal, 1990)  Use triangle inequality for computing distances. (Hodgson,1998).

Vector Quantization  Proposed by Robert M. Gray  Algorithm is nearly identical to K-Means  “Step 0. Given: A training sequence and an initial decoder.  Step 1. Encode the training sequence into a sequence of channel symbols using the given decoder minimum distortion rule. If the average distortion is small enough, quit.  Step 2. Replace the old reproduction codeword of the decoder for each channel symbol v by the centriod of all training vectors which mapped into v in Step 1. Go to Step 1.”

K-Means on a Manifold  K-Means can be performed on a manifold if one can compute the mean of the data.  Fletcher et al. introduced the notion of computing means on Riemannian manifolds.  Turaga et al. performed such an experiment applying K-Means on Riemannian manifolds.  Used iterative algorithm to find the sample Karcher mean  Used the dissimilarity measure:

Sources  Bishop C., “K-Means Clustering” in Pattern Recognition and Machine Learning, 2006,  Fletcher, P., Lu, C., Pitzer, M., Joshi, & S., “Principal Geodesic Analysis for the Study of Nonlinear Statistics of Shape” from IEEE Transactions on Medical Imaging, VOL. 23, NO. 8, August 2004,  Gray, M., “Vector Quantization” in IEEE ASSP Magazine, pp , April  Turaga, P., Veeraraghavan, A., Srivastava, A. & Chellappa, R., “Statistical Computations on Grassman and Stiefel manifolds for Image and Video-Based Recognition” in IEEE PAMI, accepted 2010.

Questions? July, 2010