Problem Definition Input: Output: Requirement:

Slides:



Advertisements
Similar presentations
CLUSTERING.
Advertisements

Clustering. How are we doing on the pass sequence? Pretty good! We can now automatically learn the features needed to track both people But, it sucks.
K-means Clustering Given a data point v and a set of points X,
K-means Clustering Ke Chen.
CMPUT 615 Applications of Machine Learning in Image Analysis
Clustering.
Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
Clustering Basic Concepts and Algorithms
PARTITIONAL CLUSTERING
Yasuhiro Fujiwara (NTT Cyber Space Labs)
Clustering & image segmentation Goal::Identify groups of pixels that go together Segmentation.
Machine Learning and Data Mining Clustering
Optimization of ICP Using K-D Tree
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
© 2005, it - instituto de telecomunicações. Todos os direitos reservados. Gerhard Maierbacher Scalable Coding Solutions for Wireless Sensor Networks IT.
K-means Clustering. What is clustering? Why would we want to cluster? How would you determine clusters? How can you do this efficiently?
Radial Basis Function Networks
Clustering Unsupervised learning Generating “classes”
Collaborative Filtering Matrix Factorization Approach
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Radial Basis Function Networks
Efficient Model Selection for Support Vector Machines
Clustering methods Course code: Pasi Fränti Speech & Image Processing Unit School of Computing University of Eastern Finland Joensuu,
CSIE Dept., National Taiwan Univ., Taiwan
CHAPTER 7: Clustering Eick: K-Means and EM (modified Alpaydin transparencies and new transparencies added) Last updated: February 25, 2014.
Image segmentation Prof. Noah Snavely CS1114
Chapter 9 DTW and VQ Algorithm  9.1 Basic idea of DTW  9.2 DTW algorithm  9.3 Basic idea of VQ  9.4 LBG algorithm  9.5 Improvement of VQ.
Unsupervised Learning. Supervised learning vs. unsupervised learning.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Chapter 3.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Machine Learning Queens College Lecture 7: Clustering.
Compiled By: Raj Gaurang Tiwari Assistant Professor SRMGPC, Lucknow Unsupervised Learning.
Flat clustering approaches
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Lloyd Algorithm K-Means Clustering. Gene Expression Susumu Ohno: whole genome duplications The expression of genes can be measured over time. Identifying.
1 Cluster Analysis – 2 Approaches K-Means (traditional) Latent Class Analysis (new) by Jay Magidson, Statistical Innovations based in part on a presentation.
Given a set of data points as input Randomly assign each point to one of the k clusters Repeat until convergence – Calculate model of each of the k clusters.
Clustering Approaches Ka-Lok Ng Department of Bioinformatics Asia University.
Clustering Usman Roshan CS 675. Clustering Suppose we want to cluster n vectors in R d into two groups. Define C 1 and C 2 as the two groups. Our objective.
Color Image Segmentation Mentor : Dr. Rajeev Srivastava Students: Achit Kumar Ojha Aseem Kumar Akshay Tyagi.
Computacion Inteligente Least-Square Methods for System Identification.
May 2003 SUT Color image segmentation – an innovative approach Amin Fazel May 2003 Sharif University of Technology Course Presentation base on a paper.
COMP24111 Machine Learning K-means Clustering Ke Chen.
Gilad Lerman Math Department, UMN
Clustering Anna Reithmeir Data Mining Proseminar 2017
CSIE Dept., National Taiwan Univ., Taiwan
Clustering MacKay - Chapter 20.
LINEAR CLASSIFIERS The Problem: Consider a two class task with ω1, ω2.
Clustering Usman Roshan.
Ch7: Hopfield Neural Model
CSSE463: Image Recognition Day 21
Ke Chen Reading: [7.3, EA], [9.1, CMB]
Clustering and Segmentation
Basic Algorithms Christina Gallner
Image Processing for Physical Data
AIM: Clustering the Data together
CSSE463: Image Recognition Day 23
Collaborative Filtering Matrix Factorization Approach
Ke Chen Reading: [7.3, EA], [9.1, CMB]
SMEM Algorithm for Mixture Models
KAIST CS LAB Oh Jong-Hoon
CSSE463: Image Recognition Day 23
CSSE463: Image Recognition Day 23
EM Algorithm and its Applications
Introduction to Machine learning
Machine Learning and Data Mining Clustering
Clustering Usman Roshan CS 675.
Presentation transcript:

Problem Definition Input: Output: Requirement: 2018/12/7 Problem Definition Input: : A data set in d-dim. space m: Number of clusters (we don’t use k here to avoid confusion with summation index…) Output: Cluster centers: Assignment of each xi to one of the m clusters: Requirement: The output should minimize the objective function… 2018/12/7

Objective Function Objective function (distortion) 2018/12/7 Objective Function Objective function (distortion) d*m (for matrix C) plus n*m (for matrix A) tunable parameters with certain constraints on matrix A Np-hard problem if exact solution is required 2018/12/7

Task 1: How to Find Assignment in A? 2018/12/7 Task 1: How to Find Assignment in A? Goal Find A to minimize J(X; C, A) with fixed C Facts Close-form solution exists: 2018/12/7

Task 2: How to Find Centers in C? 2018/12/7 Task 2: How to Find Centers in C? Goal Find C to minimize J(X; C, A) with fixed A Facts Close-form solution exists: 2018/12/7

Algorithm Initialize Select initial m cluster centers Find clusters 2018/12/7 Algorithm Initialize Select initial m cluster centers Find clusters For each xi, assign the cluster with nearest center  Find A to minimize J(X; C, A) with fixed C Find centers Recompute each cluster center as the mean of data in the cluster  Find C to minimize J(X; C, A) with fixed A Stopping criterion Stop if clusters stay the same. Otherwise go to step 2. 2018/12/7

Stopping Criteria Two stopping criteria Facts 2018/12/7 Stopping Criteria Two stopping criteria Repeating until no more change in cluster assignment Repeat until distortion improvement is less than a threshold Facts Convergence is assured since J is reduced repeatedly. 2018/12/7

Properties of K-means Clustering 2018/12/7 Properties of K-means Clustering K-means can find the approximate solution efficiently. The distortion (squared error) is a monotonically non-increasing function of iterations. The goal is to minimize the square error, but it could end up in a local minimum. To increase the probability of finding the global maximum, try to start k-means with different initial conditions. “Cluster validation” refers to a set of methods which try to determine the best value of k. Other distance measures can be used in place of the Euclidean distance, with corresponding change in center identification. 2018/12/7

2018/12/7 K-means Snapshots 2018/12/7

Demo of K-means Clustering 2018/12/7 Demo of K-means Clustering Toolbox download Utility Toolbox Machine Learning Toolbox Demos kMeansClustering.m vecQuzntize.m 2018/12/7

Demos of K-means Clustering 2018/12/7 Demos of K-means Clustering kMeansClustering.m 2018/12/7

Application: Image Compression 2018/12/7 Application: Image Compression Goal Convert a image from true colors to indexed colors with minimum distortion. Steps Collect data from a true-color image Perform k-means clustering to obtain cluster centers as the indexed colors Compute the compression rate 2018/12/7

Example: Image Compression 2018/12/7 Example: Image Compression Original image Dimension: 480x640 Data size: 480*640*3*8 bits = 0.92 MB 2018/12/7

Example: Image Compression 2018/12/7 Example: Image Compression 2018/12/7

Example: Image Compression 2018/12/7 Example: Image Compression 2018/12/7

Code X = imread('annie19980405.jpg'); image(X) [m, n, p]=size(X); 2018/12/7 Code X = imread('annie19980405.jpg'); image(X) [m, n, p]=size(X); index=(1:m*n:m*n*p)'*ones(1, m*n)+ones(p,1)*(0:m*n-1); data=double(X(index)); maxI=6; for i=1:maxI centerNum=2^i; fprintf('i=%d/%d: no. of centers=%d\n', i, maxI, centerNum); center=kMeansClustering(data, centerNum); distMat=distPairwise(center, data); [minValue, minIndex]=min(distMat); X2=reshape(minIndex, m, n); map=center'/255; figure; image(X2); colormap(map); colorbar; axis image; end 2018/12/7