Unsupervised Learning: Clustering Rong Jin Outline  Unsupervised learning  K means for clustering  Expectation Maximization algorithm for clustering.

Slides:



Advertisements
Similar presentations
Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
Advertisements

Mixture Models and the EM Algorithm
Unsupervised Learning
Clustering Beyond K-means
Expectation Maximization
K Means Clustering , Nearest Cluster and Gaussian Mixture
Supervised Learning Recap
DATA MINING van data naar informatie Ronald Westra Dep. Mathematics Maastricht University.
K-means clustering Hongning Wang
Machine Learning and Data Mining Clustering
Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
First introduced in 1977 Lots of mathematical derivation Problem : given a set of data (data is incomplete or having missing values). Goal : assume the.
CS728 Web Clustering II Lecture 14. K-Means Assumes documents are real-valued vectors. Clusters based on centroids (aka the center of gravity or mean)
Unsupervised Learning: Clustering Some material adapted from slides by Andrew Moore, CMU. Visit for
Scalable Data Mining The Auton Lab, Carnegie Mellon University Brigham Anderson, Andrew Moore, Dan Pelleg, Alex Gray, Bob Nichols, Andy.
Clustering.
Gaussian Mixture Example: Start After First Iteration.
Lecture 4 Unsupervised Learning Clustering & Dimensionality Reduction
Expectation Maximization Algorithm
Switch to Top-down Top-down or move-to-nearest Partition documents into ‘k’ clusters Two variants “Hard” (0/1) assignment of documents to clusters “soft”
Unsupervised Learning
Expectation-Maximization
ECE 5984: Introduction to Machine Learning
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Clustering & Dimensionality Reduction 273A Intro Machine Learning.
Least-Mean-Square Training of Cluster-Weighted-Modeling National Taiwan University Department of Computer Science and Information Engineering.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
CSC321: Neural Networks Lecture 12: Clustering Geoffrey Hinton.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.
Lecture 19: More EM Machine Learning April 15, 2010.
Text Clustering.
CHAPTER 7: Clustering Eick: K-Means and EM (modified Alpaydin transparencies and new transparencies added) Last updated: February 25, 2014.
Unsupervised Learning: Clustering Some material adapted from slides by Andrew Moore, CMU. Visit for
1 Statistical Techniques Chapter Linear Regression Analysis Simple Linear Regression.
Classification Heejune Ahn SeoulTech Last updated May. 03.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Unsupervised Learning: Kmeans, GMM, EM Readings: Barber
Mixture of Gaussians This is a probability distribution for random variables or N-D vectors such as… –intensity of an object in a gray scale image –color.
Clustering Algorithms Presented by Michael Smaili CS 157B Spring
HMM - Part 2 The EM algorithm Continuous density HMM.
Lecture 6 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
Cluster Analysis Potyó László. Cluster: a collection of data objects Similar to one another within the same cluster Similar to one another within the.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
DATA MINING van data naar informatie Ronald Westra Dep. Mathematics Maastricht University.
1 Clustering: K-Means Machine Learning , Fall 2014 Bhavana Dalvi Mishra PhD student LTI, CMU Slides are based on materials from Prof. Eric Xing,
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
Lecture 2: Statistical learning primer for biologists
Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.
Clustering Unsupervised learning introduction Machine Learning.
Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.
1 Unsupervised Learning and Clustering Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of.
Flat clustering approaches
A new initialization method for Fuzzy C-Means using Fuzzy Subtractive Clustering Thanh Le, Tom Altman University of Colorado Denver July 19, 2011.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
1 Machine Learning Lecture 9: Clustering Moshe Koppel Slides adapted from Raymond J. Mooney.
Advanced Artificial Intelligence Lecture 8: Advance machine learning.
Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/22/11.
Given a set of data points as input Randomly assign each point to one of the k clusters Repeat until convergence – Calculate model of each of the k clusters.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Expectation Maximization –Principal Component Analysis (PCA) Readings:
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Mixture Densities Maximum Likelihood Estimates.
Machine Learning Expectation Maximization and Gaussian Mixtures CSE 473 Chapter 20.3.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Classification of unlabeled data:
ECE 5424: Introduction to Machine Learning
Probabilistic Models with Latent Variables
EM for GMM.
INTRODUCTION TO Machine Learning
Text Categorization Berlin Chen 2003 Reference:
Presentation transcript:

Unsupervised Learning: Clustering Rong Jin

Outline  Unsupervised learning  K means for clustering  Expectation Maximization algorithm for clustering

Unsupervised vs. Supervised Learning  Supervised learning Training data Every training example is labeled  Unsupervised learning Training data No data is labeled We can still discover the structure of the data  Semi-supervised learning Training data Mixture of labeled and unlabeled data Can you think of ways to utilize the unlabeled data for improving predication?

Unsupervised Learning  Clustering  Visualization  Density Estimation  Outlier/Novelty Detection  Data Compression

Clustering/Density Estimation $$$ age

Clustering for Visualization

Image Compression

K-means for Clustering  Key for clustering Find cluster centers Determine appropriate clusters (very very hard)  K-means Start with a random guess of cluster centers Determine the membership of each data points Adjust the cluster centers

K-means 1.Ask user how many clusters they’d like. (e.g. k=5)

K-means 1.Ask user how many clusters they’d like. (e.g. k=5) 2.Randomly guess k cluster Center locations

K-means 1.Ask user how many clusters they’d like. (e.g. k=5) 2.Randomly guess k cluster Center locations 3.Each datapoint finds out which Center it’s closest to. (Thus each Center “owns” a set of datapoints)

K-means 1.Ask user how many clusters they’d like. (e.g. k=5) 2.Randomly guess k cluster Center locations 3.Each datapoint finds out which Center it’s closest to. 4.Each Center finds the centroid of the points it owns

K-means 1.Ask user how many clusters they’d like. (e.g. k=5) 2.Randomly guess k cluster Center locations 3.Each datapoint finds out which Center it’s closest to. 4.Each Center finds the centroid of the points it owns Any Computational Problem? Computational Complexity: O(N) where N is the number of points?

Improve K-means  Group points by region  KD tree  SR tree  Key difference Find the closest center for each rectangle Assign all the points within a rectangle to one cluster

Improved K-means Find the closest center for each rectangle Assign all the points within a rectangle to one cluster

Improved K-means

Gaussian Mixture Model for Clustering  Assume that data are generated from a mixture of Gaussian distributions  For each Gaussian distribution Center:  i Variance:  i (ignore)  For each data point Determine membership

Learning a Gaussian Mixture (with known covariance)  Probability  Log-likelihood of unlabeled data  Find optimal parameters

Learning a Gaussian Mixture (with known covariance) E-Step M-Step

Gaussian Mixture Example: Start

After First Iteration

After 2nd Iteration

After 3rd Iteration

After 4th Iteration

After 5th Iteration

After 6th Iteration

After 20th Iteration