Machine Learning and Data Mining Clustering

Slides:



Advertisements
Similar presentations
CS 478 – Tools for Machine Learning and Data Mining Clustering: Distance-based Approaches.
Advertisements

SEEM Tutorial 4 – Clustering. 2 What is Cluster Analysis?  Finding groups of objects such that the objects in a group will be similar (or.
Machine Learning and Data Mining Linear regression
Clustering (2). Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram –A tree like.
1 CSE 980: Data Mining Lecture 16: Hierarchical Clustering.
Albert Gatt Corpora and Statistical Methods Lecture 13.
Data Mining Cluster Analysis: Basic Concepts and Algorithms
Agglomerative Hierarchical Clustering 1. Compute a distance matrix 2. Merge the two closest clusters 3. Update the distance matrix 4. Repeat Step 2 until.
1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)
Machine Learning and Data Mining Clustering
Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Adapted by Doug Downey from Machine Learning EECS 349, Bryan Pardo Machine Learning Clustering.
Clustering Supervised vs. Unsupervised Learning Examples of clustering in Web IR Characteristics of clustering Clustering algorithms Cluster Labeling 1.
DOCUMENT CLUSTERING. Clustering  Automatically group related documents into clusters.  Example  Medical documents  Legal documents  Financial documents.
Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.
1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.
CSE5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai Li (Slides.
Artificial Intelligence 8. Supervised and unsupervised learning Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.
Prepared by: Mahmoud Rafeek Al-Farra
Lecture 6 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Machine Learning and Data Mining Clustering (adapted from) Prof. Alexander Ihler TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree like diagram that.
Compiled By: Raj Gaurang Tiwari Assistant Professor SRMGPC, Lucknow Unsupervised Learning.
Text Clustering Hongning Wang
Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.
Clustering Algorithms Sunida Ratanothayanon. What is Clustering?
Bayesian Hierarchical Clustering Paper by K. Heller and Z. Ghahramani ICML 2005 Presented by David Williams Paper Discussion Group ( )
Data Mining and Text Mining. The Standard Data Mining process.
Unsupervised Learning
Big Data Infrastructure
Clustering Anna Reithmeir Data Mining Proseminar 2017
Unsupervised Learning: Clustering
Hierarchical Clustering
Unsupervised Learning: Clustering
Machine Learning for the Quantified Self
Semi-Supervised Clustering
Clustering CSC 600: Data Mining Class 21.
CZ5211 Topics in Computational Biology Lecture 3: Clustering Analysis for Microarray Data I Prof. Chen Yu Zong Tel:
Machine Learning and Data Mining Clustering
Machine Learning Clustering: K-means Supervised Learning
Linli Xu Martha White Dale Schuurmans University of Alberta
Hierarchical Clustering
Machine Learning Lecture 9: Clustering
Neural Networks for Machine Learning Lecture 1e Three types of learning Geoffrey Hinton with Nitish Srivastava Kevin Swersky.
Hierarchical Clustering
Data Clustering Michael J. Watts
K-means and Hierarchical Clustering
John Nicholas Owen Sarah Smith
CS 2750: Machine Learning Clustering
Hierarchical and Ensemble Clustering
CSE572, CBS598: Data Mining by H. Liu
CS 188: Artificial Intelligence Spring 2007
KAIST CS LAB Oh Jong-Hoon
Data Mining 資料探勘 分群分析 (Cluster Analysis) Min-Yuh Day 戴敏育
CS 188: Artificial Intelligence Fall 2008
KDD: Part II Clustering
CSE572, CBS572: Data Mining by H. Liu
Data-Intensive Distributed Computing
Hierarchical Clustering
Birch presented by : Bahare hajihashemi Atefeh Rahimi
Text Categorization Berlin Chen 2003 Reference:
Machine Learning and Data Mining Clustering
Unsupervised Learning: Clustering
SEEM4630 Tutorial 3 – Clustering.
Hierarchical Clustering
Machine Learning and Data Mining Clustering
Unsupervised Learning
Presentation transcript:

Machine Learning and Data Mining Clustering + Machine Learning and Data Mining Clustering (adapted from Prof. Alexander Ihler) First Lecture Today (Thu 30 Jun) Read Chapters 18.6.1-2, 20.3.1 Second Lecture Today (Thu 30 Jun) Read Chapter 3.1-3.4 Next Lecture (Tue 5 Jul) Chapters 3.5-3.7, 4.1-4.2 (Please read lecture topic material before and after each lecture on that topic) TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA

Overview What is clustering and its applications? Distance between two clusters. Hierarchical Agglomerative clustering. K-Means clustering.

What is Clustering? You can say this “unsupervised learning” There is not any label for each instance of data. Clustering is alternatively called as “grouping”. Clustering algorithms rely on a distance metric between data points. Data points close to each other are in a same cluster.

Unsupervised learning Predict target value (“y”) given features (“x”) Unsupervised learning Understand patterns of data (just “x”) Useful for many reasons Data mining (“explain”) Missing data values (“impute”) Representation (feature generation or selection) One example: clustering X2 X1

Clustering Example Clustering has wide applications in: Economic Science especially market research Grouping customers with same preferences for advertising or predicting sells. Text mining Search engines cluster documents.

Overview What is clustering and its applications? Distance between two clusters. Hierarchical Agglomerative clustering. K-Means clustering.

Cluster Distances There are four different measures to compute the distance between each pair of clusters: 1- Min distance 2- Max distance 3- Average distance 4- Mean distance

Cluster Distances (Min) X2 X1

Cluster Distances (Max)

Cluster Distances (Average) Average of distance between all nodes in cluster Ci and Cj X2 X1

Cluster Distances (Mean) X2 X1 µi = Mean of all data points in cluster i (over all features)

Overview What is clustering and its applications? Distance between two clusters. Hierarchical Agglomerative clustering. K-Means clustering.

Hierarchical Agglomerative Clustering Define a distance between clusters Initialize: every example is a cluster. Iterate: Compute distances between all clusters Merge two closest clusters Save both clustering and sequence of cluster operations as a Dendrogram. Initially, every datum is a cluster

Iteration 1 Merge two closest clusters

Iteration 2 Cluster 2 Cluster 1

Iteration 3 Builds up a sequence of clusters (“hierarchical”) Algorithm complexity O(N2) (Why?) In matlab: “linkage” function (stats toolbox)

Dendrogram Two clusters Three clusters Stop the process whenever there are enough number of clusters.

Overview What is clustering and its applications? Distance between two clusters. Hierarchical Agglomerative clustering. K-Means clustering.

K-Means Clustering Always there are K cluster. In contrast, in agglomerative clustering, the number of clusters reduces. iterate: (until clusters remains almost fix). For each cluster, compute cluster means: ( Xi is feature i) For each data point, find the closest cluster.

Choosing the number of clusters Cost function for clustering is : what is the optimal value of k? (can increasing k ever increase the cost?) This is a model complexity issue Much like choosing lots of features – they only (seem to) help Too many clusters leads to over fitting. To reduce the number of clusters, one solution is to penalize for complexity: Add (# parameters) * log(N) to the cost Now more clusters can increase cost, if they don’t help “enough”

Choosing the number of clusters (2) The Cattell scree test: The best K is 3. Dissimilarity 1 2 3 4 5 6 7 Number of Clusters Scree is a loose accumulation of broken rock at the base of a cliff or mountain.

2-means Clustering Select two points randomly.

2-means Clustering Find distance of all points to these two points.

2-means Clustering Find two clusters based on distances.

2-means Clustering Find new means. Green points are mean values for two clusters.

2-means Clustering Find new clusters.

2-means Clustering Find new means. Find distance of all points to these two mean values. Find new clusters. Nothing is changed.

Summary Definition of clustering Difference between supervised and unsupervised learning. Finding labels for each datum. Clustering algorithms Agglomerative clustering Each data is a cluster. Combine closest clusters Stop when there is a sufficient number of cluster. K-means Always K clusters exist. Find new mean value. Find new clusters. Stop when nothing changes in clusters (or changes are less than very small value).