Clustering The process of grouping samples so that the samples are similar within each group.

Slides:



Advertisements
Similar presentations
Clustering II.
Advertisements

SEEM Tutorial 4 – Clustering. 2 What is Cluster Analysis?  Finding groups of objects such that the objects in a group will be similar (or.
Clustering.
Hierarchical Clustering
Cluster Analysis: Basic Concepts and Algorithms
1 CSE 980: Data Mining Lecture 16: Hierarchical Clustering.
Hierarchical Clustering. Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree-like diagram that.
Hierarchical Clustering, DBSCAN The EM Algorithm
PARTITIONAL CLUSTERING
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ What is Cluster Analysis? l Finding groups of objects such that the objects in a group will.
Data Mining Cluster Analysis: Basic Concepts and Algorithms
Clustering Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data. The example below demonstrates.
Metrics, Algorithms & Follow-ups Profile Similarity Measures Cluster combination procedures Hierarchical vs. Non-hierarchical Clustering Statistical follow-up.
Introduction to Bioinformatics
BASIC METHODOLOGIES OF ANALYSIS: SUPERVISED ANALYSIS: HYPOTHESIS TESTING USING CLINICAL INFORMATION (MLL VS NO TRANS.) IDENTIFY DIFFERENTIATING GENES Basic.
Cluster Analysis Hal Whitehead BIOL4062/5062. What is cluster analysis? Non-hierarchical cluster analysis –K-means Hierarchical divisive cluster analysis.
2004/05/03 Clustering 1 Clustering (Part One) Ku-Yaw Chang Assistant Professor, Department of Computer Science and Information.
Clustering II.
Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
4. Ad-hoc I: Hierarchical clustering
Cluster Analysis: Basic Concepts and Algorithms
What is Cluster Analysis?
What is Cluster Analysis?
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
Evaluating Performance for Data Mining Techniques
tch?v=Y6ljFaKRTrI Fireflies.
Clustering Algorithms k-means Hierarchic Agglomerative Clustering (HAC) …. BIRCH Association Rule Hypergraph Partitioning (ARHP) Categorical clustering.
1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.
Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Intelligent Systems Laboratory 1 CLUSTERS Prof. George Papadourakis,
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman modified by Hanne Jarmer.
CSE5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai Li (Slides.
Clustering.
K-Means Algorithm Each cluster is represented by the mean value of the objects in the cluster Input: set of objects (n), no of clusters (k) Output:
CZ5225: Modeling and Simulation in Biology Lecture 3: Clustering Analysis for Microarray Data I Prof. Chen Yu Zong Tel:
CURE: EFFICIENT CLUSTERING ALGORITHM FOR LARGE DATASETS VULAVALA VAMSHI PRIYA.
Ch. Eick: Introduction to Hierarchical Clustering and DBSCAN 1 Remaining Lectures in Advanced Clustering and Outlier Detection 2.Advanced Classification.
Computational Biology Clustering Parts taken from Introduction to Data Mining by Tan, Steinbach, Kumar Lecture Slides Week 9.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Cluster Analysis This lecture node is modified based on Lecture Notes for Chapter.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Multivariate statistical methods Cluster analysis.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
DATA MINING: CLUSTER ANALYSIS Instructor: Dr. Chun Yu School of Statistics Jiangxi University of Finance and Economics Fall 2015.
Unsupervised Learning
Multivariate statistical methods
Data Mining: Basic Cluster Analysis
Semi-Supervised Clustering
CSE 4705 Artificial Intelligence
CZ5211 Topics in Computational Biology Lecture 3: Clustering Analysis for Microarray Data I Prof. Chen Yu Zong Tel:
Data Mining K-means Algorithm
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
CSE 5243 Intro. to Data Mining
Clustering.
AIM: Clustering the Data together
John Nicholas Owen Sarah Smith
Multivariate Statistical Methods
DATA MINING Introductory and Advanced Topics Part II - Clustering
Data Mining – Chapter 4 Cluster Analysis Part 2
Cluster Analysis.
Introduction to Cluster Analysis
Topic 5: Cluster Analysis
SEEM4630 Tutorial 3 – Clustering.
Hierarchical Clustering
Data Mining Cluster Analysis: Basic Concepts and Algorithms
BIRCH: Balanced Iterative Reducing and Clustering using Hierarchies
Unsupervised Learning
Presentation transcript:

Clustering The process of grouping samples so that the samples are similar within each group.

Clustering

Algorithm of Clustering Hierarchical clustering Organizes the data into larger groups, which contain smaller groups, like a tree or dendrogram. Algorithms :Agglomerative,Single-linkage, complete-linkage, average-linkage, Ward…. Partitional clustering To create one set of clusters that partitions the data into similar groups. Algorithms: Forgy’s, k-means, Isodata… SOM,CLICK, CAST, …

Figures of Hierarchical Clustering 1‘ 1 2 3 4 5

Figures of Hierarchical Clustering 2‘ 1 2 3 4 5

Figures of Hierarchical Clustering 2‘ 3‘ 1 2 3 4 5

Figures of Hierarchical Clustering 1 2 3 4 5

Hierarchical Clustering Method Distance metric Single-link Average-link Complete-link Centriod

K-mean approach One more input k is required. There are many variants of k-mean. Sum-of squares criterion minimize

An example of k-mean approach Two passes Begin with k clusters, each consisting of one of the first k samples. For the remaining n-k samples, find the centroid nearest it. After each sample is assigned, re-compute the centroid of the altered cluster. For each sample, find the centroid nearest it. Put the sample in the cluster identified with this nearest centroid. ( do not need to re-compute.)

Examples

Examples

Examples

Examples

Examples

Examples

Self Organizing Maps

Examples

Examples

Examples

Examples

Examples

Examples

Examples

Examples

Examples

Examples

Examples

CLICK Use graph theory Connected component The edge weight is calculated by statistical probabilities