Agglomerative Hierarchical Clustering 1. Compute a distance matrix 2. Merge the two closest clusters 3. Update the distance matrix 4. Repeat Step 2 until.

Slides:



Advertisements
Similar presentations
SEEM Tutorial 4 – Clustering. 2 What is Cluster Analysis?  Finding groups of objects such that the objects in a group will be similar (or.
Advertisements

Clustering (2). Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram –A tree like.
Hierarchical Clustering
1 CSE 980: Data Mining Lecture 16: Hierarchical Clustering.
Hierarchical Clustering. Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree-like diagram that.
Data Mining Cluster Analysis Basics
Hierarchical Clustering, DBSCAN The EM Algorithm
Data Mining Cluster Analysis: Advanced Concepts and Algorithms
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Midterm topics Chapter 2 Data Data preprocessing Measures of similarity/dissimilarity Chapter.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ What is Cluster Analysis? l Finding groups of objects such that the objects in a group will.
Data Mining Cluster Analysis: Basic Concepts and Algorithms
Clustering Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data. The example below demonstrates.
IT 433 Data Warehousing and Data Mining Hierarchical Clustering Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ What is Cluster Analysis? l Finding groups of objects such that the objects in a group will.
Data Mining Cluster Analysis: Basic Concepts and Algorithms
Clustering II.
UPGMA Algorithm.  Main idea: Group the taxa into clusters and repeatedly merge the closest two clusters until one cluster remains  Algorithm  Add a.
1 Machine Learning: Symbol-based 10d More clustering examples10.5Knowledge and Learning 10.6Unsupervised Learning 10.7Reinforcement Learning 10.8Epilogue.
© Tan,Steinbach, Kumar Introduction to Data Mining 1/17/ Data Mining Cluster Analysis: Advanced Concepts and Algorithms Figures for Chapter 9 Introduction.
Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Lecture Notes for Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
© Tan,Steinbach, Kumar Introduction to Data Mining 1/17/ Data Mining Cluster Analysis: Basic Concepts and Algorithms Figures for Chapter 8 Introduction.
© Tan,Steinbach, Kumar Introduction to Data Mining 1/17/ Data Mining Anomaly Detection Figures for Chapter 10 Introduction to Data Mining by Tan,
Cluster Analysis.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 17: Hierarchical Clustering 1.
© Tan,Steinbach, Kumar Introduction to Data Mining 1/17/ Data Mining Classification: Alternative Techniques Figures for Chapter 5 Introduction to.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
© Tan,Steinbach, Kumar Introduction to Data Mining 1/17/ Data Mining: Exploring Data Figures for Chapter 3 Introduction to Data Mining by Tan, Steinbach,
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
© Tan,Steinbach, Kumar Introduction to Data Mining 1/17/ Data Mining Association Analysis: Advanced Concepts Figures for Chapter 7 Introduction to.
Clustering Unsupervised learning Generating “classes”
Hierarchical Clustering
Partitional and Hierarchical Based clustering Lecture 22 Based on Slides of Dr. Ikle & chapter 8 of Tan, Steinbach, Kumar.
DOCUMENT CLUSTERING. Clustering  Automatically group related documents into clusters.  Example  Medical documents  Legal documents  Financial documents.
Clustering Algorithms k-means Hierarchic Agglomerative Clustering (HAC) …. BIRCH Association Rule Hypergraph Partitioning (ARHP) Categorical clustering.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Clustering COMP Research Seminar BCB 713 Module Spring 2011 Wei Wang.
CSE5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai Li (Slides.
K-Means Algorithm Each cluster is represented by the mean value of the objects in the cluster Input: set of objects (n), no of clusters (k) Output:
Data Mining Cluster Analysis: Advanced Concepts and Algorithms Lecture Notes for Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Ch. Eick: Introduction to Hierarchical Clustering and DBSCAN 1 Remaining Lectures in Advanced Clustering and Outlier Detection 2.Advanced Classification.
Computational Biology Clustering Parts taken from Introduction to Data Mining by Tan, Steinbach, Kumar Lecture Slides Week 9.
Machine Learning and Data Mining Clustering (adapted from) Prof. Alexander Ihler TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree like diagram that.
Data Mining Cluster Analysis: Advanced Concepts and Algorithms
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Information Retrieval and Organisation Chapter 17 Hierarchical Clustering Dell Zhang Birkbeck, University of London.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Cluster Analysis This lecture node is modified based on Lecture Notes for Chapter.
Clustering Algorithms Sunida Ratanothayanon. What is Clustering?
Clustering (1) Chapter 7. Outline Introduction Clustering Strategies The Curse of Dimensionality Hierarchical k-means.
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
CSE4334/5334 Data Mining Clustering. What is Cluster Analysis? Finding groups of objects such that the objects in a group will be similar (or related)
Data Mining and Text Mining. The Standard Data Mining process.
Data Mining Classification and Clustering Techniques Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction to Data Mining.
Data Mining: Basic Cluster Analysis
More on Clustering in COSC 4335
CSE 4705 Artificial Intelligence
Hierarchical Clustering
Discrimination and Classification
CSE 5243 Intro. to Data Mining
Clustering.
Hierarchical and Ensemble Clustering
Data Mining Cluster Techniques: Basic
Information Organization: Clustering
Objectives Data Mining Course
Birch presented by : Bahare hajihashemi Atefeh Rahimi
Unsupervised Learning: Clustering
SEEM4630 Tutorial 3 – Clustering.
Hierarchical Clustering
Data Mining Cluster Analysis: Basic Concepts and Algorithms
Presentation transcript:

Agglomerative Hierarchical Clustering 1. Compute a distance matrix 2. Merge the two closest clusters 3. Update the distance matrix 4. Repeat Step 2 until only one cluster remains

Distance Between Clusters … Min distance Distance between two closest objects Min < threshold: Single-link Clustering Max distance Distance between two farthest objects Max < threshold: Complete-link Clustering Average distance Average of all pairs of objects from the two clusters

… Distance Between Clusters Mean distance Increased SSE (Ward’s Method)

Min Distance Clustering Example

... Min Distance Clustering Example ©Tan, Steinbach, Kumar Introduction to Data Mining 2004

Max Distance Clustering Example ©Tan, Steinbach, Kumar Introduction to Data Mining 2004

Average Distance Clustering Example ©Tan, Steinbach, Kumar Introduction to Data Mining 2004

Ward’s Clustering Example ©Tan, Steinbach, Kumar Introduction to Data Mining 2004

About Hierarchical Clustering Produces a hierarchy of clusters Lack of a global objective function Merging decisions are final Expensive Often used with other clustering algorithms