Charity Morgan Functional Data Analysis April 12, 2005

Slides:



Advertisements
Similar presentations
Clustering II.
Advertisements

SEEM Tutorial 4 – Clustering. 2 What is Cluster Analysis?  Finding groups of objects such that the objects in a group will be similar (or.
Hierarchical Clustering
Cluster Analysis: Basic Concepts and Algorithms
1 CSE 980: Data Mining Lecture 16: Hierarchical Clustering.
Hierarchical Clustering. Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree-like diagram that.
Hierarchical Clustering, DBSCAN The EM Algorithm
Data Mining Cluster Analysis: Basic Concepts and Algorithms
Clustering Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data. The example below demonstrates.
Assessment. Schedule graph may be of help for selecting the best solution Best solution corresponds to a plateau before a high jump Solutions with very.
Cluster Analysis.
Cluster Analysis Hal Whitehead BIOL4062/5062. What is cluster analysis? Non-hierarchical cluster analysis –K-means Hierarchical divisive cluster analysis.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Cluster Analysis (from Chapter 12)
6-1 ©2006 Raj Jain Clustering Techniques  Goal: Partition into groups so the members of a group are as similar as possible and different.
COMP 578 Discovering Clusters in Databases
Clustering II.
Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.
Cluster Analysis: Basic Concepts and Algorithms
Adapted by Doug Downey from Machine Learning EECS 349, Bryan Pardo Machine Learning Clustering.
What is Cluster Analysis?
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
Clustering. What is clustering? Grouping similar objects together and keeping dissimilar objects apart. In Information Retrieval, the cluster hypothesis.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
START OF DAY 8 Reading: Chap. 14. Midterm Go over questions General issues only Specific issues: visit with me Regrading may make your grade go up OR.
Cluster analysis 포항공과대학교 산업공학과 확률통계연구실 이 재 현. POSTECH IE PASTACLUSTER ANALYSIS Definition Cluster analysis is a technigue used for combining observations.
Technological Educational Institute Of Crete Department Of Applied Informatics and Multimedia Intelligent Systems Laboratory 1 CLUSTERS Prof. George Papadourakis,
CLUSTERING. Overview Definition of Clustering Existing clustering methods Clustering examples.
Cluster Analysis Cluster Analysis Cluster analysis is a class of techniques used to classify objects or cases into relatively homogeneous groups.
K-Means Algorithm Each cluster is represented by the mean value of the objects in the cluster Input: set of objects (n), no of clusters (k) Output:
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall 6.8: Clustering Rodney Nielsen Many / most of these.
Machine Learning Queens College Lecture 7: Clustering.
Clustering Patrice Koehl Department of Biological Sciences National University of Singapore
Applied Multivariate Statistics Cluster Analysis Fall 2015 Week 9.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Cluster Analysis This lecture node is modified based on Lecture Notes for Chapter.
CZ5211 Topics in Computational Biology Lecture 4: Clustering Analysis for Microarray Data II Prof. Chen Yu Zong Tel:
1 Cluster Analysis Prepared by : Prof Neha Yadav.
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Chapter_20 Cluster Analysis Naresh K. Malhotra
CLUSTER ANALYSIS. Cluster Analysis  Cluster analysis is a major technique for classifying a ‘mountain’ of information into manageable meaningful piles.
Data Science Practical Machine Learning Tools and Techniques 6.8: Clustering Rodney Nielsen Many / most of these slides were adapted from: I. H. Witten,
Unsupervised Learning
Data Mining: Basic Cluster Analysis
Hierarchical Clustering
Clustering Patrice Koehl Department of Biological Sciences
Hierarchical Clustering: Time and Space requirements
Clustering CSC 600: Data Mining Class 21.
Chapter 15 – Cluster Analysis
CZ5211 Topics in Computational Biology Lecture 3: Clustering Analysis for Microarray Data I Prof. Chen Yu Zong Tel:
Data Mining K-means Algorithm
Hierarchical Clustering
Data Mining -Cluster Analysis. What is a clustering ? Clustering is the process of grouping data into classes, or clusters, so that objects within a cluster.
Jagdish Gangolly State University of New York at Albany
Clustering (3) Center-based algorithms Fuzzy k-means
CSE 5243 Intro. to Data Mining
Critical Issues with Respect to Clustering
Clustering and Multidimensional Scaling
Multivariate Statistical Methods
Jagdish Gangolly State University of New York at Albany
Data Mining – Chapter 4 Cluster Analysis Part 2
Chapter_20 Cluster Analysis
Cluster Analysis.
Text Categorization Berlin Chen 2003 Reference:
SEEM4630 Tutorial 3 – Clustering.
Cluster analysis Presented by Dr.Chayada Bhadrakom
Hierarchical Clustering
Data Mining Cluster Analysis: Basic Concepts and Algorithms
Introduction to Machine learning
Unsupervised Learning
Presentation transcript:

Charity Morgan Functional Data Analysis April 12, 2005 Cluster Analysis Charity Morgan Functional Data Analysis April 12, 2005

Sources Everitt, B. S. (1979). Unresolved Problems in Cluster Analysis. Biometrics, 35, 169-181. Romesburg, H. C. (1984). Cluster Analysis for Researchers. Lifetime Learning Publications: Belmont. Everitt, B. S., Landau, S., & Leese, M. (2001). Cluster Analysis. Oxford University Press: New York

Outline Motivation Introduction Method Measure Proximity Choose Clustering Method Hierarchical Clustering Optimization Clustering Select Best Clustering

Motivation - An Example Dataset in this presentation comes from a paper on infant temperament. [Stern, H. S., Arcus, D., Kagan, J., Rubin, D. B., & Snidman, N. (1995). Using Mixture Models in Temperament Research. International Journal of Behavioral Development, 18, 407-423.] 76 infants were measured on 3 dimensions: motor activity (Motor), irritability (Cry), and fear response (Fear).

Motivation – An Example

Motivation Given a data set, can we find natural groupings in the data? How can we decide how many groups exist? Could there be subgroups within the groups?

Introduction – What is Cluster Analysis? Cluster analysis is a method to uncover groups in data. The group memberships of the data points are not known at the outset. Data points are placed into groups based on how “close” or “far apart” they are from each other.

Introduction – Examples of Cluster Analysis Astronomy: Faundez-Abans et al. (1996) used cluster analysis to classify 192 planetary nebulae. Psychiatry: Pilowsky et al. (1969) clustered 200 patients, using their responses to a depression symptom questionnaire. Archaeology: Hodson (1971) used a clustering technique to group hand axes found in the British Isles.

Methods – Measurement of Proximity Given n individuals X1,…,Xn, where Xi = (xi1,…,xip), we will create a dissimilarity matrix, D, where dij is the distance between individual i and individual j. There are many ways of defining distance.

Methods – Measurement of Proximity

Methods – Hierarchical Clustering Data is not partitioned into a set number of classes, but classification consists of a series of partitions. Results can be presented as a diagram known as a dendrogram. Can be agglomerative or divisive.

Methods – Hierarchical Clustering Agglomerative: first partition is n single member clusters; last partition is one cluster containing all n individuals. Divisive: first partition is one cluster containing all n individuals; last partition is n single member clusters.

Methods – Agglomerative Clustering Methods Single Linkage (Nearest Neighbor) Distance between groups is defined as that of the closest pair of individuals. Only need proximity matrix, not the original data. Tends to produce unbalanced and straggly clusters, especially in large data sets.

Methods – Agglomerative Clustering Methods

Methods – Agglomerative Clustering Methods

Methods – Agglomerative Clustering Methods Add individual 3 to the cluster containing individuals 4 and 5. Then merge the groups (1,2) and (3,4,5) into a single cluster.

Methods – Agglomerative Clustering Methods

Methods – Agglomerative Clustering Methods Complete Linkage (Furthest Neighbor) Distance between groups is that of the furthest pair of individuals. Tends to find compact clusters with equal diameters. Centroid Clustering Distance between groups is the distance between their centers. Requires original data.

Methods – Agglomerative Clustering Methods

Methods – Agglomerative Clustering Methods

Methods – Agglomerative Clustering Methods

Methods – Agglomerative Clustering Methods Final step will merge clusters (1,2) with (3,4,5).

Methods – Agglomerative Clustering Methods Ward’s Minimum Variance At each stage, the objective is to fuse two clusters based on keeping variance, or within-cluster sum of squares, small.

Methods – Agglomerative Clustering Methods i.e., want to minimize the increase in E, where, is the mean of the mth cluster for the kth variable. xml,k is the score on the kth variable for the lth object in the mth cluster.

Methods – Agglomerative Clustering Methods Tends to find same size, spherical clusters. Sensitive to outliers. Most widely used agglomerative technique.

Methods – Divisive Clustering Methods Can be computationally demanding if all 2k-1 – 1 possible divisions into two subclusters of a cluster of k objects are considered at each stage. Less commonly used than agglomerative methods

Methods – Hierarchical Clustering of Motivating Example Used an Euclidean distance matrix and Ward’s minimum variance technique.

Methods – Optimization Clustering Assumes number of clusters has already been fixed by the investigator. Basic idea: associated with each partition of the n individuals in the required number of groups, g, is an adequacy index c(n,g). This index is used to compare partitions.

Methods – Optimization Clustering Concepts of homogeneity and separation can be used to develop the adequacy index. Homogeneity: objects within a group should have a cohesive structure. Separation: groups should be well isolated from each other.

Methods – Optimization Clustering Criteria Decompose the total dispersion matrix, T, given by into T = W + B.

Methods – Optimization Clustering Criteria W is the within-group dispersion matrix, given by B is the between-group dispersion matrix, given by

Methods – Optimization Clustering Criteria Minimize trace(W) Equivalent to maximizing the trace(B). Maximizes the sum of the squared Euclidean distances between individuals and their group mean. Also known as the k-means algorithm. Not scale-invariant and tends to find spherical clusters.

Methods – Optimization Clustering Criteria Minimize det(W) Actually want to maximize det(T)/det(W), but T is the same for all possible partitions of n individuals into g groups. Can identify elliptical clusters and is scale-invariant. Tends to produce clusters that have an equal number of objects and are the same shape.

Methods – Optimization Clustering Criteria (a) Trace (W) (b) Det (W)

Methods – Optimization Clustering Criteria Minimize Wm is the dispersion matrix within the mth group, given by Can produce clusters of different shapes. Not often used.

Methods – Optimization Clustering of Motivating Example

Methods – Optimization Clustering of Motivating Example

Methods – Optimization Clustering of Motivating Example

Methods – Choosing the Optimal Number of Clusters Plot clustering criteria against number of groups and look for large changes in plot. Choose g to maximize C(g), where Choose g to minimize g2det(W).

Methods – Choosing the Optimal Number of Clusters Hypothesis tests Let J12(m) be the within-cluster sum of squares of the mth cluster. Let J22(m) be the WSS when the mth cluster is optimally divided in two. Reject the null hypothesis that the mth cluster is homogeneous if L(m) exceeds the critical value of a standard normal, where

Methods – Choosing the Optimal Number of Clusters Let Sg2 be the sum of squared deviations from cluster centroids. A division of the n objects into g2 clusters is significantly better than g1 clusters (g2>g1) if F*(g1,g2) exceeds the critical value of a F distribution with degrees of freedom p(g2-g1) and p(n-g2), where

The End