Download presentation
Presentation is loading. Please wait.
1
Dr. Michael R. Hyman Cluster Analysis
2
2 Introduction Also called classification analysis and numerical taxonomy Goal: assign objects to groups so that intra-group similarity and inter-group dissimilarity as maximized No (in)dependent variables Find naturally occurring groupings of objects
3
3 Uses in Studying Consumers Benefit segmentation Finding market niches Finding homogeneous market segments for future study Data reduction
4
4 Clusters Formed by Using Data on Two Characteristics
5
5 Scatter Plot of Income and Education Data for PC Owners and Non-owners
6
6
7
7
8
8 Procedure #1: Divisive (tear down) Start with profile data Find variable with highest variance Split objects above and below mean on this variable Find remaining high variance variable and split along mean
9
9 Procedure #2: Agglomerative (build up) Select similarity measure –Distance (Euclidean, city block) –Correlation –Similarity Search similarity matrix for most similar cluster pair Repeat iteratively until only one cluster remains
10
10 Commonly Used Similarity Coefficients 20
11
11 Procedure #2: Agglomerative Stopping Rules Theory and practice Distance that clusters combine Within/between group variance Relative sizes of clusters
12
12 Procedure #2: Agglomerative Linkage Methods Single (nearest neighbor) Makes long, thin clusters Complete (maximum distance to farthest neighbor) Sensitive to outliers Average distance between objects Variance methods (minimum within- cluster variance) Nodal (begin with two least similar objects as nodes)
13
13
14
14
15
15 Procedure #2: Agglomerative Reliability and Validity Assessment Use different distance measures Use different clustering methods Split data, run both halves, and compare Shuffle cases (objects) Solve with subset of profile variables
16
16 General Problems Early assignments treated as permanent –Precludes later revision for improved fit Number of clusters –More clusters means greater intra-group homogeneity but less descriptive power No good measure of cluster compactness Lack of statistical properties makes inference difficult
17
17 General Problems (cont.) Coping with inter-correlated profile variables Must select profile variables that can discriminate among objects Sensitive to unit of measurement and outliers –Fix: Standardize data and delete outliers Subjective interpretation of results (i.e., naming clusters)
18
18 Steps for Conducting a Cluster Analysis: A Summary
19
19
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.