Cluster Analysis Market Segmentation Document Similarity
Segment Members
Biz Tech Math = 64 Main Groups
Each object is assigned to its own cluster and then the algorithm proceeds iteratively, at each stage joining the two most similar clusters, continuing until there is just a single cluster. At each stage distances between clusters are recomputed by the Lance–Williams dissimilarity update formula according to the particular clustering method being used. Hierarchical Clustering
biztech <- read.csv("survey-biztech.csv") biztech <- as.matrix(biztech) #hierarchical clustering d <- dist(as.matrix(biztech)) dm <- data.matrix(d) write.csv(dm, "distance_matrix.csv") Hierarchical Clustering
hc <- hclust(d) plot(hc) rect.hclust(hc, k=6, border="red")
Hierarchical Clustering ct <- cutree(hc, k=6) #write to file write.csv(ct, "survey-hclust.csv")
hierarchical clustering is very expensive in terms of time complexity though it provides better result
Cold Weather