Download presentation
Presentation is loading. Please wait.
1
Clustering
2
Revesion of Yesterday's Algorithm
3
K-Means Algorithm Each cluster is represented by the mean value of the objects in the cluster Input : set of objects (n), no of clusters (k) Output : set of k clusters Algo Randomly select k samples & mark them a initial cluster Repeat Assign/ reassign in sample to any given cluster to which it is most similar depending upon the mean of the cluster Update the cluster’s mean until No Change.
4
K-Means (graph) Step1: Form k centroids, randomly
Step2: Calculate distance between centroids and each object Use Euclidean’s law do determine min distance: d(A,B) = (x2-x1)2 + (y2-y1)2 Step3: Assign objects based on min distance to k clusters Step4: Calculate centroid of each cluster using C = (x1+x2+…xn , y1+y2+…yn) n n Go to step 2. Repeat until no change in centroids.
5
K-Mediod (PAM) Also called Partitioning Around Mediods.
Step1: choose k mediods Step2: assign all points to closest mediod Step3: form distance matrix for each cluster and choose the next best mediod. i.e., the point closest to all other points in cluster go to step2. Repeat until no change in any mediods
6
What are Agglomerative Algorithms??
Bottom Up Approach Simple Outputs a hierarchy Structure is more informative Need not specify the number of clusters
7
Dendogram
8
Euclidean Distance
9
Distance Matrix
10
Agglomerative Algorithm
Step1: Make each object as a cluster Step2: Calculate the Euclidean distance from every point to every other point. i.e., construct a Distance Matrix Step3: Identify two clusters with shortest distance. Merge them Go to Step 2 Repeat until all objects are in one cluster
11
Agglomerative Algorithm Approaches
Single Link Complete Link Average Link
12
Simple Example Item E A C B D 1 2 3 5 6
13
Another Example Find single link technique to find clusters in the given database. X Y 1 0.4 0.53 2 0.22 0.38 3 0.35 0.32 4 0.26 0.19 5 0.08 0.41 6 0.45 0.3
14
Plot given data
15
Construct a distance matrix
1 2 3 4 5 6 0.24 0.22 0.15 0.37 0.2 0.34 0.14 0.28 0.29 0.23 0.25 0.11 0.39
16
Identify two nearest clusters
17
Repeat process until all objects in same cluster
18
Average link Average distance matrix
19
Use below data and draw single link, complete link and average link dendogram.
Object X Y A 2 B 3 C 1 D E 1.5 0.5
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.