Download presentation
Presentation is loading. Please wait.
Published byPrudence Carr Modified over 9 years ago
1
CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course
2
Outline Density-based Clustering Methods 2 Density-Based Clustering Methods Density-Based Clustering Background Terminology How does DBSCAN find clusters? DBSCAN
3
Clustering Methods Density-based Clustering Methods 3 Partitioning methods K-Means Hierarchical methods Agglomerative Hierarchical Clustering Divisive hierarchical clustering Density-based methods DBSCAN: a Density-Based Spatial Clustering of Applications with Noise Grid-based methods STING: A Statistical Information Grid Approach to Spatial Data Mining Model-based methods Expectation-Maximization Neural Network Approach High Dimensional Data Clustering CLIQUE: A Dimension-Growth Subspace Clustering Method
4
DBSCAN Density-based Clustering Methods 4
5
Density-Based Clustering Methods Clustering based on density, such as density-connected points instead of distance metric. Cluster = set of “density connected” points. Major features: Discover clusters of arbitrary shape Handle noise Need “density parameters” as termination condition- (when no new objects can be added to the cluster.) Example: DBSCAN (Ester, et al. 1996) OPTICS (Ankerst, et al 1999) DENCLUE (Hinneburg & D. Keim 1998) 5 Density-based Clustering Methods
6
Density-Based Clustering: Background Eps neighborhood: The neighborhood within a radius Eps of a given object MinPts: Minimum number of points in an Eps-neighborhood of that object. Core object: If the Eps neighborhood contains at least a minimum number of points Minpts, then the object is a core object Directly density-reachable: A point p is directly density- reachable from a point q wrt. Eps, MinPts if 1) p is within the Eps neighborhood of q 2) q is a core object p q MinPts = 5 Eps = 1 6 Density-based Clustering Methods
7
Density Reachability and Density Connectivity M, P, O and R are core objects since each is in an Eps neighborhood containing at least 3 points Minpts = 3 Eps=radius of the circles 7 Density-based Clustering Methods
8
Directly density reachable Q is directly density reachable from M. M is directly density reachable from P and vice versa. 8 Density-based Clustering Methods
9
Indirectly density reachable Q is indirectly density reachable from P since Q is directly density reachable from M and M is directly density reachable from P. But, P is not density reachable from Q since Q is not a core object. 9 Density-based Clustering Methods
10
Core, border, and noise points DBSCAN is a Density-Based Spatial Clustering of Applications with Noise Density = number of points within a specified radius (Eps) A point is a core point if it has a specified number (or more) of points (MinPts) within Eps These are points that are at the interior of a cluster. A border point has fewer than MinPts within Eps, but is in the neighborhood of a core point. A noise point is any point that is not a core point nor a border point. 10 Density-based Clustering Methods
11
How does DBSCAN find clusters? Density-based Clustering Methods 11 DBSCAN searches for clusters by checking the Eps- neighborhood of each point in the database. If the Eps-neighborhood of a point p contains more than MinPts, a new cluster with p as a core object is created. DBSCAN then iteratively collects directly density- reachable objects from these core objects, which may involve the merge of a few density-reachable clusters. The process terminates when no new point can be added to any cluster
12
DBSCAN Algorithm Arbitrary select a point p Retrieve all points density-reachable from p wrt Eps and MinPts. If p is a core point, a cluster is formed. If p is a border point, no points are density-reachable from p and DBSCAN visits the next point of the database. Continue the process until all of the points have been processed. 12 Density-based Clustering Methods
13
DBSCAN Summary DBSCAN is A Density-Based Clustering Method Based on Connected Regions with Sufficiently High Density The algorithm grows regions with sufficiently high density into clusters and discovers clusters of arbitrary shape in spatial databases with noise. It defines a cluster as a maximal set of density- connected points. So distance is not the metric unlike the case of hierarchical methods. 13 Density-based Clustering Methods
14
Summary Density-based Clustering Methods 14 Density-Based Clustering Methods Density-Based Clustering Background Terminology How does DBSCAN find clusters? DBSCAN
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.