Presentation is loading. Please wait.

Presentation is loading. Please wait.

Clustering John Owen Sarah Smith.

Similar presentations


Presentation on theme: "Clustering John Owen Sarah Smith."— Presentation transcript:

1 Clustering John Owen Sarah Smith

2 What is clustering? Grouping together objects that are like one another and not like the objects other clusters. Like sorting laundry…

3 What is Clustering? Has its routes in statistical analysis
In data mining, clustering is used to give a user a high level view of what is going on in their database.

4 Clustering Approach Algorithms can be complex
The general approach contains five steps Pattern representation Identify the pattern proximity relative to the data domain Grouping or Clustering of the data. Data abstraction Assessment of output.

5 Four Clustering Methods
Partitioning (k-means clustering) Hierarchical Density-Based Grid-Based

6 Partitioning (k-means clustering)
Classification of the data into k groups, which meet two requirements each group must contain at least one object, and each object must belong to exactly one group The analyst decides how many clusters there should be, then creates the best fit of points to a cluster The analyst must know the data to do this

7 Partitioning Example (Source k-means clustering

8 Hierarchical Clustering
Analyst need not know the data Designed primarily for creating micro-clusters in large database sets Hierarchal method is either agglomerative (bottom-up) or divisive (top-down)

9 Hierarchical Example (Source

10 Density-Based Clustering
Defines the data by the density of the data distribution Does not require the user to identify the number of clusters before beginning the data analysis Useful for dealing with outliers

11 Density-Based Examples
(Source:

12 Grid-Based Clustering
Adaptation of Density-Based Clustering Data points are placed in a data grid Each data grid is of equal size Grids can be decomposed into smaller grids

13 Grid-Based Example

14 Business Uses of Clustering
Marketing Identifying customers/clients who are outliers Detection of Credit Card Fraud Scientific inquiry Human genome

15 Future of Clustering AI Unsupervised Learning from pattern recognition


Download ppt "Clustering John Owen Sarah Smith."

Similar presentations


Ads by Google