Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A New Cluster Validity Index for Data with Merged Clusters.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A New Cluster Validity Index for Data with Merged Clusters."— Presentation transcript:

1 Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A New Cluster Validity Index for Data with Merged Clusters and Different Densities Advisor : Dr. Hsu Presenter : Ai-Chen Liao Authors : Benson S.Y. Lam and Hong Yan 2005. ICSMC

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outline Motivation Objective The Proposed Method Comparisons with Existing Methods Conclusion Personal Opinions

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation  Several cluster validity measures have been proposed for evaluating clustering results.  However, existing methods may not work well for the following two kinds of data sets. ─ (1) The data set contains cluster groups with different densities. ─ (2) Some of the cluster groups are closely positioned.

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objective  The proposed index produces more accurate results and is able to handle the two kinds of data sets mentioned above.

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Background─ Clustering  The role of cluster validity is to provide a test to both estimate the optimal number of cluster groups c in the data set and examine the quality of the clustering result.  There are three classes of validity indices: ─ (1) cost-function based indices. Ex. FOM. AIC. BIC ─ (2) density-based indices. Ex. PC ─ (3) geometric approaches. Ex. DI. I-index. CH

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 6 The Proposed Method  We measure the validity of a clustering result as the squared total length of the data eigen-axes of the cluster groups with respect to the between-cluster separation.  The proposed geometric index (GI) is defined as:

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 7 The Proposed Method L L V1 周圍的矩形範圍 : 2*2L=4L 4L^2=16L 2  16L 2 /S 2 V3 周圍的矩形範圍 : 2*(2+1/2L)=3L 3L^2=9L 2  9L 2 /S 3 = 18L 2 /S 2

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 8 Comparisons with Existing Methods

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 9 Comparisons with Existing Methods

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 10 Conclusion  We have introduced a new cluster validity index, which measure the geometrical feature of the data set.  Its validity principle is to measure the squared total length of the data eigen-axes of cluster groups with respect to the between-cluster separation if this value reaches minimum, the clustering result is an optimal solution.  The proposed index produces more accurate results.

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 11 Personal Opinions Advantage  … Drawback  … Application  For evaluating clustering results


Download ppt "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A New Cluster Validity Index for Data with Merged Clusters."

Similar presentations


Ads by Google