Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata."— Presentation transcript:

1 Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata Presenter : Shao-Wei Cheng Authors : Dianxun Shuai, Yumin Dong, Qing Shuai IS 2007

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outline Motivation Objective Methodology Experiments Conclusion Personal Comments

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation Many clustering methods have the following limitations and shortcomings in enterprise computing. The run-time increasing rapidly. Needs of some pairwise computation or pre-processing. No guarantee for the clustering optimality. The clustering performance and quality are sensitive to the cluster shape and cluster distribution. Unable to well suppress the noise affect. Poor clustering performance for high-dimensional data. No learning ability The dynamic change of clustered data objects are usually not allowed during the algorithm execution. 區域解 Start 3

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 Objectives This paper is devoted to a novel GCA for self-organizing data clustering in enterprise computing and overcame the limitations and shortcomings above. GCA is a Generalized Cellular Automata.  GCA have some components and feature.  Parallel computation  Local  Homogeneous  Cells  States  Neighborhood  Rule

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology Rule  N x N cellular array  s ij (t) is the state of the cell c ij (t), is denoted by Ø  c ij (t): cell p = 1, f ( ∆H ), 1- f ( ∆H )  ∆H = Harmony increment Γ(t) is a matrix w ij is a weight coefficient N ij = { c i, j-1, c i, j+1, c i-1, j, c i+1, j } 5

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology  d( s ij (t), s i'j' (t) ) is the similarity  If s ij (t)≠ Ø and s i'j' (t)≠ Ø, then 0 ≦ d( s ij (t), s i'j' (t) ) ≦ 1  Otherwise, d( s ij (t), s i'j' (t) ) = -1 6

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments Number of clusters: 60. Data set size: 20,000. t = number of iterations. 8 t = 0t = 20t = 40 t = 60t = 80t = 200

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments Number of clusters: 25. Average data objects per cluster: 500. Data set size: 12,500; Execution times of the GCAA: 1000. 9

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments PAM, Ex. K-means CLARANS, Clustering Large Applications based on RANdom Search CURE, Clustering Using REpresentatives 10

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 11 Conclusion The GCA approach has shown many advantages over other widely used clustering algorithms in terms of the following:  Faster clustering speed.  The ability to handle and recognize the shape-varying and size- varying clusters.  The robustness to outliers.  The ability to learn.  The suitability for high-dimensional data sets.

12 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 12 Personal Comments Advantage  A novel data clustering approach. Drawback  … Application  Clustering in enterprise computing.


Download ppt "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A new data clustering approach- Generalized cellular automata."

Similar presentations


Ads by Google