Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter : Wei-Hao Huang Authors : Ying Gui, Xiaoli Z. Fern, Jennifer G. DY TKDD, 2010

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation  Data exist multiple groupings that are reasonable and interesting from different perspectives.  Traditional clustering is restricted to ﬁnding only one single clustering.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objectives 4 To propose a new clustering paradigm for ﬁnding all non-redundant clustering solutions of the data.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Methodology  Orthogonal clustering ─ Cluster space  Clustering in orthogonal subspaces ─ Feature space  Automatically Finding the number of clusters  Stopping criteria

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Orthogonal Clustering Framework 6 X (Face dataset)

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Orthogonal clustering  Residue space 7 )

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Clustering in orthogonal subspaces  Feature space ─ linear discriminant analysis (LDA) ─ singular value decomposition (SVD) ─ LDA v.s. SVD where 8 Projection Y=A T X

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Clustering in orthogonal subspaces  Residue space 9 A (t) = eigenvectors of

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Compare moethod1 and mothod2  Residue space  Moethod1 ─  Moethod2 ─  Moethod1 is a special case of Moethod2. ─ 10 A (t) = eigenvectors of M’=M then P 1 =P 2

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  To use PCA to reduce dimensional  Clustering ─ K-means clustering Smallest SSE ─ Gaussian mixture model clustering (GMM) Largest maximum likelihood  Dataset ─ Synthetic ─ Real-world Face, WebKB text, Vowel phoneme, Digit 11

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Evaluation 12

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Synthetic 13

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Face dataset 14

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  WebKB dataset  Vowe phoneme dataset 15

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Digit dataset 16

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Finding the number of clusters ─ K-means  Gap statistics 17

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Finding the number of clusters ─ GMM  BIC  Stopping Criteria ─ SSE is less than 10% at first iteration ─ K opt =1 ─ K opt > K max  Select K max ─ Gap statistics  ─ BIC  Maximize value of BIC 18

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Synthetic dataset 19

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Face dataset 20

Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  WebKB dataset 21

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 22 Conclusions To discover varied interesting and meaningful clustering solutions. Method2 is able to apply any clustering and dimensionality reduction algorithm.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. 23 Comments  Advantages ─ Find Multiple non-redundant clustering solutions  Applications ─ Data Clustering

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :

Similar presentations

Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :

Similar presentations

Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Learning multiple nonredundant clusterings Presenter :"— Presentation transcript:

Similar presentations

About project

Feedback