Download presentation
Presentation is loading. Please wait.
Published byHollie Shelton Modified over 9 years ago
1
Efficient Semi-supervised Spectral Co-clustering with Constraints
Xiaoxiao Shi, Wei Fan, Philip S. Yu
2
Motivation Co-clustering with constraints Document-word co-clustering
How to use? Co-clustering Network Clustering Doc 1 (ICNP) Doc 2 (ICDM) Doc 3 (AAAI) C Doc 4 (KDD)
3
Motivation Co-clustering with constraints
Author-conference co-clustering Collaborators Collaborators John Mary Jack Tom Cathy How to use? ICDM 07 ICDM 08 ICDM 09 AAAI 08 AAAI 09 ICDM AAAI
4
Straightforward solution I: transform constraints as edges, and solve global graph partition problem
Keyword-conference co-clustering ICDM ICDM Co-clustering Co-clustering Cut I KDD KDD Clustering Clustering AAAI AAAI Cut II Network Network ICNP ICNP
5
Straightforward solution II: transform constraints as nodes, and solve bipartite graph partition problem in a larger graph Pseudo node Pseudo node ICDM Co-clustering ICDM Co-clustering Cut I KDD KDD Clustering Clustering AAAI Cut II AAAI Network Network ICNP ICNP
6
Problems of the two straightforward solutions
Not efficient more edges are added; more nodes are included (10 to 80 times slower than the original co-clustering without constraint) Not effective The graph becomes more complicated, of which the optimal partition is more difficult to find (In some cases, the Normalized Mutual Information drops 30% compared with the original co-clustering without constraint)
7
Formulate the problem as an optimization problem
The solution can be directly obtained via the left and right eigenvectors of the following matrix (more details in Theorem 2 of the paper): Minimize the number of inter-group edges Maximize the number of satisfied constraints Graph Laplacian
8
Algorithm Flow
9
Experiments Document-word co-clustering
10
Experiments Graph-pattern co-clustering
11
Results
12
Conclusions For many applications, some prior knowledge exists about the relationship among rows and columns for co-clustering applications. Problem: how to use the knowledge (constraints) to find better co-clusters? Two straightforward solutions Model the constraints as linkages Model the constraints as additional pseudo nodes Problem: not efficient; not effective Proposed method: model the problem as an optimization problem, and solve it with the selected eigenvectors
13
Related Work Traditional Co-clustering without constraint
Information based co-clustering Information-theoretic co-clustering (Dhillon, etc 2003) Partition based co-clustering Spectral co-clustering (Dhillon, etc 2001) Previous constraint-based co-clustering models Co-clustering with row constraint (Chen, etc 2008) Co-clustering with order based constraint (suitable for a specific type of constraint, not comparable with the proposed model; Pensa. Etc 2008) Straightforward modifications of traditional co-clustering models to use constraints: Link-based constraint co-clustering Node-based constraint co-clustering
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.