Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient Semi-supervised Spectral Co-clustering with Constraints

Similar presentations


Presentation on theme: "Efficient Semi-supervised Spectral Co-clustering with Constraints"— Presentation transcript:

1 Efficient Semi-supervised Spectral Co-clustering with Constraints
Xiaoxiao Shi, Wei Fan, Philip S. Yu

2 Motivation Co-clustering with constraints Document-word co-clustering
How to use? Co-clustering Network Clustering Doc 1 (ICNP) Doc 2 (ICDM) Doc 3 (AAAI) C Doc 4 (KDD)

3 Motivation Co-clustering with constraints
Author-conference co-clustering Collaborators Collaborators John Mary Jack Tom Cathy How to use? ICDM 07 ICDM 08 ICDM 09 AAAI 08 AAAI 09 ICDM AAAI

4 Straightforward solution I: transform constraints as edges, and solve global graph partition problem
Keyword-conference co-clustering ICDM ICDM Co-clustering Co-clustering Cut I KDD KDD Clustering Clustering AAAI AAAI Cut II Network Network ICNP ICNP

5 Straightforward solution II: transform constraints as nodes, and solve bipartite graph partition problem in a larger graph Pseudo node Pseudo node ICDM Co-clustering ICDM Co-clustering Cut I KDD KDD Clustering Clustering AAAI Cut II AAAI Network Network ICNP ICNP

6 Problems of the two straightforward solutions
Not efficient more edges are added; more nodes are included (10 to 80 times slower than the original co-clustering without constraint) Not effective The graph becomes more complicated, of which the optimal partition is more difficult to find (In some cases, the Normalized Mutual Information drops 30% compared with the original co-clustering without constraint)

7 Formulate the problem as an optimization problem
The solution can be directly obtained via the left and right eigenvectors of the following matrix (more details in Theorem 2 of the paper): Minimize the number of inter-group edges Maximize the number of satisfied constraints Graph Laplacian

8 Algorithm Flow

9 Experiments Document-word co-clustering

10 Experiments Graph-pattern co-clustering

11 Results

12 Conclusions For many applications, some prior knowledge exists about the relationship among rows and columns for co-clustering applications. Problem: how to use the knowledge (constraints) to find better co-clusters? Two straightforward solutions Model the constraints as linkages Model the constraints as additional pseudo nodes Problem: not efficient; not effective Proposed method: model the problem as an optimization problem, and solve it with the selected eigenvectors

13 Related Work Traditional Co-clustering without constraint
Information based co-clustering Information-theoretic co-clustering (Dhillon, etc 2003) Partition based co-clustering Spectral co-clustering (Dhillon, etc 2001) Previous constraint-based co-clustering models Co-clustering with row constraint (Chen, etc 2008) Co-clustering with order based constraint (suitable for a specific type of constraint, not comparable with the proposed model; Pensa. Etc 2008) Straightforward modifications of traditional co-clustering models to use constraints: Link-based constraint co-clustering Node-based constraint co-clustering


Download ppt "Efficient Semi-supervised Spectral Co-clustering with Constraints"

Similar presentations


Ads by Google