A General Model for Relational Clustering Bo Long and Zhongfei (Mark) Zhang Computer Science Dept./Watson School SUNY Binghamton Xiaoyun Wu Yahoo! Inc.

Slides:

Advertisements

Similar presentations

Consistent Bipartite Graph Co-Partitioning for High-Order Heterogeneous Co-Clustering Tie-Yan Liu WSM Group, Microsoft Research Asia Joint work.

Advertisements

BiG-Align: Fast Bipartite Graph Alignment

22C:19 Discrete Math Graphs Fall 2014 Sukumar Ghosh.

Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach Xiaoli Zhang Fern, Carla E. Brodley ICML’2003 Presented by Dehong Liu.

Dimensionality Reduction PCA -- SVD

Semi-supervised Learning Rong Jin. Semi-supervised learning  Label propagation  Transductive learning  Co-training  Active learning.

Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.

Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.

10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.

Communities in Heterogeneous Networks Chapter 4 1 Chapter 4, Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool,

Symmetric Matrices and Quadratic Forms

A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts Dhillon, Inderjit S., Yuqiang Guan, and Brian Kulis.

A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts

ISPDC 2007, Hagenberg, Austria, 5-8 July On Grid-based Matrix Partitioning for Networks of Heterogeneous Processors Alexey Lastovetsky School of.

1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 4 March 30, 2005

Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,

Chapter 3 Determinants and Matrices

A Clustered Particle Swarm Algorithm for Retrieving all the Local Minima of a function C. Voglis & I. E. Lagaris Computer Science Department University.

Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.

1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 6 May 7, 2006

CS345 Data Mining Link Analysis Algorithms Page Rank Anand Rajaraman, Jeffrey D. Ullman.

Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.

6 1 Linear Transformations. 6 2 Hopfield Network Questions.

33 rd International Conference on Very Large Data Bases, Sep. 2007, Vienna Towards Graph Containment Search and Indexing Chen Chen 1, Xifeng Yan 2, Philip.

A Multi-Agent Learning Approach to Online Distributed Resource Allocation Chongjie Zhang Victor Lesser Prashant Shenoy Computer Science Department University.

Radial Basis Function Networks

Dimensionality reduction Usman Roshan CS 675. Supervised dim reduction: Linear discriminant analysis Fisher linear discriminant: –Maximize ratio of difference.

Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.

COMMUNITIES IN MULTI-MODE NETWORKS 1. Heterogeneous Network Heterogeneous kinds of objects in social media – YouTube Users, tags, videos, ads – Del.icio.us.

Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University

1 February 24 Matrices 3.2 Matrices; Row reduction Standard form of a set of linear equations: Chapter 3 Linear Algebra Matrix of coefficients: Augmented.

Gaussian Mixture Model and the EM algorithm in Speech Recognition

Structure Preserving Embedding Blake Shaw, Tony Jebara ICML 2009 (Best Student Paper nominee) Presented by Feng Chen.

Evolutionary Clustering and Analysis of Bibliographic Networks Manish Gupta (UIUC) Charu C. Aggarwal (IBM) Jiawei Han (UIUC) Yizhou Sun (UIUC) ASONAM 2011.

1 Learning with Local and Global Consistency Presented by Qiuhua Liu Duke University Machine Learning Group March 23, 2007 By Dengyong Zhou, Olivier Bousquet,

Co-clustering Documents and Words Using Bipartite Spectral Graph Partitioning Jinghe Zhang 10/28/2014 CS 6501 Information Retrieval.

Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)

SINGULAR VALUE DECOMPOSITION (SVD)

Stratified K-means Clustering Over A Deep Web Data Source Tantan Liu, Gagan Agrawal Dept. of Computer Science & Engineering Ohio State University Aug.

Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova ， Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.

Andreas Papadopoulos - [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.

Understanding Network Concepts in Modules Dong J, Horvath S (2007) BMC Systems Biology 2007, 1:24.

HMM - Part 2 The EM algorithm Continuous density HMM.

Generic Summarization and Keyphrase Extraction Using Mutual Reinforcement Principle and Sentence Clustering Hongyuan Zha Department of Computer Science.

A Convergent Solution to Tensor Subspace Learning.

Review of Matrix Operations Vector: a sequence of elements (the order is important) e.g., x = (2, 1) denotes a vector length = sqrt(2*2+1*1) orientation.

Data Structures and Algorithms in Parallel Computing Lecture 7.

About Me Swaroop Butala  MSCS – graduating in Dec 09  Specialization: Systems and Databases  Interests:  Learning new technologies  Application of.

Efficient Semi-supervised Spectral Co-clustering with Constraints

Graphs, Vectors, and Matrices Daniel A. Spielman Yale University AMS Josiah Willard Gibbs Lecture January 6, 2016.

Relation Strength-Aware Clustering of Heterogeneous Information Networks with Incomplete Attributes ∗ Source: VLDB.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor ： Dr. Hsu Graduate ： Yu Cheng Chen Author: Wei Xu,

 In the previews parts we have seen some kind of segmentation method.  In this lecture we will see graph cut, which is a another segmentation method.

Ultra-high dimensional feature selection Yun Li

3.3 Network-Centric Community Detection  Network-Centric Community Detection –consider the global topology of a network. –It aims to partition nodes of.

A Tutorial on Spectral Clustering Ulrike von Luxburg Max Planck Institute for Biological Cybernetics Statistics and Computing, Dec. 2007, Vol. 17, No.

哈工大信息检索研究室 HITIR ’ s Update Summary at TAC2008 Extractive Content Selection Using Evolutionary Manifold-ranking and Spectral Clustering Reporter: Ph.d.

Document Clustering with Prior Knowledge Xiang Ji et al. Document Clustering with Prior Knowledge. SIGIR 2006 Presenter: Suhan Yu.

Motoki Shiga, Ichigaku Takigawa, Hiroshi Mamitsuka

Review of Matrix Operations

Document Clustering Based on Non-negative Matrix Factorization

Jianping Fan Dept of CS UNC-Charlotte

RankClus: Integrating Clustering with Ranking for Heterogeneous Information Network Analysis Yizhou Sun, Jiawei Han, Peixiang Zhao, Zhijun Yin, Hong Cheng,

Community Distribution Outliers in Heterogeneous Information Networks

Spectral Clustering Eric Xing Lecture 8, August 13, 2010

3.3 Network-Centric Community Detection

Feature space tansformation methods

Symmetric Matrices and Quadratic Forms

Unsupervised Learning and Clustering

Symmetric Matrices and Quadratic Forms

Presentation transcript:

A General Model for Relational Clustering Bo Long and Zhongfei (Mark) Zhang Computer Science Dept./Watson School SUNY Binghamton Xiaoyun Wu Yahoo! Inc. Philip S.Yu IBM Watson Research Center

Multi-type Relational Data (MTRD) is Everywhere!  Bibliometrics Papers, authors, journals  Social networks People, institutions, friendship links  Biological data Genes, proteins, conditions  Corporate databases Customers, products, suppliers, shareholders Papers Authors Key words

Challenges for Clustering!  Data objects are not identically distributed: Heterogeneous data objects (papers, authors).  Data objects are not independent Heterogeneous data objects are related to each other. No IID assumption 

Relational Data  Flat Data? Paper ID word1word2 …… author1author2 ……………………….……. 113……10 ………………..…….. …… ……. ………….…………………….…….. Author IDPaper 1Paper 2 ……………………….……. 110……………………..…….. …… …….…………………….……..  High dimensional and sparse data  Data redundancy Word ID Paper 1Paper 2 ……………………….……. 113……………………..…….. …… ……. ……………….…….. Papers Authors Key words

Relational Data  Flat Data?  No interactions of hidden structures of different types of data objects  Difficult to discover the global community structure. users Web pages queries

A General Model: Collective Factorization on Related Matrices  Formulate multi-type relational data as a set of related matrices;  cluster different types of objects simultaneously by factorizing the related matrices simultaneously.  Make use of the interaction of hidden structures of different types of objects.

Data Representation  Represent a MTRD set as a set of related matrices: Relation matrix, R (ij), denotes the relations between ith type of objects and jth type of objects. Feature matrix, F (i), denotes the feature values for ith type of objects. Users Movies Words Authors Papers f R (12) R (23) F (1)

Matrix Factorization  Exploring the hidden structure of the data matrix by its factorization:. Feature basis matrix Cluster association matrix

Model: Collective Factorization on Related Matrices (CFRM)

CFRM Model: Example f

Spectral Clustering  Algorithms that cluster points using eigenvectors of matrices derived from the data  Obtain data representation in the low- dimensional space that can be easily clustered  Traditional spectral clustering focuses on homogeneous data

Main Theorem:

Algorithm Derivation: Iterative Updating where,

Spectral Relaxation  Apply real relaxation to C (p) to let it be an arbitrary orthornormal matrix.  By Ky-Fan Theorem, the optimal solution is given by the leading k p eigenvectors of M (p).

Spectral Relational Clustering (SRC)

Spectral Relational Clustering: Example  Update C (1) as k 1 leading eigenvectors of  Update C (2) as k 2 leading eigenvectors of  Update C (3) as k 3 leading eigenvectors of 3 1 2

Advantages of Spectral Relational Clustering (SRC)  Simple as traditional spectral approaches  Applicable to relational data with various structures.  Adaptive low dimension embedding  Efficient: O(tmn 2 k). For sparse data, it is reduced to O(tmzk) where z denotes the number of non-zero elements

Special case 1: k-means and spectral clustering  Flat data: a special MTRD with only one feature matrix F,  By the main theorem, k-means is equivalent to the trace maximization,

Special case 2: Bipartite Spectral Graph Partitioning (BSGP)  Bipartite graph: a special case of MTRD with one relation matrix R,  BSGP restricts the clusters of different types of objects to have one-to-one associations, i.e., diagonal constraints on A.

Experiments  Bi-type relational data: Document-word data  Tri-type relational data: Category-document-word data.  Comparing algorithms: Normalized Cut (NC), Bipartite Spectral Graph Partitioning (BSGP), Mutual Reinforcement K-means (MRK) Consistent Bipartite Graph Co-partitioning (CBGC).

Experimental Results on Bi-type Relational Data

Eigenvectors of a multi2 data set

Experimental Results on Tri-type Relational Data

Summary  Collective Factorization on Related Matrices – a general model for MTRD clustering.  Spectral Relational Clustering – A novel spectral approach Simple and applicable to relational data with various structures. Adaptive low dimension embedding Efficient  Theoretic analysis and experiments demonstrate the effectiveness and the promise of the model and of the algorithm.