Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst

Slides:



Advertisements
Similar presentations
Partitional Algorithms to Detect Complex Clusters
Advertisements

Leting Wu Xiaowei Ying, Xintao Wu Aidong Lu and Zhi-Hua Zhou PAKDD 2011 Spectral Analysis of k-balanced Signed Graphs 1.
Traffic-driven model of the World-Wide-Web Graph A. Barrat, LPT, Orsay, France M. Barthélemy, CEA, France A. Vespignani, LPT, Orsay, France.
Analysis and Modeling of Social Networks Foudalis Ilias.
Nonlinear Dimension Reduction Presenter: Xingwei Yang The powerpoint is organized from: 1.Ronald R. Coifman et al. (Yale University) 2. Jieping Ye, (Arizona.
AGE ESTIMATION: A CLASSIFICATION PROBLEM HANDE ALEMDAR, BERNA ALTINEL, NEŞE ALYÜZ, SERHAN DANİŞ.
Hierarchy in networks Peter Náther, Mária Markošová, Boris Rudolf Vyjde : Physica A, dec
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented.
Lecture 21: Spectral Clustering
Spectral Clustering 指導教授 : 王聖智 S. J. Wang 學生 : 羅介暐 Jie-Wei Luo.
Graph Based Semi- Supervised Learning Fei Wang Department of Statistical Science Cornell University.
Segmentation Graph-Theoretic Clustering.
Gaussian Information Bottleneck Gal Chechik Amir Globerson, Naftali Tishby, Yair Weiss.
A scalable multilevel algorithm for community structure detection
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
The Multivariate Normal Distribution, Part 1 BMTRY 726 1/10/2014.
5-1 Two Discrete Random Variables Example Two Discrete Random Variables Figure 5-1 Joint probability distribution of X and Y in Example 5-1.
The Multivariate Normal Distribution, Part 2 BMTRY 726 1/14/2014.
1 Multivariate Normal Distribution Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
5-1 Two Discrete Random Variables Example Two Discrete Random Variables Figure 5-1 Joint probability distribution of X and Y in Example 5-1.
Diffusion Maps and Spectral Clustering
: Appendix A: Mathematical Foundations 1 Montri Karnjanadecha ac.th/~montri Principles of.
The Erdös-Rényi models
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Modal Shape Analysis beyond Laplacian (CAGP 2012) Klaus Hildebrandt, Christian Schulz, Christoph von Tycowicz, Konrad Polthier (brief) Presenter: ShiHao.Wu.
LANGUAGE NETWORKS THE SMALL WORLD OF HUMAN LANGUAGE Akilan Velmurugan Computer Networks – CS 790G.
Network Characterization via Random Walks B. Ribeiro, D. Towsley UMass-Amherst.
Fan Chung University of California, San Diego The PageRank of a graph.
Spectral Analysis based on the Adjacency Matrix of Network Data Leting Wu Fall 2009.
Andreas Papadopoulos - [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.
1 Sample Geometry and Random Sampling Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Bruno Ribeiro Don Towsley University of Massachusetts Amherst IMC 2010 Melbourne, Australia.
Spectral Sequencing Based on Graph Distance Rong Liu, Hao Zhang, Oliver van Kaick {lrong, haoz, cs.sfu.ca {lrong, haoz, cs.sfu.ca.
Workshop on Optimization in Complex Networks, CNLS, LANL (19-22 June 2006) Application of replica method to scale-free networks: Spectral density and spin-glass.
KPS 2007 (April 19, 2007) On spectral density of scale-free networks Doochul Kim (Department of Physics and Astronomy, Seoul National University) Collaborators:
1 Probability and Statistical Inference (9th Edition) Chapter 4 Bivariate Distributions November 4, 2015.
GRAPHS. Graph Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component, spanning tree Types of graphs: undirected,
Properties of Growing Networks Geoff Rodgers School of Information Systems, Computing and Mathematics.
Graphs, Vectors, and Matrices Daniel A. Spielman Yale University AMS Josiah Willard Gibbs Lecture January 6, 2016.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Response network emerging from simple perturbation Seung-Woo Son Complex System and Statistical Physics Lab., Dept. Physics, KAIST, Daejeon , Korea.
Laplacian Matrices of Graphs: Algorithms and Applications ICML, June 21, 2016 Daniel A. Spielman.
Xiaowei Ying, Kai Pan, Xintao Wu, Ling Guo Univ. of North Carolina at Charlotte SNA-KDD June 28, 2009, Paris, France Comparisons of Randomization and K-degree.
Laplacian Matrices of Graphs: Algorithms and Applications ICML, June 21, 2016 Daniel A. Spielman.
Density of States for Graph Analysis
Biointelligence Laboratory, Seoul National University
Graph Spectral Image Smoothing
Random Walk for Similarity Testing in Complex Networks
Intrinsic Data Geometry from a Training Set
Models of networks (synthetic networks or generative models): Random, Small-world, Scale-free, Configuration model and Random geometric model By: Ralucca.
Modeling, sampling, generating Networks with MRV
Network analysis.
Discrete Kernels.
Degree and Eigenvector Centrality
Segmentation Graph-Theoretic Clustering.
Grouping.
geometric representations of graphs
Peer-to-Peer and Social Networks
Spectral Clustering Eric Xing Lecture 8, August 13, 2010
Confidence Ellipse for Bivariate Normal Data
Shaul Druckmann, Dmitri B. Chklovskii  Current Biology 
3.3 Network-Centric Community Detection
Ch11 Curve Fitting II.
Global mean-first-passage time of random walks on Vicsek fractals
Properties and applications of spectra for networks
The Multivariate Normal Distribution, Part I
Detecting Important Nodes to Community Structure
Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst
Presentation transcript:

Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst Complex Graph Similarity Testing and Multivariate Distribution Comparison Using Random Walks Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst

Outline of the approach Small-time asymptotic results for diffusion on manifold Analogues on graphs Analyzing graphs with 1d distributions Analyzing graphs with 2d distributions Future work and collaborations

Consider diffusion on Riemannian manifold : with initial condition and boundary conditions Using the spectral resolution , The Dirichlet heat kernel for is and the heat content Note is the Fourier coefficient of the initial condition in the eigen space spanned by !

random walk on graph

Notations Laplacian matrix: Normalized Laplacian: Random walk Laplacian: Let be the eigenvalues of and the corresponding eigenvectors. With and , where .

Notations Partition vertex set into two subsets, the set of all interior nodes and the set of all boundary nodes . Label the interior vertices and the boundary vertices . The normalized can be partitioned into: Let the boundary vertices be absorbing in the heat equation in next page. Use for for convenience henceforth.

Heat Equation and Heat Content Heat Equation is associated with the normalized graph Laplacian with a given initial condition. Heat Content:

Heat Equation and Heat Content Time derivatives of the heat content: When ,

Lazy Random Walk Approximation Transition matrix . 2. Lazy random walk . 3. For any given time , take the limit with ( ),

Graph Similarity Testing Undirected Graphs Barabási–Albert model vs. Erdős–Rényi model Directed Graphs Krapivsky’s model (2001) Multivariate power law degree distribution Krapivsky’s model vs. Erdős–Rényi model

Barabási–Albert vs. Erdős–Rényi Model Barabási–Albert model: starts with nodes; each new node is connected to existing nodes with a probability proportional to the degree of the existing nodes. Degree distribution follows . Erdős–Rényi model : , each edge is included in the graph with probability independent from other edges.

Barabási–Albert vs. Erdős–Rényi Model Two groups of graphs: power law graphs generated by B-A model and random graphs generated by E-R model. Average degree varies from 20 to 50. Heat content in each group are plotted in the same color. Boundary selection: nodes with the smallest degrees. The time derivatives of the heat contents for the two groups of graphs The initial time derivative of the heat contents for power law graphs with different mean degrees Power Law graphs generated by B-A model Random graphs generated by E-R model

Barabási–Albert vs. Erdős–Rényi Model The eigenvalues of the normalized Laplacian satisfy the semicircle law under the condition that the minimum expected degree is relatively large. (Chung et. al. 2003) Laplacian Spectrum of two graphs with mean degree 20 Laplacian eigenvalues

Barabási–Albert vs. Erdős–Rényi Model Power Law graph generated by B-A model Random graph generated by E-R model

Krapivsky vs. Erdős–Rényi Model Krapivsky’s Model (‘Degree distributions of growing networks’ , 2001) With probability , a new node is introduced and attached to a target node with probability proportional to . With probability , a new link from node to node is created with probability proportional to . Degree distribution Average in-degree and out-degree:

Krapivsky vs. Erdős–Rényi Model Generated a 2000 nodes’ directed graph using Krapivsky’s model with and . CCDF of the in-degrees and out-degrees of the generated graph:

Krapivsky vs. Erdős–Rényi Model Generated four 2000 nodes’ directed graphs using Krapivsky’s model with p=0.1 , 0.15, 0.2 and 0.25 , respectively. Also generated four directed graphs using E-R model with the same average degrees. Heat contents in each group are plotted in the same color. Directed graphs with bivariate power Law degree distributions Directed E-R random graphs

Krapivsky vs. Erdős–Rényi Model Boundary selection : nodes with smallest values of The vertices with zero in-degrees are avoided to be selected as boundaries. This result reflects the different structures of the two types of graphs. For power law graphs, there is a large proportion of vertices with small in-degrees and out-degrees. Most of the selected boundary vertices have very small degrees and contribute little to the main graph structure despite their number increases. But for the E-R random graphs, the chance for a random walker reaching the boundary would increase with more boundary vertices. The number of boundary vertices impacts the heat contents of directed E-R random graphs more.

Multivariate Distribution Comparison Multivariate Normal Distributions Compare bivariate normal distributions with different covariance matrices Multivariate Power Law Distributions Krapivsky’s model, 2002 Correlated bivariate power law degree distributions Correlated distributions vs. independent distributions

Multivariate Normal Distribution Probability density function of the 2-dimensional bivariate normal distribution

Multivariate Normal Distribution vs. ( ) Symmetrised divergence vs. 0.1438 0.1895 0.3333 0.4199 1.2082 1.6281 0.8304 3.4328 4.2632

Multivariate Normal Distribution 20 samples for each distribution with 5000 random numbers in each sample. Histogram is used for density estimation. Heat contents for samples of the same distribution are plotted in the same color.

Multivariate distribution with power law marginal Krapivsky’s Model (‘A statistical physics perspective on web growth’, 2002) : A new node is introduced as isolated for webnet reality. Degree distribution Marginal distribution Joint Distribution (when ) In-degree and out-degree of a node are correlated

Multivariate distribution with power law marginal Independent bivariate power law distribution with the same marginal distributions as in Krapivsky’s model ( in the figures below) level curves

Multivariate distribution with power law marginal The joint degree distribution generated by Krapivsky’s model ( in the figure on right) The level curves for Krapivsky’s distribution with different values

Multivariate distribution with power law marginal Heat contents and time derivatives of the two groups of distributions: distributions generated by Krapivsky’s model with different values and the corresponding independent ones. Distributions generated by Krapivsky’s model Independent bivariate power law distributions

Multivariate distribution with power law marginal Heat content derivatives of the distributions generated by Krapivsky’s model with different values.

Ongoing Work Graph Similarity Testing Consider other graph generative models Consider real world network datasets Multivariate Distribution Comparison Use kernel density estimation instead of simple histogram for multi-dimensional data matching problems Compare the degree distributions of generated graphs Collaborations UMASS-Cornell UMASS-UMN

Thanks!