Modeling networks using Kronecker multiplication

Slides:



Advertisements
Similar presentations
Network analysis Sushmita Roy BMI/CS 576
Advertisements

Scale Free Networks.
1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU
1 Realistic Graph Generation and Evolution Using Kronecker Multiplication Jurij Leskovec, CMU Deepay Chakrabarti, CMU/Yahoo Jon Kleinberg, Cornell Christos.
Modeling Blog Dynamics Speaker: Michaela Götz Joint work with: Jure Leskovec, Mary McGlohon, Christos Faloutsos Cornell University Carnegie Mellon University.
Analysis and Modeling of Social Networks Foudalis Ilias.
Lecture 21 Network evolution Slides are modified from Jurij Leskovec, Jon Kleinberg and Christos Faloutsos.
Kronecker Graphs: An Approach to Modeling Networks Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutsos, Zoubin Ghahramani Presented.
SILVIO LATTANZI, D. SIVAKUMAR Affiliation Networks Presented By: Aditi Bhatnagar Under the guidance of: Augustin Chaintreau.
Advanced Topics in Data Mining Special focus: Social Networks.
CS 599: Social Media Analysis University of Southern California1 The Basics of Network Analysis Kristina Lerman University of Southern California.
1 A Random-Surfer Web-Graph Model (Joint work with Avrim Blum & Hubert Chan) Mugizi Rwebangira.
CS728 Lecture 5 Generative Graph Models and the Web.
Emergence of Scaling in Random Networks Barabasi & Albert Science, 1999 Routing map of the internet
Modeling Real Graphs using Kronecker Multiplication
Social Networks and Graph Mining Christos Faloutsos CMU - MLD.
CS 728 Lecture 4 It’s a Small World on the Web. Small World Networks It is a ‘small world’ after all –Billions of people on Earth, yet every pair separated.
Alon Arad Alon Arad Hurst Exponent of Complex Networks.
CS Lecture 6 Generative Graph Models Part II.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.
Marko Grobelnik, Dunja Mladenic JSI Parts of the presentation taken from the tutorial “Structure and function of real-world graphs and networks” by Jure.
Advanced Topics in Data Mining Special focus: Social Networks.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Summary from Previous Lecture Real networks: –AS-level N= 12709, M=27384 (Jan 02 data) route-views.oregon-ix.net, hhtp://abroude.ripe.net/ris/rawdata –
The Erdös-Rényi models
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Structure and models of real-world graphs and networks Jure Leskovec Machine Learning Department Carnegie Mellon University
Weighted Graphs and Disconnected Components Patterns and a Generator IDB Lab 현근수 In KDD 08. Mary McGlohon, Leman Akoglu, Christos Faloutsos.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
Learning With Bayesian Networks Markus Kalisch ETH Zürich.
On-line Social Networks - Anthony Bonato 1 Dynamic Models of On-Line Social Networks Anthony Bonato Ryerson University WAW’2009 February 13, 2009 nt.
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
Most of contents are provided by the website Network Models TJTSD66: Advanced Topics in Social Media (Social.
How Do “Real” Networks Look?
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
Small World Social Networks With slides from Jon Kleinberg, David Liben-Nowell, and Daniel Bilar.
Performance Evaluation Lecture 1: Complex Networks Giovanni Neglia INRIA – EPI Maestro 10 December 2012.
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
Dynamics of Real-world Networks
Lecture 23: Structure of Networks
Graph Models Class Algorithmic Methods of Data Mining
Cohesive Subgraph Computation over Large Graphs
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Lecture 1: Complex Networks
Topics In Social Computing (67810)
Complex Networks: Connectivity and Functionality
NetMine: Mining Tools for Large Graphs
How Do “Real” Networks Look?
Lecture 23: Structure of Networks
CS224W: Social and Information Network Analysis
How Do “Real” Networks Look?
How Do “Real” Networks Look?
Tools for large graph mining WWW 2008 tutorial
Lecture 13 Network evolution
Peer-to-Peer and Social Networks Fall 2017
How Do “Real” Networks Look?
Dynamics of Real-world Networks
Graph and Tensor Mining for fun and profit
Lecture 23: Structure of Networks
Topic models for corpora and for graphs
Lecture 21 Network evolution
Modelling and Searching Networks Lecture 2 – Complex Networks
Network Models Michael Goodrich Some slides adapted from:
Advanced Topics in Data Mining Special focus: Social Networks
Advanced Topics in Data Mining Special focus: Social Networks
What did we see in the last lecture?
Presentation transcript:

Modeling networks using Kronecker multiplication Jure Leskovec Machine Learning Department Carnegie Mellon University jure@cs.cmu.edu http://www.cs.cmu.edu/~jure/

Introduction Graphs are everywhere What can we do with graphs? What patterns or “laws” hold for most real-world graphs? Can we build models of graph generation and evolution? Can we fit these models to real networks? Web & citations Needle exchange Yeast protein interactions Internet

Traditional approach Sociologists were first to study networks: Study of patterns of connections between people to understand functioning of the society People are nodes, interactions are edges Questionares are used to collect link data (hard to obtain, inaccurate, subjective) Typical questions: Centrality and connectivity Limited to small graphs (~10 nodes) and properties of individual nodes and edges

New approach (1) Large networks (e.g., web, internet, on-line social networks) with millions of nodes Many traditional questions not useful anymore: Traditional: What happens if a node u is removed? Now: What percentage of nodes needs to be removed to affect network connectivity? Focus moves from a single node to study of statistical properties of the network as a whole Can not draw (plot) the network and examine it

New approach (2) How the network “looks like” even if I can’t look at it? Need statistical methods and tools to quantify large networks 3 parts/goals: Statistical properties of large networks Models that help understand these properties Predict behavior of networked systems based on measured structural properties and local rules governing individual nodes

Outline Introduction Properties of real-world networks Properties of static networks Properties of dynamic (evolving) networks) Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Fitting Kronecker Graphs Experiments Observations and Conclusion

Outline Introduction Properties of real-world networks Properties of static networks Properties of dynamic (evolving) networks) Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Fitting Kronecker Graphs Experiments Observations and Conclusion

Statistical properties of networks Features that are common to networks of different types: Properties of static networks: Small-world effect Transitivity or clustering Degree distributions (scale free networks) Network resilience Community structure Subgraphs or motifs Temporal properties: Densification Shrinking diameter

Small-world effect (1) Six degrees of separation (Milgram 60s) Random people in Nebraska were asked to send letters to stockbrokes in Boston Letters can only be passed to first-name acquantices Only 25% letters reached the goal But they reached it in about 6 steps Measuring path lengths: Diameter (longest shortest path): max dij Effective diameter: distance at which 90% of all connected pairs of nodes can be reached Mean geodesic (shortest) distance l or

Pick a random node, count how many nodes are at distance 1,2,3... hops Small-world effect (2) Distance (Hops) Number of nodes Pick a random node, count how many nodes are at distance 1,2,3... hops Distribution of shortest path lengths Microsoft Messenger network 180 million people 1.3 billion edges Edge if two people exchanged at least one message in one month period 7

Degree distributions (1) Let pk denote a fraction of nodes with degree k We can plot a histogram of pk vs. k In a (Erdos-Renyi) random graph degree distribution follows Poisson distribution Degrees in real networks are heavily skewed to the right Distribution has a long tail of values that are far above the mean Heavy (long) tail: Amazon sales word length distribution, … vs.

Degree distributions (2) Many low-degree nodes Few high-degree nodes log(pk) log(k)

Degree distributions (3) Many real world networks contain hubs: highly connected nodes We can easily distinguish between exponential and power-law tail by plotting on log-lin and log-log axis We usually work with CDF instead of PDF (then the degree exponent is α=slope+1) In scale-free networks maximum degree scales as n1/(α-1) lin-lin log-lin pk k k log-log pk k Degree distribution in a blog network

Poisson vs. Scale-free network Poisson network Scale-free (power-law) network (Erdos-Renyi random graph) Degree distribution is Power-law Function is scale free if: f(ax) = b f(x) Degree distribution is Poisson

Spectral properties Scree plot Eigenvalues of graph adjacency matrix follow a power law Network values (components of principal eigenvector) also follow a power-law Scree Plot Eigenvalue Rank

Temporal Graph Patterns Conventional Wisdom: Constant average degree: the number of edges grows linearly with the number of nodes Slowly growing diameter: as the network grows the distances between nodes grow We recently found: Densification Power Law: networks are becoming denser over time Shrinking Diameter: diameter is decreasing as the network grows

Temporal Patterns – Densification A very basic question: What is the relation between the number of nodes and the number of edges in a network? Densification Power Law N(t) … nodes at time t E(t) … edges at time t Suppose that N(t+1) = 2 * N(t) Q: what is your guess for E(t+1) =? 2 * E(t) A: over-doubled! But obeying the Densification Power Law Densification Power Law 1.69 E(t) N(t)

Networks over time: Densification Networks are becoming denser over time The number of edges grows faster than the number of nodes – average degree is increasing a … densification exponent: 1 ≤ a ≤ 2: a=1: linear growth – constant out-degree (assumed in the literature so far) a=2: quadratic growth – clique Internet E(t) a=1.2 N(t) Citations E(t) a=1.7 N(t)

Densification & degree distribution Degree exponent over time How does densification affect degree distribution? Given densification exponent a, the degree exponent is: (a) For γ=const over time, we obtain densification only for 1<γ<2, then γ=a/2 (b) For γ<2 degree distribution has to evolve according to: Power-law: y=b xγ, for γ<2 E[y] = ∞ pk=kγ (a) γ(t) a=1.1 (b) γ(t) a=1.6

Shrinking diameters Internet Intuition and prior work say that distances between the nodes slowly grow as the network grows (like log n): d ~ O(log N) d ~ O(log log N) Diameter Shrinks/Stabilizes over time as the network grows the distances between nodes slowly decrease Citations

Patterns hold in many graphs All these patterns can be observed in many real life graphs: World wide web [Barabasi] On-line communities [Holme, Edling, Liljeros] Who call whom telephone networks [Cortes] Autonomous systems [Faloutsos, Faloutsos, Faloutsos] Internet backbone – routers [Faloutsos, Faloutsos, Faloutsos] Movie – actors [Barabasi] Science citations [Leskovec, Kleinberg, Faloutsos] Co-authorship [Leskovec, Kleinberg, Faloutsos] Sexual relationships [Liljeros] Click-streams [Chakrabarti] Show plots of drawings

Outline Introduction Properties of real-world networks Properties of static networks Properties of dynamic (evolving) networks) Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Fitting Kronecker Graphs Experiments Observations and Conclusion

Graph Generators Lots of work But all of these Random graph [Erdos and Renyi, 60s] Preferential Attachment [Albert and Barabasi, 1999] Copying model [Kleinberg, Kumar, Raghavan, Rajagopalan and Tomkins, 1999] Community Guided Attachment and Forest Fire Model [Leskovec, Kleinberg and Faloutsos, 2005] Also work on Web graph and virus propagation [Ganesh et al, Satorras and Vespignani]++ But all of these Do not obey all the patterns Or we are not able prove them

Kronecker graphs Want to have a model that can generate a realistic graph: Static Patterns Power Law Degree Distribution Small Diameter Power Law Eigenvalue and Eigenvector Distribution Temporal Patterns Densification Power Law Shrinking/Constant Diameter For Kronecker graphs all these properties can actually be proven

Recursive Graph Generation There are many obvious (but wrong) ways Does not obey Densification Power Law Has increasing diameter Kronecker Product is exactly what we need Initial graph Recursive expansion

Kronecker Product – a Graph Intermediate stage Adjacency matrix Adjacency matrix

Kronecker Product – a Graph Continuing multypling with G1 we obtain G4 and so on … G4 adjacency matrix

Kronecker Graphs – Formally: We create the self-similar graphs recursively: Start with a initiator graph G1 on N1 nodes and E1 edges The recursion will then product larger graphs G2, G3, …Gk on N1k nodes Since we want to obey Densification Power Law graph Gk has to have E1k edges

Kronecker Product – Definition The Kronecker product of matrices A and B is given by We define a Kronecker product of two graphs as a Kronecker product of their adjacency matrices N x M K x L N*K x M*L

Kronecker Graphs We propose a growing sequence of graphs by iterating the Kronecker product Each Kronecker multiplication exponentially increases the size of the graph

Kronecker Graphs – Intuition Recursive growth of graph communities Nodes get expanded to micro communities Nodes in sub-community link among themselves and to nodes from different communities Little graph, super-graph (g1, g2)

How to randomize a graph? We want a randomized version of Kronecker Graphs Obvious solution Randomly add/remove some edges Wrong! – is not biased adding random edges destroys degree distribution, diameter, … Want add/delete edges in a biased way How to randomize properly and maintain all the properties?

Stochastic Kronecker Graphs Create N1N1 probability matrix P1 Compute the kth Kronecker power Pk For each entry puv of Pk include an edge (u,v) with probability puv Kronecker multiplication 0.25 0.10 0.04 0.05 0.15 0.02 0.06 0.01 0.03 0.09 0.5 0.2 0.1 0.3 Instance Matrix G2 P1 flip biased coins P2

Outline Introduction Properties of real-world networks Properties of static networks Properties of dynamic (evolving) networks) Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Fitting Kronecker Graphs Experiments Observations and Conclusion

Problem Definition Given a growing graph with nodes N1, N2, … Generate a realistic sequence of graphs that will obey all the patterns Static Patterns Power Law Degree Distribution Power Law eigenvalue and eigenvector distribution Small Diameter Dynamic Patterns Growth Power Law Shrinking/Stabilizing Diameters First and the only generator for which we can prove all the properties     

Properties of Kronecker Graphs Theorem: Kronecker Graphs have Multinomial in- and out-degree distribution (which can be made to behave like a Power Law) Proof: Let G1 have degrees d1, d2, …, dN Kronecker multiplication with a node of degree d gives degrees d∙d1, d∙d2, …, d∙dN After Kronecker powering Gk has multinomial degree distribution

Eigen-value/-vector Distribution Theorem: The Kronecker Graph has multinomial distribution of its eigenvalues Theorem: The components of each eigenvector in Kronecker Graph follow a multinomial distribution Proof: Trivial by properties of Kronecker multiplication

Problem Definition Given a growing graph with nodes N1, N2, … Generate a realistic sequence of graphs that will obey all the patterns Static Patterns Power Law Degree Distribution Power Law eigenvalue and eigenvector distribution Small Diameter Dynamic Patterns Growth Power Law Shrinking/Stabilizing Diameters   

Temporal Patterns: Densification Theorem: Kronecker graphs follow a Densification Power Law with densification exponent Proof: If G1 has N1 nodes and E1 edges then Gk has Nk = N1k nodes and Ek = E1k edges And then Ek = Nka Which is a Densification Power Law

Constant Diameter – Proof Sketch Theorem: If G1 has diameter d then graph Gk also has diameter d Observation: Edges in Kronecker graphs: where X are appropriate nodes Example:

Problem Definition Given a growing graph with nodes N1, N2, … Generate a realistic sequence of graphs that will obey all the patterns Static Patterns Power Law Degree Distribution Power Law eigenvalue and eigenvector distribution Small Diameter Dynamic Patterns Growth Power Law Shrinking/Stabilizing Diameters First and the only generator for which we can prove all the properties     

Outline Introduction Properties of real-world networks Properties of static networks Properties of dynamic (evolving) networks) Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Fitting Kronecker Graphs Experiments Observations and Conclusion

Why fitting generative models? Parameters tell us about the structure of a graph Extrapolation: given a graph today, how will it look in a year? Sampling: can I get a smaller graph with similar properties? Anonymization: instead of releasing real graph (e.g., email network), we can release a synthetic version of it

Problem definition Find parameter matrix Θ which We need to (efficiently) calculate And maximize over Θ (by using gradient descent) (so we also need to the gradient)

Fitting Kronecker to Real Data Given a graph G and Kronecker matrix Θ we can calculate probability that Θ generated G P(G|Θ): 0.25 0.10 0.04 0.05 0.15 0.02 0.06 0.01 0.03 0.09 1 0.5 0.2 0.1 0.3 Θ Θk G P(G|Θ)

Challenge 1: Node labeling 0.25 0.10 0.04 0.05 0.15 0.02 0.06 0.01 0.03 0.09 Graphs G’ and G” should have save probability P(G’|Θ) = P(G”|Θ) So one needs to consider all node labelings σ There are O(N!) such labelings All labelings are apriori equally likely 0.5 0.2 0.1 0.3 Θ Θk G’ 1 1 3 2 4 G” 2 1 4 1 3 P(G’|Θ) = P(G”|Θ)

Challenge 2: calculating P(G|Θ,σ) Takes O(N2) time. Infeasible for large graphs σ… node labeling P = Θk 0.25 0.10 0.04 0.05 0.15 0.02 0.06 0.01 0.03 0.09 1 G Θk P(G|Θ, σ)

Our solutions Naïvely calculating P(G|Θ) takes O(N!N2) time We can do it in O(E) Solutions Challenge 1: We won’t consider all labelings But use Markov Chain Monte Carlo (MCMC) sampling techniques to sample permutations from P(σ|G,Θ) Challenge 2: Real graphs are sparse: E << N Calculate P(Gempty|Θ) and then “add” the edges. This takes O(E) (and not O(N2))

Sampling node labelings (1) Gradient over parameters Sample the permutations from P(σ|G,Θ) and average them

Sampling node labelings (2) Need to efficiently calculate the likelihood ratios But the permutations σ(i) and σ(i+1) only differ at 2 positions So we only traverse to update 2 rows (columns) of Θk We can evaluate the likelihood ratio efficiently Metropolis permutation sampling algorithm j k

Calculating P(G|Θ,σ) pij =θ1aθ2bθ3cθ1d Θk Real graphs are sparse so we first calculate likelihood of empty graph Probability of edge (i,j) is in general pij =θ1aθ2b θ3c θ4d By using Taylor approximation to pij and summing the multinomial series we obtain: We approximate the likelihood: pij =θ1aθ2bθ3cθ1d Taylor approximation log(1-x) ~ -x – 0.5 x2

Convergence of fitting Can gradient descent recover true parameters? How nice (smooth, without local minima) is optimization space? Generate a graph from random parameters Start at random point and use gradient descent We recover true parameters 98% of the times How does algorithm converge to true parameters with gradient descent iterations? Log-likelihood Avg abs error 1st eigenvalue Diameter

Adjacency matrix eigen values AS graph (N=6500, E=26500) Degree distribution Hop plot Adjacency matrix eigen values Network value

Epinions graph (N=76k, E=510k) Degree distribution Hop plot Network value Adjacency matrix eigen values

Scalability Fitting scales linearly with the number of edges

Model selection How big should parameters matrix Θ be? We propose to use Bayes Information Criterion (BIC): We tradeoff between the model fit and the model complexity

Conclusion We proposed Kronecker Graphs We can provable properties of Kronecker Graph model We presented scalable algorithms for fitting Kronecker Graphs Use simulation techniques to overcome super-exponential number of node labelings Use Taylor approximation to quickly evaluate the likelihood Kronecker Graphs fit well

References Graph Evolution: Densification and Shrinking Diameters, by Jure Leskovec, Jon Kleinberg and Christos Faloutsos, ACM TKDD 2007 Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication, by Jure Leskovec, Deepay Chakrabarti, Jon Kleinberg and Christos Faloutsos, PKDD 2005 Scalable Modeling of Real Graphs using Kronecker Multiplication, by Jure Leskovec and Christos Faloutsos, in submission