Nonparametric Link Prediction in Dynamic Graphs Purnamrita Sarkar (UC Berkeley) Deepayan Chakrabarti (Facebook) Michael Jordan (UC Berkeley) 1.

Slides:



Advertisements
Similar presentations
Quantum t-designs: t-wise independence in the quantum world Andris Ambainis, Joseph Emerson IQC, University of Waterloo.
Advertisements

Lower Bounds for Local Search by Quantum Arguments Scott Aaronson (UC Berkeley) August 14, 2003.
Google News Personalization: Scalable Online Collaborative Filtering
Purnamrita Sarkar (Carnegie Mellon) Deepayan Chakrabarti (Yahoo! Research) Andrew W. Moore (Google, Inc.)
Differential Forms for Target Tracking and Aggregate Queries in Distributed Networks Rik Sarkar Jie Gao Stony Brook University 1.
Algorithmic and Economic Aspects of Networks Nicole Immorlica.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
Hidden Markov Models (1)  Brief review of discrete time finite Markov Chain  Hidden Markov Model  Examples of HMM in Bioinformatics  Estimations Basic.
Analysis and Modeling of Social Networks Foudalis Ilias.
Queuing Network Models for Delay Analysis of Multihop Wireless Ad Hoc Networks Nabhendra Bisnik and Alhussein Abouzeid Rensselaer Polytechnic Institute.
MIT CSAIL Vision interfaces Towards efficient matching with random hashing methods… Kristen Grauman Gregory Shakhnarovich Trevor Darrell.
Efficiently searching for similar images (Kristen Grauman)
Dynamic Bayesian Networks (DBNs)
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features Kristen Grauman Trevor Darrell MIT.
Lecture 3 Nonparametric density estimation and classification
Autocorrelation and Linkage Cause Bias in Evaluation of Relational Learners David Jensen and Jennifer Neville.
Directional triadic closure and edge deletion mechanism induce asymmetry in directed edge properties.
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
Entropy Rates of a Stochastic Process
More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.
1 Reasoning Under Uncertainty Over Time CS 486/686: Introduction to Artificial Intelligence Fall 2013.
Algorithmic and Economic Aspects of Networks Nicole Immorlica.
1 Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint Yang Wang Deepayan Chakrabarti Chenxi Wang Christos Faloutsos.
1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.
1 Lecture 18 Syntactic Web Clustering CS
Building Low-Diameter P2P Networks Eli Upfal Department of Computer Science Brown University Joint work with Gopal Pandurangan and Prabhakar Raghavan.
Purnamrita Sarkar (UC Berkeley) Deepayan Chakrabarti (Yahoo! Research) Andrew W. Moore (Google, Inc.) 1.
1 Fast Incremental Proximity Search in Large Graphs Purnamrita Sarkar Andrew W. Moore Amit Prakash.
1 Random Walks in WSN 1.Efficient and Robust Query Processing in Dynamic Environments using Random Walk Techniques, Chen Avin, Carlos Brito, IPSN 2004.
1 Characterizing Selfishly Constructed Overlay Routing Networks March 11, 2004 Byung-Gon Chun, Rodrigo Fonseca, Ion Stoica, and John Kubiatowicz University.
Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,..
1 Uniform Sampling from the Web via Random Walks Ziv Bar-Yossef Alexander Berg Steve Chien Jittat Fakcharoenphol Dror Weitz University of California at.
Indexing Techniques Mei-Chen Yeh.
Approximation algorithms for large-scale kernel methods Taher Dameh School of Computing Science Simon Fraser University March 29 th, 2010.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (3) Domain-Based Mathematical Models for Protein Evolution Tatsuya Akutsu Bioinformatics.
Efficient and Robust Query Processing in Dynamic Environments Using Random Walk Techniques Chen Avin Carlos Brito.
Modeling Information Diffusion in Networks with Unobserved Links Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University.
Purnamrita Sarkar (Carnegie Mellon) Deepayan Chakrabarti (Yahoo! Research) Andrew W. Moore (Google, Inc.)
Suggesting Friends using the Implicit Social Graph Maayan Roth et al. (Google, Inc., Israel R&D Center) KDD’10 Hyewon Lim 1 Oct 2014.
Adaptive CSMA under the SINR Model: Fast convergence using the Bethe Approximation Krishna Jagannathan IIT Madras (Joint work with) Peruru Subrahmanya.
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
© 2009 IBM Corporation 1 Improving Consolidation of Virtual Machines with Risk-aware Bandwidth Oversubscription in Compute Clouds Amir Epstein Joint work.
Handover and Tracking in a Camera Network Presented by Dima Gershovich.
Continuous Variables Write message update equation as an expectation: Proposal distribution W t (x t ) for each node Samples define a random discretization.
Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova , Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.
Algorithmic Detection of Semantic Similarity WWW 2005.
Random Dot Product Graphs Ed Scheinerman Applied Mathematics & Statistics Johns Hopkins University IPAM Intelligent Extraction of Information from Graphs.
ECE-7000: Nonlinear Dynamical Systems Overfitting and model costs Overfitting  The more free parameters a model has, the better it can be adapted.
Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
1 Finding Spread Blockers in Dynamic Networks (SNAKDD08)Habiba, Yintao Yu, Tanya Y., Berger-Wolf, Jared Saia Speaker: Hsu, Yu-wen Advisor: Dr. Koh, Jia-Ling.
MAIN RESULT: We assume utility exhibits strategic complementarities. We show: Membership in larger k-core implies higher actions in equilibrium Higher.
Analysis of Social Media MLD , LTI William Cohen
Stefanos Antaris Distributed Publish/Subscribe Notification System for Online Social Networks Stefanos Antaris *, Sarunas Girdzijauskas † George Pallis.
Progress Report ekker. Problem Definition In cases such as object recognition, we can not include all possible objects for training. So transfer learning.
Latent Feature Models for Network Data over Time Jimmy Foulds Advisor: Padhraic Smyth (Thanks also to Arthur Asuncion and Chris Dubois)
Incrementally Improving Lookup Latency in Distributed Hash Table Systems Hui Zhang 1, Ashish Goel 2, Ramesh Govindan 1 1 University of Southern California.
Analysis of Social Media MLD , LTI William Cohen
S IMILARITY E STIMATION T ECHNIQUES FROM R OUNDING A LGORITHMS Paper Review Jieun Lee Moses S. Charikar Princeton University Advanced Database.
The walkers problem J.D., X.Perez, M.Serna, N.Wormald Partially supported by the EC 6th FP : DELIS.
Peer-to-Peer Networks 07 Degree Optimal Networks
RF-based positioning.
Fast nearest neighbor searches in high dimensions Sami Sieranoja
A Theoretical Justification of Link Prediction Heuristics
A Theoretical Justification of Link Prediction Heuristics
Theoretical Justification of Popular Link Prediction Heuristics
Nonparametric Link Prediction in Dynamic Graphs
CS5112: Algorithms and Data Structures for Applications
Topological Signatures For Fast Mobility Analysis
Presentation transcript:

Nonparametric Link Prediction in Dynamic Graphs Purnamrita Sarkar (UC Berkeley) Deepayan Chakrabarti (Facebook) Michael Jordan (UC Berkeley) 1

Link Prediction  Who is most likely to be interact with a given node? Friend suggestion in Facebook Should Facebook suggest Alice as a friend for Bob? Bob Alice 2

Link Prediction Alice Bob Charlie Movie recommendation in Netflix Should Netflix suggest this movie to Alice? 3

Link Prediction Prediction using simple features  degree of a node  number of common neighbors  last time a link appeared What if the graph is dynamic? 4

Related Work Generative models  Exp. family random graph models [Hanneke+/’06]  Dynamics in latent space [Sarkar+/’05]  Extension of mixed membership block models [Fu+/10] Other approaches  Autoregressive models for links [Huang+/09]  Extensions of static features [Tylenda+/09] 5

Goal Link Prediction  incorporating graph dynamics,  requiring weak modeling assumptions,  allowing fast predictions,  and offering consistency guarantees. 6

Outline Model Estimator Consistency Scalability Experiments 7

The Link Prediction Problem in Dynamic Graphs G1G1 G2G2 G T+1 …… Y 1 (i,j)=1 Y 2 (i,j)=0 Y T+1 (i,j)=? Y T+1 (i,j) | G 1,G 2, …,G T ~ Bernoulli (g G1,G2,…GT (i,j)) Edge in T+1 Features of previous graphs and this pair of nodes 8

cn ℓℓ deg Including graph-based features Example set of features for pair (i,j):  cn(i,j) (common neighbors)  ℓℓ(i,j) (last time a link was formed)  deg(j) Represent dynamics using “ datacubes ” of these features.  ≈ multi-dimensional histogram on binned feature values η t = #pairs in G t with these features 1 ≤ cn ≤ 3 3 ≤ deg ≤ 6 1 ≤ ℓℓ ≤ 2 η t + = #pairs in G t with these features, which had an edge in G t+1 high η t + /η t  this feature combination is more likely to create a new edge at time t+1 9

G1G1 G2G2 GTGT …… Y 1 (i,j)=1 Y 2 (i,j)=0 Y T+1 (i,j)=? 1 ≤ cn(i,j) ≤ 3 3 ≤ deg(i,j) ≤ 6 1 ≤ ℓℓ (i,j) ≤ 2 Including graph-based features How do we form these datacubes? Vanilla idea: One datacube for G t →G t+1 aggregated over all pairs (i,j)  Does not allow for differently evolving communities 10

Y T+1 (i,j)=? 1 ≤ cn(i,j) ≤ 3 3 ≤ deg(i,j) ≤ 6 1 ≤ ℓℓ (i,j) ≤ 2 Our Model How do we form these datacubes? Our Model: One datacube for each neighborhood  Captures local evolution G1G1 G2G2 GTGT …… Y 1 (i,j)=1 Y 2 (i,j)=0 11

Our Model Number of node pairs - with feature s - in the neighborhood of i - at time t Number of node pairs - with feature s - in the neighborhood of i - at time t - which got connected at time t+1 Datacube 1 ≤ cn(i,j) ≤ 3 3 ≤ deg(i,j) ≤ 6 1 ≤ ℓℓ (i,j) ≤ 2 Neighborhood N t (i)= nodes within 2 hops Features extracted from (N t-p,…N t ) 12

Our Model Datacube d t (i) captures graph evolution  in the local neighborhood of a node  in the recent past Model: What is g(.)? Y T+1 (i,j) | G 1,G 2, …,G T ~ Bernoulli ( g G1,G2,…GT (i,j)) g(d t (i), s t (i,j) ) Features of the pair Local evolution patterns 13

Outline Model Estimator Consistency Scalability Experiments 14

Kernel Estimator for g G1G1 G 2 …… GTGT G T-1 G T-2 query data-cube at T-1 and feature vector at time T compute similarities datacube, feature pair t=1 { { { { { { { { … datacube, feature pair t=2 { { { { { { { { … datacube, feature pair t=3 { { { { { { { { … { { 15

Factorize the similarity function  Allows computation of g(.) via simple lookups } } } K(, )I{ == } Kernel Estimator for g 16

Kernel Estimator for g G1G1 G 2 …… GTGT G T-1 G T-2 datacubes t=1 datacubes t=2 datacubes t=3 compute similarities only between data cubes w1w1 w2w2 w3w3 w4w4 η 1, η 1 + η 2, η 2 + η 3, η 3 + η 4, η

Factorize the similarity function  Allows computation of g(.) via simple lookups  What is K(, )? } } } K(, )I{ == } Kernel Estimator for g 18

Similarity between two datacubes Idea 1 For each cell s, take (η 1 + /η 1 – η 2 + /η 2 ) 2 and sum Problem:  Magnitude of η is ignored  5/10 and 50/100 are treated equally Consider the distribution η 1, η 1 + η 2, η

Similarity between two datacubes 0<b<1 As b  0, K(, )  0 unless dist(, ) =0 Idea 2 For each cell s, compute posterior distribution of edge creation prob. dist = total variation distance between distributions  summed over all cells η 1, η 1 + η 2, η

Want to show: Kernel Estimator for g 21

Outline Model Estimator Consistency Scalability Experiments 22

Consistency of Estimator Lemma 1: As T→∞, for some R>0, Proof using: As T→∞, 23

Consistency of Estimator Lemma 2: As T→∞, 24

Consistency of Estimator Assumption: finite graph Proof sketch:  Dynamics are Markovian with finite state space the chain must eventually enter a closed, irreducible communication class geometric ergodicity if class is aperiodic (if not, more complicated…) strong mixing with exponential decay variances decay as o(1/T) 25

Consistency of Estimator Theorem: Proof Sketch:  for some R>0  So 26

Outline Model Estimator Consistency Scalability Experiments 27

Scalability Full solution:  Summing over all n datacubes for all T timesteps  Infeasible Approximate solution:  Sum over nearest neighbors of query datacube How do we find nearest neighbors?  Locality Sensitive Hashing (LSH) [Indyk+/98, Broder+/98] 28

Using LSH Devise a hashing function for datacubes such that  “Similar” datacubes tend to be hashed to the same bucket  “Similar” = small total variation distance between cells of datacubes 29

Using LSH Step 1: Map datacubes to bit vectors Use B 2 bits for each bucket For probability mass p the first bits are set to 1 Use B 1 buckets to discretize [0,1] Total M*B1*B2 bits, where M = max number of occupied cells << total number of cells 30

Using LSH 31

Fast Search Using LSH

Outline Model Estimator Consistency Scalability Experiments 33

Experiments 34

Setup G1G1 G2G2 GTGT Training data Test data G T+1 35

Simulations Social network model of Hoff et al.  Each node has an independently drawn feature vector  Edge(i,j) depends on features of i and j  Seasonality effect Feature importance varies with season different communities in each season  Feature vectors evolve smoothly over time evolving community structures 36

Simulations NonParam is much better than others in the presence of seasonality CN, AA, and Katz implicitly assume smooth evolution 37

Sensor Network * * 38

Summary Link formation is assumed to depend on  the neighborhood’s evolution  over a time window Admits a kernel-based estimator  Consistency  Scalability via LSH Works particularly well for  Seasonal effects  differently evolving communities 39