Link Prediction and Network Inference

Slides:



Advertisements
Similar presentations
Great Theoretical Ideas in Computer Science
Advertisements

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Great Theoretical Ideas in Computer Science for Some.
Spread of Influence through a Social Network Adapted from :
Maximizing the Spread of Influence through a Social Network
Maximizing the Spread of Influence through a Social Network
Link Analysis: PageRank
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Absorbing Random walks Coverage
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
Great Theoretical Ideas in Computer Science.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
1 On Compressing Web Graphs Michael Mitzenmacher, Harvard Micah Adler, Univ. of Massachusetts.
INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.
Today Logistic Regression Decision Trees Redux Graphical Models
CSE 550 Computer Network Design Dr. Mohammed H. Sqalli COE, KFUPM Spring 2007 (Term 062)
Maximizing Product Adoption in Social Networks
Models of Influence in Online Social Networks
1 1 MPI for Intelligent Systems 2 Stanford University Manuel Gomez Rodriguez 1,2 David Balduzzi 1 Bernhard Schölkopf 1 UNCOVERING THE TEMPORAL DYNAMICS.
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Eric Horvitz, Michael Mahoney,
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
DATA MINING LECTURE 13 Absorbing Random walks Coverage.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Information Spread and Information Maximization in Social Networks Xie Yiran 5.28.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
DATA MINING LECTURE 13 Pagerank, Absorbing Random Walks Coverage Problems.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Maximizing the Spread of Influence through a Social Network Authors: David Kempe, Jon Kleinberg, É va Tardos KDD 2003.
Online Social Networks and Media
Link Prediction Topics in Data Mining Fall 2015 Bruno Ribeiro
CIAR Summer School Tutorial Lecture 1b Sigmoid Belief Nets Geoffrey Hinton.
Manuel Gomez Rodriguez Bernhard Schölkopf I NFLUENCE M AXIMIZATION IN C ONTINUOUS T IME D IFFUSION N ETWORKS , ICML ‘12.
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
CS 590 Term Project Epidemic model on Facebook
Jure Leskovec (Stanford), Daniel Huttenlocher and Jon Kleinberg (Cornell)
A Latent Social Approach to YouTube Popularity Prediction Amandianeze Nwana Prof. Salman Avestimehr Prof. Tsuhan Chen.
Supervised Random Walks: Predicting and Recommending Links in Social Networks Lars Backstrom (Facebook) & Jure Leskovec (Stanford) Proc. of WSDM 2011 Present.
1 1 MPI for Intelligent Systems 2 Stanford University Manuel Gomez Rodriguez 1,2 Bernhard Schölkopf 1 S UBMODULAR I NFERENCE OF D IFFUSION NETWORKS FROM.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Link Prediction Class Data Mining Technology for Business and Society
Hidden Markov Models BMI/CS 576
Inferring Networks of Diffusion and Influence
Independent Cascade Model and Linear Threshold Model
Greedy & Heuristic algorithms in Influence Maximization
DM-Group Meeting Liangzhe Chen, Nov
Link Prediction Seminar Social Media Mining University UC3M
Great Theoretical Ideas in Computer Science
Social and Information Network Analysis: Review of Key Concepts
Community detection in graphs
Friend Recommendation with a Target User in Social Networking Services
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Maximizing the Spread of Influence through a Social Network
Discrete Mathematics for Computer Science
The Importance of Communities for Learning to Influence
Effective Social Network Quarantine with Minimal Isolation Costs
Bayesian Models in Machine Learning
Collaborative Filtering Matrix Factorization Approach
Cost-effective Outbreak Detection in Networks
CS224w: Social and Information Network Analysis
Boltzmann Machine (BM) (§6.4)
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
Lecture 21 Network evolution
Viral Marketing over Social Networks
Independent Cascade Model and Linear Threshold Model
Human-centered Machine Learning
Analysis of Large Graphs: Overlapping Communities
Presentation transcript:

Link Prediction and Network Inference CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu

Link Prediction in Networks [LibenNowell-Kleinberg ‘03] Link Prediction in Networks The link prediction task: Given G[t0,t0’] a graph on edges up to time t0’ output a ranked list L of links (not in G[t0,t0’]) that are predicted to appear in G[t1,t1’] Evaluation: n=|Enew|: # new edges that appear during the test period [t1,t1’] Take top n elements of L and count correct edges G[t0, t’0] G[t1, t’1] 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Link Prediction via Proximity [LibenNowell-Kleinberg ‘03] Link Prediction via Proximity Predict links in a evolving collaboration network Core: Since network data is very sparse Consider only nodes with in-degree and out-degree of at least 3 Picture – Core vs non-core 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Link Prediction via Proximity Methodology: For every pair of nodes (x,y) compute proximity c # of common neighbors c(x,y) of x and y Sort pairs by the decreasing score only condier/predict edges where both endpoints are in the core (deg. >3) Predict top n pairs as new links X X 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Link Prediction via Proximity [LibenNowell-Kleinberg ‘03] Link Prediction via Proximity For every pair of nodes (x,y) compute: Γ(x) … degree of node x 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Results: Improvement over Random [LibenNowell-Kleinberg ’ 03] Results: Improvement over Random 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Results: Common Neighbors Improvement over #common neighbors 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Supervised Link Prediction [WSDM ‘11] Supervised Link Prediction How to learn to predict new friends? Facebook’s People You May Know Let’s look at the data: 92% of new friendships on FB are friend-of-a-friend More common friends helps v z w u 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Supervised Link Prediction Recommend a list of possible friends Supervised machine learning setting: Training example: For every node s have a list of nodes she will create links to {v1 … vk} Use FB network from May 2011 and {v1…vk} are the new friendships you created since then Task: For a given node s rank nodes {v1 … vk} higher than other nodes in the network Supervised Random Walks s “positive” nodes “negative” nodes Green nodes are the nodes to which s creates links in the future 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Supervised Link Prediction How to combine node/edge attributes and the network structure? Learn a strength of each edge based on: Profile of user u, profile of user v Interaction history of u and v Do a PageRank-like random walk from s to measure the “proximity” between s and other nodes Rank nodes by their “proximity” (i.e., visiting prob.) s s     “positive” nodes “negative” nodes 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Supervised Random Walks [WSDM ’11] Supervised Random Walks Let s be the center node Let fβ(u,v) be a function that assigns a strength to each edge: 𝑎 𝑢𝑣 = 𝑓 𝛽 𝑢,𝑣 = exp − 𝑖 𝛽 𝑖 ⋅ Ψ 𝑢𝑣 𝑖 Ψuv is a feature vector Features of node u Features of node v Features of edge (u,v) (β is the parameter vector we want to learn!) Do Random Walk with Restarts from s where transitions are according to edge strengths auv s “positive” nodes “negative” nodes 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

SRW: Prediction How to estimate edge strengths? Personalized PageRank on the weighted graph. Each node u gets a PageRank proximity pu Set edge strengths auv = fβ(u,v) Network Sort nodes by the decreasing PageRank proximity pu How to estimate edge strengths? How to set parameters β of fβ(u,v)? Recommend top k nodes with the highest proximity pu to node s 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Personalized PageRank [WSDM ’11] Personalized PageRank auv …. Strength of edge (u,v,) Random walk transition matrix: PageRank transition matrix: with prob. α jump back to s Compute PageRank vector: p = pT Q Rank nodes u by pu s “positive” nodes “negative” nodes 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

The Optimization Problem [WSDM ’11] The Optimization Problem Each node u has a score pu Positive nodes D ={d1,…, dk} Negative nodes L = {the rest} What do we want? Want to find β such that pl < pd The exact solution to the above problem may not exist So we make the constrains “soft” (i.e., optional) s “positive” nodes “negative” nodes 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Making Constraints “Soft” [WSDM ’11] Making Constraints “Soft” Want to minimize: Loss: h(x)=0 if x<0, x2 else Loss pl < pd pl=pd pl > pd 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Solving the Problem: Intuition [WSDM ’11] Solving the Problem: Intuition How to minimize F? Both pl and pd depend on β Given β assign edge weights auv=fβ(u,v) Using transition matrix Q=[auv] compute PageRank scores pu Rank nodes by the PageRank score Want to find β such that pl < pd s 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Gradient Descent i.e. How to minimize F? We know: So: [WSDM ’11] Gradient Descent How to minimize F? Take the derivative! We know: i.e. So: Looks like the PageRank equation! Easy 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Optimizing F To optimize F, use gradient based method: Pick a random starting point β0 Compute the personalized PageRank vector p Compute the gradient with respect to the weight vector β Update β Optimize using quasi-Newton method 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Data: Facebook Facebook Iceland network For every node s: [WSDM ’11] Data: Facebook Facebook Iceland network 174,000 nodes (55% of population) Avg. degree 168 Avg. person added 26 friends/month For every node s: Positive examples: D={ new friendships of s created in Nov ‘09 } Negative examples: L={ other nodes s did not create new links to } Limit to friends of friends: on avg. there are 20k FoFs (max 2M)! s 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Experimental setting Node and Edge features for learning: Baselines: Node: Age, Gender, Degree Edge: Age of an edge, Communication, Profile visits, Co-tagged photos Baselines: Decision trees and logistic regression: Above features + 10 network features (PageRank, common friends) Evaluation: AUC and Precision at Top20 Skip AUC evaluation, Skip Logistic Regression, DT? 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Results: Facebook Iceland Facebook: predict future friends Adamic-Adar already works great Logistic regression also strong SRW gives slight improvement No AUC column, No DT, LR What are feature sets? Node, Network, Path 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Network Inference

Hidden and implicit networks Many networks are implicit or hard to observe: Hidden/hard-to-reach populations: Network of needle sharing between drug injection users Implicit connections: Network of information propagation in online news media But we can observe results of the processes taking place on such (invisible) networks: Virus propagation: Drug users get sick, and we observe when they see the doctor Information networks: We observe when media sites mention information Question: Can we infer the hidden networks? 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Inferring the Diffusion Networks [KDD ’10, NIPS ‘10] Inferring the Diffusion Networks There is a hidden diffusion network: We only see times when nodes get “infected”: Cascade c1: (a,1), (c,2), (b,3), (e,4) Cascade c2: (c,1), (a,4), (b,5), (d,6) Want to infer who-infects-whom network! a a a b b b d d c c c e e 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Examples and Applications Word of mouth & Viral marketing Virus propagation Viruses propagate through the network We only observe when people get sick But NOT who infected whom Recommendations and influence propagate We only observe when people buy products But NOT who influenced whom Process We observe It’s hidden Can we infer the underlying network? 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Inferring the Diffusion Network 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Network Inference: The Task Goal: Find a graph G that best explains the observed information times Given a graph G, define the likelihood P(C|G): Define a model of information diffusion over a graph Pc(u,v) … prob. that u infects v in cascade c P(c|T) … prob. that c spread in particular pattern T P(c|G) … prob. that cascade c occurred in G P(C|G) … prob. that a set of cascades C occurred in G Questions: How to efficiently compute P(G|C)? (given a single G) How to efficiently find G* that maximizes P(G|C)? (over O(2N*N) graphs) Have a diagram of what we will define and how it will all fit together. Maybe even on a separate slide. 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Cascade Diffusion Model Continuous time cascade diffusion model: Cascade c reaches node u at tu and spreads to u’s neighbors: With probability β cascade propagates along edge (u, v) and we determine the infection time of node v tv = tu + Δ e.g.: Δ ~ Exponential or Power-law tu tb tc Δ1 Δ2 u u a b b b c c We assume each node v has only one parent! c d 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Cascade Diffusion Model The model for 1 cascade: Cascade reaches node u at time tu, and spreads to u’s neighbors v: With prob. β cascade propagates along edge (u,v) and tv = tu+Δ Transmission probability: Pc(u,v)  P(tv -tu ) if tv> tu else ε e.g.: Pc(u,v)  e -Δt ε captures influence external to the network At any time a node can get infected from outside with small probability ε tv ε ε ε b d e a c β 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Cascade Probability Given node infection times and pattern T: c = { (a,1), (c,2), (b,3), (e,4) } T = { a→b, a→c, b→e } Prob. that c propagates in pattern T Approximate it as: T b d e a c Graph G Edges that “propagated” Edges that failed to “propagate” 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Complication: Too Many Trees How likely is cascade c to spread in graph G? c = {(a,1), (c,2), (b,3), (e,4)} Need to consider all possible ways for c to spread over G (i.e., all spanning trees T): b d e a c b d e a c b d e a c Consider only the most likely propagation tree 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

The Optimization Problem Score of a graph G for a set of cascades C: Want to find the “best” graph: The problem is NP-hard: MAX-k-COVER [KDD ’10] 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

How to Find the Best Tree? Given a cascade c, what is the most likely propagation tree? A maximum directed spanning tree Edge (i,j) in G has weight wc(i,j)=log Pc(i,j) The maximum weight spanning tree on infected nodes: Each node picks an in-edge of max weight: b d e a c Parent of node i in tree T Local greedy selection gives optimal tree! 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Great News: Submodularity! Theorem: Fc(G) is monotonic, and submodular Proof: Single cascade c, some edge e=(r,s) of weight. wrs Show Fc(G {e}) – Fc(G) ≥ Fc(G’ {e}) – FC (G’) Let w.s be max weight in-edge of s in G Let w’.s be max weight in-edge of s in G’ Since G  G’ : w.s  w’.s and wrs= w’rs 𝐹 𝑐 (𝐺∪ 𝑟,𝑠 − 𝐹 𝑐 𝐺 = max 𝑤 .𝑠 , 𝑤 𝑟𝑠 − 𝑤 .𝑠 ≥ max 𝑤 .𝑠 ′ , 𝑤 𝑟𝑠 − 𝑤 .𝑠 ′ = 𝐹 𝑐 𝐺 ′ ∪ 𝑟,𝑠 − 𝐹 𝑐 (𝐺′) b a i j s r s picks in-edge of max weight 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

[KDD ’10] NetInf: The Algorithm The algorithm: Use greedy hill-climbing to maximize FC(G): Start with empty G0 (G with no edges) Add k edges (k is parameter) At every step i add an edge to Gi that maximizes the marginal improvement 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Experiments: Synthetic data Take a graph G on k edges Simulate info. diffusion Record node infection times Reconstruct G Evaluation: How many edges of G can NetInf find? Break-even point: 0.95 Performance is independent of the structure of G! 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

How Good is Our Graph? We achieve ≈ 90 % of the best possible network! 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

How Many Cascades Do We Need? With 2x as many infections as edges, the break-even point is already 0.8 - 0.9! 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Experiments: Real data [KDD, ‘09] Experiments: Real data Memetracker dataset: 172m news articles Aug ‘08 – Sept ‘09 343m textual phrases Times tc(w) when site w mentions phrase c Given times when sites mention phrases Infer the network of information diffusion: Who tends to copy (repeat after) whom http://memetracker.org 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Example: Diffusion Network 5,000 news sites: Blogs Mainstream media 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Diffusion Network (small part) Blogs Mainstream media 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu