Presentation is loading. Please wait.

Presentation is loading. Please wait.

Link Prediction and Network Inference

Similar presentations


Presentation on theme: "Link Prediction and Network Inference"— Presentation transcript:

1 Link Prediction and Network Inference
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

2 Link Prediction in Networks
[LibenNowell-Kleinberg ‘03] Link Prediction in Networks The link prediction task: Given G[t0,t0’] a graph on edges up to time t0’ output a ranked list L of links (not in G[t0,t0’]) that are predicted to appear in G[t1,t1’] Evaluation: n=|Enew|: # new edges that appear during the test period [t1,t1’] Take top n elements of L and count correct edges G[t0, t’0] G[t1, t’1] 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

3 Link Prediction via Proximity
[LibenNowell-Kleinberg ‘03] Link Prediction via Proximity Predict links in a evolving collaboration network Core: Since network data is very sparse Consider only nodes with in-degree and out-degree of at least 3 Picture – Core vs non-core 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

4 Link Prediction via Proximity
Methodology: For every pair of nodes (x,y) compute proximity c # of common neighbors c(x,y) of x and y Sort pairs by the decreasing score only condier/predict edges where both endpoints are in the core (deg. >3) Predict top n pairs as new links X X 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

5 Link Prediction via Proximity
[LibenNowell-Kleinberg ‘03] Link Prediction via Proximity For every pair of nodes (x,y) compute: Γ(x) … degree of node x 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

6 Results: Improvement over Random
[LibenNowell-Kleinberg ’ 03] Results: Improvement over Random 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

7 Results: Common Neighbors
Improvement over #common neighbors 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

8 Supervised Link Prediction
[WSDM ‘11] Supervised Link Prediction How to learn to predict new friends? Facebook’s People You May Know Let’s look at the data: 92% of new friendships on FB are friend-of-a-friend More common friends helps v z w u 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

9 Supervised Link Prediction
Recommend a list of possible friends Supervised machine learning setting: Training example: For every node s have a list of nodes she will create links to {v1 … vk} Use FB network from May 2011 and {v1…vk} are the new friendships you created since then Task: For a given node s rank nodes {v1 … vk} higher than other nodes in the network Supervised Random Walks s “positive” nodes “negative” nodes Green nodes are the nodes to which s creates links in the future 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

10 Supervised Link Prediction
How to combine node/edge attributes and the network structure? Learn a strength of each edge based on: Profile of user u, profile of user v Interaction history of u and v Do a PageRank-like random walk from s to measure the “proximity” between s and other nodes Rank nodes by their “proximity” (i.e., visiting prob.) s s “positive” nodes “negative” nodes 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

11 Supervised Random Walks
[WSDM ’11] Supervised Random Walks Let s be the center node Let fβ(u,v) be a function that assigns a strength to each edge: 𝑎 𝑢𝑣 = 𝑓 𝛽 𝑢,𝑣 = exp − 𝑖 𝛽 𝑖 ⋅ Ψ 𝑢𝑣 𝑖 Ψuv is a feature vector Features of node u Features of node v Features of edge (u,v) (β is the parameter vector we want to learn!) Do Random Walk with Restarts from s where transitions are according to edge strengths auv s “positive” nodes “negative” nodes 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

12 SRW: Prediction How to estimate edge strengths?
Personalized PageRank on the weighted graph. Each node u gets a PageRank proximity pu Set edge strengths auv = fβ(u,v) Network Sort nodes by the decreasing PageRank proximity pu How to estimate edge strengths? How to set parameters β of fβ(u,v)? Recommend top k nodes with the highest proximity pu to node s 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

13 Personalized PageRank
[WSDM ’11] Personalized PageRank auv …. Strength of edge (u,v,) Random walk transition matrix: PageRank transition matrix: with prob. α jump back to s Compute PageRank vector: p = pT Q Rank nodes u by pu s “positive” nodes “negative” nodes 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

14 The Optimization Problem
[WSDM ’11] The Optimization Problem Each node u has a score pu Positive nodes D ={d1,…, dk} Negative nodes L = {the rest} What do we want? Want to find β such that pl < pd The exact solution to the above problem may not exist So we make the constrains “soft” (i.e., optional) s “positive” nodes “negative” nodes 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

15 Making Constraints “Soft”
[WSDM ’11] Making Constraints “Soft” Want to minimize: Loss: h(x)=0 if x<0, x2 else Loss pl < pd pl=pd pl > pd 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

16 Solving the Problem: Intuition
[WSDM ’11] Solving the Problem: Intuition How to minimize F? Both pl and pd depend on β Given β assign edge weights auv=fβ(u,v) Using transition matrix Q=[auv] compute PageRank scores pu Rank nodes by the PageRank score Want to find β such that pl < pd s 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

17 Gradient Descent i.e. How to minimize F? We know: So:
[WSDM ’11] Gradient Descent How to minimize F? Take the derivative! We know: i.e. So: Looks like the PageRank equation! Easy 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

18 Optimizing F To optimize F, use gradient based method:
Pick a random starting point β0 Compute the personalized PageRank vector p Compute the gradient with respect to the weight vector β Update β Optimize using quasi-Newton method 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

19 Data: Facebook Facebook Iceland network For every node s:
[WSDM ’11] Data: Facebook Facebook Iceland network 174,000 nodes (55% of population) Avg. degree 168 Avg. person added 26 friends/month For every node s: Positive examples: D={ new friendships of s created in Nov ‘09 } Negative examples: L={ other nodes s did not create new links to } Limit to friends of friends: on avg. there are 20k FoFs (max 2M)! s 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

20 Experimental setting Node and Edge features for learning: Baselines:
Node: Age, Gender, Degree Edge: Age of an edge, Communication, Profile visits, Co-tagged photos Baselines: Decision trees and logistic regression: Above features + 10 network features (PageRank, common friends) Evaluation: AUC and Precision at Top20 Skip AUC evaluation, Skip Logistic Regression, DT? 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

21 Results: Facebook Iceland
Facebook: predict future friends Adamic-Adar already works great Logistic regression also strong SRW gives slight improvement No AUC column, No DT, LR What are feature sets? Node, Network, Path 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

22 Network Inference

23 Hidden and implicit networks
Many networks are implicit or hard to observe: Hidden/hard-to-reach populations: Network of needle sharing between drug injection users Implicit connections: Network of information propagation in online news media But we can observe results of the processes taking place on such (invisible) networks: Virus propagation: Drug users get sick, and we observe when they see the doctor Information networks: We observe when media sites mention information Question: Can we infer the hidden networks? 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

24 Inferring the Diffusion Networks
[KDD ’10, NIPS ‘10] Inferring the Diffusion Networks There is a hidden diffusion network: We only see times when nodes get “infected”: Cascade c1: (a,1), (c,2), (b,3), (e,4) Cascade c2: (c,1), (a,4), (b,5), (d,6) Want to infer who-infects-whom network! a a a b b b d d c c c e e 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

25 Examples and Applications
Word of mouth & Viral marketing Virus propagation Viruses propagate through the network We only observe when people get sick But NOT who infected whom Recommendations and influence propagate We only observe when people buy products But NOT who influenced whom Process We observe It’s hidden Can we infer the underlying network? 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

26 Inferring the Diffusion Network
9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

27 Network Inference: The Task
Goal: Find a graph G that best explains the observed information times Given a graph G, define the likelihood P(C|G): Define a model of information diffusion over a graph Pc(u,v) … prob. that u infects v in cascade c P(c|T) … prob. that c spread in particular pattern T P(c|G) … prob. that cascade c occurred in G P(C|G) … prob. that a set of cascades C occurred in G Questions: How to efficiently compute P(G|C)? (given a single G) How to efficiently find G* that maximizes P(G|C)? (over O(2N*N) graphs) Have a diagram of what we will define and how it will all fit together. Maybe even on a separate slide. 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

28 Cascade Diffusion Model
Continuous time cascade diffusion model: Cascade c reaches node u at tu and spreads to u’s neighbors: With probability β cascade propagates along edge (u, v) and we determine the infection time of node v tv = tu + Δ e.g.: Δ ~ Exponential or Power-law tu tb tc Δ1 Δ2 u u a b b b c c We assume each node v has only one parent! c d 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

29 Cascade Diffusion Model
The model for 1 cascade: Cascade reaches node u at time tu, and spreads to u’s neighbors v: With prob. β cascade propagates along edge (u,v) and tv = tu+Δ Transmission probability: Pc(u,v)  P(tv -tu ) if tv> tu else ε e.g.: Pc(u,v)  e -Δt ε captures influence external to the network At any time a node can get infected from outside with small probability ε tv ε ε ε b d e a c β 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

30 Cascade Probability Given node infection times and pattern T:
c = { (a,1), (c,2), (b,3), (e,4) } T = { a→b, a→c, b→e } Prob. that c propagates in pattern T Approximate it as: T b d e a c Graph G Edges that “propagated” Edges that failed to “propagate” 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

31 Complication: Too Many Trees
How likely is cascade c to spread in graph G? c = {(a,1), (c,2), (b,3), (e,4)} Need to consider all possible ways for c to spread over G (i.e., all spanning trees T): b d e a c b d e a c b d e a c Consider only the most likely propagation tree 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

32 The Optimization Problem
Score of a graph G for a set of cascades C: Want to find the “best” graph: The problem is NP-hard: MAX-k-COVER [KDD ’10] 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

33 How to Find the Best Tree?
Given a cascade c, what is the most likely propagation tree? A maximum directed spanning tree Edge (i,j) in G has weight wc(i,j)=log Pc(i,j) The maximum weight spanning tree on infected nodes: Each node picks an in-edge of max weight: b d e a c Parent of node i in tree T Local greedy selection gives optimal tree! 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

34 Great News: Submodularity!
Theorem: Fc(G) is monotonic, and submodular Proof: Single cascade c, some edge e=(r,s) of weight. wrs Show Fc(G {e}) – Fc(G) ≥ Fc(G’ {e}) – FC (G’) Let w.s be max weight in-edge of s in G Let w’.s be max weight in-edge of s in G’ Since G  G’ : w.s  w’.s and wrs= w’rs 𝐹 𝑐 (𝐺∪ 𝑟,𝑠 − 𝐹 𝑐 𝐺 = max 𝑤 .𝑠 , 𝑤 𝑟𝑠 − 𝑤 .𝑠 ≥ max 𝑤 .𝑠 ′ , 𝑤 𝑟𝑠 − 𝑤 .𝑠 ′ = 𝐹 𝑐 𝐺 ′ ∪ 𝑟,𝑠 − 𝐹 𝑐 (𝐺′) b a i j s r s picks in-edge of max weight 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

35 [KDD ’10] NetInf: The Algorithm The algorithm: Use greedy hill-climbing to maximize FC(G): Start with empty G0 (G with no edges) Add k edges (k is parameter) At every step i add an edge to Gi that maximizes the marginal improvement 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

36 Experiments: Synthetic data
Take a graph G on k edges Simulate info. diffusion Record node infection times Reconstruct G Evaluation: How many edges of G can NetInf find? Break-even point: 0.95 Performance is independent of the structure of G! 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

37 How Good is Our Graph? We achieve ≈ 90 % of the best possible network!
9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

38 How Many Cascades Do We Need?
With 2x as many infections as edges, the break-even point is already ! 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

39 Experiments: Real data
[KDD, ‘09] Experiments: Real data Memetracker dataset: 172m news articles Aug ‘08 – Sept ‘09 343m textual phrases Times tc(w) when site w mentions phrase c Given times when sites mention phrases Infer the network of information diffusion: Who tends to copy (repeat after) whom 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

40 Example: Diffusion Network
5,000 news sites: Blogs Mainstream media 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,

41 Diffusion Network (small part)
Blogs Mainstream media 9/19/2018 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis,


Download ppt "Link Prediction and Network Inference"

Similar presentations


Ads by Google