Download presentation
Presentation is loading. Please wait.
1
Fast Direction-Aware Proximity for Graph Mining KDD 2007, San Jose Hanghang Tong, Yehuda Koren, Christos Faloutsos
2
2 Defining Direction-Aware Proximity (DAP): escape probability Define Random Walk (RW) on the graph Esc_Prob(A B) – Prob (starting at A, reaches B before returning to A) Esc_Prob = Pr (smile before cry) A B the remaining graph
3
3 Esc_Prob(1->5) = P= I - + P: Transition matrix (row norm.)
4
Intuition of Formula P*P=
5
5 Esc_Prob(1->5) = P= I - + P: Transition matrix (row norm.)
6
6 Case 1, Medium Size Graph – Matrix inversion is feasible, but… – What if we want many proximities? – Q: How to get all (n ) proximities efficiently? – A: FastAllDAP! Case 2: Large Size Graph – Matrix inversion is infeasible – Q: How to get one proximity efficiently? – A: FastOneDAP! Challenges 2
7
7 FastAllDAP Q1: How to efficiently compute all possible proximities on a medium size graph? – a.k.a. how to efficiently solve multiple linear systems simultaneously? Goal: reduce # of matrix inversions!
8
8 FastAllDAP: Observation Need two different matrix inversions! P=
9
9 FastAllDAP: Rescue Redundancy among different linear systems! P= Overlap between two gray parts! Prox(1 5) Prox(1 6)
10
10 FastAllDAP: Theorem Theorem: Proof: by SM Lemma Example:
11
11 FastAllDAP: Algorithm Alg. – Compute Q – For i,j =1,…, n, compute Computational Save O(1) instead of O(n )! Example – w/ 1000 nodes, – 1m matrix inversion vs. 1 matrix! 2
12
12 FastOneDAP Q1: How to efficiently compute one single proximity on a large size graph? – a.k.a. how to solve one linear system efficiently? Goal: avoid matrix inversion!
13
13 FastOneDAP: Observation Partial Info. (4 elements /2 cols ) of Q is enough!
14
14 FastOneDAP: Observation Q: How to compute one column of Q? A: Taylor expansion Reminder: i col of Q th [0, …0, 1, 0, …, 0] T
15
15 FastOneDAP: Observation xxx Sparse matrix-vector multiplications! …. i col of Q th [0, …0, 1, 0, …, 0] T
16
16 FastOneDAP: Iterative Alg. Alg. to estimate i Col of Q th
17
17 FastOneDAP: Property Convergence Guaranteed ! Computational Save – Example: 100K nodes and 1M edges (50 Iterations) 10,000,000x fast! Footnote: 1 col is enough! – (details in paper)
18
18 Esc_Prob is good, but… Issue #1: – `Degree-1 node’ effect Issue #2: – Weakly connected pair Need some practical modifications!
19
19 Issue#1: `degree-1 node’ effect [Faloutsos+] [Koren+] no influence for degree-1 nodes (E, F)! – known as ‘pizza delivery guy’ problem in undirected graph Solutions: Universal Absorbing Boundary! Esc_Prob(a->b)=1
20
20 Universal Absorbing Boundary U-A-B is a black-hole! Footnote: fly-out probability = 0.1
21
21 Introducing Universal-Absorbing-Boundary Prox(a->b)=0.91 Prox(a->b)=0.74 Footnote: fly-out probability = 0.1 Esc_Prob(a->b)=1
22
22 Issue#2: Weakly connected pair Prox(A B) = Prox (B A)=0 Solution: Partial symmetry!
23
23 Practical Modifications: Partial Symmetry Prox(A B) = Prox (B A)=0 Prox(A B) =0.081 > Prox (B A)=0.009
24
24 Efficiency: FastAllDAP Size of Graph Time (sec) Straight-Solver FastAllDAP 1,000x faster!
25
25 Efficiency: FastOneDAP Size of Graph Time (sec) FastOneDAP Straight-Solver 1,0000x faster!
26
26 Link Prediction: existence DatasetAccuracy DAPUDAP WL65.40% PC79.60%80.78% AE81.51%80.60% CN86.71%84.00% EP92.21%92.09%
27
27 Link Prediction: direction Q: Given the existence of the link, what is the direction of the link? A: Compare prox(i j) and prox(j i) >70% Prox (i j) - Prox (j i) density
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.