Download presentation
Published bySabina Ward Modified over 9 years ago
1
SimRank : A Measure of Structural-Context Similarity
Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Glen Jeh Jennifer Widom
2
Outline Motivation Objective Introduction Basic Graph Model SimRank
Random Surfer-Pairs Model Future Work Personal opinion
3
Motivation The problem of measuring “similarity” of objects arises in many applications.
4
Objective The approach, applicable in any domain with object-to-object relationships. Two objects are similar if they are related to similar objects.
5
Introduction
6
Basic Graph Model We model objects and relationships as a directed graph G=(V,E). For a node v in a graph, we denote by I(v) and O(v) the set of in-neighbors and out-neighbors.
7
SimRank Basic SimRank Equation
If a=b then s(a,b) is defined to be 1. Otherwise, Where C is a constant between 0 and 1. Set s(a,b)=0 when or . (1)
8
SimRank Bipartite SimRank Two types of objects.
Example : Shopping graph G.
9
SimRank
10
SimRank Let s(A,B) denote the similarity between persons A and B, for
Let s(c,d) denote the similarity between items c and d, for (2) (3)
11
SimRank Computing SimRank-Naive Method is a lower bound on the .
To compute from (if ) (4) For , and for
12
SimRank The space required is simply to store the results .
The time required is K:The number of iterations :The average of |I(a)||I(b)| over all node pairs (a,b).
13
SimRank Computing SimRank-Pruning
set the similarity between two nodes far apart to be 0. consider node-pairs only for nodes which are near each other.
14
SimRank Radius r, and average such neighbors for a node, then there
will be node-pairs. The time and space complexities become and respectively.
15
Random Surfer-Pair Model
Expected Distance Let H be any strongly connected graph. Let u,v be any two nodes in H. We define the expected distance d(u,v) from u to v as (5)
16
Random Surfer-Pair Model
Expected Meeting Distance(EMD). (6)
17
Random Surfer-Pair Model
Expected-f Meeting Distance To circumvent the “infinite EMD” problem. To map all distances to a finite interval. Exponential function ,where is a constant. (7)
18
Random Surfer-Pair Model
Equivalence to SimRank
19
Random Surfer-Pair Model
Theorem. The SimRank score, with parameter C, between two nodes is their expected-f meeting distance traveling back-edges, for
20
Future Work Future Work. Divided and conquer and merge.
Divided a corpus into chunks… Ternary(or more) relationships.
21
Personal Opinion We believe that the intuition behind SimRank can be used in many domains which based on objects to objects.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.