Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Applications of Relative Importance  Why is relative importance interesting? Web Social Networks Citation Graphs Biological Data  Graphs become too.

Similar presentations


Presentation on theme: "1 Applications of Relative Importance  Why is relative importance interesting? Web Social Networks Citation Graphs Biological Data  Graphs become too."— Presentation transcript:

1 1 Applications of Relative Importance  Why is relative importance interesting? Web Social Networks Citation Graphs Biological Data  Graphs become too complex for manual analysis

2 2 Existing Techniques  Web PageRank (Google)  Social Networks ‘Centrality’  All focus on global measures of node importance – we’re interested in importance relative to a set of root nodes R

3 3 Use Existing Techniques?  Use global algorithm on the subgraph surrounding root nodes?  No preferential treatment of root nodes – just ranking surrounding nodes.

4 4 Organization: Relative importance Algorithms Notation Problem Formulation General Framework Algorithms

5 5 Notation  Digraph G = (V, E)  Edges Ordered pair of nodes (u, v)  Graphs are directed, unweighted, simple  Walks from u to v a.k.a. A walk is a path with no repeated nodes

6 6 Notation  k-short paths  P(u,v) – set of paths between u and v  – set of distinct out-going edges from u   Similarly, we have

7 7 Problem Formulation 1.Given G and r and t, where, compute the “importance” of t w.r.t. root node r:

8 8 Problem Formulation 2.Given G and node, rank all vertices in T(G), T V, w.r.t. r.

9 9 Problem Formulation 3.Given G, a set of nodes T(G) to rank, and a set of root nodes R(G) where R V, rank all vertices in T w.r.t. R. This is similar to the last case, except that we compute rather than Average importance:

10 10 Problem Formulation (3 cont’d.)  Rather than average each node’s importance score, we could define  This requires ‘important’ nodes to have a high importance score among all nodes in R

11 11 Problem Formulation 4.Given G, rank all nodes where R=T=V.

12 12 General Framework: Weighted Paths  Nodes are related according to the paths that connect them  The longer the path, the less importance: is a scalar coefficient, P(r,t) is a set of paths from r to t, p i is the ith path in P. Importance decays exponentially

13 13 How to choose P(r,t)?  Path examples a.b. Shortest paths from R to T: {R-C-T. R-D-T} which fail to capture much of Connectivity from R to T.

14 14 Shortest Path  e.g.: Transport cargo from r to t  Shortest path doesn’t always give a good approximation of importance. E.g: the web (graph b)

15 15 k-Short Paths  Paths of length K  Idea: there might often be longer paths than the shortest ones that are important to take into account  Fixes problem of longer, important paths in Shortest Paths e.g.: graph b., 3-short  Problem: capacity constraints e.g.: network topology

16 16 k-Short Node-Disjoint Paths  No nodes and no edges are repeated Implicitly enforces capacity constraints Motivated by ‘mass flow’ where importance can ‘flow’ along paths e.g.: graph b.  Breadth-first with some heuristic, with some K and some

17 17 Markov Chains & Relative Importance  Graph viewed as a stochastic process Explanation of Markov Chains Token traversing Chain… Obviously good for modeling the web

18 18 Markov Chains & Relative Importance  Markov Centrality Mean First Passage Time : expected number of steps until first arrival at node t starting at node r : probability that the chain first returns to state t in exactly n steps

19 19 Markov Chains & Relative Importance Bias toward ‘central nodes’ COMPLEX!!  Time: O(|V| 3 ) (inversion of |V|x|V| transition matrix)  Space: O(|V 2 |)

20 20 Markov Chains & Relative Importance  PageRank Uses backlinks to assign importance to web pages

21 21 Markov Chains & Relative Importance  PageRank Less complex Converges logarithmically 322 million links processed in 52 iterations

22 22 Markov Chains & Relative Importance  Retrofit PageRank such that all nodes in R have a uniform bias at the start  ‘Surfer’ begins at a root node, traverses graph, returning to root set R with probability at each time-step  I(t|R) = probability that surfer visits t during a walk

23 23 Experiments (Simulated Data)

24 24 Experiments (Simulated Data)  More complex in and out degrees changed Shortest path lengths between nodes changed (e.g.: A-B)  Analysis which follows, R={A,F}

25 25 Experiments (Simulated Data)  HITSPa A.252 F.241 G.128 C.110 E.099 H.052 D.032 J.025 I.032 B.024  HITSPh F.225 A.186 D.162 B.119 E.090 I.067 H.061 J.050 G.028 C.008

26 26 Experiments (Simulated Data)  MarkovC J.180 C.133 G.130 H.129 E.111 I.101 F.069 D.051 A.047 B.044  KSMarkov H.146 G.142 E.142 J.140 C.120 I.098 F.087 D.061 A.034 B.024

27 27 Experiments (9/11 Terrorist Network)  63 nodes (terrorists)  308 edges (interactions)

28 RankPRankPHITSPWKPathsMarkovCKSMarkov 1Khemais BeghalAttaKhemais 2Beghal KhemaisAl-ShehhiBeghal 3MoussaouiAttaMoussaouiAl-ShibhMoussaoui 4MaaroufiMoussaouiMaaroufiMoussaouiMaaroufi 5QatadaMaaroufiBensakhriaJarrahQatada 6DaoudiQatadaDaoudiHanjourDaoudi 7CourtaillierBensakhriaQatadaAl-OmariBensakhria 8 DaoudiWalidKhemaisCourtaillier 9WalidCourtaillier QatadaWalid 10Khammoun BahajiKhammoun

29 29 Conclusion  Provides a first-step to addressing ‘relative-importance’  Scaling for algorithms such as Markov Chaining can be an issue  Using different algorithms and comparing results can reveal interesting information  …Paper Analysis…

30 30 References  White, Smyth. Algorithms for Estimating Relative Importance in Networks. SIGKDD ’03.  Page, Brin, Motwani, Winograd. The PageRank Citation Ranking: Bringing Order to the Web. Stanford University, Computer Science Department Technical Report.  Wikipedia on Markov Chains http://en.wikipedia.org/wiki/Markov_chain http://en.wikipedia.org/wiki/Examples_of_Markov_chains

31 31 Weather Markov Chain Example

32 32 Markov Chain Steady State  The further along the prediction, the less accurate – converges on a steady state We’ll skip the proof in interest of time…  Probabilities derived from gathering experimental data


Download ppt "1 Applications of Relative Importance  Why is relative importance interesting? Web Social Networks Citation Graphs Biological Data  Graphs become too."

Similar presentations


Ads by Google