Proximity in Graphs by Using Random Walks

Slides:



Advertisements
Similar presentations
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Advertisements

RESILIENCE NOTIONS FOR SCALE-FREE NETWORKS GUNES ERCAL JOHN MATTA 1.
Absorbing Random walks Coverage
Fast Direction-Aware Proximity for Graph Mining KDD 2007, San Jose Hanghang Tong, Yehuda Koren, Christos Faloutsos.
© 2010 IBM Corporation Diversified Ranking on Large Graphs: An Optimization Viewpoint Hanghang Tong, Jingrui He, Zhen Wen, Ching-Yung Lin, Ravi Konuru.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
SCS CMU Proximity Tracking on Time- Evolving Bipartite Graphs Speaker: Hanghang Tong Joint Work with Spiros Papadimitriou, Philip S. Yu, Christos Faloutsos.
Link Analysis, PageRank and Search Engines on the Web
Measuring and Extracting Proximity in Networks By - Yehuda Koren, Stephen C.North and Chris Volinsky - Rahul Sehgal.
C-DEM: A Multi-Modal Query System for Drosophila Embryo Databases Fan Guo, Lei Li, Eric Xing, Christos Faloutsos Carnegie Mellon University {fanguo, leili,
Measure Proximity on Graphs with Side Information Joint Work by Hanghang Tong, Huiming Qu, Hani Jamjoom Speaker: Mary McGlohon 1 ICDM 2008, Pisa, Italy15-19.
Fast Random Walk with Restart and Its Applications
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P3-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 3: Recommendations & proximity Faloutsos,
Mining Large Graphs Part 3: Case studies Jure Leskovec and Christos Faloutsos Machine Learning Department Joint work with: Lada Adamic, Deepay Chakrabarti,
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
School of Computer Science Carnegie Mellon LLNL, Feb. '07C. Faloutsos1 Mining static and time-evolving graphs Christos Faloutsos Carnegie Mellon University.
Section 8 – Ec1818 Jeremy Barofsky March 31 st and April 1 st, 2010.
DATA MINING LECTURE 13 Absorbing Random walks Coverage.
Network Characterization via Random Walks B. Ribeiro, D. Towsley UMass-Amherst.
KDD 2007, San Jose Fast Direction-Aware Proximity for Graph Mining Speaker: Hanghang Tong Joint work w/ Yehuda Koren, Christos Faloutsos.
SCS CMU Proximity on Large Graphs Speaker: Hanghang Tong Guest Lecture.
Fast Random Walk with Restart and Its Applications Hanghang Tong, Christos Faloutsos and Jia-Yu (Tim) Pan ICDM 2006 Dec , HongKong.
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
KDD 2007, San Jose Fast Direction-Aware Proximity for Graph Mining Speaker: Hanghang Tong Joint work w/ Yehuda Koren, Christos Faloutsos.
Project funded by the Future and Emerging Technologies arm of the IST Programme Are Proliferation Techniques more efficient than Random Walk with respect.
Kijung Shin Jinhong Jung Lee Sael U Kang
Project funded by the Future and Emerging Technologies arm of the IST Programme Search in Unstructured Networks Niloy Ganguly, Andreas Deutsch Center for.
Center-Piece Subgraphs: Problem definition and Fast Solutions Hanghang Tong Christos Faloutsos Carnegie Mellon University.
Online Social Networks and Media Absorbing random walks Label Propagation Opinion Formation.
Importance Measures on Nodes Lecture 2 Srinivasan Parthasarathy 1.
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
SLAW: A Mobility Model for Human Walks
Lecture 1: Complex Networks
Random Walks on Graphs.
Topics In Social Computing (67810)
A Study of Group-Tree Matching in Large Scale Group Communications
Randomized Algorithm (Lecture 2: Randomized Min_Cut)
Search Engines and Link Analysis on the Web
Applications of graph theory in complex systems research
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
Link Prediction on Hacker Networks
DTMC Applications Ranking Web Pages & Slotted ALOHA
Link Prediction Seminar Social Media Mining University UC3M
Peer-to-Peer and Social Networks
Section 8.6: Clustering Coefficients
Community detection in graphs
Summarizing Entities: A Survey Report
Hanghang Tong, Brian Gallagher, Christos Faloutsos, Tina Eliassi-Rad
Large Graph Mining: Power Tools and a Practitioner’s guide
Section 8.6 of Newman’s book: Clustering Coefficients
Apache Spark & Complex Network
Centralities (4) Ralucca Gera,
Peer-to-Peer and Social Networks Fall 2017
Department of Computer Science University of York
Coverage Approximation Algorithms
Speaker: Hanghang Tong Carnegie Mellon University
Jinhong Jung, Woojung Jin, Lee Sael, U Kang, ICDM ‘16
Clustering Coefficients
Lecture 2: Complex Networks
Katz Centrality (directed graphs).
CS224w: Social and Information Network Analysis
Graph and Link Mining.
Social Network Analysis with Apache Spark and Neo4J
Lecture 21 Network evolution
Learning to Rank Typed Graph Walks: Local and Global Approaches
Practical Applications Using igraph in R Roger Stanton
Navigation and Propagation in Networks
Advanced Topics in Data Mining Special focus: Social Networks
Analysis of Large Graphs: Overlapping Communities
Presentation transcript:

Proximity in Graphs by Using Random Walks Many of the slides are borrowed from Dr. Hanghang Tong’ talk slides and Dr. Jure Leskovec’s lecture notes

Proximity on Graph What is Prox between A and B ‘how close is Smith to Johnson’?

Proximity on Graphs: Why? Link prediction Ranking Email Management Image caption Neighborhooh Formulation Conn. subgraph Pattern match Collaborative Filtering Many more…

Link Prediction How to predict the existence of the link? Proximity [Liben-Nowell + 2003]

Center-Piece Subgraph(Ceps) Given Q query nodes Find Center-piece ( ) Input of Ceps Q Query nodes Budget b K softand coefficient App. Social Network Law Inforcement Gene Network …

Example of CEPS

CEPS: Overview Individual Score Calculation Combine Individual Scores Measure importance wrt individual query Combine Individual Scores Measure importance wrt query set “Extract” Alg. … the connection subgraphs

Issue: `degree-1 node’ effect [Faloutsos+] [Koren+] Esc_Prob(a->b)=1 Esc_Prob(a->b)=1 no influence for degree-1 nodes (E, F)! known as ‘pizza delivery guy’ problem in undirected graph

RWR: Individual Score Calculation Goal Individual importance score r(i,j) = ri,j For each node j wrt each query i How to Random walk with restart Steady State Prob.

An Illustrating Example 5 Prob (RW will finally stay at j) 11 12 4 Starting from 1 Randomly to neighbor Some p to return to 1 10 3 13 6 2 7 1 9 8

Individual Score Calculation Q1 Q2 Q3 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node 13 0.5767 0.0088 0.0088 0.1235 0.0076 0.0076 0.0283 0.0283 0.0283 0.0076 0.1235 0.0076 0.0088 0.5767 0.0088 0.0076 0.0076 0.1235 0.0088 0.0088 0.5767 0.0333 0.0024 0.1260 0.1260 0.0024 0.0333 0.1260 0.0333 0.0024 0.0333 0.1260 0.0024 0.0024 0.1260 0.0333 0.0024 0.0333 0.1260

Individual Score Calculation Q1 Q2 Q3 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node 13 0.5767 0.0088 0.0088 0.1235 0.0076 0.0076 0.0283 0.0283 0.0283 0.0076 0.1235 0.0076 0.0088 0.5767 0.0088 0.0076 0.0076 0.1235 0.0088 0.0088 0.5767 0.0333 0.0024 0.1260 0.1260 0.0024 0.0333 0.1260 0.0333 0.0024 0.0333 0.1260 0.0024 0.0024 0.1260 0.0333 0.0024 0.0333 0.1260

Variant: escape probability Define Random Walk (RW) on the graph Esc_Prob(AB) Prob (starting at A, reaches B before returning to A) the remaining graph A B Esc_Prob = Pr (smile before cry)

AND: Combine Scores Q: How to combine scores? A: Multiply …= prob. 3 random particles coincide on node j

K_SoftAnd: Combine Scores Generalization – SoftAND: We want nodes close to k of Q (k<Q) query nodes. Q: How to do that?

K_SoftAnd: Combine Scores Generalization – softAND: We want nodes close to k of Q (k<Q) query nodes. Q: How to do that? A: Prob(at least k-out-of-Q will meet each other at j)

K_SoftAnd: Relaxation of AND Disconnected Communities Noise Asking AND query?  No Answer!

AND query vs. K_SoftAnd query x 1e-4 2_SoftAnd Query And Query

1_SoftAnd query = OR query

Measuring Importance Individual Scores Combining Scores Q1 Q2 Q3 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node 13 0.5767 0.0088 0.0088 0.1235 0.0076 0.0076 0.0283 0.0283 0.0283 0.0076 0.1235 0.0076 0.0088 0.5767 0.0088 0.0076 0.0076 0.1235 0.0088 0.0088 0.5767 0.0333 0.0024 0.1260 0.1260 0.0024 0.0333 0.1260 0.0333 0.0024 0.0333 0.1260 0.0024 0.0024 0.1260 0.0333 0.0024 0.0333 0.1260 0.4505 0.0710 0.2267 0.1010 OR 0.0103 0.0019 0.0024 0.0046 Random walk with restart K_SoftAnd Steady State Prob And 2_SoftAnd Meeting Prob

“Extract” Alg. Goal How to…”Extract” Alg. Maximize total scores and 1 2 3 5 4 6 7 8 9 10 11 12 13 14 15 16 Goal Maximize total scores and ‘Appropriate’ Connections How to…”Extract” Alg. Dynamic Programming Greedy Alg. Pickup promising node Find ‘best’ path 2 10 9 6 8 13 11 4 5 7 12 3 1

Case Study: AND query

database Statistic 2_SoftAnd query