Download presentation
Presentation is loading. Please wait.
Published bySonja Peters Modified over 5 years ago
1
Random Walk on Graph t=0 Random Walk Start from a given node at time 0
Choose a neighbor randomly (including previous) and move there Repeat until time t = n Q1. Where does this converge to as n ∞ Q2. How fast does it converge? Q3. What are the implications for different applications?
2
Random Walks on Graphs A =
Node degree ki move to any neighbor with prob = 1/ki This is a Markov chain! Start at a node i p(0) = (0,0,…,1,…0,0) p(n) = p(0) An π = π A [where π = limn∞ p(n)] Q: what is π for a random walk on a graph? 1 1/k1 1/k2 1/k3 1/k4 1/k5 A =
3
Random Walks on Undirected Graphs
Stationarity: π(z) = Σxπ(x)p(x,z) p(x,y) = 1/kx Could try to solve these or global balance. Not Easy!! Define N(z): {neighbors of z) Σx ∈ N(z) kx⋅p(x,z) = Σx ∈ N(z) kx⋅(1/kx) = Σx ∈ N(z)1 = kz Normalize by (dividing both sides with) Σxkx Σxkx = 2|E| (|E| = m = # of edges) Σx ∈ N(z) (kx/2|E|)⋅p(x,z) = kz/2|E| π(x) = kx/2|E| is the stationary distribution always satisfies the stationarity eq π(x) = π(x)P
4
What about Random Walks on Directed Graphs?
1/8 4/13 2/13 1/13 Assign each node centrality 1/n (for n nodes)
5
A Problematic Graph Q: What is the problem with this graph?
A: All centrality “points” will eventually go to F and G Solution: when at node i With probability β jump to any (of the total N) node(s) With 1-β jump to a random neighbor of i Q: Does this remind you of something? A: PageRank algorithm! PageRank of node i is the stationary probability for a random walk on this (modified) directed graph factor β in PageRank function avoids this problem by “leaking” some small amount of centrality from each node to all other nodes
6
PageRank Centrality A (bored) web surfer Either surf a linked webpage
PageRank as a Random Walk A (bored) web surfer Either surf a linked webpage with probability 1-β Or surf a random page (e.g. new search) with probability β The probability of ending up at page X, after a large enough time = PageRank of page X! Can generalize PageRank with general β = (β1,β2,…,βn) Undirected network: removing β degree centrality
7
Applications of RW: Measuring Large Networks
We are interested in studying the properties (degree distribution, path lengths, clustering, connectivity, etc.) of many real networks (Internet, Facebook, YouTube, Flickr, etc.) as this contain many important ($$$) information E.g. to plot degree distribution, we need to crawl the whole network and obtain a “degree value” for each node. This networks might contain millions of nodes!!
8
Online Social Networks (OSNs)
Size Traffic 500 million 2 200 million 9 130 million 12 100 million 43 75 million 10 29 > 1 billion users October 2010 (over 15% of world’s population, and over 50% of world’s Internet users !)
9
This is neither feasible nor practical.
Measuring FaceBook Facebook: 500+M users 130 friends each (on average) 8 bytes (64 bits) per user ID The raw connectivity data, with no attributes: 500 x 130 x 8B = 520 GB To get this data, one would have to download: 100+ TB of (uncompressed) HTML data! This is neither feasible nor practical. Solution: Sampling!
10
Measuring Large Networks (for the mere mortals)
Obtaining complete dataset difficult companies usually unwilling to share data for privacy and performance reasons (e.g. Facebook will ban accounts if it sees extensive crawling) tremendous overhead to measure all (~100TB for Facebook) Representative samples desirable study properties test algorithms
11
Sampling What: How: Topology? Directly? Nodes? Exploration?
12
(1) Breadth-First-Search (BFS)
Starting from a seed, explores all neighbor nodes. Process continues iteratively without replacement. BFS leads to bias towards high degree nodes Lee et al, “Statistical properties of Sampled Networks”, Phys Review E, 2006 Early measurement studies of OSNs use BFS as primary sampling technique i.e [Mislove et al], [Ahn et al], [Wilson et al.]
13
(2) Random Walk (RW) Explores graph one node at a time with replacement Restart from different seeds Or multiple seeds in parallel Does this lead to a good sample??
14
Implications for Random Walk Sampling
Say, we collect a small part of the Facebook graph using RW Higher chance to visit high-degree nodes High-degree nodes overrepresented Low-degree nodes under-represented sampled degree distribution 2? Random Walk (RW): Real degree distribution sampled degree distribution 1? [1] M. Gjoka, M. Kurant, C. T. Butts and A. Markopoulou, “Walking in Facebook: A Case Study of Unbiased Sampling of OSNs”, INFOCOM 2010.
15
Random Walk Sampling of Facebook
sampled real Real average node degree: 94 Observed average node degree: 338 Q: How can we fix this? A: Intuition Need to reduce (increase) the probability of visiting high (low) degree nodes
16
Markov Chain Monte Carlo (MCMC)
Q:How should we modify the Random Walk? A: Markov Chain Monte Carlo theory Original chain: move xy with prob Q(x,y) Stationary distribution π(x) Desired chain: Stationary distribution w(x) (for uniform sampling: w(x) = 1/N) New transition probabilities
17
MCMC (2) a(x,y): probability of accepting proposed move
Q: How should we choose a(x,y) so as to converge to the desired stationary distribution w(x)? A: w(x) station. distr. w(x)P(x,y) = w(y)P(y,x) (for all x,y) Q: Why? Local balance (time-reversibility) equations w(x)Q(x,y)a(x,y) = w(y)Q(y,x)a(y,x) (denote b(x,y) = b(y,x)) a(x,y) ≤ 1 (probability) b(x,y) ≤ w(x)Q(x,y) b(x,y) = b(y,x) ≤ w(y)Q(y,x)
18
MCMC for Uniform Sampling
w(x) = w(y) (= 1/n…doesn’t really matter) Q(y,x)/Q(x,y) = kx/ky Metropolis-Hastings random walk: Move to lower degree node always accepted Move to higher degree node reject with prob related to degree ratio
19
Metropolis-Hastings (MH) Random Walk
Explore graph one node at a time with replacement In the stationary distribution
20
Degree Distribution of FB with MHRW
Sampled degree distribution almost identical to real one MCMC methods have MANY other applications Sampling Optimization
21
Spectral Analysis of (ergodic) Markov Chains
If a Markov Chain (defined by transition matrix P) is ergodic (irreducible, aperiodic, and positive recurrent) P(n)ik πk and π = [π1, π2,…, πn] Q: But how fast does the chain converge? E.g. how many steps until we are “close enough” to π A: This depends on the eigenvalues of P The convergence time is also called the mixing time
22
Eigenvalues and Eigenvectors of matrix P
Left Eigenvectors A row vector π is a left eigenvector for eigenvalue λ of matrix P iff πP = λπ Σk πk pki = λπi Right Eigenvectors A column vector v is a right eigenvector for eigenvalue λ of matrix P iff Pv = λv Σk pik vk = λvi Q: What eigenvalues and eigenvectors can we guess already? A: λ = 1 is a left eigenvalue with eigenvector π the stationary distr. λ = 1 is a right eigenvalue with eigenvector v=1 (all 1s)
23
Eigenvalues and Eigenvectors for 2-state Chains
Both sets have non-zero solutions (P-λI) is singular There exists v ≠ 0 such that (P-λI)v = 0 Determinant |P-λI| = 0 (p11- λ)(p22- λ)-p12p21 = 0 λ1=1, λ2 = 1 – p12 – p21 (replace above and confirm using some algebra) |λ2| < 1 (normalized: π(1) to be a stationary distribution AND v(i) ∙π(i) = 1, ∀i)
24
Diagonalization Eigenvalue decomposition: P = U Λ U-1 Q: What is P(n)?
=> Q: How fast does the chain converge to stationary distrib.? A: It converges exponentially fast in n, as (λ2)n
25
Generalization for M-state Markov Chains
We’ll assume that there are M distinct eigenvalues (see notes for repeated ones) Matrix P is stochastic all eigenvalues |λi| ≤ 1 Q: Why? A: Q: How fast does an (ergodic) chain converge to stationary distribution? A: Exponentially with rate equal to 2nd largest eigenvalue
26
Speed of Sampling on this Network?
26 λ2 (2nd largest eigenvalue) related to (balanced) min-cut of the graph The more “partitioned” a graph is into clusters with few links between them the longer the convergence time for the respective MC the slower the random walk search
27
Community Detection - Clustering
28
Device-to-Device Communication (e.g. Bluetooth or WiFi Direct)
29
Data/Malware Spreading Over Opp. Nets
F E D B D D D A D C Contact Process: Due to node mobility Q: How long until X% of nodes “infected”? Contact Process: Due to node mobility Q: How long until X% of nodes “infected”?
30
News/Videos on Online Social Networks
interaction (post, share) i j Contact/Interaction: (random) times when user i posts/writes to user j, or user j checks out i’s page. “transfer” during a contact with probability p Q: How long until a video goes “viral”? 30
31
Email Network An email with a virus or worm
A graph showing which users send s to whom Pairwise contact process: (random) times of s between i and j Q: How long for the worm to spread?? 31
32
Diffusion in networks: ER graphs
33
ER graphs: connectivity and density
nodes infected after 10 steps, infection rate = 0.15 average degree = 2.5 average degree = 10
34
Quiz Q: When the density of the network increases, diffusion in the network is faster slower unaffected
35
Diffusion in “grown networks”
nodes infected after 4 steps, infection rate = 1 preferential attachment non-preferential growth
36
Quiz Q: When nodes preferentially attach to high degree nodes, the diffusion over the network is faster slower unaffected
37
Diffusion in small worlds
What is the role of the long-range links in diffusion over small world topologies?
38
Quiz Q: As the probability of rewiring increases, the speed with which the infection spreads increases decreases remains the same
39
Analysis of Epidemics: The Usual Approach
Assumption 1) Underlay Graph Fully meshed Assumption 2) Contact Process Poisson(λij), Indep. Assumption 3) Contact Rate λij = λ (homogeneous)
40
Modeling Epidemic Spreading: Markov Chains (MC)
2-hop infection
41
How realistic is this? A Real Contact Graph A Poisson Graph
(ETH Wireless LAN trace)
42
Arbitrary Contact Graphs
43
Bounding the Transition Delay
What are we really saying here?? Let a = 3 how can split the graph into a subgraph of 3 and a subgraph of N-3 node, by removing a set of edges whose weight sum is minimum?
44
A 2nd Bound on Epidemic Delay
Φ is a fundamental property of a graph Related to graph spectrum, community structure
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.