Presentation is loading. Please wait.

Presentation is loading. Please wait.

Thrasyvoulos Spyropoulos / Eurecom, Sophia-Antipolis 1  How many samples do we need to converge?  How many Random Walk steps to get.

Similar presentations


Presentation on theme: "Thrasyvoulos Spyropoulos / Eurecom, Sophia-Antipolis 1  How many samples do we need to converge?  How many Random Walk steps to get."— Presentation transcript:

1 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis 1  How many samples do we need to converge?  How many Random Walk steps to get plot on the right?

2 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  Consider the standard M/M/1 chain  Assume with start with K initial customers  K ≠ E[N] Q: How long until convergence to π (stationary distribution)?  Start on a sunny day Q: How long until P(rainy) = π(rainy)? 2 0 λ 1 2 μ … λ μ λ μ

3 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  Community Detection: Identify (sub)sets of nodes that are better connected to each other than the rest of the network  Not easy!  visually easy a posteriori, but at first the network on the right is just a large matrix  Clustering/Machine Learning/Pattern Recognition 3

4 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  If a Markov Chain (defined by transition matrix P) is ergodic (irreducible, aperiodic, and positive recurrent)  P (n) ik  π k and π = [π 1, π 2,…, π n ] Q: But how fast does the chain converge? E.g. how many steps until we are “close enough” to π A: This depends on the eigenvalues of P The convergence time is also called the mixing time 4

5 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis Left Eigenvectors A row vector π is a left eigenvector for eigenvalue λ of matrix P iff πP = λπ  Σ k π k p ki = λπ i Right Eigenvectors A column vector v is a right eigenvector for eigenvalue λ of matrix P iff Pv = λv  Σ k p ik v k = λv i Q: What eigenvalues and eigenvectors can we guess already? A: λ = 1 is a left eigenvalue with eigenvector π the stationary distr. λ = 1 is a right eigenvalue with eigenvector v=1 (all 1s) 5

6 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  Both sets have non-zero solutions  (P - λI) is singular  There exists v ≠ 0 such that (P-λI)v = 0  Determinant |P-λI| = 0  (p 11 - λ)(p 22 - λ)-p 12 p 21 = 0  λ 1 =1, λ 2 = 1 – p 12 – p 21 (replace above and confirm using some algebra)  |λ 2| < 1 (normalized: π (1) to be a stationary distribution AND v (i) ∙ π (i) = 1, ∀i) 6

7 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  Eigenvalue decomposition: P = U Λ U -1 Q: What is P (n) ? A: => Q: How fast does the chain converge to stationary distrib.? A: It converges exponentially fast in n, as ( λ 2 ) n 7

8 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  We’ll assume that there are M distinct eigenvalues (see notes for repeated ones)  Matrix P is stochastic  all eigenvalues |λ i | ≤ 1 Q: Why? A: Q: How fast does an (ergodic) chain converge to stationary distribution? A: Exponentially with rate equal to 2 nd largest eigenvalue 8

9 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  λ 2 (2 nd largest eigenvalue) related to (balanced) min-cut of the graph  The more “partitioned” a graph is into clusters with few links between them  the longer the convergence time for the respective MC  the slower the random walk search 9 9

10 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis

11 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis P7-11 L= D-A= 1 23 4 Diagonal matrix, d ii =d i

12 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis P7-12 1 23 4 10 0.3 2 4

13 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis so, zero is an eigenvalue If k connected components, Fiedler (‘73) called “algebraic connectivity of a graph” The further from 0, the more connected.

14 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis P7-14 G(V,E) L= eig(L)= #zeros = #components 123 6 75 4

15 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis P7-15 G(V,E) L= eig(L)= #zeros = #components 123 6 75 4 0.01 Indicates a “good cut”

16 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis

17 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis

18 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis

19 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis Device-to-Device Communication (e.g. Bluetooth or WiFi Direct) 19

20 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis Data/Malware Spreading Over Opp. Nets 20 / 38  Contact Process: Due to node mobility  Q: How long until X% of nodes “infected”? ACBD D EF D D D D  Contact Process: Due to node mobility  Q: How long until X% of nodes “infected”?

21 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis News/Videos on Online Social Networks 21  Contact/Interaction: (random) times when user i posts/writes to user j, or user j checks out i’s page.  “transfer” during a contact with probability p  Q: How long until a video goes “viral”? interaction (post, share) i j

22 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis Email Network 22  An email with a virus or worm  A graph showing which users send emails to whom  Pairwise contact process: (random) times of emails between i and j  Q: How long for the worm to spread??

23 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis Assumption 1) Underlay Graph  Fully meshed Assumption 2) Contact Process  Poisson(λ ij ), Indep. Assumption 3) Contact Rate  λ ij = λ (homogeneous) 23 Analysis of Epidemics: The Usual Approach

24 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis 2-hop infection 24

25 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis A Poisson Graph A Real Contact Graph (ETH Wireless LAN trace) 25

26 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis

27 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis Bounding the Transition Delay  What are we really saying here??  Let a = 3  how can split the graph into a subgraph of 3 and a subgraph of N-3 node, by removing a set of edges whose weight sum is minimum? 27

28 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  Φ is a fundamental property of a graph  Related to graph spectrum, community structure 28

29 Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / spyropoul@eurecom.fr Navigability: Decentralized Search of Large Networks

30 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis http://ccl.northwestern.edu/netlogo/models/run.cgi?GiantComponent.884.534

31 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis http://projects.si.umich.edu/netlearn/NetLogo4/RAndPrefAttachment.html

32 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis

33 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis Source: http://maps.google.com

34 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis from Milgram's original article in Psychology Today

35 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  How to choose among hundreds of acquaintances?  Strategy:  Simple greedy algorithm - each participant chooses correspondent  who is closest to target with respect to the given property Models  Geography  Kleinberg (2000)  Hierarchical groups  Watts, Dodds, Newman (2001), Kleinberg(2001)  High degree nodes  Adamic, Puniyani, Lukose, Huberman (2001), Newman(2003)

36 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  Consider the following simple search algorithm (to find a destination) -- “Decentralized Search” 1. I know all my neighbors and their location 2. At every step I move to my neighbor closest to the destination Q: Does this greedy algorithm find short paths?? (i.e. O(logN) jops)? 36 Erdos-Renyi (Poisson) Random Graphs  are small-world! dest

37 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  Cannot find short paths with local, greedy algorithms (even though they exist) Q: What is the problem? A: Even if y is closer than x to dest (in some embedded coordinate system)  we have no expectations about y’s neighbors  all its neighbors might be further than x Q: How many steps, on average, to reach the destination? 37

38 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  What about the Watts- Strogatz small-world model?  Regular links to k closest  Random rewiring with prob p  Or this one?  Regular 2D lattice  Each node has an additional k random links (to any node) 38 dest Decentralized search cannot find short paths either!

39 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis Q: What is wrong this time? A: We have two options 1. Set of “close” neighbors 2. Random shortcut 39 from Milgram's original article in Psychology Today

40 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  Option 1: Close neighbors  Traverse up to k hops (constant)  Option 2: Shortcuts  Traverse ~n ½ hops (constant) Q: What happens when remaining distance < n ½ to destination? A: Small probability that a shortcut will take us closer  need to follow lattice links only Q: How many hops for this last part? A: Order of n ½ / k 40

41 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis Intuition: 1. long-range links (shortcuts) not to any node with equal probability 2. The further away the node, the smaller the chance we “know” him  Shortcuts: Prob(of link at distance d) ∼ d -q  Original model: q = 0 41 small qlarge q

42 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  How does decentralized search perform for different q values?  q = 0  random shortcuts) already saw it doesn’t work Result: best performance for q = 2 Q: Why q = 2? A: number of shortcuts at different “scales” is constant 42

43 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  1D lattice (i.e. ring)?  long range link at distance d with prob d -q  Optimal if q = 1  Result: n-dimensional lattice  optimal q = n 43

44 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  (as before) move to direct neighbor, unless there is a shortcut leading closer  Break the sequence of steps into phases  Phase j ) 2 j+1 < distance < 2 j 44 a  i

45 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis Q:What is the total number of phases? A: Not more than log 2 n (why?)  X i = number of steps to finish phase I  This is a random variable  E[X] = E[X 1 + X 2 + … + X logn ]  Goal: Show that E[X i ] ~logn steps 45

46 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  Prob(link d hops far): P(d) ~ 1/d  Need to normalize: n nodes    46

47 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis Q: how many nodes at distance d/2 from destination? A: N(d/2) = d/2 + d/2 = d Q: What is the probability of a shortcut to one of them? A: Furthest away at 3d/2   Q: How many steps to leave phase j? A: Xi ~ Geometric(p) with p = (3logn) -1  E[Xi] ≤ 3logn  E[X] ~ (logn) 2 47

48 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  The average user will have ~ 2.5 non-geographic friends  The other friends (5.5 on average) are distributed according to an approximate 1/distance relationship  But 1/d was proved not to be navigable by Kleinberg, so what gives?

49 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis   = d(u,v) the distance between pairs of people  The probability that two people are friends given their distance is equal to  P(  ) =  + f(  ),  is a constant independent of geography   is 5.0 x 10 -6 for LiveJournal users who are very far apart

50 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis  Kleinberg assumed a uniformly populated 2D lattice  But population is far from uniform  population networks and rank-based friendship  probability of knowing a person depends not on absolute distance but on relative distance -i.e. how many people live closer Pr[u ->v] ~ 1/rank u (v)

51 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis

52 Thrasyvoulos Spyropoulos / spyropou@eurecom.fr Eurecom, Sophia-Antipolis


Download ppt "Thrasyvoulos Spyropoulos / Eurecom, Sophia-Antipolis 1  How many samples do we need to converge?  How many Random Walk steps to get."

Similar presentations


Ads by Google