Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Coarse-Grained Topology Estimation via Graph Sampling Maciej Kurant 1 Minas Gjoka 2 Yan Wang 2 Zack W. Almquist 2 Carter T. Butts 2 Athina Markopoulou.

Similar presentations


Presentation on theme: "1 Coarse-Grained Topology Estimation via Graph Sampling Maciej Kurant 1 Minas Gjoka 2 Yan Wang 2 Zack W. Almquist 2 Carter T. Butts 2 Athina Markopoulou."— Presentation transcript:

1 1 Coarse-Grained Topology Estimation via Graph Sampling Maciej Kurant 1 Minas Gjoka 2 Yan Wang 2 Zack W. Almquist 2 Carter T. Butts 2 Athina Markopoulou 2 1 ETHZ, 2 UC Irvine 17 Aug 2012, WOSN

2 Coarse-grained topology A B nodes belong to different categories Example categories: countries universities workplaces religion age music genres … www.facebook.com/notes/facebook-data-team/mapping-global-friendship-ties/ (19 March 2012)

3 Number of edges between A and B ? Coarse-grained topology A B nodes belong to different categories Not normalized by the size of categories!

4 Probability that a random node in A is a neighbor of a random node in B 4 Coarse-grained topology A B A, B - all nodes labeled by ‘A’ and ‘B’, respectively all existing edges between A and B all possible edges between A and B nodes belong to different categories

5 Facebook: 800+M users 150 friends each (on average) 8 bytes (64 bits) per user ID The raw connectivity data, with no attributes: 800 x 150 x 8B = 960 GB This is neither feasible nor practical. Solution: Sampling! To get this data, one would have to download: 200 TB of HTML data! 5 Name School / Workplace City or country (before 2010) List of friends

6 6 Coarse-grained topology from a sample UIS – Uniform Independence Sample A B RW – Random Walk sample A B estimate

7 7 Coarse-grained topology from a sample UIS – Uniform Independence Sample A B RW – Random Walk sample A B estimate sampling probability w(v) proportional to node degree

8 UISRW N - number of nodes in the graph A, B - all nodes labeled by ‘A’ and ‘B’, respectively S - all sampled nodes S A, S B - nodes sampled in A and B, respectively w(v)- sampling weight of node v (under RW equal to degree of v) A Estimating category size |A|

9 N - number of nodes in the graph A, B - all nodes labeled by ‘A’ and ‘B’, respectively S - all sampled nodes S A, S B - nodes sampled in A and B, respectively w(v)- sampling weight of node v (under RW equal to degree of v) This correction is essential! UISRW Estimating category size |A|

10 N - number of nodes in the graph A, B - all nodes labeled by ‘A’ and ‘B’, respectively S - all sampled nodes S A, S B - nodes sampled in A and B, respectively w(v)- sampling weight of node v (under RW equal to degree of v) all existing edges between A and B all possible edges between A and B all observed edges between A and B all edges we could have observed between A and B A, BA, B UISRW Estimating edge weights w(A,B) (induced)

11 N - number of nodes in the graph A, B - all nodes labeled by ‘A’ and ‘B’, respectively S - all sampled nodes S A, S B - nodes sampled in A and B, respectively w(v)- sampling weight of node v (under RW equal to degree of v) UISRW Estimating edge weights w(A,B) (induced)

12 N - number of nodes in the graph A, B - all nodes labeled by ‘A’ and ‘B’, respectively S - all sampled nodes S A, S B - nodes sampled in A and B, respectively w(v)- sampling weight of node v (under RW equal to degree of v) E a,B - all edges between node a and nodes in B UISRW Estimating edge weights w(A,B) (star sampling)

13 N - number of nodes in the graph A, B - all nodes labeled by ‘A’ and ‘B’, respectively S - all sampled nodes S A, S B - nodes sampled in A and B, respectively w(v)- sampling weight of node v (under RW equal to degree of v) E a,B - all edges between node a and nodes in B UISRW Estimating edge weights w(A,B) (star sampling)

14 N - number of nodes in the graph A, B - all nodes labeled by ‘A’ and ‘B’, respectively S - all sampled nodes S A, S B - nodes sampled in A and B, respectively w(v)- sampling weight of node v (under RW equal to degree of v) E a,B - all edges between node a and nodes in B UISRW Estimating edge weights w(A,B) (star sampling) A, BA, B

15 N - number of nodes in the graph A, B - all nodes labeled by ‘A’ and ‘B’, respectively S - all sampled nodes S A, S B - nodes sampled in A and B, respectively w(v)- sampling weight of node v (under RW equal to degree of v) E a,B - all edges between node a and nodes in B UISRW Estimating edge weights w(A,B) (star sampling)

16 16 UIS RW Category size Edge weight inducedstarinducedstar Estimators A B We prove the consistency of all these estimators

17 Performance evaluation 17

18 Facebook: Texas sample size |S| Fully known graph

19 sample size |S| Facebook online Online graph [swrw10] M. Kurant, M. Gjoka, C. T. Butts and A. Markopoulou, “Walking on a Graph with a Magnifying Glass”, SIGMETRICS 2011.

20 geosocialmap.com 20

21 geosocialmap.com

22

23

24 Public and private colleges in the USA geosocialmap.com 24

25 geosocialmap.com The world according to Facebook 25

26 26 Egypt Saudi Arabia United Arab Emirates Lebanon Jordan Israel Strong clusters among middle-eastern countries

27 UIS A B Summary Consistent estimators under induced and star sampling Coarse-grained topology Original (unknown) topology RW geosocialmap.com More info: http://odysseas.calit2.uci.edu/osn Kiitos!


Download ppt "1 Coarse-Grained Topology Estimation via Graph Sampling Maciej Kurant 1 Minas Gjoka 2 Yan Wang 2 Zack W. Almquist 2 Carter T. Butts 2 Athina Markopoulou."

Similar presentations


Ads by Google