Download presentation
Presentation is loading. Please wait.
Published byMyrtle Long Modified over 9 years ago
1
CMU SCS KDD 2006Leskovec & Faloutsos1 ??
2
CMU SCS KDD 2006Leskovec & Faloutsos2 Sampling from Large Graphs poster# 305 Jurij (Jure) Leskovec Christos Faloutsos Carnegie Mellon University
3
CMU SCS KDD 2006Leskovec & Faloutsos3 Problems and recommendations Q: How to sample from a large graph? A: FF, RN Q: Which properties to preserve? A: (at least) the 13 ones we list Q: How to measure success/similarity? A: K-S, towards ‘back-in-time’ version
4
CMU SCS KDD 2006Leskovec & Faloutsos4 Criteria in-degree; out-degree distribution distr. of WCC; SCC hop-plot; hop-plot for WCC distr. of first left singular vector values scree plot distr. of clustering coefficient Densification power law shrinking diameter normalized size of largest c.c. first eigenvalue STATICTEMPORAL
5
CMU SCS KDD 2006Leskovec & Faloutsos5 Targets scale-down (= fewer nodes; same diameter, same degree etc) back-in-time (match an earlier, real, smaller version of the graph)
6
CMU SCS KDD 2006Leskovec & Faloutsos6 Sampling Methods RN random nodes RPN pageRank random nodes RDN random nodes, degree- biased RE random edges RNE HYB (Hybrid) RNN RJ random jump RW random walk FF Forest fire
7
CMU SCS KDD 2006Leskovec & Faloutsos7 4 Datasets Arxiv (author-paper) Citation (HEP-TH, HEP-PH) A.S. epinions.com 26K - 500K edges
8
CMU SCS KDD 2006Leskovec & Faloutsos8 Diameter vs N; CC vs degree
9
CMU SCS KDD 2006Leskovec & Faloutsos9 degree distribution; avg CC vs N
10
CMU SCS KDD 2006Leskovec & Faloutsos10 diameterDPL
11
CMU SCS KDD 2006Leskovec & Faloutsos11 better D-statistic vs sample size scale-downback-in-time
12
CMU SCS KDD 2006Leskovec & Faloutsos12 Conclusions random nodes + a little exploration -> FF (RN, RJ are close) 15% sample seems enough back-in-time concept
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.