Characterizing the Two-Tier Gnutella Topology Gnutella, FastTrack, and eDonkey use two-tier overlay topologies. Our initial study focuses on Gnutella. Top-level peers form the core overlay. Each leaf connects to a few top-level peers. 5. Ongoing Work 2. Two-Tier Topologies Characterizing file distribution and query workload Characterizing Kademlia-based DHTs Examination of performance bottlenecks in BitTorrent Top-to-Leaf Degree Distribution 4. Results Top-to-Top Degree Distribution Leaf-to-Top Degree Distribution Fig. 1 Daniel Stutzbach and Reza Rejaie – University of Oregon The Ion P2P Project: GraphPath Lengths Lengths of Random Clustering Coefficient CC of Random Modern Gnutella 4.17— Older Gnutella 3.30— Movie Actors Power Grid Small World Properties Path Length Distribution Peer degree is fairly homogeneous, not power-law. Most top-level peers have a degree around 30 (Fig. 1). Under 30, degree is nearly uniformly distributed (Fig. 1). A power-law was reported by previous studies. Our prior work shows that slow crawling can erroneously lead to a power-law degree distribution (Fig 2). Degree distributions between tiers are also fairly homogenous. The number of leaves per top-level peer is similar. Version differences cause different spikes (Fig. 3). Most leaf peers have a very low degree, but a small number have a high degree (Fig. 4) Despite exponential growth in size, overlay path lengths in Gnutella are very short (Fig. 5 and Fig. 6). 60% of top-level paths are exactly four hops in length. 99.5% of top-level paths are five hops or less. Leaf-to-leaf paths are 1 or 2 hops longer, on average. Gnutella is not power-law, but is still a small world. Path lengths are close to same-size random graphs. The top-level overlay is not tightly clustered (0.018). However, it is much more clustered than same-size random graphs (0.018 >> ). Characterizing & modeling the dynamics of overlay topologies: 1) Peer churn, 2) Edge churn Developing an overlay topology generator for simulation 3. Approach 1. Motivation Most of the large file sharing Peer-to-Peer (P2P) applications with millions of users are based on unstructured, two-tier overlay topologies. Characterizing the unstructured overlays of these applications is important for design and evaluation. Characterizing P2P overlays requires capturing accurate and fine-grained snapshots of the overlays. Snapshots (as graphs) are captured with a crawler, recording peers (as nodes) & connections (as edges). Captured snapshots by a slow crawler can be distorted due to 1) dynamic changes in the overlay during a crawl, 2) peers unreachable by the crawler. Previous studies are outdated, used slow crawlers (1 or 2 hours), and did not examine the accuracy of their captured snapshots. We developed a parallel and tunable crawler, Cruiser. Cruiser increases crawling speed by Using a master-slave architecture Crawling many peers in parallel Leveraging the two-tier topology Cruiser captures an accurate Gnutella snapshot with 1-million peers in around 7 minutes (140k peers/min). Inaccuracy of Slow Crawling Fig. 5 Fig. 6 Fig. 4 Fig. 2 Fig. 3