Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analyzing Peer-to-Peer Traffic Across Large Networks Jia Wang Joint work with Subhabrata Sen AT&T Labs - Research.

Similar presentations


Presentation on theme: "Analyzing Peer-to-Peer Traffic Across Large Networks Jia Wang Joint work with Subhabrata Sen AT&T Labs - Research."— Presentation transcript:

1 Analyzing Peer-to-Peer Traffic Across Large Networks Jia Wang Joint work with Subhabrata Sen AT&T Labs - Research

2 Analyzing peer-to-peer traffic accoss large networks2 P2P applications Distributed file sharing Distributed file sharing  Napster, Gnutella, FastTrack, EDonkey, DirectConnect…  Searching v.s. data fetching phases  All the communications occur over default ports  SuperNodes and Hubs Why is this interesting? Why is this interesting?  Large and growing traffic volume

3 Analyzing peer-to-peer traffic accoss large networks3 Outline Methodology Methodology  Data collection  Characterization metrics Analysis results Analysis results  Traffic volume and overlay topology  System dynamics  Traffic characterization P2P vs Web P2P vs Web

4 Analyzing peer-to-peer traffic accoss large networks4 Methodology Challenges Challenges  Decentralized system  Transient peer membership  Some popular close proprietary protocols Large-scale passive measurement Large-scale passive measurement  Flow-level data from routers across a large tier-1 ISP backbone  Analyze both signaling and data fetching traffic  3 levels of granularity: IP, Prefix, AS P2P protocols P2P protocols  FastTrack:1214 (including Morpheus)  Gnutella:6346/6347  DirectConnect:411/412

5 Analyzing peer-to-peer traffic accoss large networks5 Methodology Discussion Advantages Advantages  Requires minimal knowledge of P2P protocols: port number  Large scale non-intrusive measurement  More complete view of P2P traffic  Allows localized analysis Limitations Limitations  Flow-level data: no application-level details  Incomplete traffic flows Other issues Other issues  DHCP, NAT, proxy  Host  IP  Asymmetric IP routing

6 Analyzing peer-to-peer traffic accoss large networks6 Measurements Characterization Characterization  Overlay network topology  Traffic distribution  Dynamic behavior Metrics Metrics  Host distribution  Host connectivity  Traffic volume  Mean bandwidth usage  Traffic pattern over time  Connection duration and on-time

7 Analyzing peer-to-peer traffic accoss large networks7 Data cleaning Invalid IPs Invalid IPs  10.0.0.0-10.255.255.255  172.16.0.0-172.31.255.255.255  192.168.0.0-192.168.255.255 No matched prefixes in routing tables No matched prefixes in routing tables Invalid AS numbers Invalid AS numbers  > 64512 Removed 4% flows Removed 4% flows

8 Analyzing peer-to-peer traffic accoss large networks8 Overview of P2P traffic Total 800 million flow records Total 800 million flow records FastTrack is the most popular one FastTrack is the most popular one Date (2001) 9/10-9/1510/9-10/1312/10-12/16 # flows 111M184M341M # IPs 3.4M4.5M5.9M # IPs / day 1M1.5M1.9M Total traffic (GB/day) 77311531776 Traffic per IP (MB/day) 1.61.61.8

9 Analyzing peer-to-peer traffic accoss large networks9 Host distribution

10 Analyzing peer-to-peer traffic accoss large networks10 Host connectivity Connectivity is very small for most hosts, very high for few hosts Distribution is less skewed at prefix and AS levels FastTrack (9/14/2001)

11 Analyzing peer-to-peer traffic accoss large networks11 Traffic volume distribution Significant skews in traffic volume across granularities  Few entities source most of the traffic  Few entities receive most of the traffic FastTrack (9/14/2001)

12 Analyzing peer-to-peer traffic accoss large networks12 Mean bandwidth usage Upstream usage < downstream usage. Possible causes are  Asymmetric available BW, e.g., DSL, cable  Users/ISPs rate-limiting upstream data transfers FastTrack (9/14/2001)

13 Analyzing peer-to-peer traffic accoss large networks13 Time of day effect  Traffic volume exhibits very strong time-of-day effect  Milder time-of-day variation for # hosts in the system FastTrack (9/14/2001 GMT)

14 Analyzing peer-to-peer traffic accoss large networks14 Host connection duration & on-time  Substantial transience: most hosts stay in the system for a short time  Distribution less skewed at the prefix and AS levels  Using per-cluster or per-AS indexing/caching nodes may help FastTrack (9/14/2001) thd=30min

15 Analyzing peer-to-peer traffic accoss large networks15 Traffic characterization The power law The power law  May not be a suitable model for P2P traffic Relationship between metrics Relationship between metrics  Traffic volume  Number of IPs  On-time  Mean bandwidth usage

16 Analyzing peer-to-peer traffic accoss large networks16 Traffic volume vs. on-time 1. Volume heavy hitters tend to have long on-times 2. Hosts with short on-times contribute small traffic volumes FastTrack (9/14/2001): top 1% hosts (73% volume) 1 2

17 Analyzing peer-to-peer traffic accoss large networks17 Connectivity vs. on-time FastTrack (9/14/2001): top 1% hosts (73% volume) 1.Hosts with high connectivity have long on-times 2.Hosts with short on-times communicate with few other hosts 1 2

18 Analyzing peer-to-peer traffic accoss large networks18 P2P vs Web Observations Observations  97% of prefixes contributing P2P traffic also contribute Web traffic  Heavy hitter prefixes for P2P traffic tend to be heavy hitters for Web traffic Prefix stability – the daily traffic volume (in %) from the prefix does not change over days Prefix stability – the daily traffic volume (in %) from the prefix does not change over days Experiments: 0.1%, 10% heavy hitters => 30%, 90% of the traffic volume Experiments: 0.01%, 0.1%, 1%, 10% heavy hitters => 10%, 30%, 50%, 90% of the traffic volume

19 Analyzing peer-to-peer traffic accoss large networks19 Traffic stability March 2002 Top 0.01% prefixesTop 1% prefixes P2P traffic contributed by the top heavy hitter prefixes is more stable than either Web or total traffic

20 Analyzing peer-to-peer traffic accoss large networks20 Summary Measure and characterize P2P traffic across a large network Measure and characterize P2P traffic across a large network Three popular P2P systems Three popular P2P systems  Significant increase in both number of users and traffic volume  Traffic distributions are highly skewed  High level system dynamics  P2P is significant, but stable component of the Internet traffic

21 Analyzing peer-to-peer traffic accoss large networks21 Acknowledgement AT&T Labs AT&T Labs  Matt Grossglauser, Carsten Lund, Jennifer Rexford, Matt Roughan, Fred True External External  Steve Gribble


Download ppt "Analyzing Peer-to-Peer Traffic Across Large Networks Jia Wang Joint work with Subhabrata Sen AT&T Labs - Research."

Similar presentations


Ads by Google