1 Analyzing Peer-To-Peer Traffic Across Large Networks Subhabrata Sen, Member, IEEE, and Jia Wang, Member, IEEE 組員:李英宗 d96725004 林慶和 d95725005 2009 年 6.

Slides:



Advertisements
Similar presentations
A Measurement Study of Peer-to-Peer File Sharing Systems Presented by Cristina Abad.
Advertisements

Traffic Dynamics at a Commercial Backbone POP Nina Taft Sprint ATL Co-authors: Supratik Bhattacharyya, Jorjeta Jetcheva, Christophe Diot.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Understanding KaZaA Jian Liang Rakesh Kumar Keith Ross Polytechnic University Brooklyn, N.Y.
Topology Generation Suat Mercan. 2 Outline Motivation Topology Characterization Levels of Topology Modeling Techniques Types of Topology Generators.
Streaming Video Traffic: Characterization and Network Impact Kobus van der Merwe Shubho Sen Chuck Kalmanek
Traffic Engineering With Traditional IP Routing Protocols
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin.
FRIENDS: File Retrieval In a dEcentralized Network Distribution System Steven Huang, Kevin Li Computer Science and Engineering University of California,
 We developed a fast and tunable crawler, Cruiser.  Cruiser uses a master-slave architecture, parallel crawling, and leverages the two-tier topology.
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Network Traffic Measurement and Modeling CSCI 780, Fall 2005.
1 Deriving Traffic Demands for Operational IP Networks: Methodology and Experience Anja Feldmann*, Albert Greenberg, Carsten Lund, Nick Reingold, Jennifer.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
On Power-Law Relationships of the Internet Topology CSCI 780, Fall 2005.
Dynamics of Hot-Potato Routing in IP Networks Renata Teixeira (UC San Diego) with Aman Shaikh (AT&T), Tim Griffin(Intel),
Analysis of the Internet Topology Michalis Faloutsos, U.C. Riverside (PI) Christos Faloutsos, CMU (sub- contract, co-PI) DARPA NMS, no
Kyushu University Graduate School of Information Science and Electrical Engineering Department of Advanced Information Technology Supervisor: Professor.
1 Characterizing Files in the Modern Gnutella Network: A Measurement Study Shanyu Zhao, Daniel Stutzbach, Reza Rejaie University of Oregon SPIE Multimedia.
Network Monitoring for Internet Traffic Engineering Jennifer Rexford AT&T Labs – Research Florham Park, NJ 07932
1 Deriving Traffic Demands for Operational IP Networks: Methodology and Experience Anja Feldmann*, Albert Greenberg, Carsten Lund, Nick Reingold, Jennifer.
1 TCP Traffic Analysis in cooperation with Motorola Todd DeSantis and David Loose Advisor: Professor Mark Claypool Co-Advisor: Professor Robert Kinicki.
Analyzing Peer-to-Peer Traffic Across Large Networks Jia Wang Joint work with Subhabrata Sen AT&T Labs - Research.
Measurements of Peer-to-Peer Systems Pradnya Karbhari Nov 25 th, 2003 CS 8803: Network Measurements Seminar.
Presentation by Manasee Conjeepuram Krishnamoorthy.
CRIO: Scaling IP Routing with the Core Router-Integrated Overlay Xinyang (Joy) Zhang Paul Francis Jia Wang Kaoru Yoshida.
1 Reading Report 4 Yin Chen 26 Feb 2004 Reference: Peer-to-Peer Architecture Case Study: Gnutella Network, Matei Ruoeanu, In Int. Conf. on Peer-to-Peer.
Differences between In- and Outbound Internet Backbone Traffic Wolfgang John and Sven Tafvelin Dept. of Computer Science and Engineering Chalmers University.
Developing Analytical Framework to Measure Robustness of Peer-to-Peer Networks Niloy Ganguly.
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
Skype P2P Kedar Kulkarni 04/02/09.
1 The Research on Analyzing Time- Series Data and Anomaly Detection in Internet Flow Yoshiaki HARADA Graduate School of Information Science and Electrical.
Othman Othman M.M., Koji Okamura Kyushu University 1.
Resilient Peer-to-Peer Streaming Presented by: Yun Teng.
1 CS 425 Distributed Systems Fall 2011 Slides by Indranil Gupta Measurement Studies All Slides © IG Acknowledgments: Jay Patel.
A Routing Underlay for Overlay Networks Akihiro Nakao Larry Peterson Andy Bavier SIGCOMM’03 Reviewer: Jing lu.
TOMA: A Viable Solution for Large- Scale Multicast Service Support Li Lao, Jun-Hong Cui, and Mario Gerla UCLA and University of Connecticut Networking.
Aemen Lodhi (Georgia Tech) Amogh Dhamdhere (CAIDA)
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
Othman Othman M.M., Koji Okamura Kyushu University 1.
Understanding KaZaA Jian Liang Rakesh Kumar Keith Ross Polytechnic University Brooklyn, N.Y.
BGP topics to be discussed in the next few weeks: –Excessive route update –Routing instability –BGP policy issues –BGP route slow convergence problem –Interaction.
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin In First Workshop on Hot Topics in Understanding Botnets,
A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance Feng Wang 1, Zhuoqing Morley Mao 2 Jia Wang 3, Lixin Gao 1,
On Understanding of Transient Interdomain Routing Failures Feng Wang, Lixin Gao, Jia Wang, and Jian Qiu Department of Electrical and Computer Engineering.
Peer Centrality in Socially-Informed P2P Topologies Nicolas Kourtellis, Adriana Iamnitchi Department of Computer Science & Engineering University of South.
FastTrack Network & Applications (KaZaA & Morpheus)
Understanding the Network-Level Behavior of Spammers Author: Anirudh Ramachandran, Nick Feamster SIGCOMM ’ 06, September 11-16, 2006, Pisa, Italy Presenter:
April 4th, 2002George Wai Wong1 Deriving IP Traffic Demands for an ISP Backbone Network Prepared for EECE565 – Data Communications.
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
Exploiting Network Structure for Proactive Spam Mitigation Shobha Venkataraman * Joint work with Subhabrata Sen §, Oliver Spatscheck §, Patrick Haffner.
정하경 MMLAB Fundamentals of Internet Measurement: a Tutorial Nevil Brownlee, Chris Lossley, “Fundamentals of Internet Measurement: a Tutorial,” CMG journal.
Advanced Technology Laboratories 8 December 2000 page 1 Characterization of Traffic at a Backbone POP Nina Taft Supratik Bhattacharyya Jorjeta Jetcheva.
Efficient Group Key Management in Wireless LANs Celia Li and Uyen Trang Nguyen Computer Science and Engineering York University.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Click to edit Master title style Multi-Destination Routing and the Design of Peer-to-Peer Overlays Authors John Buford Panasonic Princeton Lab, USA. Alan.
BGP Routing Stability of Popular Destinations Jennifer Rexford, Jia Wang, Zhen Xiao, and Yin Zhang AT&T Labs—Research Florham Park, NJ All flaps are not.
Performance Limitations of ADSL Users: A Case Study Matti Siekkinen, University of Oslo Denis Collange, France Télécom R&D Guillaume Urvoy-Keller, Ernst.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
An Analysis of Internet Content Delivery Systems 19 rd November, 2007 Youngsub CSE, SNU.
#16 Application Measurement Presentation by Bobin John.
1 Internet Traffic Measurement and Modeling Carey Williamson Department of Computer Science University of Calgary.
PlanetSeer: Internet Path Failure Monitoring and Characterization in Wide-Area Services Ming Zhang, Chi Zhang Vivek Pai, Larry Peterson, Randy Wang Princeton.
Accelerating Peer-to-Peer Networks for Video Streaming
BGP Routing Stability of Popular Destinations
Zueyong Zhu† and J. William Atwood‡
Early Measurements of a Cluster-based Architecture for P2P Systems
A Measurement Study of Napster and Gnutella
Peer-to-Peer Reputations
Transport Layer Identification of P2P Traffic
Presentation transcript:

1 Analyzing Peer-To-Peer Traffic Across Large Networks Subhabrata Sen, Member, IEEE, and Jia Wang, Member, IEEE 組員:李英宗 d 林慶和 d 年 6 月 15 日 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 12, NO. 2, APRIL 2004

2 ACN 2009 Authors Subhabrata Sen received the B.Eng. Degree in computer science from Jadavpur University, India, in 1992, and the M.S. and Ph.D. degrees in computer science from the University of Massachusetts,A mherst, in 1997 and 2001, respectively. Jia Wang received the B.S. degree in computer science from the State University of New York, Binghamton, in 1996, and the M.S. and Ph.D. degrees in computer science from Cornell University, Ithaca, NY, in 1999 and 2001, respectively. They’re currently two members of the Internet and Networking Systems Research Center at AT&T Labs–Research in Florham Park, NJ. Their research interests include network measurement, routing and topology analysis, traffic flow measurement, overlay networks and applications, network security and anomaly detection, Web performance, content distribution networks, and other Internet-related research work. Dr. Sen and Dr.Wang are the members of the Association for Computing Machinery (ACM).

3 ACN 2009 Introduction  Motivation & Goals  The use of P2P applications is for distributed file sharing  Large and growing traffic volume impact on the underlying network  to characterize P2P behavior with a view to understanding how these systems impact the network  and to gain insights into developing P2P systems with superior performance.  Previous research  almost exclusively on P2P signaling traffic  setting up P2P crawlers on the Internet, using “active probing” approach  Early version  Based on data from the edge networks  provide a view of local P2P usage  This work provides a complementary “backbone view”  from a large tier-1 ISP  gathering data at multiple border routers across the ISP.

4 ACN 2009 Outline  Methodology  Characterization Metrics  View and Analysis results  P2P vs Web

5 ACN 2009 Methodology  Popular P2P Applications  Three systems: Gnutella, FastTrack, DirectConnect  All decentralized, self organizing  Data and index information distributed over peers  Transient peer membership  Measurement Approach  Large-scale passive measurement  Flow-level data gathered from routers across a large tier-1 ISP’s backbone  Analyze both signaling and data traffic  Three levels of granularity: IP address, network prefix, Autonomous system  Collect data using Cisco’s NetFlow

6 ACN 2009 Methodology  Advantages  Requires knowledge about P2P protocol: port#  Non-intrusive measurement  More easy than crawler  More complete view of P2P traffic  Allow localized analysis  Limitations  Flow level data, No AP-level details  May not capture the complete flow

7 ACN 2009 Characterization Metrics  Characterization  Topology: hosts distributions, application-level overlay  Traffic distribution: downstream & upstream  Dynamic behavior:how frequently hosts join an leave the system, how long a host stay…

8 ACN 2009 Characterization Metrics  Metrics  Host distribution  Traffic Volume  Host Connectivity  Traffic pattern over time  Connection duration and on-time  Data cleaning  Invalid IP: 10.x.x.x/8 、 x.x/13 、 x.x/16  No matched prefix in routing tables  Invalid AS#(>64512) 、  Remove 4% of flow records

9 ACN 2009 Overview of P2P traffic uTABLE I Netflow DATA SET OF P2P TRAFFIC OVER TCP uTotal around 800 million flow records

10 ACN 2009 Host distribution Fig. 2. Host density: the distribution of the hosts participating in three P2P systems per day (y-axis is in logscale).

11 ACN 2009 Traffic volume distribution Fig. 3. Cumulative distribution of traffic volume associated with IP addresses ranked in decreasing order of volume, for September 14, 2001 (x-axis is in logscale). Aggregate traffic observed for FastTrack on this day was 960 GB. uSignificant skews in traffic volume across granularities u Few entities source/receive most of the traffic

12 ACN 2009 Host connectivity uFig. 5. Cumulative distribution of network connectivity at the IP and network prefix (PR) levels, for hosts participating in FastTrack on September 14, uConnectivity is very small for most hosts, very high for few hosts u Distribution is less skewed at prefix and AS levels

13 ACN 2009 Time of day effect uFig. 6. Distribution of number of IP addresses and traffic volume across hours in FastTrack on September 14, 2001 (GMT). (a) The traffic volume transferred in each bin. (b) The number of unique IP addresses, network prefixes, and ASes that are active in each bin.

14 ACN 2009 Host connection duration & on-time uSubstantial transience: most hosts stay in the system for a short time u Distribution less skewed at the prefix and AS levels uFastTrack (9/14/2001) thd=30min

15 ACN 2009 Mean bandwidth usage uFig. 9. Cumulative distribution of the mean upstream and downstream bandwidth usage of hosts participating in FastTrack, and DirectConnect on September 14, 2001 (x- axis is in logscale). (a) FastTrack. (b) DirectConnect. uUpstream < Downstream: ADSL, Rate limiting

16 ACN 2009 Traffic Characterization  The P2P traffic does not fit well with power law distributions.  Relationships between measures  Traffic volume  #IPs  On-times  Mean bandwidth usage

17 ACN 2009 The power laws uFig. 10. Rank-frequency plots of the P2P metrics for FastTrack on September 14, 2001: (a) overall host connectivity; (b) host connectivity for the top 10% IP addresses; (c) traffic volume of the top 10% IP addresses; (d) on-time of the top 10% IP addresses (both x-axis and y-axis are labeled in logscale).

18 ACN 2009 Relationships: Traffic volume vs on-time 、 Connectivity 、 #BW  Volume heavy hitters are likely to have long on-times; Hosts with short on-times contribute small traffic volumes  A Host communicating with many others can transmit a small amount of traffic; a host communicating with few others can also source significant traffic.  Volume heavy hitters are likely to have large bandwidths; Hosts with small bandwidths contribute small traffic volumes

19 ACN 2009 Traffic volume vs on-time 、 Connectivity 、 #BW uFig. 11. FastTrack data set for September 14, 2001—top 1%. IP addresses ranked by volume of data sent out. Scatter plots (log-log scale): (a) upstream volume versus upstream on-time; (b) upstream volume versus number of unique upstream IP addresses that an IP address connects to; (c) upstream volume versus average upstream bandwidth of an IP address.

20 ACN 2009 Connectivity 、 on-time 、 #BW  Hosts with high connectivity have long on- times; Hosts with short on-times communicate with few other hosts.  Hosts with high upstram badwidths have low connectivity counts; Hosts send traffic to many others tend to span the bandwidths, but no one with the highest bandwidths  Hosts with low upstram badwidths have very long on-time (maybe download large file or SuperNode)

21 ACN 2009 Connectivity 、 on-time 、 #BW uFig. 12. FastTrack data set for September 14, 2001—top 1% IP addresses ranked by volume of data sent out. Scatter plots (log-log scale): (a) number of unique upstream IP addresses that a host connects to versus total upstream on-time of the IP address; (b) number of unique upstream IP addresses versus average upstream bandwidth; (c) average upstream bandwidth versus total upstream on-time.

22 ACN 2009 P2P vs Web  97% of prefixes contributing P2P traffic also contribute Web traffic  Heavy hitter prefixes for P2P traffic tend to be heavy hitters for Web traffic  P2P traffic contributed by the top heavy hitter prefixes is more stable than either Web or total traffic  0.01%, 0.1%, 1%, 10% heavy hitters contribute 10%, 30%, 50%, 90% of the traffic volume

23 ACN 2009 P2P vs Web uFig. 13. Cumulative distribution of the traffic volume changes for top heavy hitter prefixes. (a) Top 0.01%. prefixes. (b) Top 1% prefixes.

24 ACN 2009 Summary  The analysis covers both signaling & data traffic.  complements previous work for Gnutella.  Significant increase in both traffic volume and number of Users.  The traffic volume generated by individual hosts is extremely variable  less than 10% #IPs  99% of the traffic volume.  Traffic distributions are extremely skewed  Both of traffic volume, connectivity, ontime and average bandwidth usage.  But do not strictly obey with power laws.

25 ACN 2009 Summary  All three P2P systems exhibit a high level of system dynamics  But only a small fraction of hosts are persistent over long time periods.  P2P is significant, but stable component of the Internet traffic  More stable than Web traffic or overall traffic  Application-specific layer-3 traffic engineering is a promising way to manage the P2P workload in an ISP’s network.