1 Reading Report 4 Yin Chen 26 Feb 2004 Reference: Peer-to-Peer Architecture Case Study: Gnutella Network, Matei Ruoeanu, In Int. Conf. on Peer-to-Peer.

Slides:



Advertisements
Similar presentations
Peer-to-Peer and Social Networks An overview of Gnutella.
Advertisements

INF 123 SW ARCH, DIST SYS & INTEROP LECTURE 12 Prof. Crista Lopes.
Predicting Tor Path Compromise by Exit Port IEEE WIDA 2009December 16, 2009 Kevin Bauer, Dirk Grunwald, and Douglas Sicker University of Colorado Client.
Routing: Cores, Peers and Algorithms
1 An Overview of Gnutella. 2 History The Gnutella network is a fully distributed alternative to the centralized Napster. Initial popularity of the network.
Search in Power-Law Networks Presented by Hakim Weatherspoon CS294-4: Peer-to-Peer Systems Slides also borrowed from the following paper Path Finding Strategies.
LightFlood: An Optimal Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
Gnutella 2 GNUTELLA A Summary Of The Protocol and it’s Purpose By
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
Evaluation of Ad hoc Routing Protocols under a Peer-to-Peer Application Authors: Leonardo Barbosa Isabela Siqueira Antonio A. Loureiro Federal University.
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
1 Denial-of-Service Resilience in P2P File Sharing Systems Dan Dumitriu (EPFL) Ed Knightly (Rice) Aleksandar Kuzmanovic (Northwestern) Ion Stoica (Berkeley)
Responder Anonymity and Anonymous Peer-to-Peer File Sharing. by Vincent Scarlata, Brian Levine and Clay Shields Presentation by Saravanan.
Building Low-Diameter P2P Networks Eli Upfal Department of Computer Science Brown University Joint work with Gopal Pandurangan and Prabhakar Raghavan.
1 Unstructured Routing : Gnutella and Freenet Presented By Matthew, Nicolai, Paul.
Gnutella, Freenet and Peer to Peer Networks By Norman Eng Steven Hnatko George Papadopoulos.
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Secure Overlay Services Adam Hathcock Information Assurance Lab Auburn University.
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
Lecture Week 3 Introduction to Dynamic Routing Protocol Routing Protocols and Concepts.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Presentation by Manasee Conjeepuram Krishnamoorthy.
P2P File Sharing Systems
Freenet: A Distributed Anonymous Information Storage and Retrieval System Presentation by Theodore Mao CS294-4: Peer-to-peer Systems August 27, 2003.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Peer-to-Peer Computing CS587x Lecture Department of Computer Science Iowa State University.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
P2P Architecture Case Study: Gnutella Network
Peer to Peer Network Anas Hardan. What is a Network? What is a Network? A network is a group of computers and other devices (such as printers) that are.
Developing Analytical Framework to Measure Robustness of Peer-to-Peer Networks Niloy Ganguly.
1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
Introduction of P2P systems
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.
Peer-to-Pee Computing HP Technical Report Chin-Yi Tsai.
Mapping the Gnutella Network Presented By: Tony Young M.Math Candidate October 7th, 2004.
03/19/02Scalab Seminar Series1 Mapping the Gnutella Network Macroscopic Properties of Large Scale P2P Systems Ramaswamy N.Vadivelu Scalab, ASU.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002.
Peer Pressure: Distributed Recovery in Gnutella Pedram Keyani Brian Larson Muthukumar Senthil Computer Science Department Stanford University.
FastTrack Network & Applications (KaZaA & Morpheus)
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
P2PComputing/Scalab 1 Gnutella and Freenet Ramaswamy N.Vadivelu Scalab.
An overview of Gnutella
LightFlood: An Efficient Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Peer to Peer Network Design Discovery and Routing algorithms
Peer-to-peer systems (part I) Slides by Indranil Gupta (modified by N. Vaidya)
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implications for System Design Authors: Matei Ripeanu Ian Foster Adriana.
Peer-to-Peer (P2P) Networks By Bongju Yu. Contents  What is P2P?  Features of P2P systems  P2P Architecture  P2P Protocols  P2P Projects  Reference.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
An example of peer-to-peer application
BitTorrent Vs Gnutella.
An overview of Gnutella
Peer-to-Peer and Social Networks
Presentation by Theodore Mao CS294-4: Peer-to-peer Systems
Unstructured Routing : Gnutella and Freenet
Presentation transcript:

1 Reading Report 4 Yin Chen 26 Feb 2004 Reference: Peer-to-Peer Architecture Case Study: Gnutella Network, Matei Ruoeanu, In Int. Conf. on Peer-to-Peer Computing (P2P2001), Linkoping, Sweden, August

2 P2P Systems Functionality of P2P system  Share CPU cycles, i.e., Entropia,  Share storage space, i.e., FreeNet, Gnutella  Support interpersonal collaborative environments, i.e., Groove Become more popular  Low cost and high availability of huge computing and storage resources  Increased network connectivity Design Goals of P2P File Sharing Systems  Ability to operate in a dynamic environment  Performance and scalability (aggregate storage size, response time, availability, query throughput)  Reliability  Anonymity

3 Gnutella Protocol Description Gnutella nodes  Called servents by developers.  Perform tasks as servers and clients.  Provide client-side interface to users.  Have functionalities as : issue queries, display search results, accept queries from other servents, check for matches against their local data set, send back response, manage background traffic. Join the system  A new node connects to one of several known hosts that are almost always available (i.e. gnutellahosts.com).  Once attached to the network, nodes send messages to interact with each other.  A node periodically PINGs its neighbors to discover other participating nodes.  Nodes decide where to connect based only on local information, and thus form a dynamic, self-organizing network of independent entities.

4 Gnutella Protocol Description Gnutella Message  Can be broadcasted  Can be back-propagated : send on a specific connection on the reverse of the path taken by an initial, broadcasted, message.  Protocol features: Each message has a randomly generated identifier Each node keeps a short memory of the recently routed messages, used to prevent re- broadcasting and implement back-propagation Messages are flagged with time-to-live (TTL) and “hops passed” fields  Message types Group Membership (PING / PONG) Messages: A new node broadcasts a PING message to announce its presence. Each receiving node forwards PING message to its neighbors and back-propagates a PONG message, which contains information (its IP address, the number / size of shared files). Search (QUERY / QUERY RESPONSE) Messages : QUERY messages contain a user specified search string and be broadcasted. Each receiving node matches against locally stored file names, and back-propagates QUERY RESPONSE message which includes information necessary to download a file. File Transfer (GET / PUSH) Messages : To download files.

5 Data Collection Developed a Crawler join the network as a servent, to collect the data. Crawler working mechanism  The crawler initiates a TCP connection to each node in a list.  Newly discovered neighbors are added to the list.  Collects IP address, port, the number of files and total space shared of neighbors.  Start with small group of nodes, but grow to more than 400,000 nodes. Distributed crawling strategy:  A client/ server architecture: The server is responsible with managing the list of nodes to be contacted, assembling the final graph and assigning work to clients. The clients receive a small list of initial points and discover the network topology around these points. Used 50 clients  The crawling time reduced to a couple of hours. (Compare to sequential crawling strategy, which used 50 hours searching time, even for a small network (4000 nodes)

6 Gnutella Network Analysis Gnutella Network Growth The growth of the Gnutella network in the past 6 months ( From 11/2000 ~ 2(3)/2001) (Figure 1)  Grew about 25 times in 6 months In 11/2000, 2,063 hosts In 3/2001,14,949 hosts  Reason for this: Careful engineering The network connectivity of Gnutella participation machines improved significantly. Sending nodes with low available bandwidth at the edges of the network eventually paid off. The number of connected components is relatively small:  The largest connected components always includes more that 95% of the active nodes discovered.  The second biggest connected component usually has less than 10 nodes. Dynamic graph structure  40% of the nodes leave the network in less than 4 hours  25% of the nodes are alive for more than 24 hours

7 Gnutella Network Analysis Gnutella Generated Traffic Classified Gnutella generated traffic in 11/2000, according to message type (Figure 2). The volume of generated traffic is the main obstacle for better scaling and wider deployment. 36% of the total traffic was user-generated traffic (QUERY messages). 55% of the traffic was pure overhead (PING / PONG ) 9% of the traffic contained either bogus messages (1%) or PUSH messages that were broadcasted by servents that were not fully compliant with the latest version of the protocol. Newer Gnutella improved this: 92% QUERY messages, 8% PING messages …

8 Gnutella Network Analysis Shared Data Distribution Correlation between the amount of data shared and the number of links a node maintains. (Figure 3)  Most nodes share few files and maintain only a couple of connections  A small group provides more that half of the information shared, while other group provides most of the essential connectivity. Power-law data distribution  The number of nodes sharing k files is, where c is a constant in the range 0.8 to  The power-law distribution observed makes the network extremely dependent on the information provided by the largest: The largest 1% of all nodes provide 30% of the files available. The largest 10% of nodes provide 71% of the files available.

9 Gnutella Network Analysis Power-Law Network Many natural networks such as molecules in a cell, species in an ecosystem, and people in a social group organize themselves as so called power-law networks. In power-law networks most nodes have few links and a tiny number of hubs have a large number of links. Node connectivity follows a power law distribution. i.e., the fraction of node with L links is proportional to, where k is a network dependent constant. Strong similarities with Zipf’s distributions. A large number of man-made and naturally occurring phenomena including incomes, word frequencies, earthquake magnitudes, web-cache object popularity and even Gnutella queries are distributed according to a Zipf’s law distribution. Can explain why networks ranging from metabolisms to ecosystems to the Internet are generally highly stable and resilient :Since most nodes (molecules, Internet routers, Gnutella servents) are sparsely connected, little depends on them : a large fraction can be taken away and the network stays connected. But if just a few highly connected nodes are eliminated, the whole system could crash. These networks are extremely robust when facing random node failures, but vulnerable to well-planned attacks.

10 Gnutella Network Analysis Node Connectivity and Network Topology Analysis The connectivity distribution in 11/2000 (Figure 6) : can easily recognize a power-law distribution. The connectivity distribution in 3/2001 (Figure 7) : recent networks move away from this organization : too few nodes with low connectivity to form a pure power-law network. The power-law distribution is preserved for nodes with more than 10 links while nodes with few links follow an almost constant distribution. The work believes the more uniform connectivity distribution preserves the network capability to deal with random node failures while reducing the network dependence on highly connected, easy to single out (and attack) nodes.

11 Gnutella Network Analysis Distribution of Node-To-Node Shortest Paths While the network scaled up in size about 50 times, the average node-to-node distance grew only 26% :  The average is 4.24 in 11/2000  The average is 5.35 in 3/2001

12 End …