By Jonathan Drake.  The Gnutella protocol is simply not scalable  This is due to the flooding approach it currently utilizes  As the nodes increase.

Slides:

Advertisements

Similar presentations

Peer-to-Peer and Social Networks An overview of Gnutella.

Advertisements

INF 123 SW ARCH, DIST SYS & INTEROP LECTURE 12 Prof. Crista Lopes.

Scalable Content-Addressable Network Lintao Liu

GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Intel Research Seattle Sylvia Ratnasamy, Lee Breslau, Scott Shenker, and Nick Lanham.

Efficient Search - Overview Improving Search In Peer-to-Peer Systems Presented By Jon Hess cs294-4 Fall 2003.

Technion –Israel Institute of Technology Computer Networks Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi.

1 An Overview of Gnutella. 2 History The Gnutella network is a fully distributed alternative to the centralized Napster. Initial popularity of the network.

Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.

Improving Gnutella Willy Henrique Säuberli Seminar in Distributed Computing, 16. November 2005 Papers: I.Making Gnutella-like P2P Systems Scalable; SIGCOMM.

P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.

Technion –Israel Institute of Technology Software Systems Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi Melamed.

A Trust Based Assess Control Framework for P2P File-Sharing System Speaker ： Jia-Hui Huang Adviser : Kai-Wei Ke Date ： 2004 / 3 / 15.

Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,

1 Maximizing Remote Work in Flooding-based P2P Systems Qixiang Sun Neil Daswani Hector Garcia-Molina Stanford University.

Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao Cisco Systems, Inc. (Joint work with Christine Lv, Edith Cohen, Kai Li and Scott Shenker)

Making Gnutella-like P2P Systems Scalable Presented by: Karthik Lakshminarayanan Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick Lanham, and Scott.

Object Naming & Content based Object Search 2/3/2003.

Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.

Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.

Efficient Search in Peer to Peer Networks By: Beverly Yang Hector Garcia-Molina Presented By: Anshumaan Rajshiva Date: May 20,2002.

Searching in Unstructured Networks Joining Theory with P-P2P.

P2P Course, Structured systems 1 Introduction (26/10/05)

Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.

1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.

P2P File Sharing Systems

INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.

Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.

1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.

P2P Group Meeting (ICS/FORTH) Monday, 21 February, 2005 Making Gnutella-like P2P Systems Scalable (Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick.

1 - CS7701 – Fall 2004 Review of: Making Gnutella-like P2P Systems Scalable Paper by: – Yatin Chawathe (AT&T) –Sylvia Ratnasamy (Intel) –Lee Breslau (AT&T)

Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.

Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.

Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.

Structuring P2P networks for efficient searching Rishi Kant and Abderrahim Laabid Abderrahim Laabid.

A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.

Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.

Bob Knowledge Plane -- Scaling of the WHY App Bob Braden, ISI 24 Sept 03.

GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Sylvia Ratnasamy, Scott Shenker, Nick Lanham, Lee Breslau (Several slides have been taken.

1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.

GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Sylvia Ratnasamy, Scott Shenker, Nick Lanham, Lee Breslau Parts of it has been adopted from.

Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.

MobiQuitous 2007 Towards Scalable and Robust Service Discovery in Ubiquitous Computing Environments via Multi-hop Clustering Wei Gao.

P2PComputing/Scalab 1 Gnutella and Freenet Ramaswamy N.Vadivelu Scalab.

"A Measurement Study of Peer-to-Peer File Sharing Systems" Stefan Saroiu, P. Krishna Gummadi Steven D. Gribble, "A Measurement Study of Peer-to-Peer File.

1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.

An overview of Gnutella

Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.

Peer to Peer Network Design Discovery and Routing algorithms

Author: Haoyu Song, Murali Kodialam, Fang Hao and T.V. Lakshman Publisher/Conf. : IEEE International Conference on Network Protocols (ICNP), 2009 Speaker:

LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.

P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.

P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.

03/19/02Scalab Seminar Series1 Finding Good Peers in Peer-to-Peer Networks Ramaswamy N.Vadivelu Scalab, ASU.

Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.

Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,

Unstructured Networks: Search Márk Jelasity. 2 Outline ● Emergence of decentralized networks ● The Gnutella network: how it worked and looked like ● Search.

A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.

CT301 LECTURE 8.

An overview of Gnutella

On Growth of Limited Scale-free Overlay Network Topologies

Peer-to-Peer and Social Networks

EE 122: Peer-to-Peer (P2P) Networks

GIA: Making Gnutella-like P2P Systems Scalable

Paraskevi Raftopoulou, Euripides G.M. Petrakis

Improving Performance in the Gnutella Protocol

Peer-to-Peer Information Systems Week 6: Performance

Mobile P2P Data Retrieval and Caching

draft-bryan-sipping-p2p

Presentation transcript:

By Jonathan Drake

 The Gnutella protocol is simply not scalable  This is due to the flooding approach it currently utilizes  As the nodes increase this approach causes an exponential increase in bandwidth usage  DHT is one concept that has been considered but it performs poorly on multiple keyword searches

 DHT indexes specific keywords so this allows queries that search for specific needles in the haystack  The reality is that most searches in P2P utilize multiple keywords  Users tend to look for general results where multiple files could satisfy their needs  DHT would be great for finding a specific file or one entry in groups of thousands but that’s not the case

 Random walking is one solution but unfortunately it takes a lot of hops and doesn’t guarantee it will find all the results the user wants  Random walking selects a random peer node to query and may end up missing results the user wants unless it runs for long periods of time making it no better then flooding

 Supernodes work better but it still takes considerable resources and bandwidth because flooding (broadcasting) is still taking place between this super mesh  This can cause failures among super nodes and still doesn’t scale well when considering a file may only exist on a regular edge node

 Sure!  Well the idea is that random walking has less of a cost then flooding so but still only chooses a random node to forward the query  Nodes are not identical, some have more resources then others so why not take advantage of this?  That’s what GIA proposes. When forwarding a query it should go to the node that’s least overloaded and has the most available bandwidth

 Dynamic topology adaption – choose neighbors that have high capacity so we pass off queries to nodes that can handle it  Active Flow Control – When a node gets overloaded it allocates less tokens for queries so that its not overloaded  One-hop replication – Keep an index of the files on all neighbors to help speed up querying  Search Protocol – Direct queries to the node with the highest capacity

 Topology Adaptation for GIA is an approach that chooses neighbors based on their overall capacity and current number of neighbors  When a node gets a request from another node it only accepts it if it has the capacity  If it doesn’t it still favors the new node and drops another neighbor from the subset of nodes with lower capacity that has the most neighbors. This is based on the idea that the node that is dropped has the least to loose

 Tokens are assigned to neighbors based on their capacity (rather then uniformly)  These are used to issue queries to other nodes  They can start out uniformly but as nodes don’t use their tokens they can be redistributed to other nodes until it reflects a weight towards capacity

 Replicate the contents of your neighbors in an index so that when a query comes you can respond with their file matches as well  When a node leaves the node removes their information from the index

 Searching is essentially a biased random walk  Each node sends the query to neighbor with the highest capacity it has tokens for (otherwise its queued for later)  Book keeping is done with GUIDs to make sure we don’t follow redundant paths  TTL is used to end the query if its taking too long  MAX_RESPONSES is the total responses that should be retrieved before sending results back

 You want a 90% success rate  You can see that just over 10 is the Collapse Point.  More replication makes things easier

 Higher CP is preferred and lower hop counts  As replication rate increases CP increases and hop count decreases.  GIA Wins!

 The authors thought of that and did some comparisons

 Yes but GIA scales to multiple responses with no issues  They even found a proportion between MAX_RESPONSES and Replication factor!

 You can achieve even capacities by allowing nodes to replicate files and not just index the checksum and location (one hope replication)  I’m sure the RIAA and MPAA love this idea…

 Satisfaction levels are used to help choose when to keep looking for higher capacity neighbors  I = T x K -(1-S)

 Yes that’s true. If a node loses a result the fallback is that the node who issued the request will not receive keep-alive messages from other nodes signaling it to reissue the request  For cases involving topology adaptation it won’t accept new queries after changing neighbors but it will still forward them along the old path

 Then ask me a question!  Seriously any questions?