Peer to Peer (1). References  Chapter 2.9 of Kurose and Ross  Papers oOpenNap: Open Source Napster Server oJ. Liang, R. Kumar and K. Ross, Understanding.

Peer to Peer (1)

References  Chapter 2.9 of Kurose and Ross  Papers oOpenNap: Open Source Napster Server oJ. Liang, R. Kumar and K. Ross, Understanding KaZaA  Acknowledgements: Many of the figures are from other presentations especially from the original authors.

Client-Server Model  Let’s look at the Client- Server model  Servers are centrally maintained and administered  Client has fewer computing resources than a server  This is the way the web works  No interaction between clients

Client Server Model  Disadvantages of the client-server model oReliability —The network depends on a possibly highly loaded server to function properly. —Server needs to be replicated to some extent to provide better reliability. oScalability —More users imply more demand for computing power, storage space and bandwidth

Peer-to-Peer Model  All nodes have same functional capabilities and responsibilities  No reliance on central services or resources.  A node acts as both as a “server” and client.  Considered more scalable

Peer-to-Peer Model  Peer-to-peer systems provide access to information resources located on computers throughout network.  Algorithms for the placement and subsequent retrieval of information is a key aspect of the design of P2P systems.

Why P2P?  The Internet has three valuable fundamental assets oInformation oComputing resources oBandwidth  All of which are vastly under utilized,partly due to the traditional client-server model

Why P2P?  No single search engine can locate and catalog the ever-increasing amount of information on the Web in a timely way  Moreover, a huge amount of information is transient and not subject to capture by techniques such as Web crawling oGoogle claims that it searches about 1.3x10 8 web pages oFinding useful information in real time is increasingly difficult!

Why P2P?  Although miles of new fiber have been installed, the new bandwidth gets little use if everyone goes to Yahoo for content and to eBay  Instead, hot spots just get hotter while cold pipes remain cold  This is partly why most people still feel the congestion over the Internet while a single fiber’s bandwidth has increased by a factor of 10 6 since 1975, doubling every 16 months

Why P2P?  P2P potentially can eliminating the single-source bottleneck  P2P can be used to distribute data and control and load-balance requests across the Net  P2P potentially eliminates the risk of a single point of failure  P2P infrastructure allows direct access and shared space, and this can enable remote maintenance capability

Brief History  Generation 1 of P2P Systems oNapster music exchange  Generation 2 oFreenet, Gnuetella, Kazaa, BitTorrent  Generation 3 oCharacterized by the emergence of middleware layers for the application-independent management of distributed resources on a global scale —Pastry, Tapestry, Chord, Kademlia

Environment Characteristics for Peer-to-Peer Systems  Unreliable environments  Peers connecting/disconnecting – network failures to participation  Random Failures e.g. power outages, cable and DSL failures, hackers  Personal machines are much more vulnerable than servers

Evaluating Peer-to-Peer Systems  A node’s database: oWhat does a node need to save in order to operate properly/efficiently  Success rate (if the file is in the network, what are the changes that a search will find it)  Lookup cost: oTime oCommunication (bandwidth usage)  Join/departure cost  Fault Tolerance – Resilience to faults  Resilience to denial of service attacks, security.

Issues in File Sharing Services  Publish – How to insert a new file into the network  Lookup – Find a specific file  Retrieval – Getting a copy of a file

P2P File Sharing Software  Allows a user to open up a directory in their file system oAnyone can retrieve a file from directory oLike a Web server  Allows the user to copy files from other users’ open directories: oLike a Web client  Allows users to search nodes for content based on keyword matches: oLike Google

Napster: How Did It Work  Application-level, client-server protocol over point- to-point TCP  Centralized directory server  Steps: oConnect to Napster server oGive server keywords to search the full list with. oSelect “best” of correct answers. —One approach is select based on the response time of a pings. –Shortest response time is chosen.

Napster: How Did It Work File list and IP address is uploaded 1. napster.com centralized directory

Napster: How Did It Work napster.com centralized directory Query and results User requests search at server. 2.

Napster: How Did It Work pings User pings hosts that apparently have data. Looks for best transfer rate. 3. napster.com centralized directory

Napster: How Did It Work napster.com centralized directory Retrieves file User chooses server 4. Napster’s centralized server farm had difficult time keeping up with traffic

Napster  There are centralized indexes but users supplied the files which were stored and accessed on their personal computer  Napster became very popular for music exchange

Napster  History:  5/99: Shawn Fanning (freshman, Northeasten U.) founds Napster Online music service  12/99: first lawsuit  3/00: 25% UWisc traffic Napster  2/01: US Circuit Court of Appeals: Napster knew users violating copyright laws  7/01: # simultaneous online users: Napster 160K, Gnutella: 40K, Morpheus (KaZaA): 300K

Napster  Judge orders Napster to pull plug in July ‘01  Other file sharing apps take over! gnutella napster fastrack (KaZaA) 8M 6M 4M 2M 0.0 bits per sec

Napster’s Downfall  Napster’s developers argued they were not liable for infringement of the copyrights oWhy? They were not participating in the copying process which was performed entirely between users’ machines.  This argument was not accepted by the courts oWhy? The index servers were deemed an essential part of the process  Since the index servers were located at well- known addresses, their operators were unable to remain anonymous. oMakes for an easy lawsuit target

Napster’s Downfall  A more fully distributed file sharing service spreads the responsibility across all of the users oMakes the pursuit of legal remedies difficult

Napster: Discussion  Locates files quickly  Vulnerable to censorship and technical failure  Popular data become less accessible because of the load of the requests on a central server  People started to look for more distributed solutions to file-sharing as a result of Napster’s failure.

Gnutella  Napster’s legal problems motivated Gnutella where there is not a use of centralized indexes  The focus is on a decentralized method of searching for files oCentral directory server no longer the bottleneck oMore difficult to “pull plug”  Each application instance serves to: oStore selected files oRoute queries from and to its neighboring peers oRespond to queries if file stored locally oServe files

Gnutella  Gnutella history: o3/14/00: release by AOL, almost immediately withdrawn oBecame open source oMany iterations to fix poor initial design (poor design turned many people off)  Issues: oHow much traffic does one query generate? oHow many hosts can it support at once? oWhat is the latency associated with querying? oIs there a bottleneck?

Gnutella: Searching  Searching by flooding: A Query packet might ask, "Do you have any content that matches the string ‘ Homer"? oIf a node does not have the requested file, then 7 (default set by Gnutella) of its neighbors are queried. oIf the neighbors do not have it, they contact 7 of their neighbors. oMaximum hop count: 10 (this is called time-to-live TTL) oReverse path forwarding for responses (not files)

Gnutella: Searching  Downloading Peers respond with a “ QueryHit ” (contains contact info) File transfers use direct connection using HTTP protocol ’ s GET method When there is a firewall a "Push" packet is used – reroutes via Push path

Gnutella: Searching

Gnutella: Discovering Peers  A peer has to know at least one other peer to send requests to.  Addresses of some peers have been published on a website.  When a peer enters the network, it contacts a designated peer and receives a list of other peers that have recently entered the network.

Gnutella: Discussion  Robust: The failure of peer is not a failure of Gnutella.  Performance: Flooding leads to poor performance  Free riders: Those who get data but do not share data.  The model of Gnutella just presented was found not be workable.  This led to models which had some peer nodes having indexes.

KaZaA: The Service  More than 3 million up peers sharing over 3,000 terabytes of content  More popular than Napster ever was  More than 50% of Internet traffic ?  MP3s & entire albums, videos, games  Optional parallel downloading of files  Automatically switches to new download server when current server becomes unavailable  Provides estimated download times

KaZaA: The Service  A user can configure the maximum number of simultaneous uploads and maximum number of simultaneous downloads  Queue management at server and client oFrequent uploaders can get priority in server queue  Keyword search oUser can configure “up to x” responses to keywords  Responses to keyword queries come in waves; stops when x responses are found

KaZaA: The Technology  Proprietary  Control data encrypted  Everything in HTTP request and response messages

KaZaA: Architecture  Each peer is either a supernode or is assigned to a supernode o56 min avg connect oEach SN has about 100- 150 children oRoughly 30,000 SNs  Each supernode has TCP connections with 30-50 supernodes o23 min avg connect supernodes

KaZaA: Architecture  Nodes that have more connection bandwidth and are more available are designated as supernodes  Each supernode acts as a mini-Napster hub, tracking the content and IP addresses of its descendants  A supernode tracks only the content of its children.  Considered a cross between Napster and Gnutella

KaZaA: Finding Supernodes  List of potential supernodes included within software download  New peer goes through list until it finds operational supernode oConnects, obtains more up-to-date list, with 200 entries oNodes in list are “close” to ON. oNode then pings 5 nodes on list and connects with the one  If supernode goes down, node obtains updated list and chooses new supernode

KaZaA Queries  Node first sends query to supernode oSupernode responds with matches oIf x matches found, done.  Otherwise, supernode forwards query to subset of supernodes oIf total of x matches found, done.  Otherwise, query further forwarded oProbably by original supernode rather than recursively

Bootstrapping  How do I find out about a peer to begin with?  Use a bootstrapping (or multiple bootstrapping nodes).

Summary  The use of centralized indexes in Napster lands you in legal woes  The use of Gnutella avoids legal woes but is painfully slow.  Kazaa is somewhere in between.  Can we do better?

Peer to Peer (1). References  Chapter 2.9 of Kurose and Ross  Papers oOpenNap: Open Source Napster Server oJ. Liang, R. Kumar and K. Ross, Understanding.

Similar presentations

Presentation on theme: "Peer to Peer (1). References  Chapter 2.9 of Kurose and Ross  Papers oOpenNap: Open Source Napster Server oJ. Liang, R. Kumar and K. Ross, Understanding."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Peer to Peer (1). References  Chapter 2.9 of Kurose and Ross  Papers oOpenNap: Open Source Napster Server oJ. Liang, R. Kumar and K. Ross, Understanding.

Similar presentations

Presentation on theme: "Peer to Peer (1). References  Chapter 2.9 of Kurose and Ross  Papers oOpenNap: Open Source Napster Server oJ. Liang, R. Kumar and K. Ross, Understanding."— Presentation transcript:

Similar presentations

About project

Feedback