Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.

Similar presentations


Presentation on theme: "1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in."— Presentation transcript:

1 1 Napster & Gnutella An Overview

2 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in mid 1999, and remained operational for two years. Most Innovative Software (Computer Shopper, Dec 2000) Best Cyber Tech 2000 (Time Magazine) Sparked off a host of lawsuits for copyright infringement.

3 3 Napster Principles The actual MP3 files are stored on client machines. A central server holds meta- information about online clients along with the files held by them. Clients contact Napster Server to locate files, and contact each other to download them Index Server

4 4 Remarks Simple but elegant Central index server is a bottleneck and a point of failure Replication / chaining of servers eased the bottleneck to some extent.

5 5 What is KaZaA? Extension of Napster - uses the FastTrack technology. Powerful nodes (supernodes) act as local index servers, and client queries are propagated to other supernodes. Step in the right direction. Second generation P2P Supernode download

6 6 What is Gnutella? Gnutella is a protocol for distributed search peer-to-peer comm decentralized model No third party lookup ? ? ? Two stages : 1.Join Network … later 2.Use Network 1.Discover other peers 2.Search other peers

7 7 The Jargon Horizon/TTL: how many hops a packet can go before it dies (default setting is 7 in Gnutella) Hops: a hop is a pass through an intermediate node Servent: A Gnutella node. 1 Hop 2 Hops Each servent is both a client and a server

8 8 Gnutella Messages Ping: used to actively discover hosts on the network. A servent receiving a Ping descriptor is expected to respond with one or more Pong descriptors. Pong: the response to a Ping. Each Pong packet contains a Globally Unique Identifier (GUID) plus address of servent and information regarding the amount of data it is making available to the network) Query: the primary mechanism for searching the distributed network. A servent receiving a Query descriptor will respond with a QueryHit if a match is found against its local data set. QueryHit: the response to a Query: contains IP address, GUID and search results Push: allows downloading from firewalled servents

9 9 Gnutella Scenario Step 0: Join the network Step 1: Determining who is on the network "Ping" packet is used to announce your presence on the network. Other peers respond with a "Pong" packet. Also forwards your Ping to other connected peers A Pong packet also contains: an IP address port number amount of data that peers is sharing Pong packets come back via same route Step 2: Searching Gnutella "Query" ask other peers if they have the file you desire (and have an acceptably fast network connection). A Query packet might ask, "Do you have any content that matches the string ‘Homer"? Peers check to see if they have matches & respond (if they have any matches) & send packet to connected peers Continues for TTL Step 3: Downloading Peers respond with a “QueryHit” (contains contact info) File transfers use direct connection using HTTP protocol’s GET method When there is a firewall a "Push" packet is used – reroutes via Push path

10 10 Remarks Simple idea - But lacks scalability, since bandwidth is wasted. Sometimes, objects that are present may not be located due to limited TTL. Various improved search strategies have been proposed.

11 11 Efficient search: Gia approach Flooding was replaced by (biased) random walk Topology adaptation: high-capacity nodes maintain more clues via one hop replication. They should be easily reachable Flow Control to prevent overloading To improve the scalability of Gnutella on a heterogeneous Platform, Gia used the following approach:

12 12 Search via random walk Gnutella is a protocol for distributed search peer-to-peer decentralized model Two stages : 1.Join Network … later 2.Use Network 1.Discover other peers 2.Search other peers via random walk

13 13 Topology control Goals are (1)high capacity nodes have high degrees and (2) low capacity nodes are within a short distance from the high capacity nodes. S = satisfaction level. S=0 means dissatisfied (# neighbors < max # according to its capacity), S=1 means satisfied. To increase the level of satisfaction, each host picks a node from its cache as neighbor if it is not already a neighbor, or not marked dead.

14 14 Topology Control {To add a neighborY} If # of neighbors + 1 < max then accept Y as a neighbor {To drop an existing neighbor for the sake of adding Y} pick the highest degree current neighbor Z; If Y has higher capacity or Y has fewer neighbors than Z then drop Z, accept Y else reject Y {This ensures we do not drop poorly connected neighbors}

15 15 One hop replication Each node keeps track of the indices of the files at its immediate neighbors. As a result, high capacity/ high degree nodes can provide useful response to a a large number of queries.

16 16 Biased random walk P=5/10 P=3/10 P=2/10 How does it help?

17 17 Analysis of random walk p = number of nodes that have the desired object O = overhead, i.e. distance traveled by the walkers to complete the search (that can be successful or unsuccessful) T = TTL (time to live for each walker) Expected overhead = 1-(1-p) T-1 + (1-p) T-1.k p One can deploy k random walkers to expedite the search

18 18 Analysis of random walk Expected overhead = 1-(1-p) T-1 + (1-p) T-1. k p As k increases, the probability of success increases, the overhead increases, but the delay decreses! There is a tradeoff here. Probability of success = 1-(1-p) kT


Download ppt "1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in."

Similar presentations


Ads by Google