Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Gnutella. 2 Overview r P2P search mechanism r Simple and straightforward r Completely decentralized r Creates overlay network r Different applications.

Similar presentations


Presentation on theme: "1 Gnutella. 2 Overview r P2P search mechanism r Simple and straightforward r Completely decentralized r Creates overlay network r Different applications."— Presentation transcript:

1 1 Gnutella

2 2 Overview r P2P search mechanism r Simple and straightforward r Completely decentralized r Creates overlay network r Different applications can run over Gnutella – especially file sharing r “Older” unstructured network  Data can be located at any node  Data may not be available at all r No incentive mechanism

3 3 History r First distributed in March 2000  Created by Justin Frankel  Distributed by his company Nullsoft as freeware r Various freeware clients r Creates legal problem for copyright holders because there is no central or semi-central node r Several clients  Limewire  Bearshare  Morpheus (began as KaZaa, but that was disabled by the KaZaa company) r Deficiencies in protocol led to creation of Gnutella2 (which is a completely different protcol)

4 4 Network Architecture r Nodes are called Servents (Server + Client) r Each node maintains its own neighborhood  All nodes that it knows directly  Maintenance is determined by Gnutella client and is not standardized r GWebCache – special nodes that store some servent addresses  Help in bootstrap  Usually a module in a web server

5 5 Joining a Network r A new peer finds an existing servent by  Querying a GWebCache  Manually, through an acquaintance r New peer broadcasts Ping message r Ping advertises that a new peer has joined the network r Every node that receives the ping may add the peer to its neighborhood

6 6 Message Propagation r Broadcast r Limited by  TTL (default is 7)  A Globally Unique ID (GUID) so that node doesn’t send message twice r Response message is sent over the same route as the original message r A response to Ping is Pong  Joining peer may add answering node to its neighborhood  Response has IP address of responding peer in its payload

7 7 Querying Data r Peer broadcasts “Query” message  Includes pattern to match (with different possible interpretations)  TTL r Computer with matching data returns “Query Response”  Response has IP address of responding peer in its payload r If more than one node holds data, peer may decide from which to download r Data may not be found even if it exists in network (unlike Chord)

8 8 Downloading Data r Download is directly via HTTP, not in the overlay network r Download is usually by HTTP GET r Let Alice be downloading peer and Bob be the peer that holds data r If Bob is behind firewall, Alice is not behind firewall and  Gnutella messages (specifically query) goes through the firewall  Alice can’t initiate HTTP through firewall Then, Alice can use Push to get data from Bob

9 9 Free Riding r No incentive mechanism in Gnutella r Significant free riding  Different studies measure between 50% and 90% of clients don’t share data r 50% of all data and files is shared by top 1% of users r Conclusion: voluntary cooperation does not work

10 10 Scalability Issues r Broadcast r TTL r Overlay does not have same topology as underlying IP network  Example of “bad” topology  In real-life studies, topology of Gnutella is independent of IP topology, so nodes that are close in Gnutella may be distant over IP and vice-versa

11 11 Scalability (cont.) r Let the neighborhood size be n (actual number is between 3 and 4) r Let the TTL be t r Maximal number of reachable nodes is  n  i (n+1) i-1, where i=1,…,t r Maximal number is reached in tree. Actual number depends on graph of overlay network r Query message is of length 83 bytes. r Bandwidth requirements: Bad.

12 12 Security Issues r Standard P2P security Issues r Redirection and DoS  When Eve receives a Query, she returns a Query Hit with a different IP address – Alice’s address r Alice may be hit with large number of requests  Even if Alice doesn’t have a Gnutella client! r Since Gnutella peers may fail and there may not be many copies of data, requests tend to be  Frequent – as often as once every few seconds  Long term – 24 hours or more

13 13 Gnutella-BitTorrent Comparison r Gnutella  Simple  Minimal initial bootstrap (no need for web server that maps required file to tracker)  Robust to failures r BitTorrent  Incentives  More efficient design reduces control messages (tracker vs. broadcast)  Download in pieces more adapted to P2P and large files

14 Gnutella-Chord comparison r Chord r Search always returns correct answer r Search in O(log n) r Only exact search possible r High churn affects network structure r Gnutella r Search may not find item r Search in ~O(n) r Approximate search possible r High churn does not affect structure 14

15 Strategies for coordinated download 15

16 Strategy 1: linear transfer of pieces r Simple case: assume that  i, u i =d i =α, file size is F and there are F/α pieces, one piece transferred at each time slot r Peer 1 downloads from server  One piece at a time r Peer i downloads from Peer i-1 for i=2,…,N  One piece at a time r Analysis:  Completion time for network - N-1+F/α  Average completion time – (N-1)/2+F/α 16

17 Strategy 2: exponential transfer of file r Same assumptions as linear strategy r Between time i(F/α) and (i+1)(F/α), peer j uploads full file to peer j+2 i  i=0, 1,…,(log N)-1  j=1,…,2 i r Analysis  Completion time for network – (F/α)log N  Claim: Average completion time – (for N=2 k ) is (F/α)(log N – (N-1)/N) 17

18 Proof I r 2 i peers complete download at time i for i=0,1,…,k-1 (k=log N) r Average time to download r Define 18

19 Proof II r Calculate S by: r Therefore: 19

20 Comparison of strategies 1 & 2 r Completion time compares  N-1+F/α and (F/α)log N r First strategy is better for large files:  (N-1)/(log N - 1)<F r Otherwise, second strategy is better r Can we improve on both? 20

21 Strategy 3 r N=2 k, including one seed r A i is the set of peers that have piece i, except for the seed r Initial strategy r Strategy 2 r Used when at least one user does not have a piece r Ends after k time-slots (steps) 21

22 Strategy 3 (cont.) r After the initial strategy, peer selection changes:  At time slot k+i-1, A i includes n/2 nodes  A i sends the i-th piece to all the other nodes (n/2-1)  Nodes from other sets and the seed replicate their pieces on A i  At each round one peer of A i is idle 22

23 Arnaud Legout © 2010 23 Initial Strategy r At t=0  Seed has all pieces r At t=S  |A 1 |=2 0 r At t=2S  |A 1 |=2 1, |A 2 |=2 0 r At t=3S  |A 1 |=2 2, |A 2 |=2 1, |A 3 |=2 0 time t=0 t=S t=2S 1 1 11 1 1 1 2 223 Seed

24 Arnaud Legout © 201024 Initial Strategy r At t=jS  |A i |=2 j-i, i≤j r This strategy ends when j=k  All n-1=2 k -1 leechers have a piece time t=0 t=S t=2S 1 1 11 1 1 1 2 223 Seed

25 Arnaud Legout © 2010 25 Second Peer Selection Strategy r An example 4 pieces and k=3 r Assume that the seed stops sending pieces when a copy of the content was served  Easier to model  Lower bounds the performance, because it uses less resources

26 Arnaud Legout © 2010 26 Second Peer Selection Strategy r We confirm that for k=3 all peers have a piece r t=3S  There are 2 3 /2 piece 1  There are 2 3 /2 2 piece 2  There are 2 3 /2 3 piece 3 r t=4S  All have piece 1  There are 2 3 /2 piece 2  There are 2 3 /2 2 piece 3  There are 2 3 /2 3 piece 4 time t=3S 11 1 122 3 t=4S t=5S t=7S t=6S 2121 3131 1212 1313 4141 2121 1212 2121 231231 312312 213213 241241 321321 412412 321321 42314231 312312 213213 32413241 43214321 34123412 ALL

27 Arnaud Legout © 2010 27 Second Peer Selection Strategy r t=5S  All have piece 1 and 2  There are 2 3 /2 piece 3  There are 2 3 /2 2 piece 4 r t=6S  All have piece 1, 2, and 3  There are 2 3 /2 piece 4 r t=7S  All have piece 1, 2, 3, and 4 time t=3S 11 1 122 3 t=4S t=5S t=7S t=6S 2121 3131 1212 1313 4141 2121 1212 2121 231231 312312 213213 241241 321321 412412 321321 42314231 312312 213213 32413241 43214321 34123412 ALL

28 Arnaud Legout © 2010 28 Results r At t=kS each peer has a single piece  |A i |=2 k-i, i≤k r After slot k+i for i ≤ |F|/α  Each peer has pieces 1,…,i  |A i+1 |=n/2 peers have piece i+1 and replicate it on the n/2-1 other peers The seed already has piece i+1  Each other peer replicates a piece on the peers in A i+1 At the m slot, the seed stops serving pieces For all j>i+1, |A j | 2*|A j |

29 Arnaud Legout © 2010 29 Results r Termination  At each slot the number of copies of each piece is doubled  When there are n=2 k peers, a piece needs k+1 slots to appear on all peers We consider that the first slot for piece x is when x is sent by the seed to the first peer  For m pieces, k+m slots a required to distribute all pieces on all peers

30 Arnaud Legout © 2010 30 Results r Termination time  All peers have finished at t=(k+m)S  t=(k+m)S=T(k+m)/m=(T/m).log 2 n + T  Decreases in 1/m compared to the content based model  Does not account for pieces overhead

31 Arnaud Legout © 2010 31 Results r Mean download time  With the proposed strategy, at kS each peer has only one piece  As the number of pieces double at each slot, one needs k+m-1 slots for half of the peers to have all the pieces At k, 1 piece; at k+1, 2 pieces; at k+m-1, m pieces But at m, the seed stops serving pieces, thus at k+m-1 only half of the peers have m pieces, the rest have m-1 pieces  The other half receives the last pieces at k+m

32 Arnaud Legout © 2010 32 Results r Mean download time


Download ppt "1 Gnutella. 2 Overview r P2P search mechanism r Simple and straightforward r Completely decentralized r Creates overlay network r Different applications."

Similar presentations


Ads by Google