Download presentation
Presentation is loading. Please wait.
Published byJanice Allen Modified over 8 years ago
1
1 Gnutella
2
2 Overview r P2P search mechanism r Simple and straightforward r Completely decentralized r Creates overlay network r Different applications can run over Gnutella – especially file sharing r “Older” unstructured network Data can be located at any node Data may not be available at all r No incentive mechanism
3
3 History r First distributed in March 2000 Created by Justin Frankel Distributed by his company Nullsoft as freeware r Various freeware clients r Creates legal problem for copyright holders because there is no central or semi-central node r Several clients Limewire Bearshare Morpheus (began as KaZaa, but that was disabled by the KaZaa company) r Deficiencies in protocol led to creation of Gnutella2 (which is a completely different protcol)
4
4 Network Architecture r Nodes are called Servents (Server + Client) r Each node maintains its own neighborhood All nodes that it knows directly Maintenance is determined by Gnutella client and is not standardized r GWebCache – special nodes that store some servent addresses Help in bootstrap Usually a module in a web server
5
5 Joining a Network r A new peer finds an existing servent by Querying a GWebCache Manually, through an acquaintance r New peer broadcasts Ping message r Ping advertises that a new peer has joined the network r Every node that receives the ping may add the peer to its neighborhood
6
6 Message Propagation r Broadcast r Limited by TTL (default is 7) A Globally Unique ID (GUID) so that node doesn’t send message twice r Response message is sent over the same route as the original message r A response to Ping is Pong Joining peer may add answering node to its neighborhood Response has IP address of responding peer in its payload
7
7 Querying Data r Peer broadcasts “Query” message Includes pattern to match (with different possible interpretations) TTL r Computer with matching data returns “Query Response” Response has IP address of responding peer in its payload r If more than one node holds data, peer may decide from which to download r Data may not be found even if it exists in network (unlike Chord)
8
8 Downloading Data r Download is directly via HTTP, not in the overlay network r Download is usually by HTTP GET r Let Alice be downloading peer and Bob be the peer that holds data r If Bob is behind firewall, Alice is not behind firewall and Gnutella messages (specifically query) goes through the firewall Alice can’t initiate HTTP through firewall Then, Alice can use Push to get data from Bob
9
9 Free Riding r No incentive mechanism in Gnutella r Significant free riding Different studies measure between 50% and 90% of clients don’t share data r 50% of all data and files is shared by top 1% of users r Conclusion: voluntary cooperation does not work
10
10 Scalability Issues r Broadcast r TTL r Overlay does not have same topology as underlying IP network Example of “bad” topology In real-life studies, topology of Gnutella is independent of IP topology, so nodes that are close in Gnutella may be distant over IP and vice-versa
11
11 Scalability (cont.) r Let the neighborhood size be n (actual number is between 3 and 4) r Let the TTL be t r Maximal number of reachable nodes is n i (n+1) i-1, where i=1,…,t r Maximal number is reached in tree. Actual number depends on graph of overlay network r Query message is of length 83 bytes. r Bandwidth requirements: Bad.
12
12 Security Issues r Standard P2P security Issues r Redirection and DoS When Eve receives a Query, she returns a Query Hit with a different IP address – Alice’s address r Alice may be hit with large number of requests Even if Alice doesn’t have a Gnutella client! r Since Gnutella peers may fail and there may not be many copies of data, requests tend to be Frequent – as often as once every few seconds Long term – 24 hours or more
13
13 Gnutella-BitTorrent Comparison r Gnutella Simple Minimal initial bootstrap (no need for web server that maps required file to tracker) Robust to failures r BitTorrent Incentives More efficient design reduces control messages (tracker vs. broadcast) Download in pieces more adapted to P2P and large files
14
Gnutella-Chord comparison r Chord r Search always returns correct answer r Search in O(log n) r Only exact search possible r High churn affects network structure r Gnutella r Search may not find item r Search in ~O(n) r Approximate search possible r High churn does not affect structure 14
15
Strategies for coordinated download 15
16
Strategy 1: linear transfer of pieces r Simple case: assume that i, u i =d i =α, file size is F and there are F/α pieces, one piece transferred at each time slot r Peer 1 downloads from server One piece at a time r Peer i downloads from Peer i-1 for i=2,…,N One piece at a time r Analysis: Completion time for network - N-1+F/α Average completion time – (N-1)/2+F/α 16
17
Strategy 2: exponential transfer of file r Same assumptions as linear strategy r Between time i(F/α) and (i+1)(F/α), peer j uploads full file to peer j+2 i i=0, 1,…,(log N)-1 j=1,…,2 i r Analysis Completion time for network – (F/α)log N Claim: Average completion time – (for N=2 k ) is (F/α)(log N – (N-1)/N) 17
18
Proof I r 2 i peers complete download at time i for i=0,1,…,k-1 (k=log N) r Average time to download r Define 18
19
Proof II r Calculate S by: r Therefore: 19
20
Comparison of strategies 1 & 2 r Completion time compares N-1+F/α and (F/α)log N r First strategy is better for large files: (N-1)/(log N - 1)<F r Otherwise, second strategy is better r Can we improve on both? 20
21
Strategy 3 r N=2 k, including one seed r A i is the set of peers that have piece i, except for the seed r Initial strategy r Strategy 2 r Used when at least one user does not have a piece r Ends after k time-slots (steps) 21
22
Strategy 3 (cont.) r After the initial strategy, peer selection changes: At time slot k+i-1, A i includes n/2 nodes A i sends the i-th piece to all the other nodes (n/2-1) Nodes from other sets and the seed replicate their pieces on A i At each round one peer of A i is idle 22
23
Arnaud Legout © 2010 23 Initial Strategy r At t=0 Seed has all pieces r At t=S |A 1 |=2 0 r At t=2S |A 1 |=2 1, |A 2 |=2 0 r At t=3S |A 1 |=2 2, |A 2 |=2 1, |A 3 |=2 0 time t=0 t=S t=2S 1 1 11 1 1 1 2 223 Seed
24
Arnaud Legout © 201024 Initial Strategy r At t=jS |A i |=2 j-i, i≤j r This strategy ends when j=k All n-1=2 k -1 leechers have a piece time t=0 t=S t=2S 1 1 11 1 1 1 2 223 Seed
25
Arnaud Legout © 2010 25 Second Peer Selection Strategy r An example 4 pieces and k=3 r Assume that the seed stops sending pieces when a copy of the content was served Easier to model Lower bounds the performance, because it uses less resources
26
Arnaud Legout © 2010 26 Second Peer Selection Strategy r We confirm that for k=3 all peers have a piece r t=3S There are 2 3 /2 piece 1 There are 2 3 /2 2 piece 2 There are 2 3 /2 3 piece 3 r t=4S All have piece 1 There are 2 3 /2 piece 2 There are 2 3 /2 2 piece 3 There are 2 3 /2 3 piece 4 time t=3S 11 1 122 3 t=4S t=5S t=7S t=6S 2121 3131 1212 1313 4141 2121 1212 2121 231231 312312 213213 241241 321321 412412 321321 42314231 312312 213213 32413241 43214321 34123412 ALL
27
Arnaud Legout © 2010 27 Second Peer Selection Strategy r t=5S All have piece 1 and 2 There are 2 3 /2 piece 3 There are 2 3 /2 2 piece 4 r t=6S All have piece 1, 2, and 3 There are 2 3 /2 piece 4 r t=7S All have piece 1, 2, 3, and 4 time t=3S 11 1 122 3 t=4S t=5S t=7S t=6S 2121 3131 1212 1313 4141 2121 1212 2121 231231 312312 213213 241241 321321 412412 321321 42314231 312312 213213 32413241 43214321 34123412 ALL
28
Arnaud Legout © 2010 28 Results r At t=kS each peer has a single piece |A i |=2 k-i, i≤k r After slot k+i for i ≤ |F|/α Each peer has pieces 1,…,i |A i+1 |=n/2 peers have piece i+1 and replicate it on the n/2-1 other peers The seed already has piece i+1 Each other peer replicates a piece on the peers in A i+1 At the m slot, the seed stops serving pieces For all j>i+1, |A j | 2*|A j |
29
Arnaud Legout © 2010 29 Results r Termination At each slot the number of copies of each piece is doubled When there are n=2 k peers, a piece needs k+1 slots to appear on all peers We consider that the first slot for piece x is when x is sent by the seed to the first peer For m pieces, k+m slots a required to distribute all pieces on all peers
30
Arnaud Legout © 2010 30 Results r Termination time All peers have finished at t=(k+m)S t=(k+m)S=T(k+m)/m=(T/m).log 2 n + T Decreases in 1/m compared to the content based model Does not account for pieces overhead
31
Arnaud Legout © 2010 31 Results r Mean download time With the proposed strategy, at kS each peer has only one piece As the number of pieces double at each slot, one needs k+m-1 slots for half of the peers to have all the pieces At k, 1 piece; at k+1, 2 pieces; at k+m-1, m pieces But at m, the seed stops serving pieces, thus at k+m-1 only half of the peers have m pieces, the rest have m-1 pieces The other half receives the last pieces at k+m
32
Arnaud Legout © 2010 32 Results r Mean download time
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.