Download presentation
Presentation is loading. Please wait.
Published byKerry Lawrence Modified over 9 years ago
1
1.Optimizing P2P Networks: Lessons learned from social networking a)Social Networks b)Lessons Learned c)Are P2P Networks Social?? d)Organizing P2P Networks 2.Peer Topologies a)Centralized, Ring, Hierarchical & Decentralized b)Hybrid: oCentralized-Ring oCentralized-Centralized oCentralized-Decentralized c)Reflector Nodes 3.Gnutella Case Studies a)3 case studies Scalability 1
2
“You can’t scale better than by utilising someone else’s computer.” Paul James1 2
3
Limewire Gnutella Coding 3
4
Social Networks Stanley Milgram (Harvard professor ) – 1967 social networking experiment How many ‘social hops’ would it take for messages to traverse through the US population (200 million) Posted 160 letters randomly chosen people in Omaha, Nebraska Boston Omaha Asked them to try to pass these letters to a stockbroker working in Boston, Massachusetts Rules: use intermediacies whom they know on a first name basis chosen intelligently make a note at each hop 42 letters made it !! Average of 5.5 hops Demonstrated the ‘small world effect’ Proved that the social network of the United States is indeed connected with a path- length (number of hops) of around 6 – The 6 degrees of separation ! Does this mean that it takes 6 hops to traverse 200 million people?? 4
5
Lessons Learned from Milgrim’s Experiment Social circles are highly clustered A few members have wide-ranging connections these form a bridge between far-flung social clusters this bridging plays a critical role in bringing the network closer together For example A quarter of all letters passed through a local storekeeper A half were mediated by just 3 people Lessons Learned These people acted as gateways or hubs between the source and the wider world A small number of bridges dramatically reduces the number of hops 5
6
From Social Networks to Computer Networks… There are a number of similarities to social networks People = peers Intermediaries = Hubs, Gateways or Rendezvous Nodes (JXTA speak...) Number of intermediaries passed through = number of hops Are P2P Networks Special then? P2P networks are more like social networks than other types of computer network because they are often: Self Organizing Ad-Hoc Employ clustering techniques based on prior interactions (like we form relationships) Decentralized discovery and communication (like we form neighbourhoods, villages, cities etc) 6
7
Problem: how do we organize peers within ad-hoc, multi- hop pervasive P2P networks? network of self-organizing peers organized in a decentralized fashion such networks can rapidly expand from a few hundred peers to several thousand or even millions Peer to Peer: What’s the problem? P2P Environment Recap: Unreliable Environments Peers connecting/disconnecting – network failures to participation Random Failures e.g. power outages, Cable, DSL failure, hackers Personal machines are much more vulnerable than servers algorithms have to cope with this continuous restructuring of the network core. P2P systems need to treat failures as normal occurrences not freak exceptions must be designed in a way that promotes redundancy with the tradeoff of a degradation of performance 7
8
For P2P This does not mean abstract numerical benchmarks e.g. how many milliseconds will it take to compute this many millions of FFTs? Rather, it means asking question like: How long will it take to retrieve this particular file? How much bandwidth will this query consume? How many hops will it take for my package to get to a peer on the far side of the network? If I add/remove a peer to the network will the network still be fault tolerant? Does the network scale as we add more peers. Such networks can rapidly expand from a few hundred peers to several thousand or even millions So, how do we Organize Networks in Order to Get Optimum Performance? 8
9
3 main factors that make P2P networks more sensitive to performance issues: Performance Issues in P2P Networks 1.Communication. Fundamental necessity Users connected via different connections speeds Multi-hop 2.Searching No central Control so more effort is needed Each hop adds to total bandwidth – problems: time outs 3.Equal Peers Free Riders – unbalance in the harmonicity of network Degrades performance for others Need to get this right to adjust accordingly 9
10
Core Centralized Ring Hierarchical Decentralized Hybrid Centralized-Ring Centralized-Centralized Centralized-Decentralized Peer Topologies 10
11
Centralized Client/server Web servers Databases Napster search Instant Messaging Popular Power 11
12
Ring Fail-over clusters Simple load balancing Assumption –Single owner 12
13
Hierarchical Tree structure DNS Usenet (sort of) 13
14
Decentralized Gnutella Freenet Internet routing 14
15
Centralized + Ring Robust web applications High availability of servers 15
16
Centralized + Centralized N-tier apps Database heavy systems Web services gateways Google.com uses this topology to deliver their service 16
17
Centralized + Decentralized New Wave of P2P Clip2 Gnutella Reflector (next) FastTrack –KaZaA –Morpheus Email Like Social Networks perhaps ? 17
18
F1.mp3 – ID0:F1.mp3 … CF1.mp3 F2.mp3 F3.mp3 0 1 2 Reflector Nodes Known as ‘super peers’ – in JXTA these are Rendezvous peers cache file list of connected users – maintain an index When a query is issued, the Reflector does not retransmit it - it answers the query from its own memory Do they remind you of anything ? 18
19
Napster Gnutella User Napster.com Gnutella Super Peers: 1. Natural?? 2. Reflector (clip2.com) =? User Napster N2 N3 Napster Duplicated Servers Napster = Gnutella? 19
20
The figure below is a view of the topology of a Gnutella network as shown on the LimeWire web site, the popular Gnutella file-sharing client. Notice how the power-law or centralized-decentralized structure is demonstrated. The Gnutella Network Today 20
21
Another View of the Gnutella Network 21
22
Gnutella Studies 1: Free Riding E. Adar and B.A. Huberman (2000), “Free Riding on Gnutella,” First Monday 5(10), http://firstmonday.org/issues/issue5_10/adar/index.html Two types of free riding 1.download files but never provide any files for other to download 2.users that have undesirable content They found 22,084 of the 33,335 peers in the network (66%) of the peers share no files 24,347 or 73% share ten or less files top 1 percent (333 hosts) represent 37 percent of the total files shared 20 percent (6,667 hosts) sharing 98% of the files shows - even without Gnutella Reflector nodes, the Gnutella network naturally converges into a centralized + decentralized topology with the top 20% of nodes acting as super peers or reflectors 22
23
Gnutella Studies 2: Equal Peers Study on Reflector Nodes [clip] www.clip2.com Studied Gnutella for one month Noted an apparent scalability barrier when query rates went above 10 per second. Why?? Gnutella query = 560 bits long and queries make up approximately one quarter of traffic. Each peer is connect to three peers, so: 560 *10 * 3 = 16,800 bytes per second This is a quarter of the traffic so total traffic 67,200 bytes per second. a 56-K link cannot keep up with this amount of traffic one node connected in the incorrect place can grind the whole network to a halt. This is why P2P networks place slower nodes at the edges 23
24
Gnutella Studies 3: Communication Peer-to-Peer Architecture Case Study: Gnutella Network Matei Ripeanu, on-line at: http://people.cs.uchicago.edu/~matei/PAPERS/P2P2001.pdf Studied topology of Gnutella over several months & reported two findings: 1. Gnutella network shares the benefits and drawbacks of a power-law structure - networks that organize themselves so that most nodes have a few links and a small number of nodes have many - found to show an unexpected degree of robustness when facing random node failures. - vulnerable to attacks e.g. by removing a few of the super nodes can have a massive effect on the function of the network as a whole. 2. Gnutella network topology does not match well with the underlying Internet topology leading to inefficient use of network bandwidth. He gave 2 suggestions: 1.use an agent to monitor network and intervene by asking servents to drop/add links to keep the topology optimal. 2.replace the Gnutella flooding mechanism with a smarter routing and group communication mechanism. 24
25
What about other topologies: The Future? Centralized + Hierarchical? –Back end tree of information –Caching architectures Decentralized + Ring? –P2P network of fail-over clusters More ?? 25
26
1.Summary a)Centralized + Decentralized – understand from the original Gnutella to the new models b)The role of Reflector nodes 2.Further Information: Distributed Hashtable Models a)Pastry: http://research.microsoft.com/~antr/pastry b)Chord: http://www.pdos.lcs.mit.edu/chord/ 26 Closing Remarks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.