© 2013 A. Haeberlen, Z. Ives NETS 212: Scalable and Cloud Computing 1 University of Pennsylvania Peer-to-Peer Systems November 26, 2013.

© 2013 A. Haeberlen, Z. Ives Announcements How is the PennBook project going? You can sign up for demo slots tomorrow There will be slots on several different days If your team has special constraints (finals, travel,...), please sign up as soon as you can - the slots go quickly! Visualizer example code & handout available! Should be useful for building the "friend visualization" URL available on Piazza and on the course web page Second midterm on December 10th HW4MS2 is due tomorrow at 10pm EST 2 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives PennBook FAQ Should I have multiple domains - one for the user, one for activities, etc.? Should we have a separate MapReduce job for ranking activities? Should we have a numeric UserID that is different from the user name? Should we have a status ID as well? How frequently should we run friend recommendation? How do we do offline message queueing? EJS confusion 3 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Scalability is important! 4 University of Pennsylvania http://www.washingtonpost.com/blogs/wonkblog/wp/2013/10/05/a-techie-walks-us-through-healthcare-govs-two-big-problems/

© 2013 A. Haeberlen, Z. Ives Scalability and robustness What do we have to do to scale the systems we discussed so far? How to double capacity of a Hadoop cluster? A web service? What limits scalability? Are these systems resilient to faults and attacks? If so, against which ones? How is this achieved? 5 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Problem #1: Centralized resources What if the system needs LOTS of resources? Example: Telephony system for 10s of millions of users Cloud can provide the resources, but who should pay? How can they raise the necessary money? 6 University of Pennsylvania AliceBob Alice's customers

© 2013 A. Haeberlen, Z. Ives Problem #2: Central state + control What can happen if some nodes are much more important than the others? Example: Name node in Hadoop Consequences for scalability? For robustness? For security? What can happen if someone doesn't like the system? 7 University of Pennsylvania foo.txt: 3,9,6 bar.data: 2,4 9 9 9 9 3 3 3 2 2 2 4 4 4 6 6 Name node Data nodes Client

© 2013 A. Haeberlen, Z. Ives Approach: Decentralization How can we make systems less centralized? Idea #1: Utilize resources of ALL nodes The 'client' nodes can help the system by contributing storage, bandwidth, computation,... Idea #2: Remove centralized components Avoid having individual nodes or systems that are crucial to the operation of the system 8 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives "P2P = File sharing" Some of the early P2P applications were used for file sharing (Napster, Gnutella,...) Some people even believe they are the same But P2P is not the same as file sharing! File sharing: A specific application P2P: A design principle for distributed systems And file sharing is not the only application! Other examples: Streaming media, telephony, content distribution, routing, volunteer computing,... 9 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Examples of deployed systems Examples of partly centralized systems: Microsoft's Skype (telephony) Akamai NetSession (content distribution) BitTorrent (content distribution) SETI@home/BOINC (volunteer computing) Amazon Dynamo (key-value store) Examples of decentralized systems: Freenet (censorship-resistant data store) Gnutella (file sharing) CoralCDN (content distribution) BGP (the Internet's interdomain routing system) NNTP and SMTP (news and mail distribution) 11 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Recap: Decentralization Systems can be centralized in different ways Centralized resources, centralized functionality Example: Client-server system Centralization can cause problems Example: What if centralized components fail? Idea: Decentralization Distribute the work more equally among the nodes Leverage additional resources, eliminate bottlenecks, etc. Partly vs. fully decentralized 12 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Our plan for today Partly decentralized system: BitTorrent Swarming Incentives: Tit for tat Decentralized system: Pastry Overlays; unstructured vs. structured Key-based routing Distributed hashtables KBR in Pastry Challenges Swarming and ISPs Gaming BitTorrent; Sybil and Eclipse attacks 13 University of Pennsylvania NEXT

© 2013 A. Haeberlen, Z. Ives Characteristics of partly centralized systems Contains some infrastructure components Example: Central controller that maintains a list of participating nodes But: Infrastructure component is not involved in resource-intensive operations Example: Data is downloaded or uploaded directly to peers Sometimes called a "peer-assisted" system 14 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives An example Suppose we want to ship a DVD image to 10,000 clients. How do we do this? Option #1: Server does all the work Example: 1 Gbps upstream  Need about 190 hours Option #2: Let the clients help 1 Mbps upstream x 10,000 = 10 Gbps! Even if the server has only 1 Mbps, can finish in 19 hours! 15 University of Pennsylvania 1 Gbps 1 Mbps x 1 x 10,000

© 2013 A. Haeberlen, Z. Ives Trackers and torrent files How do clients find peers to connect to? Clients connect to a special tracker node Tracker responds with the IP+port of a few other peers who are downloading the same file Modern BitTorrent clients are trackerless and use a DHT instead (more about this later) How do clients find the tracker? Clients begin by downloading a 'torrent file' (e.g., from a web server), which has the URL of the tracker Torrent file also contains a SHA1 hash of each file block Why is this needed? 17 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives BitTorrent Simplified BitTorrent session: 1. Download the 'torrent file' 2. Connect to the tracker and get a list of peers 3. Connect to the peers - initially as a 'leecher' 4. While file is not yet fully downloaded: Advertise to peers which blocks are available locally Request blocks from peers Compare hash of downloaded blocks to hash in torrent file (why?) 5. Turn into a 'seeder', i.e., continue uploading to peers without downloading 18 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Incentives Many users would rather not upload content Some users pay per byte (e.g., cellular networks) Uploading may take bandwidth from other applications Upload traffic may introduce jitter or queueing delay (VoIP!) Danger: Tragedy of the commons Everyone wants to download, but nobody uploads Idea: Provide an incentive for uploading Many possible incentives (name a few!) BitTorrent's approach is based on reciprocity 19 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Tit for tat Idea: Upload to peers with best download rate Result: Everyone has an incentive to upload Instance of an old, successful idea Goes back to Axelrod's tournament (iterated prisoner's dilemma) http://www-cs-faculty.stanford.edu/~eroberts/courses/soco/projects/1998-99/game-theory/axelrod.html Attempts to achieve pareto efficiency How this is used in BitTorrent: At any given time, peer uploads to a fixed # of other peers Peers are chosen based on current download rate All other peers are 'choked' (no uploads) Additionally, one peer is optimistically 'unchoked' (why?) 20 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Recap: Partly centralized systems Contain a few centralized components Example: HDFS namenode, BitTorrent tracker,... However, most of the actual work is done by the peers Some pros and cons: More scalable than centralized systems 'Organic growth': More peers potentially means more demand, but also more resources But: Centralized component is single point of failure and eventually becomes a bottleneck 22 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Characteristics of decentralized systems No dedicated nodes that are critical for the operation of the system No inherent bottlenecks Can scale very very far Potentially resilient to failure, attack, legal challenges Often a high degree of symmety Most functions can be performed by any node Example: Storing a certain file, forwarding packets,... Does this mean that all nodes have to be exactly the same? 24 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives The bootstrap problem How does a new node join the system? Partly centralized system: Can ask the 'controller' for peers But in a decentralized system, there is no 'controller'! Need to find at least one existing peer. How? Idea #1: Hardcode a few IPs in the software Idea #2: Keep a web page with a few current IPs Idea #3: Use an IRC channel Idea #4: Use DNS Idea #5: Use anycast... 25 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Overlays Peers must somehow connect to each other Result: Logical network topology Relationship to the physical network topology? Direct link in logical topology does not imply direct link in physical topology, or vice versa But link in logical topology requires path in physical topology Logical topology is an overlay 26 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Unstructured overlay What should the logical topology be like? One option: No constraints at all When a peer joins, it randomly connects to some other peers How many should it connect to? Result: Unstructured overlay (Example: Gnutella) Where should we store content? How can we locate it? Name at least two ways to do each 27 University of Pennsylvania X Where is object X?

© 2013 A. Haeberlen, Z. Ives Store+retrieve: Unstructured overlay Content can be stored anywhere Example: On the node that inserted it Maybe add a few replicas or pointers on other nodes To retrieve: Flooding or random walks What should be the scope of the flood? Length of the walk? Pros and cons? Very good at finding hay, not so good at finding needles Finding unpopular objects can be expensive Object may not be found even if it exists 28 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Coordination: Epidemic protocol How can we disseminate information in an unstructured overlay? Idea: Use epidemic protocol ('Gossiping') Source node tells a few of its peers, who in turn tell a few of their own peers, and so on Similar to spread of infections or rumors in a population Pro: Very simple and extremely robust Con: Can be slow; difficult to target a subset of the nodes 29 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Maintenance Peers join and leave over time ('churn') Temporary departure: Node switched off, loses connectivity... Permanent departure: Reinstallation, node decommissioned... Need overlay maintenance Replace links to departed nodes with new links Otherwise, overlay will disconnect eventually! May need replication + replica maintenance If objects need to remain available despite churn Permanent departures destroy replicas; need to replace them How many replicas should we make? 30 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Recap: Unstructured overlays No restrictions on which peers can connect Result: Random topology A few techniques for unstructured overlays Flooding or random walks to locate content Epidemic protocol for coordination Both advantages and disadvantages Advantages: Simple, robust, low-maintenance Disadvantage: Not very efficient (flooding!) 31 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Idea: Add structure! Give each node a unique identifier Use the identifier to restrict who can connect to whom Example: Identifiers are from a circular ID space, and nodes form a ring (Pastry) Result: Can assign specific roles to nodes For example, node 42 can be responsible for object 39 Everyone knows where to look for object 39! Structured overlay 89 34 42 51 Unstructured (e.g. Gnutella) Structured (e.g. Pastry) Identifier space 39 Node Object University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Consistent hashing On which nodes should objects be stored? Assumption: Each object has an identifier too ('key') Idea #1: Hashing Example: k nodes, object O is stored on node (O mod k) What happens when nodes join or leave? Idea #2: Consistent hashing Object O is stored on node whose ID is closest to O What happens when nodes join or leave? If each object has k replicas, where should we put these? 34 University of Pennsylvania 89 34 42 51 39

© 2013 A. Haeberlen, Z. Ives Key-based routing How do nodes in a structured overlay communicate? Suppose node with ID i wants to get object with key o Object is on node with ID closest to o - but how to find it? Idea: Key-based routing Overlay implements a function route(k,m), which delivers message m to the node whose ID is closest to k This would help: Node i can simply invoke route(o,GET(i,o)) Why does the GET have to include o? Why does it include i? But how to implement route(k,m)? 35 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Problem: Efficiency How do we implement route(k,m)? Idea #1: Every node knows every other node Not very scalable: For N nodes, needs O(N 2 ) state! When node joins or leaves, need to inform N-1 other nodes! Idea #2: Every node knows just its neighbors O(N) state, need to inform only 2 nodes when joining But routing is very inefficient - O(N) hops are expected! Idea #3: Every node knows some other nodes Can we get a good tradeoff between state, overhead, #hops? 36 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives The O(log N) trick Intuition: Each node knows only O(log N) other nodes Example: Identifiers from [0,2 X -1], node i knows nodes following (i+2 k ) mod 2 X -1 (~MIT Chord) Typically need fewer than X links because ID space is typically large (X=160 is common) and sparsely populated Result: Can do multihop routing How would you implement route(k,m) in this setting? How do we know when a message has arrived? 37 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Distributed hashtable (DHT) Implements a simple storage system on top of key-based routing To insert an object O with key k, send PUT(k,O) To retrieve object with key k from node i, send GET(k,i) Many technical details, e.g., replica maintenance 38 University of Pennsylvania 63 84 63 24 O PUT(65,O) O O GET(65,24) Optimization

© 2013 A. Haeberlen, Z. Ives Recap: Structured overlays Peers form a specific topology Each peer has a set of potential neighbors, which depends on the peer's ID and the current membership Tradeoff between state, routing efficiency, maintenance O(log N) trick: Chord topology Some techniques for structured overlays Key-based routing Distributed hashtables 39 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives 41 Pastry: Routing 128bit circular ID space for nodes and objects 32023 Leaf set If t is covered by the leaf set of s, deliver it Otherwise, forward it to a node that shares a longer prefix with t Message is delivered to the node whose node ID is numerically 'closest' to t Converges; avoids loops Hop count: O(log n) 21301 To route message s  t: 21232 32023 30122 32310

© 2013 A. Haeberlen, Z. Ives 42 Pastry: Data structures Each node maintains its own routing table 4 3 2 1 -- (self)-- (self) 21232 -- 21031 --22313(self)20110 30110(self)1221003121 0 3210Lvl Routing table for node 21301 Entries in row r share a prefix of r digits with the node, but differ in the r+1 st digit Entries may be empty N nodes  log b N rows 21310213032123321230 Leaf set for node 21301 Two other data structs: Leaf set L 222100312131320 Neighbourhood set for node 21301 Neighbourhood set M State: b·log b N+|L|+|M|

© 2013 A. Haeberlen, Z. Ives 43 Pastry: Node arrival Must update data structures when a new node joins Algorithm: Routing tableLeaves Neighbours i 1. Choose random node ID i 3. Ask A to route a special 'join' message to i 4. Receive routing tables from nodes on the path 5. Combine the replies and initialize the routing table * 6. Send the result to all nodes that 'need to know' Concurrent arrivals? A 2. Find a nearby member A

© 2013 A. Haeberlen, Z. Ives 44 Pastry: Routing table maintenance Fail-stop assumption Leaf set? To repair the leaf set: Obtain leaf set from a neighbour, choose suitable nodes, check for liveness... 213312132321310 (self)3 212322110321031 2... To replace a routing table entry: Obtain routing table from nodes in the same row or lower Neighbourhood set: Ask other neighbours

© 2013 A. Haeberlen, Z. Ives 45 Pastry: Locality Problem: Routes may be unnecessarily long Solution: Add concept of network 'distance', prefer routes to 'nearby' nodes Proximity metric d(n 1,n 2 ) i=01320 A=21302 03321 01201 01312 First approximation: Get row i of the routing table from i th node along the path of the join message To improve: "Second stage" in the join protocol Perfection not necessary University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives 46 Pastry: Replica routing Idea: Spread replicas of an object around the world and always request a 'nearby' copy 23333 30002 30011 30022 23331 23330 23323 21011 21013 21022 21031 21010 21002 21000 Store replicas on the k nodes numerically closest to the key Random node IDs  Good diversity Route locality  Request reaches nearby node first Proximity heuristic

© 2013 A. Haeberlen, Z. Ives Recap: Pastry Implements a structured overlay Key-based routing primitive Other services can be built on top of Pastry. Examples: PAST, SplitStream, Scribe, ePOST, Glacier,... Open source implementation available (FreePastry) Join protocol, routing table maintenance,... Several refinements compared to basic overlay design presented earlier Leaf sets Proximity-based neighbor selection 47 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Swarming and ISPs Swarming reduces bandw. cost of the source But the content is still being downloaded - so who pays now? ISPs typically charge each other based on bytes sent Result: Local and regional ISPs have to pay more! "But I am paying for a flat rate!" - why is this not a good argument? What can be done? ISPs have tried to rate-limit or block P2P traffic (e.g., Comcast and BitTorrent) P2P networks have tried to become more 'ISP friendly'; similar effect can be achieved with middleboxes ISPs still not very happy about P2P content distribution because it conflicts with their own services (triple play etc) 49 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Protecting data Suppose we retrieve an object from a DHT. How do we know it is genuine? Studies show that unprotected systems tend to be rife with mislabeled or corrupt content (remember Gnutella?) Idea: Make the object self-certifying When inserting object O, choose key as hash(O) When retrieving object, check whether key=hash(response) What do we do if the object is mutable? 50 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Controlling membership Who can join the decentralized system? Many systems allow anyone to join Problem: Sybil attack Adversary joins system with a large number of (physical or virtual) peers Can censor objects, snoop on data, disrupt routing,... Successfully used, e.g., on the Vanish system How to defend a system against this? Proof of work, controlled membership,... 51 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Topology attacks Suppose adversary wants to censor objects in a DHT Is a defense against Sybil enough? Problem: Adversary can hide other nodes Example: Choose keys to control an object, or to control a specific peer's routing table Possible countermeasure: Certified identities Example: Adversary tells the victim only about other nodes controller by the adversary (Eclipse attack) 52 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Strategies in BitTorrent Does BitTorrent protect against freeloading? No. Several attacks have been proposed over the years. Examples: BitThief: Exploits optimistic unchoke + large view BitTyrant: Maximizes reciprocation by deviating from equal split and by increasing #peers ('last place is good enough') Strategically reveal available pieces (over- or underreport) 53 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Recap: Challenges Strength of decentralized systems is also a weakness No dedicated infrastructure, no centralized control Many systems allow anyone to join Result: Some unique challenges not faced by other types of distributed systems, or not to the same degree Decentralization is not a panacea The resources still have to come from somewhere 54 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Our plan for today Partly decentralized system: BitTorrent Swarming Incentives: Tit for tat Decentralized system: Pastry Overlays; unstructured vs. structured Key-based routing Distributed hashtables KBR in Pastry Challenges Swarming and ISPs Gaming BitTorrent; Sybil and Eclipse attacks 55 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives Sources and further reading R. Rodrigues, P. Druschel: "Peer-to-peer systems", CACM 2010 vol. 53 no. 10 (assigned for today) B. Cohen: "Incentives build robustness in BitTorrent", IPTPS 2003 D. Levin, K. LaCurts, N. Spring, B. Bhattacharjee: "BitTorrent is an auction", SIGCOMM 2008 A. Rowstron, P. Druschel: "Scalable, decentralized object location and routing for large-scale peer-to- peer systems", Middleware 2001 56 University of Pennsylvania

© 2013 A. Haeberlen, Z. Ives NETS 212: Scalable and Cloud Computing 1 University of Pennsylvania Peer-to-Peer Systems November 26, 2013.

Similar presentations

Presentation on theme: "© 2013 A. Haeberlen, Z. Ives NETS 212: Scalable and Cloud Computing 1 University of Pennsylvania Peer-to-Peer Systems November 26, 2013."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

© 2013 A. Haeberlen, Z. Ives NETS 212: Scalable and Cloud Computing 1 University of Pennsylvania Peer-to-Peer Systems November 26, 2013.

Similar presentations

Presentation on theme: "© 2013 A. Haeberlen, Z. Ives NETS 212: Scalable and Cloud Computing 1 University of Pennsylvania Peer-to-Peer Systems November 26, 2013."— Presentation transcript:

Similar presentations

About project

Feedback