Peer-to-Peer Infrastructure and Applications

Slides:

Advertisements

Similar presentations

Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge

Advertisements

Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK

Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.

Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.

Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.

1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.

MCDST : Supporting Users and Troubleshooting a Microsoft Windows XP Operating System Chapter 13: Troubleshoot TCP/IP.

Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.

Peer-to-peer archival data trading Brian Cooper Joint work with Hector Garcia-Molina (and others) Stanford University.

Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.

Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.

Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.

Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )

P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.

Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.

1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.

Application Layer Multicast for Earthquake Early Warning Systems Valentina Bonsi - April 22, 2008.

1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.

1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.

Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.

(part 3).  Switches, also known as switching hubs, have become an increasingly important part of our networking today, because when working with hubs,

Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.

Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)

1 Plaxton Routing. 2 Introduction Plaxton routing is a scalable mechanism for accessing nearby copies of objects. Plaxton mesh is a data structure that.

Lecturer: Ghadah Aldehim

Distributed Systems Concepts and Design Chapter 10: Peer-to-Peer Systems Bruce Hammer, Steve Wallis, Raymond Ho.

CH2 System models.

Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.

Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.

A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.

Advanced Computer Networks Topic 2: Characterization of Distributed Systems.

Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.

An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.

Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel, Middleware 2001.

1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.

Peer to Peer Network Design Discovery and Routing algorithms

INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.

P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.

P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.

1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.

Fabián E. Bustamante, Fall 2005 A brief introduction to Pastry Based on: A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and.

Distributed Web Systems Peer-to-Peer Systems Lecturer Department University.

IP: Addressing, ARP, Routing

An example of peer-to-peer application

Ad-hoc Networks.

Pastry Scalable, decentralized object locations and routing for large p2p systems.

THE NEED FOR DNS DOMAIN NAME SYSTEM

Peer-to-Peer Data Management

Controlling the Cost of Reliability in Peer-to-Peer Overlays

COS 461: Computer Networks

CHAPTER 3 Architectures for Distributed Systems

Internet Networking recitation #12

Net 323 D: Networks Protocols

Plethora: Infrastructure and System Design

Accessing nearby copies of replicated objects

EE 122: Peer-to-Peer (P2P) Networks

CS5412: Using Gossip to Build Overlay Networks

Distributed Peer-to-peer Name Resolution

Getting Started.

Getting Started.

Dr. Rocky K. C. Chang 23 February 2004

CS5412: Using Gossip to Build Overlay Networks

AbbottLink™ - IP Address Overview

COS 461: Computer Networks

Ch 17 - Binding Protocol Addresses

EE 122: Lecture 22 (Overlay Networks)

Applications (2) Outline Overlay Networks Peer-to-Peer Networks.

CS5412: Using Gossip to Build Overlay Networks

Distributed Systems and Algorithms

Presentation transcript:

Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge +44 1223 479818 aherbert@microsoft.com

Microsoft and the Grid Shared vision of the “virtual organization” But focused on e-Business rather than e-Science Grid investments Windows clusters for large scale computations TerraServer projects for large data sets Globus port to Windows Globus implementation of OGSA Wrap Grid services as Web services Leverage web services as Grid infrastructure E.g., Hailstorm user authentication

Microsoft Research Web services enable wide area integration How to extend this to enable efficient wide scale information sharing and collaboration? Move to a model of peer-to-peer service implementation in contrast to today’s server-based model Necessarily scalable and self-organizing Necessarily simple developer framework No conflict with WSDL, SOAP etc

Peer-to-Peer today Music / video download Distributed computing Napster, Morpheus, Gnutella Distributed computing SETI@Home Research community looking for general purpose frameworks discovering useful applications

Peer-to-Peer applications Publish/Subscribe Event Notification (SCRIBE) Share load of supporting topics and disseminating messages from publishers to subscribers Distributed document archive (PAST) Share load of storing documents reliably Web caching (SQUIRREL) Share load of caching web pages Dynamic directory (OVERLOOK) Share load of storing directory entries for dynamic data Here are a number of peer-to-peer applications we’ve built at Cambridge to illustrate the benefits of peer-to-peer computing: Publish subscribe multicast: There are many examples of scenarios where a publish / subscribe model of communication works well. Press releases are a classic example: agencies publish them and journalists subscribe to topics relevant to their interests. When a topic is current news arrives thick and fast and more people want to know about it, so the load on the system dramatically increases. For example on Sept 11th while the Internet itself coped with the increased level of traffic many streaming services struggled to keep up with the demand. We’ll talk later about how our Scribe middleware package that enables the construction of message dissemination trees for scalable, reliable and efficient Internet multicast. Peer-to-peer document storage: Journalists rely on press archives and photo libraries for their research. Archives are useful but costly and often slow because someone has to provide central storage, registration and distribution services. Imagine a system where the reports and images on every journalist’s laptop could be accessible to colleagues and the task of backing up shared between them so that at any time of the day or night you could access current material. PAST is a large-scale, decentralized, peer-to-peer, fault-tolerant, persistent document archive utility that meets this need. Peer-to-peer, cooperative web caching: In corporate networks people use proxy servers to cache downloaded web pages to save (slow) fetches across the WAN. This makes sense as the proxy can be part of the firewall / LAN-WAN router infrastructure. For ad hoc groups why can’t we share each other’s caches, especially if we are on a fast local (wireless) network with a slow Internet link. Squirrel is a decentralized, peer-to-peer web caching package that enables client desktop machines to cooperatively share their local web caches with each other in an efficient, scalable, fault-tolerant manner. It turns out to perform as well as a centralized cache, doesn’t require extra hardware and scales up automatically as more machines are added to the network. Dynamically-scalable name resolution service: We increasingly talk about the disaggregated PC in which the box on (under) our desk is replaced by a collection of cooperating devices that we hook up via a (wireless) network to provide our virtual computer. This raises the question of where we keep the registry and other directory like stuff. Moreover since we might expect the disaggregated PC to change configuration frequently this directory information will change a great deal. This rules out relying on the current generation of directory servers (Active Directory, Dynamic DNS) since they are optimized for relatively static data. Overlook is a scalable name resolution service that uses Pastry to distribute dynamic directory information and adapt the distribution to meet applied load.

A P2P framework requires: Content-based addressing Hash content to key Route message to computer hosting that key Dynamic caching and proxying Local computers stand in for remote ones Faster access, reduced load on key holder Replication and automatic failover Store at K computers adjacent to key holder Multicast cascade for group communication Each computer needs a spanning tree of routes for reaching every other computer Peer-to-peer applications need a number of capabilities in order to share and access information efficiently: Content-based addressing: We often need to send a message to a specific computer associated with looking after some part of the shared information. For example to retrieve a document by name. Usually we don’t know beforehand which computer holds the document, unlike a web page where the URL names the physical web server holding the document. Dynamic caching and interposition: Often information will be primarily located at a specific computer (or set of computers if replicated for availability). These computers may not however be close to the users of that information. If the information is popular the computer hosting it may become a bottleneck. Therefore it should be possible for other computers to retain “cache” copies that stand in as proxies for the primary source. Replication with automatic fail-over: We can’t assume all the computers in a peer-to-peer structure will always be working – they might individually fail or be temporarily disconnected. Therefore copies of shared information has to be stored at several computers if it is to be always available. The system must automatically switch between these copies as computer come up and go down.. Routing trees among overlay nodes: Often it is necessary for one peer to be able to speak to several others, may be the whole set. It can be very efficient and put huge strain on it’s network links if the source computer has to send messages to every individual peer directly. A peer to peer system should provide a means to cascade down multicast – the source tells a set of neighbours who then tell their neighbours and so on until the message has reached all of it’s recipient.

Overlay Networks Peer-to-peer requires richer routing semantics than IP IP routes to destination computer, not content URLs route to destination computer, not content IP multicast isn’t widely deployed Solution: Overlay networks allow applications to participate in hop-by-hop routing decisions Ideal overlay is efficient, self-organizing, scalable, and fault-tolerant Unfortunately it turns out that the Internet as it is today doesn’t provide the facilities we need to deliver peer-to-peer computing. It’s routing infrastructure provides a very limited functionality: send a message to the IP address associated with a specific node. Many distributed applications and services need richer routing semantics. Examples include scalable efficient Internet multicast – the ability to reliably send a message to several recipients at the same time and content addressing -- the ability to route a message to the node responsible for a particular name or data object. Because the Internet routing infrastructure is so limited, peer-to-peer functionality has to be provided by means of so-called overlay networks, which allow applications to directly participate in hop-by-hop routing decisions. A current theme in distributed computing research is the design of generic overlay networks that are scalable, self-organizing, fault-tolerant, and reasonably efficient.

Pastry outline Computers (Nodes) have unique Id Typically 128 bits long Primitive: Route (msg, key) Deliver msg to currently alive node with Id closest numerically to key Scalable, efficient Per node routing table O(log(N)) entries Route in O(log(N)) steps Fault tolerant Self-fixes routing tables when nodes added, deleted or fail Pastry is a generalized overlay network developed by researchers at MSR Cambridge that provides the common infrastructure to all our Peer-to-Peer applications. Pastry connects a set of cooperating computers called “nodes” that have the following basic properties: Messages are sent to abstract keys allocated from a large ID space that is typically 128 bits wide. Nodes are randomly assigned IDs from ID the same space. A message sent to a key is delivered to the currently-alive node whose ID is closest to the destination key.ID Pastry’s routing is scalable to very large overlays and it is efficient: Nodes for an overlay network of N nodes need only maintain O(log(N) routing table entries. For a one million node overlay, the routing table size is about 65. Messages are routed to their destination in O(log(N)) overlay hops. on average For a million node overlay, the average number of hops is 5. Pastry routes through the overlay are typically within a factor of two of the cost of direct IP routes between the same nodes. Pastry is also self-organizing and fault-tolerant: it fixes its routing tables when nodes are added, deleted, or fail. This repair mechanism does not involve any administrator or centralized control, and it requires only O(log(N)) messages.

Pastry routing 0XXX 1XXX 2XXX 3XXX 0112 2321 START 0112 routes a message to key 2000. 2032 First hop fixes first digit (2) 2001 Second hop fixes second digit (20) Think of Pastry as providing a form of content-addressable routing. Steer message towards physically nearest node that is numerically closer to the key Give preference to longest common prefix Greedy algorithm: at each step set of nodes left to “search” reduces exponentially physical radius of search increases exponentially Always converges Routing table retains information about a node’s neighbours in Id name space The namespace doesn’t have to be fully populated – a message will always be routed to the nearest live Pastry node. END 2001 closest live node to 2000.

Pastry routing table Routing table: For each level, nearest peer for other domains Namespace leaf set: nearest Ids to “left” and “right” in name space Each entry gives IP address for host associated with Id Routing table like an optimisation (imagine worst case where every node only knows about 1 neighbour in each direction – you’d still get there… eventually) The namespace doesn’t have to be fully populated – a message will always be routed to the nearest live Pastry node. The namespace leaf set ensures that concurrent failures of up to l/2 nodes with adjacent node ids are tolerated (l is the size of the leaf set, half of its nodes are numerically smaller, half larger than the present node). The namespace set can also be used by applications to replicate stored objects Routing table is like an optimisation (imagine worst case where every node only knows about 1 neighbour in each direction – you’d still get there… eventually)

Pastry Routing Demo

Pastry node addition Want to add new node to the system Invent a new random nodeId X Go to a nearby or well-known node A Route to “key” X via A (finds node Z with Id closest to X) Obtain leaf set from Z and rebuild Obtain routing table entries from each node along the route from A to Z and rebuild Register with each member of A’s namespace leaf set so they adjust their leaf sets and rebuild Find nearest leaf set node and use its routing table to improve locality When joining we need to find out where the adjacent Ids are in the Pastry network. If we find one of them, we can from it find the others. Having found our neighbours we can insert ourselves into their routing tables. From everyone else’s routing tables we can pick the routes that are physically closer to us.

Scribe: A Pastry Application Publisher Publisher Topic of interest Subscriber Subscriber Publish / subscribe is a popular model for “event driven” systems with volatile membership Decouple event publishers from event subscribers Publishers don’t know in advance who subscribers are Subscribers don’t know in advance who publishers are Challenge is how to multicast notifications from topics efficiently Now let’s look at how we can use Pastry to support a multicast facility to support publish/subscribe communication. In a publish / subscribe system there are publishers of information about a topic of interest and subscribers who wish to receive that information. The publishers simply fire messages at the “topic” as a logical entity and the subscribers register to have these messages forwarded to them. The advantage of this structure is that publishers and subscribers can join and leave the topic at any time. A classic example is the feeding of stock market data from various sources to financial trading desks. We can implement a publish / subscribe system using a topic server (just as MSN hosts IM), but this could become a bottleneck. With Pastry we can share the load of handling messages over the publishers and subscribers.

Scribe: architecture Topic hashed to a key Construct a multicast tree based on the Pastry network Have the (Pastry) node with the closest Id to the topic key be the root This node replicates knowledge of the topic to its k nearest neighbours for resilience Pass event notification down through the tree Each parent forwards event to it’s children Avoids over stressing network links close to the topic node We treat the topic name as the key. Thus topics are randomly located around the Pastry network. We replicate topics so if a node fails we don’t lose it. Note that the replicas are likely to be geographically scattered, making a denial of service attack hard to mount. Each topic “node” is the root of a tree for cascading messages out to the subscribers.

Scribe: Topic creation Each topic is assigned a topicId Root of the multicast tree= node with nodeId numerically closest Create(topic): route through Pastry to the topicId Root T Create(T)

Scribe: subscribing 1111 1000 1111 1100 1101 1100 1101 1011 1001 1011 0100 1001 0100 0111 1000 0111

Scribe: event dissemination 1100 Publish(topic, event) Route through the Pastry network using the topicId as the destination Dissemination along the multicast tree starting from the root 1101 1011 1011 0100 0111

Scribe demo

Summary Peer-to-peer techniques are good for wide area information sharing and collaborative computation Overlay networks enable peer-to-peer distributed computing Pastry is an efficient, scalable, self-organizing peer-to-peer framework Pastry makes it easy to build powerful peer-to-peer applications For more see: http://research.microsoft.com/~antr/Pastry/