Peer-to-Peer Infrastructure and Applications

Peer-to-Peer Infrastructure and Applications
Andrew Herbert Microsoft Research, Cambridge

Microsoft and the Grid Shared vision of the “virtual organization”
But focused on e-Business rather than e-Science Grid investments Windows clusters for large scale computations TerraServer projects for large data sets Globus port to Windows Globus implementation of OGSA Wrap Grid services as Web services Leverage web services as Grid infrastructure E.g., Hailstorm user authentication

Microsoft Research Web services enable wide area integration
How to extend this to enable efficient wide scale information sharing and collaboration? Move to a model of peer-to-peer service implementation in contrast to today’s server-based model Necessarily scalable and self-organizing Necessarily simple developer framework No conflict with WSDL, SOAP etc

Peer-to-Peer today Music / video download Distributed computing
Napster, Morpheus, Gnutella Distributed computing Research community looking for general purpose frameworks discovering useful applications

Peer-to-Peer applications
Publish/Subscribe Event Notification (SCRIBE) Share load of supporting topics and disseminating messages from publishers to subscribers Distributed document archive (PAST) Share load of storing documents reliably Web caching (SQUIRREL) Share load of caching web pages Dynamic directory (OVERLOOK) Share load of storing directory entries for dynamic data Here are a number of peer-to-peer applications we’ve built at Cambridge to illustrate the benefits of peer-to-peer computing: Publish subscribe multicast: There are many examples of scenarios where a publish / subscribe model of communication works well. Press releases are a classic example: agencies publish them and journalists subscribe to topics relevant to their interests. When a topic is current news arrives thick and fast and more people want to know about it, so the load on the system dramatically increases. For example on Sept 11th while the Internet itself coped with the increased level of traffic many streaming services struggled to keep up with the demand. We’ll talk later about how our Scribe middleware package that enables the construction of message dissemination trees for scalable, reliable and efficient Internet multicast. Peer-to-peer document storage: Journalists rely on press archives and photo libraries for their research. Archives are useful but costly and often slow because someone has to provide central storage, registration and distribution services. Imagine a system where the reports and images on every journalist’s laptop could be accessible to colleagues and the task of backing up shared between them so that at any time of the day or night you could access current material. PAST is a large-scale, decentralized, peer-to-peer, fault-tolerant, persistent document archive utility that meets this need. Peer-to-peer, cooperative web caching: In corporate networks people use proxy servers to cache downloaded web pages to save (slow) fetches across the WAN. This makes sense as the proxy can be part of the firewall / LAN-WAN router infrastructure. For ad hoc groups why can’t we share each other’s caches, especially if we are on a fast local (wireless) network with a slow Internet link. Squirrel is a decentralized, peer-to-peer web caching package that enables client desktop machines to cooperatively share their local web caches with each other in an efficient, scalable, fault-tolerant manner. It turns out to perform as well as a centralized cache, doesn’t require extra hardware and scales up automatically as more machines are added to the network. Dynamically-scalable name resolution service: We increasingly talk about the disaggregated PC in which the box on (under) our desk is replaced by a collection of cooperating devices that we hook up via a (wireless) network to provide our virtual computer. This raises the question of where we keep the registry and other directory like stuff. Moreover since we might expect the disaggregated PC to change configuration frequently this directory information will change a great deal. This rules out relying on the current generation of directory servers (Active Directory, Dynamic DNS) since they are optimized for relatively static data. Overlook is a scalable name resolution service that uses Pastry to distribute dynamic directory information and adapt the distribution to meet applied load.

A P2P framework requires:
Content-based addressing Hash content to key Route message to computer hosting that key Dynamic caching and proxying Local computers stand in for remote ones Faster access, reduced load on key holder Replication and automatic failover Store at K computers adjacent to key holder Multicast cascade for group communication Each computer needs a spanning tree of routes for reaching every other computer Peer-to-peer applications need a number of capabilities in order to share and access information efficiently: Content-based addressing: We often need to send a message to a specific computer associated with looking after some part of the shared information. For example to retrieve a document by name. Usually we don’t know beforehand which computer holds the document, unlike a web page where the URL names the physical web server holding the document. Dynamic caching and interposition: Often information will be primarily located at a specific computer (or set of computers if replicated for availability). These computers may not however be close to the users of that information. If the information is popular the computer hosting it may become a bottleneck. Therefore it should be possible for other computers to retain “cache” copies that stand in as proxies for the primary source. Replication with automatic fail-over: We can’t assume all the computers in a peer-to-peer structure will always be working – they might individually fail or be temporarily disconnected. Therefore copies of shared information has to be stored at several computers if it is to be always available. The system must automatically switch between these copies as computer come up and go down.. Routing trees among overlay nodes: Often it is necessary for one peer to be able to speak to several others, may be the whole set. It can be very efficient and put huge strain on it’s network links if the source computer has to send messages to every individual peer directly. A peer to peer system should provide a means to cascade down multicast – the source tells a set of neighbours who then tell their neighbours and so on until the message has reached all of it’s recipient.

Overlay Networks Peer-to-peer requires richer routing semantics than IP IP routes to destination computer, not content URLs route to destination computer, not content IP multicast isn’t widely deployed Solution: Overlay networks allow applications to participate in hop-by-hop routing decisions Ideal overlay is efficient, self-organizing, scalable, and fault-tolerant Unfortunately it turns out that the Internet as it is today doesn’t provide the facilities we need to deliver peer-to-peer computing. It’s routing infrastructure provides a very limited functionality: send a message to the IP address associated with a specific node. Many distributed applications and services need richer routing semantics. Examples include scalable efficient Internet multicast – the ability to reliably send a message to several recipients at the same time and content addressing -- the ability to route a message to the node responsible for a particular name or data object. Because the Internet routing infrastructure is so limited, peer-to-peer functionality has to be provided by means of so-called overlay networks, which allow applications to directly participate in hop-by-hop routing decisions. A current theme in distributed computing research is the design of generic overlay networks that are scalable, self-organizing, fault-tolerant, and reasonably efficient.

Pastry outline Computers (Nodes) have unique Id
Typically 128 bits long Primitive: Route (msg, key) Deliver msg to currently alive node with Id closest numerically to key Scalable, efficient Per node routing table O(log(N)) entries Route in O(log(N)) steps Fault tolerant Self-fixes routing tables when nodes added, deleted or fail Pastry is a generalized overlay network developed by researchers at MSR Cambridge that provides the common infrastructure to all our Peer-to-Peer applications. Pastry connects a set of cooperating computers called “nodes” that have the following basic properties: Messages are sent to abstract keys allocated from a large ID space that is typically 128 bits wide. Nodes are randomly assigned IDs from ID the same space. A message sent to a key is delivered to the currently-alive node whose ID is closest to the destination key.ID Pastry’s routing is scalable to very large overlays and it is efficient: Nodes for an overlay network of N nodes need only maintain O(log(N) routing table entries. For a one million node overlay, the routing table size is about 65. Messages are routed to their destination in O(log(N)) overlay hops. on average For a million node overlay, the average number of hops is 5. Pastry routes through the overlay are typically within a factor of two of the cost of direct IP routes between the same nodes. Pastry is also self-organizing and fault-tolerant: it fixes its routing tables when nodes are added, deleted, or fail. This repair mechanism does not involve any administrator or centralized control, and it requires only O(log(N)) messages.

Pastry routing 0XXX 1XXX 2XXX 3XXX 0112 2321 START
0112 routes a message to key 2000. 2032 First hop fixes first digit (2) 2001 Second hop fixes second digit (20) Think of Pastry as providing a form of content-addressable routing. Steer message towards physically nearest node that is numerically closer to the key Give preference to longest common prefix Greedy algorithm: at each step set of nodes left to “search” reduces exponentially physical radius of search increases exponentially Always converges Routing table retains information about a node’s neighbours in Id name space The namespace doesn’t have to be fully populated – a message will always be routed to the nearest live Pastry node. END 2001 closest live node to 2000.

Pastry routing table Routing table:
For each level, nearest peer for other domains Namespace leaf set: nearest Ids to “left” and “right” in name space Each entry gives IP address for host associated with Id Routing table like an optimisation (imagine worst case where every node only knows about 1 neighbour in each direction – you’d still get there… eventually) The namespace doesn’t have to be fully populated – a message will always be routed to the nearest live Pastry node. The namespace leaf set ensures that concurrent failures of up to l/2 nodes with adjacent node ids are tolerated (l is the size of the leaf set, half of its nodes are numerically smaller, half larger than the present node). The namespace set can also be used by applications to replicate stored objects Routing table is like an optimisation (imagine worst case where every node only knows about 1 neighbour in each direction – you’d still get there… eventually)

Pastry Routing Demo

Pastry node addition Want to add new node to the system
Invent a new random nodeId X Go to a nearby or well-known node A Route to “key” X via A (finds node Z with Id closest to X) Obtain leaf set from Z and rebuild Obtain routing table entries from each node along the route from A to Z and rebuild Register with each member of A’s namespace leaf set so they adjust their leaf sets and rebuild Find nearest leaf set node and use its routing table to improve locality When joining we need to find out where the adjacent Ids are in the Pastry network. If we find one of them, we can from it find the others. Having found our neighbours we can insert ourselves into their routing tables. From everyone else’s routing tables we can pick the routes that are physically closer to us.

Scribe: A Pastry Application
Publisher Publisher Topic of interest Subscriber Subscriber Publish / subscribe is a popular model for “event driven” systems with volatile membership Decouple event publishers from event subscribers Publishers don’t know in advance who subscribers are Subscribers don’t know in advance who publishers are Challenge is how to multicast notifications from topics efficiently Now let’s look at how we can use Pastry to support a multicast facility to support publish/subscribe communication. In a publish / subscribe system there are publishers of information about a topic of interest and subscribers who wish to receive that information. The publishers simply fire messages at the “topic” as a logical entity and the subscribers register to have these messages forwarded to them. The advantage of this structure is that publishers and subscribers can join and leave the topic at any time. A classic example is the feeding of stock market data from various sources to financial trading desks. We can implement a publish / subscribe system using a topic server (just as MSN hosts IM), but this could become a bottleneck. With Pastry we can share the load of handling messages over the publishers and subscribers.

Scribe: architecture Topic hashed to a key
Construct a multicast tree based on the Pastry network Have the (Pastry) node with the closest Id to the topic key be the root This node replicates knowledge of the topic to its k nearest neighbours for resilience Pass event notification down through the tree Each parent forwards event to it’s children Avoids over stressing network links close to the topic node We treat the topic name as the key. Thus topics are randomly located around the Pastry network. We replicate topics so if a node fails we don’t lose it. Note that the replicas are likely to be geographically scattered, making a denial of service attack hard to mount. Each topic “node” is the root of a tree for cascading messages out to the subscribers.

Scribe: Topic creation
Each topic is assigned a topicId Root of the multicast tree= node with nodeId numerically closest Create(topic): route through Pastry to the topicId Root T Create(T)

Scribe: subscribing 1111 1000 1111 1100 1101 1100 1101 1011 1001 1011 0100 1001 0100 0111 1000 0111

Scribe: event dissemination
1100 Publish(topic, event) Route through the Pastry network using the topicId as the destination Dissemination along the multicast tree starting from the root 1101 1011 1011 0100 0111

Scribe demo

Summary Peer-to-peer techniques are good for wide area information sharing and collaborative computation Overlay networks enable peer-to-peer distributed computing Pastry is an efficient, scalable, self-organizing peer-to-peer framework Pastry makes it easy to build powerful peer-to-peer applications For more see:

Peer-to-Peer Infrastructure and Applications

Similar presentations

Presentation on theme: "Peer-to-Peer Infrastructure and Applications"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Peer-to-Peer Infrastructure and Applications

Similar presentations

Presentation on theme: "Peer-to-Peer Infrastructure and Applications"— Presentation transcript:

Similar presentations

About project

Feedback