Applications (2) Outline Overlay Networks Peer-to-Peer Networks
Overlay Network An overlay network is a virtual network That is constructed on top of physical network Virtual nodes corresponds to physical node Virtual links correspond a (possibly multi-hop) path/tunnel in the physical network That is used for implementing application-specific functions that the traditional Internet otherwise does not provide
Overlay Network
Examples of Overlay Networks VPN (virtual private network) Used as a private/secure overlay network connecting remote company sites Private/secure tunnels connects sites MBone Used for multicast MBone-ware nodes construct a multicast tree, tunneling through regular routers using unicast 6Bone Used for implementing IPv6 IPv6-aware nodes uses v6 addresses, tunneling through regular nodes by v4 addressing
Overlay Network IHdr: can be IPv6 header, application-specific names, even content attributes OHdr: regular IPv4 header
Examples of Overlay Networks End system multicast Only hosts participate Unlike VPN, Mbone, 6bone, which require router participation Hosts connect with each other using UDP tunnels, forming an overlay mesh, on top of which multicast trees are built Resilient overlay networks For finding more efficient, robust routes Problem: BGP does not guarantee shortest path, possible to have: L(AB) > L(AC) + L(CB) Solution: Monitor path metrics (delay, bandwidth, loss probability) among all pairs of participating hosts (n*n pairs) Select the best path between each pair, adapt to change in network condition Modest performance improvements, but much quicker recovery from failures, not scalable
Peer-to-Peer Networks Definition: A P2P network allows a community of peers (users) to share content, storage, processing, bandwidth, etc. In contrast to traditional model of distributed computing (client-server) Characteristics: Decentralized control Self-organization Symmetric (peer) relationship
Examples Pioneers Research Prototypes Napster Gnutella FreeNet Centralized directory Distributed file Gnutella Each node knows a couple of peers, forming an irregular mesh overlay network A node that needs an object floods a query, a peer that knows the object sends a reply. FreeNet A p2p network with anonymity and privacy (encrypted content) Research Prototypes Structured vs. unstructured random/broadcast search vs. routing/look-up tables BitTorrent, Pastry, Chord, CAN,…
BitTorrent A file is divided into pieces, which are replicated in the BitTorrent network (called swarm). Pieces are downloaded in random order A node become a source for a piece as soon as it downloads it. The more nodes download a piece, the more the piece is replicated. To join the swarm, a peer contacts a tracker (server) to download a partial list of other peers. Peers exchange bitmaps reporting the pieces they hold. If a peer’s neighbors do not have a particular piece, the peer can contact its track for more peers, or wait while downloading other pieces.
BitTorrent Newer versions of BitTorrent support trackerless swarms. Reason: tracker can be a single point of failure and a performance bottleneck. Peer finder (implemented in the client software) form an overlay network Each finder belongs to a swarm, and the network has many interconnected swarms. Each finder maintains a table of peers whose IDs are close to its own ID plus a few whose IDs are distant. To find a particular peer, the finder first send the request to the peers in its table whose ID’s are close to that of the target swarm. The request is routed to the target progressively in terms of distance in the ID space.
Structured P2P Consistent hashing: Mapping object names and node (IP) addresses onto the same ID space, typically arranged in a ring hash (obj_name) = obj_id hash (node_addr) = node_id An object is stored at the node whose node_id is closest to the obj_id. Hash turns a number/text to a random number uniformly distributed Good for load balancing Locating an object is a simple matter of invoking a hash function
Common Issues Organize, maintain overlay network node arrivals node failures Resource allocation/load balancing Resource location Locality (network proximity) Idea: generic p2p substrate
Object Distribution Consistent hashing [Karger et al. ‘97] 128 bit circular id space nodeIds (uniform random) objIds (uniform random) Invariant: node with numerically closest nodeId maintains object objid nodeids 2 128 - 1 Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically.
Object Insertion/Lookup 2128 - 1 O Msg with key X is routed to live node with nodeId closest to X Problem: complete routing table not feasible X Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically. Route(X)
Routing Properties log16 N steps O(log N) state Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically. Properties log16 N steps O(log N) state
Leaf Sets Each node maintains IP addresses of the nodes with the L numerically closest larger and smaller nodeIds, respectively. routing efficiency/robustness fault detection (keep-alive) application-specific local coordination
Routing Procedure if (destination is within range of our leaf set) forward to numerically closest member else let l = length of shared prefix let d = value of l-th digit in D’s address if (Rld exists) forward to Rld forward to a known node that (a) shares at least as long a prefix (b) is numerically closer than this node
Routing Integrity of overlay: guaranteed unless L/2 simultaneous failures of nodes with adjacent nodeIds Number of routing hops: No failures: < log16 N expected, 128/b + 1 max During failure recovery: O(N) worst case, average case much better
Node Addition Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically.
Node Departure (Failure) Leaf set members exchange keep-alive messages Leaf set repair (eager): request set from farthest live node in set Routing table repair (lazy): get table from peers in the same row, then higher rows
API route(M, X): route message M to node with nodeId numerically closest to X deliver(M): deliver message M to application forwarding(M, X): message M is being forwarded towards key X newLeaf(L): report change in leaf set L to application
PAST: Cooperative, archival file storage and distribution Layered on top of Pastry Strong persistence High availability Scalability Reduced cost (no backup) Efficient use of pooled resources
PAST API Insert - store replica of a file at k diverse storage nodes Lookup - retrieve file from a nearby live storage node that holds a copy Reclaim - free storage associated with a file Files are immutable
PAST: File storage fileId Insert fileId PAST file storage is mapped onto the Pastry overlay network by maintaing the invariant that replicas of a file are stored on the k nodes that are numerically closest to the file’s numeric fileId. During an insert operation, an insert request for the file is routed using the fileId as the key. The node closest to fileId replicates the file on the k-1 next nearest nodes in then namespace.
PAST: File storage Storage Invariant: File “replicas” are fileId Insert fileId k=4 Storage Invariant: File “replicas” are stored on k nodes with nodeIds closest to fileId (k is bounded by the leaf set size) PAST file storage is mapped onto the Pastry overlay network by maintaing the invariant that replicas of a file are stored on the k nodes that are numerically closest to the file’s numeric fileId. During an insert operation, an insert request for the file is routed using the fileId as the key. The node closest to fileId replicates the file on the k-1 next nearest nodes in then namespace.
PAST: File Retrieval C k replicas Lookup fileId file located in log16 N steps (expected) usually locates replica nearest client C The last point is shown pictorally here. A lookup request is routed in at most log16 N steps to a node that stores a replica, if one exists. In practice, the node among the k that first receives the message serves the file. Furthermore, network locality properties of Pastry (not discussed in this talk) ensure that this is node is usually the node that is closest to the client in the network !!