Download presentation
Presentation is loading. Please wait.
Published byMavis King Modified over 9 years ago
1
1 Peer-to-Peer Systems
2
2 Introduction What is peer One that of equal standing with another Peer-to-peer A way of structure distributed applications Each node acts as both a client and a server
3
3 Client-server v.s. Peer-to-Peer network Example : How to find an object in the network Client-server approach – Use a big server store objects and provide a directory for look up Peer-to-Peer approach – Data are fully distributed – Each peer acts as both a client and a server – By asking? Client-server Client is dump Server does most things, but… Peer-to-peer The peers have equal functionality – Client, server, router
4
4 Characteristic of P-to-P Multiple peers participating the network The number of peers are large Each peer contains some sharing resources Distributed, decentralized Self-control Ad hoc participation Dynamic Resource sharing, cost sharing
5
5 Applications File sharing Napsper, Gnutella Instant message ICQ Gaming Information hiding Etc …
6
6 General assumption in P-to-P
7
7 Topology Distributing objects, centralizing directory Napster Most famous and motivate whole P2P research Distributing objects without centralizing directory Gnutella No centralized directory servers Pings the net to locate friends File requests are broadcast to friends When provider located, file transferred via HTTP Freenet
8
8 Distributing objects and directories Chord Can Hypercube PRR Pastry Tapestry Etc… Yapper Distributing objects and multiple servers Supper peers network
9
9 Desirable prperties Deterministic location If an object exists anywhere in the network, it should be located Routing locality Route should have low stretch Load balance The load of storing objects (or object locations) and routing information should be evenly distributed over network nodes Dynamic membership The network should adapt to joining and leaving nodes while maintaining the above properties
10
10
11
11
12
12 Gnutella : summary Fully distributed Simple, efficient, flexible query High network traffic The cost of a search is unbounded The life time of a message is unknown Only know its hop count but not duration
13
13 Freenet Selective routing Queries for files follow a route biased by hints Replication of data clustering Key clustering Improve data availability
14
14 Chord A distributed lookup protocol Routing table is distributed Given a key, it maps the key onto a node
15
15 Base Chord protocol Consistent hashing The consistent hash function assigns each node and key an m-bit identifier using a base hash function – A nodes’s identifier is chosen by hashing the node’s IP address – A key identifier is produced by hashing the key – M must be large enough to make the probability of two nodes hashing to the same identifier negligible – Identifiers are ordered in an identifier circle modulo 2 m – Key k is assigned to the first node whose identifier is equal to or follows k in the identifier space. – Successor(k)
16
16
17
17 What should be done when a node n join or leave the system? Scalable key location Routing information Each node only be aware of its successor node on the circle – Inefficient Each node, n, maintains a routing table with at most m entries – Finger table – A node’s finger table generally does not contain enough information to determine the successor of an arbitrary key k. – The finger pointers at repeatedly doubling distances around the circle each forwarding process halve the distance to the target identifier – O(logN)
18
18
19
19 Node joins Each node maintains a predecessor pointer When node n joins Initialize the predecessor and fingers of node n. Update the fingers and predecessors of existing nodes to reflect the addition of n – Node n will become the i th finger of node p iff – P precedes n by at least 2 i-1 and – The i th finger of node p succeeds n. Transferring and publishing keys
20
20
21
21 Failures When a node fails, nodes whose finger tables include n must find n’s successor. Maintains a “successor-list” of r nearest successors Replications
22
22 Can (Content-Addressable Network) The entire CAN space is divided amongst the nodes currently in the system. The new node must find a node already in the CAN Using the CAN routing mechanisms, it must find a node whose zone will be split The neighbors of the split zone must be notified so that routing can include the new node
23
23
24
24 Node departure, recovery and Can maintenance One of the failed node ’ s neighbors takes over the zone (key,value) pairs held by the departing node are lost until the state is refreshed by the holders of the data Takeover algorithm Each neighbor of the failed node will start a takeover timer running independently. When timer expires, the peer send a TAKEOVER message conveying its own zone volume to all of the failed node ’ s neighbors Compare the volume –The node which is still alive and has a small zone volume will be chosen.
25
25 Design improvements Multi-dimensioned coordinate spaces Increasing the dimensions reduces the routing path length RTT (round-trip-time) weighted routing Reducing the latency of individual hops along the path and not at reducing the path length Overloading coordinate zones Allow multiple nodes to share the same zone –A node maintains a list of its peers in addition to its neighbor list Adv –Reduced per-hop latency –Improved fault tolerance Multiple hash functions Improve data availability Map a single key onto k points (replication)
26
26 Hypercube routing Node and object Ids are drawn from the same ID space which can be thought of as a ring Each node ’ s ID is represented by d digits of base b Example : 32-bit ID => 8 Hex digits (b=16)
27
27 Neighbor table Each node consists of d levels with b entries at each level Each node also keeps track of its reverse- neighbors
28
28 Routing scheme Example Join protocol (index maintaining) Single join Multiple joins Sequential joins Concurrent joins – Independent joins – Dependent joins
29
29 YAPPERS: A Peer-to-Peer Lookup Service over Arbitrary Topology Prasanna Ganesan Qixiang Sun Hector Garcia-Molina IEEE INFOCOM 2003
30
30 Intro. Build a small DHT consisting of nearby nodes and then provide an intelligent search mechanism that can traverse all the small DHTs YAPPERS (Yet Another Peer-to-PEeR System) operates on top of an arbitrary overlay network. Lookup service Partial lookup Total lookup
31
31 Key concept If a node A wants to register a value for a white key, this pair can be stored at A itself, since A is also white. For a gray key and its value, then A looks for a neighboring gray node. A query for a gray key needs to be forwarded only to gray nodes.
32
32
33
33 We call the nodes within h hops the immediate neighborhood of a node The nodes within 2h+1 hops the extended neighborhood
34
34 Basic algorithm Consistency: if a node X is in two different neighborhoods IN(A) and IN(B), both A and B assign the same color to node X Stability: X is assigned the same color regardless how IN(A) changes dynamically when nodes enter or leave Stability reduces data relocation The key assignment:
35
35 The immediate neighborhood Multiple nodes in IN(X) have the same color? Allowing X to pick any one of these nodes to store the key No nodes in IN(X) have color C? By a backup assignment scheme
36
36 Backup assignment When there are no nodes in IN(X) that have color Ci, color Ci is assigned to a node with color C {(i+1) mod b}, if there are multiple nodes of C {(i+1) mod b}, choose the node with the smallest IP.
37
37 In resolving the pitfalls mentioned above, our solution is no longer consistent and stable as envisioned earlier. By probabilistic analysis, it can shown that if a node A has blogb nodes in IN(A), then with high probability there exists a node of each color.
38
38 Maintaining topology Edge deletion: when deleting an edge (X,Y), both X and Y broadcast the deletion event to its surviving neighbors with a TTL of 2h. Edge insertion: when adding an edge (X,Y), a “ trim ” technique will be performed by nodes connected to X and Y.
39
39 Enhancement Fringe node problem solutions ; Pruning: If X is a fringe node, then X doesn ’ t participate in YAPPERS directly. It selects a nearby high connectivity node Y as its proxy Biased backup: forbidding a node with a small immediate neighborhood assigning backup colors to a node with a large immediate neighborhood
40
40 Requirements Expressiveness Work in P2P search has focused on answering simple queries Types of queries Key lookup Keyword Range query Aggregates SQL
41
41 Autonomy, Efficiency and Robustness Autonomy The freedom of node join and leave Efficiency Bandwidth, processing power Robustness Stability in the presence of failures
42
42 Comprehensiveness Quality of Service Number of results Response time relevance
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.