A Self-repairing Peer-to-Peer Systems Resilient to Dynamic Adversarial Churn Fabian Kuhn, Microsoft Research, Silicon Valley Stefan Schmid, ETH Zurich Roger Wattenhofer, ETH Zurich Some slides taken from Stefan Schmid’s presentation of his Masters thesis
Churn Unlike servers, peers are transient! join leave Machines are under the control of individual users e.g., just connecting to download one file Membership changes are called churn join leave In peer-to-peer systems, each machine acts as both client _and_ server at the same time. However, unlike real servers, the machines often only connect to the network for short periods of time, for example to download a Successful P2P systems have to cope with churn (i.e., guarantee correctness, efficiency, etc.)!
Churn characteristics Depends on application (Skype vs. eMule vs. …) But: there may be dozens of membership changes per second! Peers may crash without notice! How can peers collaborate in spite of churn? Of course, not every P2P system has the same amount of churn. For example, a recent study shows that many users leave their Internet-telephony application (Skype) running for long time periods. On the other hand, one can imagine that in file-sharing system such as emule, there is _more_ churn, as users only connect to download one or two files and then disconnect, for example because of legacy concerns. So how can we perform reasonable tasks in spite of the churn?
Churn threatens the advantages of P2P a lot of churn So, this is an example to motivate why we really have to cope with churn in a proactive fashion. Assume that the machines are arranged in a hypercubic topology. This topology is highly scalable as each peer has a logarithmic degree, and the network diameter is also logarithmic which allows for fast lookups. If now many peers join at this peer <zeigen>, and all other peers crash, a topology arises which is very undesirable: <zeigen> One peer is connected to all other ones and becomes a bottleneck. What can we guarantee in presence of churn? We have to actively maintain P2P systems!
Goal of the paper Only a small number of P2P systems have been analyzed under churn! This paper presents techniques to: - Provably maintain P2P systems with desirable properties… - in spite of ongoing worst-case membership changes. Peer degree, network diameter, … Adversary continuously attacks the weakest part (The system is never fully repaired, but always fully functional)
How does Churn affect P2P systems? Objects may be lost when the host crashes Queries may not make it to the destination
Think about this What is the big deal about churn? Does not every P2P system define Join and Leave protocols? Well, the system eventually recovers, but during recovery, services may be affected. And objects not replicated are lost. Observe the difference between non-masking and masking fault tolerance. What we need is some form of masking tolerance.
Model for Dynamics We assume worst-case perspective: Adversary A(J,L) induces J joins and L leaves every round anywhere in the system. We assume a synchronous model: time divided into rounds. Further refinement: Adversary A(J, L, r) implies J joins, L leaves every r rounds The topology is assumed to be a hypercube that has O(log n) degree and O(log n) diameter.
Topology Maintenance π1 π2 Challenges in maintaining the hypercube! How does peer 1 know that it should replace peer 2? How does it get there when there are concurrent joins and leaves? …
Simple idea: Simulate the topology! The Proposed Approach Simple idea: Simulate the topology! So the solution we propose is to use _several_ peers per node rather than just one. That is, we propose to take a classic topology, and then simulate each vertex with _many_ peers. We can show that with this trick, the resulting structure has again the desirable properties as the original graph (small diameter, small degree), but _now it can also be maintained_! Several peers per node
General Recipe for Robust Topologies Take a graph with desirable properties Low diameter, low peer degree, etc. Replace vertices by a set of peers 3. Maintain it: a. Permanently run a peer distribution algorithm which ensures that all vertices have roughly the same amount of peers (“token distribution algorithm”). b. Estimate the total number of peers in the system and change “dimension of topology” accordingly (“information aggregation algorithm” and “scaling algorithm”). So our approach is as follows. First, we take a graph which features desirable properties. We then replace each node of this graph by a _set_ of peers. In order to maintain the network, we use _two algorithms_: <Click> The first algorithm makes sure that each node has always roughly the same number of peers, _regardless_ of the worst-case churn. In particular, each node has always at least one peer, but also not _too many_ in order to bound the _peer degree_. We call this the „peer distribution algorithm“ or „token distribution algorithm“. A second algorithm is used to change the dimesion of the topology with respect to the number of peers in the system. E.g., when many peers leave, we reduce the number of nodes of the original graph. This is called the „information aggregation algorithm“, as it estimates the total number of peers in the system. With these techniques, we get a network which has the same nice quality or _properties_ of the original graph (diameter, degree, etc.), but which is also maintainable under churn. <Click> Resulting structure has similar properties as original graph (e.g., connectivity, degree, …), but is also maintainable under churn! There is always at least one peer per node (but not too many either).
Dynamic Token Distribution V= 11011 a peers b peers W= 10010 After one step of recovery, both U and V will contain (a+b) /2 peers. Try this once for each dimension of the hypercube (dimension exchange method)
Theorem Discrepancy is the maximum difference between the token count of a pair of nodes. The goal is to reduce the discrepancy to 0. The previous step reduces to 0 for fractional tokens, but for a d-dimensional hypercube, using integer tokens, = d in the worst case In presence of an A(J,K,1) adversary, the proposed algorithm maintains the invariance of ≤ 2J + 2K + d
Information aggregation When the total number of peers N exceeds an upper bound, each node splits into two, and the dimension of the hypercube has to increase by 1. Similarly, when the total number of peers N falls below a lower bound, pairs of nodes in dimension (d-1) merge into one, and the dimension of the hypercube has to decrease by 1. Thus, the system needs a mechanism to keep track of N.
Simulated hypercube Given an adversary A (d+1, d+1, 6)*, the outdegree of every peer is bounded by (log2N), and The diameter is bounded by (log N) * The adversary inserts and deletes at most (d+1) peers during any time interval of 6 rounds
Topology Only the core peers store data items. Core Despite churn, at least one node in each core has to survive Core periphery Example topology for d=2. Peers in each core are connected to one another and to the peers of the core of the neighboring nodes Q. What does the periphery node do?
6-round maintenance algorithm The authors implied six rounds for one dimension in each phase Round 1. Each node takes snapshot of active peers within itself. Round 2. Exchange snapshot Round 3. Preparation for peer migration Round 4. Core send ids of new peers to periphery. Reduce dimension if necessary. Round 5. Dimension growth & building new core (2d+3) Round 6. Exchange information about the new core.
Further improvement: Pancake Graph (1) A robust system with degree and diameter O(log n / loglog n): the pancake graph (most papers refer to Papadimitriou & Gates’ contribution here)! Pancake of dimension d: d! nodes represented by unique permutation {l1, …, ld} where l1 {1,…,d} Two nodes u and v are adjacent iff u is a prefix-inversion of v 4-dimensional pancake: 1234 4321 3214 2134 So how is this pancake graph defined? A pancake graph of dimension d consist of d factorial many nodes. Each node is a permutation of the numbers 1 to d. Two nodes u and v are _adjacent_ if their label is related by _prefix inversion_.
No other graph can have a smaller degree and a smaller diameter! The Pancake Graph (2) Properties Node degree O(log n / log log n) Diameter O(log n / log log n) … where n is the total number of nodes A factor log log n better than hypercube! But: difficult graph (diameter unknown!) No other graph can have a smaller degree and a smaller diameter! The pancake graph has very nice properties. Concretely, each node has degree log/loglog in the total number of vertices, and also log/loglog diameter. So both properties are a factor of loglog n better than in a hypercube. Moreover, one can show that there is no other graph which has smaller diameter and degree! However, the pancake graph is a _difficult graph_. Besides its nice properties, this was also a _reason_ why we chose it. For example, it is still an unsolved problem to compute its diameter! [Bem: But there are still fast routing algorithms which can be shown to be near optimal!]
Asymptotically optimal! Contributions Using peer distribution and information aggregation algorithms on the simulated pancake topology, he proposed: a DHT-based peer-to-peer system with Peer degree and lookup / network diameter in O (log n / loglog n) Robustness to ADV(O (log n / log log n), O (log n / log log n)) No data is ever lost! Asymptotically optimal! When we apply our techniques to the pancake graph, we get a peer-to-peer system where every peer can reach every other peer in log/loglog many rounds, and each peer has also at most log/loglog many neighbors! The topology is also robust to an adversary who can add and remove log/loglog peers per (communication) round. Note that this is asymptotically optimal: If more than log/loglog peers could be removed per time interval, a peer can always be isolated when the graph has only log/loglog many neighbors. Finally, if we apply the standard distributed hash table approach and store data _redundantly_ at a node, we can guarantee that no data is lost in spite of the dynamics.
The Pancake System So here is a concrete example to give you an idea of the maintanence algorithms. Assume some peers join at this node and some peers leave at this node <zeigen, Clicks>. Then, in order to maintain the invariance that each node has at least one peer, we apply our peer distribution algorithm. Concretely, we move the additional peers to the sparse areas. <Click> Of course, only to _distribute_ the peers among nodes is _not enough_. If _many_ peers leave the system <Click>, some nodes necessarily run out of peers. Therefore, our solution is to _reduce the dimension_ of the pancake if the total number of peers falls beyond a certain threshold.
Conclusion A nice model for understanding the effect of churn and dealing with it. But it is too simplistic So in conclusion, with our algorithms, it is possible to maintain the quality (e.g, routing efficiency) of P2P systems in spite of worst-case membership changes. All you need is a base graph which you can simulate, a token distribution and an information aggregation algorithm on that topology. By simulating the base graph with many peers, it is often possible to adopt its properties. As we have seen for the pancake graph, this requires some additional tricks somehow, such as building a grid inside a node rather than connect the peers in the same node completely. Moreover, of course, in graphs where the diameter is much larger than the degree, simulation is more difficult and may entail a larger degree because of the discrepancies of the token distribution algorithm. With our techniques, we could create a robust peer-to-peer system which has both smaller degree and diameter than the often-used hypercube. However, the pancake system is also much more difficult! The larger expansion is challenging during the dimension changes. This was also a reason for choosing this graph! We believe that the dynamics of P2P system is still an interesting and important research area, as many systems today only apply _heuristics_, or are (in theory) only analysed for static environments. In particular, our approach is only a first step. For example, our system fails if an adversary can remove _more peers_ once in a while. That is, there is no _self-stabilizing_ mechanism.