Impact of Neighbor Selection on Performance and Resilience of Structured P2P Networks Sushma Maramreddy
Overlay networks An overlay network is a logical network on top of one or more networks. E.g. Internet Main purpose of these networks is to provide effective means by which a huge amount of computing links are linked and accessed. Various distributed services can be built on top of these networks. Structured and Unstructured overlay networks.
Overlay networks Contd.. Unstructured overlay network – a peer joins the network by connecting itself to any node in the network. Structured overlay network – a peer joins the network by connecting itself to some other well-defined peers using a logical identifier.
Authors Byung-Gon Chun, U.C.Berkley Ben Y.Zhao, U.C.Santa Barbara John D.Kubiatowicz, U.C.Berkley
Impact of neighbor selection on performance and Resilience Introduction Related work Details of neighbor selection Impact of cost functions on Performance and Resilience Conclusion
Introduction Structured overlay networks provide routing to endpoints or nodes inside the network requiring logarithmic steps at each node. Nodes choose the neighbors based on optimization metrics. A recent study by Gummadi has shown that neighbor selection based on network proximity significantly increase overall performance. Problem – network imbalance.
Introduction Contd.. To better model neighbor selection across the networks, a generalized cost model is presented. Most current protocols only consider network proximity in neighbor selection This paper uses different models based on network proximity and network capacity. Study the impact they have on lookup latency and static resilience in tree and ring geometries.
Related Work Closest work was done by Gummadi. The authors quantified the impact of routing geometry on performance and resilience. Albert shows a correlation between scale-free nature of networks and resilience to attacks and failures. Several researchers propose optimizing the overlay construction of structured overlays using network proximity, but generally ignore CPU load, storage and bandwidth capacity.
Structured Overlay Construction Each node chooses neighbors that meet logical identifier constraints (e.g., prefix matching or identifier range) and builds direct links. These constraints are flexible such that a no. of nodes are possible for each routing table entry. The neighbor selection problem is reduced to cost minimization problem.
Cost Model Optimizing neighbor selection for node i means minimizing the sum of the cost from i to all nodes. Cost from i to j consists of two factors: node cost and edge cost. Node cost is the cost incurred by intermediate node- Edge cost is the cost incurred by the network links -
Cost Model Contd.. N = network size t (i, j) = traffic from i, to j. Cp (i,j) =cost of the path from i to j V(I,J) = intermediate nodes
Cost Model Contd.. This model captures the heterogeneity node capacity – a function of bandwidth, computation power, disk access time and so on. For structured networks such as Cord, Pastry and Tapestry cost function is defined as follows
Cost function in structured networks b = neighbor index nb = neighbor indexed by b Nb = no. of neighbors Rb = set of destination through nb cn(i) = node cost, ce(e) = edge cost Ce(k,l) = edge cost between k and l.
Neighbor selections Four neighbor selection models Random - choose neighbors randomly Dist – neighbors physically closest in the network Cap – neighbors with smallest processing delay DistCap – neighbors with smallest combined delay (sum of node processing delay and overlay link delay)
Cost functions studied Cn(i) - processing delay in node i Ce(i, nb) -direct overlay link delay between node I and node nb.
Simulation - set up Simulate Tapestry and Chord protocols as representatives of tree and ring structures. Simulations use 5100 node network topologies. Each node in Chord forwards messages to the live neighbor closest to the destination. Look up fails if all neighbors before the destination in the namespace fail. For tapestry each node forwards messages to the first live neighbor matching one more prefix digit. If all primary and backup links in the routing entry fail, the lookup fails.
Simulation results - Performance Two different distributions of node processing delay. Uniform and Bimodal distributions In Uniform we assign the processing delay uniformly from a/10, 2a/10. … a where a is the max processing delay. In Bimodal, nodes are either fast or slow. Fast nodes can process 100 messages/sec and slow nodes process 1 message/sec.
Simulation Results - Uniform
Simulation Results - Bimodal
Simulation Results – Static Resilience Measure resilience as the proportion of all pairs of live nodes that can still route to each other after an external event, either randomized node failures or targeted attacks. Assumptions - attacks focus on removing nodes with the highest in-degree in order to maximize damage to overall network reachability. Assume nodes have an uniform processing delay distribution with a=0.5s
Satic Resilience For Tapestry, examine resilience of the base protocol, base protocol plus additional backup routes (all chosen using the neighbor selection algorithms), base protocol plus backup routes chosen at random. For chord we examine the base protocol, base protocol plus sequential neighbors.
Random node failures - Tapestry
Random Node failure - Chord
Targeted node attacks - Tapestry
Targeted node attacks - Chord
Simulation Results Attacking nodes with high in-degree affects network connectivity severely. Random shows the best attack tolerance among neighbor selections CapDist has the worst attack tolerance than Dist, although it has better performance. This result demonstrates tradeoff between performance and attack resilience in structured P2P overlay construction.
Conclusion and Future Work Took a quantitative approach to examine the benefits and costs of considering network or physical characteristics in overlay construction The choice of neighbor selection algorithm drives a tradeoff between performance and resilience. If high degree nodes are attacked the impact on network connectivity is severe. As future work investigate the resilience of different geometries under different neighbor selection algorithms.