Self-stabilizing Overlay Networks Sukumar Ghosh University of Iowa Work in progress. Jointly with Andrew Berns and Sriram Pemmaraju (Talk at Michigan Technological.

Slides:



Advertisements
Similar presentations
Stefan Schmid & Christian Scheideler Dept. of Computer Science
Advertisements

Peer to Peer and Distributed Hash Tables
Chapter 7 - Local Stabilization1 Chapter 7: roadmap 7.1 Super stabilization 7.2 Self-Stabilizing Fault-Containing Algorithms 7.3 Error-Detection Codes.
Lecture 8: Asynchronous Network Algorithms
Chapter 15 Basic Asynchronous Network Algorithms
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
The Chord P2P Network Some slides have been borowed from the original presentation by the authors.
Self-Stabilization in Distributed Systems Barath Raghavan Vikas Motwani Debashis Panigrahi.
Fast Leader (Full) Recovery despite Dynamic Faults Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Sébastien Tixeuil.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1.
Altruistic Routing in P2P Networks: Solutions and Problems Sukumar Ghosh Alina Bejan Amlan Bhattacharya University of Iowa.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Global State Collection. Global state collection Some applications - computing network topology - termination detection - deadlock detection Chandy-Lamport.
Models and Security Requirements for IDS. Overview The system and attack model Security requirements for IDS –Sensitivity –Detection Analysis methodology.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
1 University of Freiburg Computer Networks and Telematics Prof. Christian Schindelhauer Distributed Coloring in Õ(  log n) Bit Rounds COST 293 GRAAL and.
Dynamic Hypercube Topology Stefan Schmid URAW 2005 Upper Rhine Algorithms Workshop University of Tübingen, Germany.
CPSC 668Self Stabilization1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
Building Low-Diameter P2P Networks Eli Upfal Department of Computer Science Brown University Joint work with Gopal Pandurangan and Prabhakar Raghavan.
Secure routing for structured peer-to-peer overlay networks (by Castro et al.) Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
Outline Max Flow Algorithm Model of Computation Proposed Algorithm Self Stabilization Contribution 1 A self-stabilizing algorithm for the maximum flow.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Localized Self- healing using Expanders Gopal Pandurangan Nanyang Technological University, Singapore Amitabh Trehan Technion - Israel Institute of Technology,
1 Analysis of Link Reversal Routing Algorithms Srikanta Tirthapura (Iowa State University) and Costas Busch (Renssaeler Polytechnic Institute)
P2P Course, Structured systems 1 Introduction (26/10/05)
GS 3 GS 3 : Scalable Self-configuration and Self-healing in Wireless Networks Hongwei Zhang & Anish Arora.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
Distributed Algorithms on a Congested Clique Christoph Lenzen.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Introduction of P2P systems
Distributed Coloring Discrete Mathematics and Algorithms Seminar Melih Onus November
Andreas Larsson, Philippas Tsigas SIROCCO Self-stabilizing (k,r)-Clustering in Clock Rate-limited Systems.
Fault-containment in Weakly Stabilizing Systems Anurag Dasgupta Sukumar Ghosh Xin Xiao University of Iowa.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
CCAN: Cache-based CAN Using the Small World Model Shanghai Jiaotong University Internet Computing R&D Center.
1 Maximal Independent Set. 2 Independent Set (IS): In a graph G=(V,E), |V|=n, |E|=m, any set of nodes that are not adjacent.
Fault-containment in Weakly Stabilizing Systems Anurag Dasgupta Sukumar Ghosh Xin Xiao University of Iowa.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
A Self-Stabilizing O(n)-Round k-Clustering Algorithm Stéphane Devismes, VERIMAG.
Dissecting Self-* Properties Andrew Berns & Sukumar Ghosh University of Iowa.
Autonomic distributed systems. 2 Think about this Human population x10 9 computer population.
SybilGuard: Defending Against Sybil Attacks via Social Networks.
Fault Management in Mobile Ad-Hoc Networks by Tridib Mukherjee.
Self-stabilization. What is Self-stabilization? Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward.
Brief Announcement : Measuring Robustness of Superpeer Topologies Niloy Ganguly Department of Computer Science & Engineering Indian Institute of Technology,
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
Self-stabilization. Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward error recovery). Guarantees.
Hwajung Lee.  Technique for spontaneous healing.  Forward error recovery.  Guarantees eventual safety following failures. Feasibility demonstrated.
Superstabilizing Protocols for Dynamic Distributed Systems Authors: Shlomi Dolev, Ted Herman Presented by: Vikas Motwani CSE 291: Wireless Sensor Networks.
ITEC452 Distributed Computing Lecture 15 Self-stabilization Hwajung Lee.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. Fast.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
Incrementally Improving Lookup Latency in Distributed Hash Table Systems Hui Zhang 1, Ashish Goel 2, Ramesh Govindan 1 1 University of Southern California.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
The Chord P2P Network Some slides taken from the original presentation by the authors.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
Pastry Scalable, decentralized object locations and routing for large p2p systems.
The Chord P2P Network Some slides have been borrowed from the original presentation by the authors.
Vineet Mittal Should more be added here Committee Members:
(slides by Nick Feamster)
Maximal Independent Set
Analysis of Link Reversal Routing Algorithms
MST in Log-Star Rounds of Congested Clique
Corona Robust Low Atomicity Peer-To-Peer Systems
SKIP GRAPHS (continued)
Presentation transcript:

Self-stabilizing Overlay Networks Sukumar Ghosh University of Iowa Work in progress. Jointly with Andrew Berns and Sriram Pemmaraju (Talk at Michigan Technological University)

On Thursday, 16th August 2007 Skype had an outage (Skype is known to be a “self-healing” overlay network) (Skype’s explanation) The disruption was triggered by a massive restart of users’ computers across the globe within a very short timeframe, as they re-booted after receiving a routine set of patches through Windows Update.

Overlay Network A logical network laid on top of the Internet A B C Internet Logical link AB Logical link BC

The Formal Model Let V be a set of nodes. The functions id : V  Z+ assigns a unique id to each node in V rs : V  {0, 1}* assigns a random bit string to each node in V A family of overlay networks ON : F  G, where F is the set of all triples λ= (V; id; rs) and G is the set of all directed graphs. The family of overlay networks associates a unique directed graph ON(λ) ∈ G with each labeled set λ = (V; id; rs) of nodes.

Structured vs. Unstructured Overlay networks UnstructuredStructured No restriction on network topology. Examples: Gnutella, Kazaa, Bittorrent, Skype etc. Network topology satisfies specific invariants. Examples: Chord, CAN, Pastry Skip Graph etc

The Challenge Can an overlay network restore its correct functionality from an arbitrary initial configuration? Bad configurations can be caused by failures, perturbations, selfish actions, malicious attacks.

Autonomic Systems Self-management is the holy grail of all complex dynamic systems.

Self-stabilizing systems (Convergence) Recover from any arbitrary initial configuration to a legal configuration in a bounded number of steps, and (Closure) remain in the legal configuration thereafter, until another failure or perturbation occurs.

Self-stabilizing Overlay Networks Can an overlay network restore its topology from an arbitrary initial configuration? Does it make sense in unstructured networks? Does it make sense in structured networks?

Related work Self-stabilizing and Byzantine-tolerant overlay network. OPODIS 2007 [Dolev, Hoch, van Renesse] A distributed polylog time algorithm for self-stabilizing SKIP graph. PODC ’09 [Jacob, Richa, Scheideler et. al] Linearization: Locally self-stabilizing Sorting in graphs. ALENEX, SIAM ‘07 [Onus, Richa, Scheideler]

Example: Linearization The ideal topology is a sorted list. The goal is to spontaneously recover to the ideal topology from an arbitrary connected topology (Onus, Richa, Scheideler, ALENEX 2007)

Self-stabilizing algorithm: Linearization Left and right neighbors: –‘w’ is left neighbor of node ‘u’ if {u, w}  E and w < u. –‘w’ is right neighbor of node ‘u’ if {u, w}  E and u < w. u=10 w 1 =2 w 2 =3w 4 =8w 3 =6v 1 =19 v 2 =28v 4 =35v 3 =30 left neighborsright neighbors

Self-stabilizing algorithm: Linearization u=10 w 1 =2 w 2 =3w 4 =8w 3 =6v 1 =19 v 2 =28v 4 =35v 3 =30 (The Algorithm) In each round do Convert left neighbors into sorted list Convert right neighbors into sorted list Takes at most (n-2) rounds. Slide borrowed from Onus et al.

Evolution of Skip Graph (Aspenes, Shah SODA 2003) Search time is O(n) hops

SKIP Graph Node degree = O(log n), diameter = O(log n) Number of levels = O(log n), Search time now is O(log n) hops Level 0 Level 1 Level

SKIP Graph: the question Can we have a self-stabilizing skip graph that can spontaneously restore its topology starting from any “connected” initial configuration?

Why local checking is important Unless bad configurations are detected via local checking, periodic global snapshots are needed, which is disruptive for the system.

SKIP Graph is NOT locally checkable Self-stabilization requires local detection of errors, but certain failures are not locally checkable

SKIP+ graph Jacob, Richa, Scheideler et al. (PODC 2009) proposed a locally checkable version of SKIP Graph by adding a few extra edges to an existing Skip Graph. They called it a SKIP+ Graph. They presented an algorithm to stabilize such a topology in O(log 2 n) rounds with high probability. The algorithm is quite cumbersome. We try to devise a simpler and better solution.

Detectors detector Our first step

Detector diameter The detector diameter of G, is the maximum hop distance in G between any node and the closest detector.

Transitive Closure Framework Due to the local checkability property in any faulty configuration, there is at least one detector

Transitive Closure Framework Theorem For a SKIP+ graph, the detector diameter D = O(log n)

Transitive Closure Framework

The neighbors of each detector become detectors in the next round. In O(log n) rounds, every node becomes a detector, and these detectors initiate the transitive closure process. After an additional O(log n) rounds, all nodes become connected with one another, and the topology becomes completely connected.

Transitive Closure Framework After all nodes becomes detectors and eventually the topology becomes completely connected, the nodes rebuild the correct topology using a REPAIR subroutine. REPAIR takes only one round.

The Repair Process Lemma If the network is completely connected and all nodes are detectors in round i, a legal overlay network will be built in round (i + 1), and no node will be a detector. Compare with Jacob et. al’s results

Local checkability Let L define a correct configuration of an overlay network. Then network is locally checkable when L = p 0 ∧ p 1 ∧ p 2 ∧ … ∧ p n-1 where p i is a local predicate involving process i and its immediate neighbors only. Most of the real life networks are NOT locally checkable

Example: a clique Theorem. A complete connected topology is locally checkable a b c

Example: a clique Theorem. A complete connected topology is locally checkable a b c

Chord is not locally checkable Chord ringLoopy chord ring

CAN is not locally checkable Content Addressable Network (CAN) on a 2D torus Replace the black edges by the red edges, and each column becomes a loopy chord ring

LCON: a locally checkable overlay network in a circular key space N= 64 7

LCON: a locally checkable overlay network in a circular key space S-links for node u : one edge to each node in the range (u to u+s mod N ) D-links for node u: Succ (u+s mod N), Succ (u+2s mod N) Succ (u+(d-1)s + mod N) N max = s x d Let s=16, d=4 7

Observations Observation Each node in LCON has (d+s-2) neighbors. When d = s, the size of the neighborhood is O(sqrt N). Theorem The detector diameter of LCON is at most two.

Some properties of LCON Theorem. LCON is locally checkable. Main idea. Case 1. If the diameter is two, then every node can “see” every other node, and check if the topology is correct. Case 2. We show that if the diameter if greater than two, then there is at least one detector.

Self-stabilization of LCON The Transitive Closure Framework (TCF) will stabilize LCON in O(log N) time. But it may be a sledgehammer. What is the space complexity of stabilization using TCF?

Self-stabilization of LCON We have an algorithm customized for LCON that stabilizes LCON in polylog time, while the space complexity does not skyrocket to O(n)

Generalization of LCON Main idea Consider a CAN-like topology on a d- dimensional torus. Convert the “ring” in each dimension into an LCON ring. It is only partially shown in the figure on a 2-dimensional torus Each node has O(d.N 1/2d ) neighbors

Conclusion  A new problem of growing interest. We need efficient algorithms for stabilizing a variety of overlay topologies.  The initial topology must be connected. Stabilization from a partitioned topology is impossible. Also for a given (V, id, rs) the legal topology should be unique. Otherwise there will be an additional step for distributed consensus  Working on extending this to more fragile networks.

Questions?