Anish Arora Ohio State University Mikhail Nesterenko Kent State University Local Tolerance to Unbounded Byzantine Faults.

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

Computer Networking A Top-Down Approach Chapter 4.7.
Chapter 15 Basic Asynchronous Network Algorithms
Dining Philosophers (Diners) Problem n definition n Chandy and Misra solution n extension to drinking philosophers.
Self Stabilizing Algorithms for Topology Management Presentation: Deniz Çokuslu.
Ad-Hoc Networks Beyond Unit Disk Graphs
Mobile and Wireless Computing Institute for Computer Science, University of Freiburg Western Australian Interactive Virtual Environments Centre (IVEC)
Università degli Studi dell’Aquila Academic Year 2009/2010 Course: Algorithms for Distributed Systems Instructor: Prof. Guido Proietti Time: Monday:
1/14 Ad Hoc Networking, Eli M. Gafni and Dimitri P. Bertsekas Distributed Algorithm for Generating Loop-free Routes in Networks With Frequently.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
1 Complexity of Network Synchronization Raeda Naamnieh.
1 University of Freiburg Computer Networks and Telematics Prof. Christian Schindelhauer Wireless Sensor Networks 21st Lecture Christian Schindelhauer.
1 Minimum-energy broadcasting in multi-hop wireless networks using a single broadcast tree Department of Computer Science and Information Engineering National.
An Analysis of the Optimum Node Density for Ad hoc Mobile Networks Elizabeth M. Royer, P. Michael Melliar-Smith and Louise E. Moser Presented by Aki Happonen.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
CPSC 668Set 2: Basic Graph Algorithms1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
LSRP: Local Stabilization in Shortest Path Routing Hongwei Zhang and Anish Arora Presented by Aviv Zohar.
LSRP: Local Stabilization in Shortest Path Routing Anish Arora Hongwei Zhang.
Distributed Algorithms for Guiding Navigation across a Sensor Network Qun Li, Michael DeRosa, and Daniela Rus Dartmouth College MOBICOM 2003.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 2 – Distributed Systems.
Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.
Mario Čagalj supervised by prof. Jean-Pierre Hubaux (EPFL-DSC-ICA) and prof. Christian Enz (EPFL-DE-LEG, CSEM) Wireless Sensor Networks:
Beacon Vector Routing: Scalable Point-to-Point Routing in Wireless Sensornets.
Raeda Naamnieh 1. The Partition Algorithm Intuitively, the idea of the following algorithm is to choose each cluster as a maximal subset of nodes whose.
GS 3 GS 3 : Scalable Self-configuration and Self-healing in Wireless Networks Hongwei Zhang & Anish Arora.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
1 Software Testing Techniques CIS 375 Bruce R. Maxim UM-Dearborn.
Efficient and Reliable Broadcast in ZigBee Networks Purdue University, Mitsubishi Electric Lab. To appear in SECON 2005.
Message Passing Systems A Formal Model. The System Topology – network (connected undirected graph) Processors (nodes) Communication channels (edges) Algorithm.
Distributed Algorithms 2014 Igor Zarivach A Distributed Algorithm for Minimum Weight Spanning Trees By Gallager, Humblet,Spira (GHS)
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
Network Aware Resource Allocation in Distributed Clouds.
Andreas Larsson, Philippas Tsigas SIROCCO Self-stabilizing (k,r)-Clustering in Clock Rate-limited Systems.
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
De-Nian Young Ming-Syan Chen IEEE Transactions on Mobile Computing Slide content thanks in part to Yu-Hsun Chen, University of Taiwan.
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
Network and Communications Ju Wang Chapter 5 Routing Algorithm Adopted from Choi’s notes Virginia Commonwealth University.
Review for Exam 2. Topics included Deadlock detection Resource and communication deadlock Graph algorithms: Routing, spanning tree, MST, leader election.
1 Maximal Independent Set. 2 Independent Set (IS): In a graph G=(V,E), |V|=n, |E|=m, any set of nodes that are not adjacent.
InterConnection Network Topologies to Minimize graph diameter: Low Diameter Regular graphs and Physical Wire Length Constrained networks Nilesh Choudhury.
DISTRIBUTED SYSTEMS II A POLYNOMIAL LOCAL SOLUTION TO MUTUAL EXCLUSION Prof Philippas Tsigas Distributed Computing and Systems Research Group.
1 Leader Election in Rings. 2 A Ring Network Sense of direction left right.
Void Traversal for Guaranteed Delivery in Geometric Routing
Hwajung Lee.  Models are simple abstractions that help understand the variability -- abstractions that preserve the essential features, but hide the.
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 2: Basic Graph Algorithms 1.
Chapter 11 Resource Allocation by Mikhail Nesterenko “Distributed Algorithms” by Nancy A. Lynch.
Agreement in Distributed Systems n definition of agreement problems n impossibility of consensus with a single crash n solvable problems u consensus with.
Autonomous prototype sensors (motes) 4 MHz, 8bit MCU, 4 KB RAM, 8KB ROM short-range (1-10ft.) radio light and other sensors LED and serial port outputs.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
2/14/2016  A. Orda, A. Segall, 1 Queueing Networks M nodes external arrival rate (Poisson) service rate in each node (exponential) upon service completion.
Ben Miller.   A distributed algorithm is a type of parallel algorithm  They are designed to run on multiple interconnected processors  Separate parts.
Superstabilizing Protocols for Dynamic Distributed Systems Authors: Shlomi Dolev, Ted Herman Presented by: Vikas Motwani CSE 291: Wireless Sensor Networks.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Anish Arora Ohio State University Mikhail Nesterenko Kent State University Local Tolerance to Unbounded Byzantine Faults.
Construction of Optimal Data Aggregation Trees for Wireless Sensor Networks Deying Li, Jiannong Cao, Ming Liu, and Yuan Zheng Computer Communications and.
Introduction Wireless Ad-Hoc Network  Set of transceivers communicating by radio.
Formal verification of distance vector routing protocols.
Dining Philosophers (Diners) Problem
Evaluating and Optimizing Stabilizing Dining Philosophers
Vineet Mittal Should more be added here Committee Members:
Jordan Adamek Mikhail Nesterenko Sébastien Tixeuil
Intra-Domain Routing Jacob Strauss September 14, 2006.
Research: algorithmic solutions for networking
Introduction to locality sensitive approach to distributed systems
Introduction Wireless Ad-Hoc Network
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Corona Robust Low Atomicity Peer-To-Peer Systems
Presentation transcript:

Anish Arora Ohio State University Mikhail Nesterenko Kent State University Local Tolerance to Unbounded Byzantine Faults

large system size presents unique challenges and opportunities to ensuring dependability problem  faults: –occur often –affect multiple components –interact unpredictably  asynchronous execution model  faults are spatially/temporally unbounded, complex & undetectable opportunity  a fault directly affects a region rather than whole system  if faults are contained, rest of the system continues to function Faults in System of Large Scale affected faulty unaffected

lack of spatial bound  arbitrary number of processes can be faulty  cannot rely on limited scope of fault or number of faulty processes lack of temporal bound  faulty process behaves incorrectly arbitrarily long  cannot wait until fault stops  contain correctness and tolerance instead of faults  use execution models that simplify such containment Difficulties Containing Unbounded Faults

Outline containing correctness and tolerance: strict fault containment and strict stabilization execution models and example programs  reactive program: dining philosophers  transformational execution models and programs –output dependent: -independent set selection –output independent: lightweight spanner construction

address specification first what does it mean for a system to be correct when its arbitrary portion is faulty? spec defines correct sequences for each process P sequence involves states of P and possibly others a program is locally containing of faults of class F if  constant l (containment radius) such that every P conforms to its spec if faulty processes are at least l hops away from P problem: correctness of P depends on every process in the system conforming to spec or F Containing Correctness fault of class F containment radius l containment locality

Strict Fault Containment strict fault containing (SFC) program is locally containing of unbounded Byzantine faults  a process satisfies spec regardless of actions of processes outside locality  SFC-program is containing of bounded and unbounded faults of any class  for each P the spec can only mention processes inside locality  a problem lacking such specs (e.g. routing) does not have SFC-solutions Byzantine fault

Strict Stabilization additional tolerance properties to faults within locality for a strictly- fault containing program  strict stabilization – stabilization from transient faults: regardless of actions outside locality, each P eventually satisfies spec

Outline containing correctness and tolerance: strict fault containment and strict stabilization execution models and example programs  reactive program: dining philosophers  transformational execution models and programs –output dependent: k -independent set selection –output independent: lightweight spanner construction

Dining Philosophers Problem definition  network of processes, each may request to eat  properties –mutual exclusion – no two neighbors eat together –liveness – each requesting process eats eventually execution model  interleaving  communication via shared registers  high-atomicity thinking (T) hungry (H) eating (E) cycle for requesting process

Solution to Dining Philosophers priority based actions if T & higher priority neighbors thinking  become hungry if H & no neighbors are eating  eat (ensures MX) E & done  think & give priority to neighbors (ensures liveness)  waiting chain ≤ 3  optimal containment radius of 2 ETHany decreasing priority

Fault Containment and Information Propagation fault containment leverages limit on information propagation idea: abstract from the process of information propagation and highlight the result a b c d process: sends info to b sends a’s info to c sends a’s info to d result: d reads from a

Execution Models transformation program – given input computes output (e.g. leader election) models for transformation programs – each process reads from processes within range (finite distance) output dependent – each process reads all information within range: input and (atomically) output output independent – each process reads only input within range  every program in this model is strictly fault containing P reads input&output range P reads input only

k -Independent Set Selection (cf. [HHJS01]) problem: select a maximal subset of processes S such that for each process in S each other process of S is at least k hops away solution actions if no member of S less than k -hops away  join S if exists member of S less than k -hops away  leave S observe: only faulty node P can make another process Q to leave S if Q leaves S, it can make another process R join S  containment radius is 2k 1-independent set joins Sleaves Sjoins S P Q R k k

Outline containing correctness and tolerance: strict fault containment and strict stabilization execution models and example programs  reactive program: dining philosophers  transformational execution models and programs –output dependent: k -independent set selection –output independent: lightweight spanner construction practical problem: fast routing tree construction in sensor networks spanner construction with double range spanner optimization with larger ranges

Experimental Platform: Wireless Sensors 4 MHz Amtel processor 8 Kb of programming memory 512B of data memory 916 MHz single-channel, low-power radio 10 Kbps of raw bandwidth uniform antenna length & orientation TinyOS as the runtime system fresh AA batteries

Experiment: Fast Routing Tree Construction By Flooding [G+02] 156 nodes are arranged in a 13x12 grid on an open parking lot, with grid spacing of 2 feet. the base station is placed in the middle of the base of the grid and starts the flooding each receiving node rebroadcast the flood message immediately upon receipt and then squelches further broadcasts  the sender is selected as parent, thus routing tree to the base station is formed expectation: a routing tree with relatively regular structure:  # of children, link length, path size, etc.

Backward Link Long Link Straggler Clustering 1 hop 2 hops 3 hops final

Problems and Solution Approach problem: routing tree constructed fast over “raw” topology is inadequate  uneven clustering (some nodes have too many neighbors)  long links (possibly unreliable)  unoptimal paths (backward links) idea: pre-process the topology to mitigate the problem  weigh links (by length, error rate, node degree, etc.)  locally construct a connected but lightweight spanner –link weight may be reflexive (depend on the spanner, ex: node degree)

Lightweight Spanner Construction Using 2k -Range spanner – connected subgraph that includes all nodes (ex: spanning tree) k -local spanner – there is a path within distance ≤ k to each neighbor problem: given a weighted graph (all weights unique) and 2k -range build a lightweight k -local spanner solution: each process P computes the minimum spanning tree for each process Q in distance no more than k and selects the union of incident edges k k P Q P can compute MST for each process Q in this region MST for Q’s region

Spanner Optimization Using Ranges > 2 each P computes spanner’s topology in neighborhood with radius range -k  P knows complete spanner in this region P iteratively repeats the procedure on the resultant spanner k k P Q P can compute MST for each process Q in this region k

Conclusion complexity and scale of large systems forces unorthodox approaches to faults we explored spatial dimension of fault tolerance to complex unbounded faults, used lack of global info propagation  stated necessary conditions and impossibility results  gave first examples of programs question: how to solve problems that do have global info propagation? is it possible to contain problems before they spread?