Anish Arora Ohio State University Mikhail Nesterenko Kent State University Local Tolerance to Unbounded Byzantine Faults.

Slides:



Advertisements
Similar presentations
Randomness Conductors Expander Graphs Randomness Extractors Condensers Universal Hash Functions
Advertisements

Routing Complexity of Faulty Networks Omer Angel Itai Benjamini Eran Ofek Udi Wieder The Weizmann Institute of Science.
The Capacity of Wireless Networks
Connectivity - Menger’s Theorem Graphs & Algorithms Lecture 3.
Operating Systems Lecture Notes Deadlocks Matthew Dailey Some material © Silberschatz, Galvin, and Gagne, 2002.
Chapter 6 - Convergence in the Presence of Faults1-1 Chapter 6 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press, 2000 Shlomi Dolev, All Rights.
Dining Philosophers (Diners) Problem n definition n Chandy and Misra solution n extension to drinking philosophers.
Parallel Scheduling of Complex DAGs under Uncertainty Grzegorz Malewicz.
1 By Gil Kalai Institute of Mathematics and Center for Rationality, Hebrew University, Jerusalem, Israel presented by: Yair Cymbalista.
Ad-Hoc Networks Beyond Unit Disk Graphs
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Design and Analysis of Algorithms
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
CPSC 668Set 16: Distributed Shared Memory1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
LSRP: Local Stabilization in Shortest Path Routing Hongwei Zhang and Anish Arora Presented by Aviv Zohar.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
Dynamic Hypercube Topology Stefan Schmid URAW 2005 Upper Rhine Algorithms Workshop University of Tübingen, Germany.
LSRP: Local Stabilization in Shortest Path Routing Anish Arora Hongwei Zhang.
2/23/2009CS50901 Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial Fred B. Schneider Presenter: Aly Farahat.
Definition Dual Graph G* of a Plane Graph:
Dept. of Computer Science Distributed Computing Group Asymptotically Optimal Mobile Ad-Hoc Routing Fabian Kuhn Roger Wattenhofer Aaron Zollinger.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 2 – Distributed Systems.
Dynamic Medial Axis Based Motion Planning in Sensor Networks Lan Lin and Hyunyoung Lee Department of Computer Science University of Denver
Job Scheduling Lecture 19: March 19. Job Scheduling: Unrelated Multiple Machines There are n jobs, each job has: a processing time p(i,j) (the time to.
Distributed Combinatorial Optimization
CS 603 Communication and Distributed Systems April 15, 2002.
GS 3 GS 3 : Scalable Self-configuration and Self-healing in Wireless Networks Hongwei Zhang & Anish Arora.
The Art Gallery Problem
Coloring 3/16/121. Flight Gates flights need gates, but times overlap. how many gates needed? 3/16/122.
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
Selected topics in distributed computing Shmuel Zaks
1 By: MOSES CHARIKAR, CHANDRA CHEKURI, TOMAS FEDER, AND RAJEEV MOTWANI Presented By: Sarah Hegab.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
Linear Programming Data Structures and Algorithms A.G. Malamos References: Algorithms, 2006, S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani Introduction.
Expanders via Random Spanning Trees R 許榮財 R 黃佳婷 R 黃怡嘉.
Linear Programming Problem. Definition A linear programming problem is the problem of optimizing (maximizing or minimizing) a linear function (a function.
Agenda Fail Stop Processors –Problem Definition –Implementation with reliable stable storage –Implementation without reliable stable storage Failure Detection.
Stabilization Presented by Xiaozhou David Zhu. Contents What-is Motivation 3 Definitions An Example Refinements Reference.
SR: A Cross-Layer Routing in Wireless Ad Hoc Sensor Networks Zhen Jiang Department of Computer Science West Chester University West Chester, PA 19335,
Fault Management in Mobile Ad-Hoc Networks by Tridib Mukherjee.
Chapter 11 Resource Allocation by Mikhail Nesterenko “Distributed Algorithms” by Nancy A. Lynch.
Vertex Coloring Distributed Algorithms for Multi-Agent Networks
Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.
Anish Arora Ohio State University Mikhail Nesterenko Kent State University Local Tolerance to Unbounded Byzantine Faults.
CS 542: Topics in Distributed Systems Self-Stabilization.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
2/14/2016  A. Orda, A. Segall, 1 Queueing Networks M nodes external arrival rate (Poisson) service rate in each node (exponential) upon service completion.
“Towards Self Stabilizing Wait Free Shared Memory Objects” By:  Hopeman  Tsigas  Paptriantafilou Presented By: Sumit Sukhramani Kent State University.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 9 Instructor: Haifeng YU.
1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. Fast.
Design of Tree Algorithm Objectives –Learning about satisfying safety and liveness of a distributed program –Apply the method of utilizing invariants and.
Efficient Placement and Dispatch of Sensors in a Wireless Sensor Network You-Chiun Wang, Chun-Chi Hu, and Yu-Chee Tseng IEEE Transactions on Mobile Computing.
Introduction Wireless Ad-Hoc Network  Set of transceivers communicating by radio.
1 Maximality Properties Dr. Mikhail Nesterenko Presented By Ibrahim Motiwala.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
Dining Philosophers (Diners) Problem
Evaluating and Optimizing Stabilizing Dining Philosophers
Vineet Mittal Should more be added here Committee Members:
Jordan Adamek Mikhail Nesterenko Sébastien Tixeuil
Hongwei Zhang Anish Arora
The Art Gallery Problem
The Art Gallery Problem
Introduction Wireless Ad-Hoc Network
Corona Robust Low Atomicity Peer-To-Peer Systems
Locality In Distributed Graph Algorithms
Presentation transcript:

Anish Arora Ohio State University Mikhail Nesterenko Kent State University Local Tolerance to Unbounded Byzantine Faults

large system size presents unique challenges to ensuring dependability:  faults occur often  multiple regions can be affected by faults  faults may interact unpredictably  faults can be spatially/temporally unbounded & complex how to tolerate such faults? affected faulty  localize tolerance to unbounded complex faults Tolerating Faults in System of Large Scale

execution model  asynchronous  interleaving  communication via shared registers  examples  graph coloring – color (assign numbers) vertices of a graph so that colors of adjacent onse do not match  if graph has degree d, can always color in d+1 colors routing – assign parent to each process such that there is a path from each process to the sink (destination) Execution model & Example problems sink

Outline fault containment & tolerance  strict fault containment  strict fault tolerance –strict stabilization examples of strictly fault tolerant programs  graph coloring  dining philosophers  routing limits of strict fault containment critique and further directions

Spatial Fault Hierarchy bounded faults – processes outside certain locality of a fault perform correctly (according to specification) unbounded faults – process performs correctly in spite of faults outside its locality unbounded Byzantine faults - each process behaves correctly regardless of actions outside its locality if a program is tolerant to unbounded Byzantine faults, it is also tolerant to bounded and unbounded faults of any fault class

Containment of Unbounded Faults Proposition 4. P is strictly fault containing if there exists a constant l such that for each process p there exists and invariant I.p which is closed with respect to Byzantine actions of processes whose distance to p is greater than l what is the form of this invariant? can it include variables outside locality? can you always come up with an invariant of this form? What does it mean for an individual process to perform correctly?

What if faults occur inside the containment locality? Tolerance Inside Locality can achieve additional tolerance  two process specifications –ideal (no faults) –tolerant (faults of some class present) example – safety is never violated  which spec do processes outside fault locality satisfy?

Strict Stabilization stabilization – special case of tolerant spec – eventual satisfaction of ideal spec when (transient) faults stop occurring strict stabilization – process p eventually satisfies ideal spec regardless of behavior of processes outside its locality  what is the difference between traditional stabilization and strict stabilization?  is strict containment required for strict stabilization? more formally:

Vertex Coloring Program (PVC) Lemma 2. when node has a neighbor with matching color it can select a new color without affecting any of its neighbors Invariant: Theorem 1. PVC is strictly fault-containing and strictly stabilizing (with locality of 1) nodes that may recolor following Byzantine Byzantine node

Dining Philosophers Problem (DP) [D72] graph of processes, each may request to eat properties  no two neighbors eat together  each requesting process eats eventually thinking (T) hungry (H) eating (E) cycle of requesting process

DP: Fault-Free Operation [CM84] actions: if thinking, needs to eat & all parents thinking  become hungry if hungry & no neighbors eating  eat when finished  think & become child of each neighbor b eats & gives up privilege a T H T b c T a T E b ca T T T bca E T E bc a & c eat a TTT b c executes

Dining Philosophers Program (PDP) a hungry faulty process may block immediate thinking neighbors an eating faulty process may block hungry neighbors and their thinking neighbors H E T T T H E T E T H H

Dining Philosophers Program (PDP) Lemma 4. non-Byzantine eating process eventually thinks Lemma 5. a hungry process whose immediate neighborhood is not Byzantine eventually eats Lemma 6. If a Byzantine process is at least 2 hops away a thinking process eventually becomes hungry Invariant Theorem 2. PDP is strictly fault-containing and strictly stabilizing (with locality of 2)

Limits of Containment Theorem 3. the containment radius of a solution to an r - restrictive problem is at least r graph coloring and dining-philosophers are 1-restrictive routing is restrictive for arbitrary r σ is in p ’s spec s1s1 s2s2 s 1 and s 2 differ in values of a process at least r away from p

Critique and Further Research interesting and useful examples of strict containment § geometric spanners, spanners of fixed degree § low-atomicity dining-philosophers § ?? better bounds on containment  r -restriction is obvious but too crude a bound for containment § some non-containing problems appear “almost” the same as containing § example: maximal independent set – 1-containing maximal independent set with distance of at most 2 – not containing for any l

DP: Handling Crashes [CS96] dynamic threshold: if a parent does not think then start to think  unblocks processes 2 hops away from faulty process crashedblocked crashed process a blocks neighbors b thinks & unblocks c c eats a EHT b ca ETT b c a ETH b c a ETE b c requesting process with dynamic threshold T H E if parent not T