DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.

Slides:



Advertisements
Similar presentations
Distributed systems Total Order Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
CS425/CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 6 Instructor: Haifeng YU.
1 Distributed systems Causal Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Reliable Multicast Steve Ko Computer Sciences and Engineering University at Buffalo.
CS542 Topics in Distributed Systems Diganta Goswami.
Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Authored by: Seth Gilbert and Nancy Lynch Presented by:
Failure Detection The ping-ack failure detector in a synchronous system satisfies – A: completeness – B: accuracy – C: neither – D: both.
Byzantine Generals Problem: Solution using signed messages.
LEADER ELECTION CS Election Algorithms Many distributed algorithms need one process to act as coordinator – Doesn’t matter which process does the.
1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
© nCode 2000 Title of Presentation goes here - go to Master Slide to edit - Slide 1 Reliable Communication for Highly Mobile Agents ECE 7995: Term Paper.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
CPSC 668Set 15: Broadcast1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
Systems of Distributed systems Module 2 - Distributed algorithms Teaching unit 2 – Properties of distributed algorithms Ernesto Damiani University of Bozen.
Distributed Systems Terminating Reliable Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
Consensus and Related Problems Béat Hirsbrunner References G. Coulouris, J. Dollimore and T. Kindberg "Distributed Systems: Concepts and Design", Ed. 4,
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
Fault-Tolerant Broadcast Terminology: broadcast(m) a process broadcasts a message to the others deliver(m) a process delivers a message to itself 1.
Lab 2 Group Communication Farnaz Moradi Based on slides by Andreas Larsson 2012.
Group Communication Group oriented activities are steadily increasing. There are many types of groups:  Open and Closed groups  Peer-to-peer and hierarchical.
Farnaz Moradi Based on slides by Andreas Larsson 2013.
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
Hwajung Lee. A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types of groups:
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.
Totally Ordered Broadcast in the face of Network Partitions [Keidar and Dolev,2000] INF5360 Student Presentation 4/3-08 Miran Damjanovic
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
Building Dependable Distributed Systems, Copyright Wenbing Zhao
SysRép / 2.5A. SchiperEté The consensus problem.
Lecture 10: Coordination and Agreement (Chap 12) Haibin Zhu, PhD. Assistant Professor Department of Computer Science Nipissing University © 2002.
Agreement in Distributed Systems n definition of agreement problems n impossibility of consensus with a single crash n solvable problems u consensus with.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Failure Detectors n motivation n failure detector properties n failure detector classes u detector reduction u equivalence between classes n consensus.
Fault-Tolerant Broadcast Terminology: broadcast(m) a process broadcasts a message to the others deliver(m) a process delivers a message to itself 1.
Failure detection The design of fault-tolerant systems will be easier if failures can be detected. Depends on the 1. System model, and 2. The type of failures.
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.
Distributed systems II Fault-Tolerant Broadcast
Exercises for Chapter 11: COORDINATION AND AGREEMENT
Distributed Systems Course Coordination and Agreement
CSE 486/586 Distributed Systems Reliable Multicast --- 1
Coordination and Agreement
Distributed systems Total Order Broadcast
Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013
Agreement Protocols CS60002: Distributed Systems
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Active replication for fault tolerance
EEC 688/788 Secure and Dependable Computing
Distributed systems Reliable Broadcast
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Distributed systems Causal Broadcast
Distributed Systems Course Coordination and Agreement
CSE 486/586 Distributed Systems Reliable Multicast --- 2
CSE 486/586 Distributed Systems Reliable Multicast --- 1
Blockchains Lecture 6.
Presentation transcript:

DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group

Broadcast B A C m m deliver broadcast deliver

Fault-Tolerant Broadcast Terminology: broadcast(m) a process broadcasts a message to the others deliver(m) a process delivers a message to itself 3

Best-effort broadcast Reliable broadcast Uniform broadcast … P1 P2 P3 Broadcast abstractions

Modules of a process request(deliver) indication (deliver) indication request(deliver) indication (deliver) request(deliver) indication

Broadcast Models: Synchronous vs. asynchronous Types of process failures Types of communication failures Network topology Deterministic vs. randomized 6

Reliable Broadcast Three conditions sameAgreement: all correct processes eventually deliver same set of messages all messagesValidity: set of messages delivered by correct processes includes all messages broadcasted by correct processes at most once actually broadcastedIntegrity: each correct process P delivers a message from correct process Q at most once, and only if Q actually broadcasted it 7

Reliable Broadcast What about faulty processes? Definition: A property is uniform if faulty processes satisfy it as well. Uniform agreement: If a process (correct or faulty) delivers m, then all correct processes eventually deliver m. Uniform integrity: For every broadcasted message m, every process (correct or not) delivers m at most once, and only if some process has broadcasted m 8

Reliable Broadcast What about faulty processes? Definition: A property is uniform if faulty processes satisfy it as well. Uniform agreement: If a process (correct or faulty) delivers m, then all correct processes eventually deliver m. Uniform integrity: For every broadcasted message m, every process (correct or not) delivers m at most once, and only if some process has broadcasted m 9 Another way to express agreement

Reliable broadcast m1 crash p1 p2 p3 m2 delivery m2 crash delivery

Uniform reliable broadcast m1 crash p1 p2 p3 m2 delivery m2 crash delivery

Reliable Broadcast How can we implement Reliable Broadcast? Model Asynchronous Benign process and link failures only No network partitions 12

Reliable Broadcast Assume we have send(m) and receive(m) primitives Transmit and send messages across a link If P sends m to Q, and link correct, then Q eventually receives m For all m, Q receives m at most once from P, and only if P actually sent m 13

14 Reliability of one-to-one communication  validity : –any message in the outgoing message buffer is eventually delivered to the incoming message buffer;  integrity : –the message received is identical to one sent, and no messages are delivered twice.

15  validity : –any message in the outgoing message buffer is eventually delivered to the incoming message buffer;  integrity : –the message received is identical to one sent, and no messages are delivered twice. integrity  by use checksums, reject duplicates (e.g. due to retries).  If allowing for malicious users, use security techniques How do we achieve validity and integrity? validity - by use of acknowledgements and retries

Reliable Broadcast R-broadcast(m) uniquely tag m with sender and sequence number send(m) to all neighbours (including self) end R-broadcast R-deliver(m) upon receive(m) do if i have not already delivered m then if I am not the sender of m then send m to all neighbours endif deliver(m) endif end R-deliver 16

Reliable Broadcast In an asynchronous system o Where every two correct processes are connected via a path that never fails, o the previous algorithm implements reliable broadcast with uniform integrity: For every broadcasted message m, every process (correct or not) delivers m at most once, and only if some process broadcast m. 17

Algorithm idea (rb) m m p1 p2 p3 crash m m delivery

TODO  Prove Agreement, Validity and Integrity 19

Reliable Broadcast In an asynchronous system where every two correct processes are connected via a path that never fails, and receive omissionsonly receive omissions occur, then the algorithm satisfies uniform agreement: If a process (correct or faulty) delivers m, then all correct processes eventually deliver m. 20

TODO  Extend the previous Proof for Uniform Agreement 21

Ordering (1)  So far, we did not consider ordering among messages; In particular, we considered messages to be independent  Two messages from the same process might not be delivered in the order they were broadcast

FIFO ? p1 p2 p3 m2 delivery m1 delivery m2 delivery

FIFO Broadcast Same as reliable, plus All messages broadcast by same sender delivered in order sent 24

FIFO Broadcast msgBag=0 Next[Q]=1 for all processes Q F-broadcast(m) R-broadcast(m) F-deliver(m) upon R-deliver(m) do Q := sender(m) msgBag := msgBagU{m} while ( m´ in msgBag : sender(m´)=Q and seq (m´) = next[Q]) do F-deliver(m´) next[Q] := next[Q]+1 msgBag:=msqBag-{m´} endwhile 25

FIFO Broadcast Theorem 1: Given a reliable broadcast algorithm this algorithm is uniform FIFO. TODO: Prove it. Theorem 2: if the reliable broadcast algorithm satisfies uniform agreement, so does this algorithm. TODO: Prove it. 26

Limitations of FIFO Broadcast Scenario: User A broadcasts a message to a mailing list/Board B delivers that article B broadcasts reply C delivers B’s response without A´s original message and misinterprets the message 27

Intuitions  A message m1 that causes a message m2 might be delivered by some process after m2  Causal broadcast alleviates the need for the application to deal with such dependencies

Causality ? p1 p2 p3 m2 delivery m1 delivery m2

Causal Broadcast prevDel is sequence of messages that P C-delivered since its last C-broadcast C-broadcast(m) F-broadcast(prevDel●m) prevDel:=Ø C-deliever(m) upon F-deliever(m1,m2,...,ml) do for i in 1..l do if P has not previously C-delivered mi then C-deliver(mi) prevDel:=prevDel●mi 30

Causal Broadcast Theorem 1: If the FIFO broadcast algorithm is Uniform FIFO, this is a uniform causal broadcast algorithm. Theorem 2: if the FIFO broadcast satisfies Uniform Agreement, so does this one. 31

Limitation of Causal Broadcast Causal broadcast does not impose any order on unrelated messages. Two correct processes can deliver operations/request in different order. 32

33 Total, FIFO and causal ordering of multicast messages Figure Notice the consistent ordering of totally ordered messages T 1 and T 2. They are opposite to real time. The order can be arbitrary it need not be FIFO or causal and the causally related messages C 1 and C 3

Atomic Broadcast Requires that all correct processes deliver all messages in the same order. Implies that all correct processes see the same view of the world. 34

Atomic Broadcast Theorem: Atomic broadcast is impossible in asynchronous systems. Proof: Equivalent to consensus problem. 35