DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT II Prof Philippas Tsigas Distributed Computing and Systems Research Group.

Slides:



Advertisements
Similar presentations
CS542: Topics in Distributed Systems Distributed Transactions and Two Phase Commit Protocol.
Advertisements

Slides for Chapter 13: Distributed transactions
(c) Oded Shmueli Distributed Recovery, Lecture 7 (BHG, Chap.7)
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
Computer Science 425 Distributed Systems CS 425 / ECE 428 Consensus
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Byzantine Generals Problem: Solution using signed messages.
Exercises for Chapter 17: Distributed Transactions
CIS 720 Concurrency Control. Timestamp-based concurrency control Assign a timestamp ts(T) to each transaction T. Each data item x has two timestamps:
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
ICS 421 Spring 2010 Distributed Transactions Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/16/20101Lipyeow.
Systems of Distributed Systems Module 2 -Distributed algorithms Teaching unit 3 – Advanced algorithms Ernesto Damiani University of Bozen Lesson 6 – Two.
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Teaching material based on Distributed Systems: Concepts and Design, Edition 3, Addison-Wesley Copyright © George Coulouris, Jean Dollimore, Tim.
Non-blocking Atomic Commitment Aaron Kaminsky Presenting Chapter 6 of Distributed Systems, 2nd edition, 1993, ed. Mullender.
CS 603 Three-Phase Commit February 22, Centralized vs. Decentralized Protocols What if we don’t want a coordinator? Decentralized: –Each site broadcasts.
1 More on Distributed Coordination. 2 Who’s in charge? Let’s have an Election. Many algorithms require a coordinator. What happens when the coordinator.
Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.
Distributed Commit. Example Consider a chain of stores and suppose a manager – wants to query all the stores, – find the inventory of toothbrushes at.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Distributed Systems Fall 2009 Distributed transactions.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Distributed Commit Dr. Yingwu Zhu. Failures in a distributed system Consistency requires agreement among multiple servers – Is transaction X committed?
Distributed Transactions March 15, Transactions What is a Distributed Transaction?  A transaction that involves more than one server  Network.
DISTRIBUTED SYSTEMS II AGREEMENT (2-3 PHASE COM.) Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Chapter 19 Recovery and Fault Tolerance Copyright © 2008.
Distributed Transactions Chapter 13
Lecture 12: Distributed transactions Haibin Zhu, PhD. Assistant Professor Department of Computer Science Nipissing University © 2002.
Distributed Transactions
Consensus and Its Impossibility in Asynchronous Systems.
CSE 486/586 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Transactions CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Nikita Borisov - UIUC Material derived from slides by I. Gupta, M. Harandi,
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
Distributed Transaction Management, Fall 2002Lecture Distributed Commit Protocols Jyrki Nummenmaa
University of Tampere, CS Department Distributed Commit.
DISTRIBUTED SYSTEMS II AGREEMENT - COMMIT (2-3 PHASE COMMIT) Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Distributed Transactions Chapter – Vidya Satyanarayanan.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
Committed:Effects are installed to the database. Aborted:Does not execute to completion and any partial effects on database are erased. Consistent state:
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 17: Distributed.
 2002 M. T. Harandi and J. Hou (modified: I. Gupta) Distributed Transactions.
IM NTU Distributed Information Systems 2004 Distributed Transactions -- 1 Distributed Transactions Yih-Kuen Tsay Dept. of Information Management National.
Revisiting failure detectors Some of you asked questions about implementing consensus using S - how does it differ from reaching consensus using P. Here.
A client transaction becomes distributed if it invokes operations in several different Servers There are two different ways that distributed transactions.
Fault Tolerance Chapter 7. Goal An important goal in distributed systems design is to construct the system in such a way that it can automatically recover.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
10-Jun-16COMP28112 Lecture 131 Distributed Transactions.
1 AGREEMENT PROTOCOLS. 2 Introduction Processes/Sites in distributed systems often compete as well as cooperate to achieve a common goal. Mutual Trust/agreement.
Recovery in Distributed Systems:
Outline Introduction Background Distributed DBMS Architecture
Outline Announcements Fault Tolerance.
CSE 486/586 Distributed Systems Concurrency Control --- 3
Slides for Chapter 14: Distributed transactions
Distributed Transactions
Exercises for Chapter 14: Distributed Transactions
Distributed Transactions
UNIVERSITAS GUNADARMA
Distributed Transactions
Distributed Systems Course Distributed transactions
Distributed Transactions
CIS 720 Concurrency Control.
CSE 486/586 Distributed Systems Concurrency Control --- 3
Last Class: Fault Tolerance
Presentation transcript:

DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT II Prof Philippas Tsigas Distributed Computing and Systems Research Group

Conditions for a solution for Byzantine faults Number of processes: n Maximum number of possibly failing processes: f Necessary and sufficient condition for a solution to Byzantine agreement: f<n/3 Minimal number of rounds in a deterministic solution: f+1  There exist randomized solutions with a lower expected number of rounds 2

Impossibility of 1-resilient 3-processor Agreement 3 A:V A =0 B:V B =0 C:V C =0 A´:V A´= 1 B´:V B´ =1 C´:V C´ =1 E1E1

Impossibility of 1-resilient 3-processor Agreement 4 A:V A =0 B:V B =0 C:V C =0 A´:V A´= 1 B´:V B´ =1 C´:V C´ =1 E0E0

Impossibility of 1-resilient 3-processor Agreement 5 A:V A =0 B:V B =0 C:V C =0 A´:V A´= 1 B´:V B´ =1 C´:V C´ =1 E1E1

Impossibility of 1-resilient 3-processor Agreement 6 A:V A =0 B:V B =0 C:V C =0 A´:V A´= 1 B´:V B´ =1 C´:V C´ =1 E2E2

Proof In E 0 A and B decide 0 In E 1 B´ and C´ decide 1 In E 2 C´ has to decide 1 and A has to decide 0, contradiction! 7

t-resilient algorithm requiring n 2 8 P1, P4,…P2,…P3,… P1, P2, P3, P4,...,Pn processors P´1= P´2= P´3= Distribute them Evenly to 3 logical Processes: P’1, P’2, P’3

t-resilient algorithm requiring n 2 9 P1, P4,…P2,…P3,… P1, P2, P3, P4,...,Pn processors P´1= P´2= P´3= |P´1|, |P´2|, |P´3| < t

t-resilient algorithm requiring n 2 10 P1, P4,…P2,…P3,… P1, P2, P3, P4,...,Pn processors P´1= P´2= P´3= In the system with 3 processors (P´1, P´2 and P´3) if one of them is faulty then at most t processors of the initial system are going to be faulty.

Proof  Run the solution to the problem (sysyem with n processes) at the 3 process system.  Ask P’1, P’2 and P’3 to simulate their respective substem and decide what the processes of its subsystem have decided.  Contradiction! This is a solution to a 3 processes system with one byzantine process. 11

”I can’t find a solution, I guess I’m just too dumb”  Picture from Computers and Intractability, by Garey and Johnson 12

”I can’t find an algorithm, because no such algorithm is possible”  Picture from Computers and Intractability, by Garey and Johnson 13

”I can’t find an algorithm, but neither can all these famous people.”  Picture from Computers and Intractability, by Garey and Johnson 14

o For a system with at most f processes crashing, the algorithm proceeds in f+1 rounds (with timeout), using basic multicast. o Values r i : the set of proposed values known to P i at the beginning of round r. o Initially Values 0 i = {} ; Values 1 i = {v i } for round = 1 to f+1 do multicast (Values r i – Values r-1 i ) Values r+1 i  Values r i for each V j received Values r+1 i = Values r+1 i  V j end d i = minimum(Values f+2 i ) Consensus in a Synchronous System with process crashing

Proof of Correctness Proof by contradiction.  Assume that two processes differ in their final set of values.  Assume that p i possesses a value v that p j does not possess.  A third process, p k, sent v to p i, and crashed before sending v to p j.  Any process sending v in the previous round must have crashed; otherwise, both p k and p j should have received v.  Proceeding in this way, we infer at least one crash in each of the preceding rounds.  But we have assumed at most f crashes can occur and there are f+1 rounds  contradiction.

Byzantine agreem. with authentication and Synchrony Every message carries a signature The signature of a loyal general cannot be forged Alteration of the contents of a signed message can be detected Every (loyal) general can verify the signature of any other (loyal) general Any number f of traitors can be allowed Commander is process 0 Structure of message from (and signed by) the commander, and subsequently signed and sent by lieutenants Li1, Li2,…: (v : s0 : si1: … : sik) Every lieutenant maintains a set of orders V Some choice function on V for deciding (e.g., majority, minimum) 17

 Algorithm in commander: send(v: s0)to every lieutenant –Algorithm in every lieutenant Li : upon receipt of (v : s0: si1: …. : sik) do if (v not in V) then V := V union {v} if (k < f) then for(j in {1,2,…,n-1} \{i,i1,…,ik}) do send(v: s0: si1: … : sik: i) to Lj If (Li will not receive any more messages) then decide(choice(V)) 18

19 Atomic commit protocols  transaction atomicity requires that at the end, –either all of its operations are carried out or none of them.  in a distributed transaction, the client has requested the operations at more than one server  one-phase atomic commit protocol –the coordinator tells the participants whether to commit or abort –what is the problem with that? –this does not allow one of the servers to decide to abort – it may have discovered a deadlock or it may have crashed and been restarted  two-phase atomic commit protocol –is designed to allow any participant to choose to abort a transaction –phase 1 - each participant votes. If it votes to commit, it is prepared. It cannot change its mind. In case it crashes, it must save updates in permanent store –phase 2 - the participants carry out the joint decision The decision could be commit or abort - participants record it in permanent store

20 Failure model for the commit protocols  Failure model for transactions –this applies to the two-phase commit protocol  Commit protocols are designed to work in –synchronous system, system failure when a msg does not arrive on time. –servers may crash but a new process whose state is set from information saved in permanent storage and information held by other processes. –messages may NOT be lost. –assume corrupt and duplicated messages are removed. –no byzantine faults – servers either crash or they obey their requests  2PC is an example of a protocol for reaching a consensus. –Chapter 11 says consensus cannot be reached in an asynchronous system if processes sometimes fail. –however, 2PC does reach consensus under those conditions. –because crash failures of processes are masked by replacing a crashed process with a new process whose state is set from information saved in permanent storage and information held by other processes.

21 Operations for two-phase commit protocol  participant interface - canCommit?, doCommit, doAbort coordinator interface - haveCommitted, getDecision canCommit?(trans)-> Yes / No Call from coordinator to participant to ask whether it can commit a transaction. Participant replies with its vote. doCommit(trans) Call from coordinator to participant to tell participant to commit its part of a transaction. doAbort(trans) Call from coordinator to participant to tell participant to abort its part of a transaction. haveCommitted(trans, participant) Call from participant to coordinator to confirm that it has committed the transaction. getDecision(trans) -> Yes / No Call from participant to coordinator to ask for the decision on a transaction after it has voted Yes but has still had no reply after some delay. Used to recover from server crash or delayed messages. Figure 13.4 This is a request with a reply These are asynchronous requests to avoid delays Asynchronous request

22 The two-phase commit protocol Figure 13.5 Phase 1 (voting phase): 1. The coordinator sends a canCommit? request to each of the participants in the transaction. 2. When a participant receives a canCommit? request it replies with its vote (Yes or No) to the coordinator. Before voting Yes, it prepares to commit by saving objects in permanent storage. If the vote is No the participant aborts immediately. Phase 2 (completion according to outcome of vote): 3. The coordinator collects the votes (including its own). w(a)If there are no failures and all the votes are Yes the coordinator decides to commit the transaction and sends a doCommit request to each of the participants. w(b)Otherwise the coordinator decides to abort the transaction and sends doAbort requests to all participants that voted Yes. 4. Participants that voted Yes are waiting for a doCommit or doAbort request from the coordinator. When a participant receives one of these messages it acts accordingly and in the case of commit, makes a haveCommitted call as confirmation to the coordinator.

Two-Phase Commit Protocol  canCommit? yes doCommit no doAbort canCommit?

TimeOut Protocol At step 2 and 3 no commit decision made OK to abort Coordinator will either not collect all commit votes or will vote for abort

TimeOut Protocol At step 4 o cohort cannot communicate with coordinator o Coordinator mayhave decided o Cohort must block until communication re- established o Might ask other cohorts

Restart Protocol  canCommit? yes doCommit no doAbort canCommit? If the site Has decided, it just picks up from where it left off Is a cohort that had not voted, it decides abort Is a cordinator that has not decided, it decides abort A cohort that crashed after voting commit, it must block until it discovers

Blocking  canCommit? yes doCommit no doAbort canCommit? Blocking can occur if: Coordinatoor crashes Cohort cannot communicate with coordinator Between 2 and 4

Three-Phase Commit Protocol  canCommit? yes precommit no doAbort canCommit? ack commit 5 6

Three-phase commit protocol 29

30 Performance of the two-phase commit protocol  if there are no failures, the 2PC involving N participants requires – N canCommit? messages and replies, followed by N doCommit messages.  the cost in messages is proportional to 3N, and the cost in time is three rounds of messages.  The haveCommitted messages are not counted –there may be arbitrarily many server and communication failures –2PC is is guaranteed to complete eventually, but it is not possible to specify a time limit within which it will be completed  delays to participants in uncertain state  some 3PCs designed to alleviate such delays they require more messages and more rounds for the normal case

Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct use by individual teachers. It may not be included in any product or employed in any service without the written permission of the authors. Viewing: These slides must be viewed in slide show mode. Teaching material based on Distributed Systems: Concepts and Design, Edition 3, Addison-Wesley Distributed Systems Course Distributed transactions 13.1 Introduction 13.2Flat and nested distributed transactions 13.3Atomic commit protocols 13.4Concurrency control in distributed transactions 13.5Distributed deadlocks 13.6Transaction recovery

Two-phase commit protocol for nested transactions  Recall Fig 13.1b, top-level transaction T and subtransactions T 1, T 2, T 11, T 12, T 21, T 22  A subtransaction starts after its parent and finishes before it  When a subtransaction completes, it makes an independent decision either to commit provisionally or to abort. –A provisional commit is not the same as being prepared: it is a local decision and is not backed up on permanent storage. –If the server crashes subsequently, its replacement will not be able to carry out a provisional commit.  A two-phase commit protocol is needed for nested transactions –it allows servers of provisionally committed transactions that have crashed to abort them when they recover.