Distributed Computing Group The Consensus Problem Roger Wattenhofer a lot of kudos to Maurice Herlihy and Costas Busch for some of their slides.

Slides:



Advertisements
Similar presentations
Introduction Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit TexPoint fonts used in EMF. Read the TexPoint manual.
Advertisements

The Art of Multiprocessor Programming Nir Shavit, Ori Shalev CS Spring 2007 (Based on the book by Herlihy and Shavit)
Impossibility of Distributed Consensus with One Faulty Process
1 © R. Guerraoui The Limitations of Registers R. Guerraoui Distributed Programming Laboratory.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Consensus Steve Ko Computer Sciences and Engineering University at Buffalo.
Teaser - Introduction to Distributed Computing
Mutual Exclusion The Art of Multiprocessor Programming Spring 2007.
6.852: Distributed Algorithms Spring, 2008 Class 7.
CPSC 668Set 18: Wait-Free Simulations Beyond Registers1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
Course Missive CS2750 Spring In Deo Speramus Brown.
Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©
Announcements. Midterm Open book, open note, closed neighbor No other external sources No portable electronic devices other than medically necessary medical.
Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 13: Impossibility of Consensus All slides © IG.
Consensus Hao Li.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©
The Relative Power of Synchronization Operations The Art of Multiprocessor Programming Spring 2007.
Sergio Rajsbaum 2006 Lecture 3 Introduction to Principles of Distributed Computing Sergio Rajsbaum Math Institute UNAM, Mexico.
The Relative Power of Synchronization Operations Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Distributed Systems – Roger Wattenhofer –6/1 Part 2 Distributed Systems 2009.
ETH Zurich – Distributed Computing – Roger Wattenhofer Part 2: Fault-Tolerance Distributed Systems 2010.
Impossibility of Distributed Consensus with One Faulty Process Michael J. Fischer Nancy A. Lynch Michael S. Paterson Presented by: Oren D. Rubin.
Introduction Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Modified by Rajeev Alur for CIS 640 at Penn, Spring.
CPSC 668Set 11: Asynchronous Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 11: Asynchronous Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
1 More on Distributed Coordination. 2 Who’s in charge? Let’s have an Election. Many algorithms require a coordinator. What happens when the coordinator.
Multiprocess Synchronization Algorithms ( ) Lecturer: Danny Hendler.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 19: Paxos All slides © IG.
On the Cost of Fault-Tolerant Consensus When There are no Faults Idit Keidar & Sergio Rajsbaum Appears in SIGACT News; MIT Tech. Report.
Halting Problem. Background - Halting Problem Common error: Program goes into an infinite loop. Wouldn’t it be nice to have a tool that would warn us.
1 © R. Guerraoui Seth Gilbert Professor: Rachid Guerraoui Assistants: M. Kapalka and A. Dragojevic Distributed Programming Laboratory.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Lecture 8-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 16, 2010 Lecture 8 The Consensus.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 11: Asynchronous Consensus 1.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 18: Wait-Free Simulations Beyond Registers 1.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 3 (26/01/2006) Instructor: Haifeng YU.
Consensus and Its Impossibility in Asynchronous Systems.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
1 Chapter 9 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2014 Synchronization Algorithms and Concurrent Programming Synchronization.
DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch Set 11: Asynchronous Consensus 1.
1 Consensus Hierarchy Part 1. 2 Consensus in Shared Memory Consider processors in shared memory: which try to solve the consensus problem.
CS294, Yelick Consensus revisited, p1 CS Consensus Revisited
1 Consensus Hierarchy Part 2. 2 FIFO (Queue) FIFO Object headtail.
Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Mutual Exclusion & Leader Election Steve Ko Computer Sciences and Engineering University.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
Multiprocessor Synchronization Algorithms ( ) Lecturer: Danny Hendler.
Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.
Alternating Bit Protocol S R ABP is a link layer protocol. Works on FIFO channels only. Guarantees reliable message delivery with a 1-bit sequence number.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 9 Instructor: Haifeng YU.
Agenda  Quick Review  Finish Introduction  Java Threads.
CSE 486/586 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
The Relative Power of Synchronization Operations Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
The consensus problem in distributed systems
CSE 486/586 Distributed Systems Leader Election
When Is Agreement Possible
Part 2: Fault-Tolerance Distributed Systems 2011
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Alternating Bit Protocol
Distributed Consensus
CS Multiprocessor Synchronization
The Relative Power of Synchronization Methods
The Relative Power of Synchronization Operations
Presentation transcript:

Distributed Computing Group The Consensus Problem Roger Wattenhofer a lot of kudos to Maurice Herlihy and Costas Busch for some of their slides

Distributed Computing Group Roger Wattenhofer2 Sequential Computation memory object thread

Distributed Computing Group Roger Wattenhofer3 Concurrent Computation memory object threads

Distributed Computing Group Roger Wattenhofer4 Asynchrony Sudden unpredictable delays –Cache misses (short) –Page faults (long) –Scheduling quantum used up (really long)

Distributed Computing Group Roger Wattenhofer5 Model Summary Multiple threads –Sometimes called processes Single shared memory Objects live in memory Unpredictable asynchronous delays

Distributed Computing Group Roger Wattenhofer6 Road Map We are going to focus on principles –Start with idealized models –Look at a simplistic problem –Emphasize correctness over pragmatism –“Correctness may be theoretical, but incorrectness has practical impact”

Distributed Computing Group Roger Wattenhofer7 You may ask yourself … I’m no theory weenie - why all the theorems and proofs?

Distributed Computing Group Roger Wattenhofer8 Fundamentalism Distributed & concurrent systems are hard –Failures –Concurrency Easier to go from theory to practice than vice-versa

Distributed Computing Group Roger Wattenhofer9 The Two Generals Red army wins If both sides attack together

Distributed Computing Group Roger Wattenhofer10 Communications Red armies send messengers across valley

Distributed Computing Group Roger Wattenhofer11 Communications Messengers don’t always make it

Distributed Computing Group Roger Wattenhofer12 Your Mission Design a protocol to ensure that red armies attack simultaneously

Distributed Computing Group Roger Wattenhofer13 Real World Generals Date: Wed, 11 Dec :33: From: Friedemann Mattern To: Roger Wattenhofer Subject: Vorlesung Sie machen jetzt am Freitag, 08:15 die Vorlesung Verteilte Systeme, wie vereinbart. OK? (Ich bin jedenfalls am Freitag auch gar nicht da.) Ich uebernehme das dann wieder nach den Weihnachtsferien.

Distributed Computing Group Roger Wattenhofer14 Real World Generals Date: Mi :34 From: Roger Wattenhofer To: Friedemann Mattern Subject: Re: Vorlesung OK. Aber ich gehe nur, wenn sie diese nochmals bestaetigen... :-) Gruesse -- Roger Wattenhofer

Distributed Computing Group Roger Wattenhofer15 Real World Generals Date: Wed, 11 Dec :53: From: Friedemann Mattern To: Roger Wattenhofer Subject: Naechste Runde: Re: Vorlesung... Das dachte ich mir fast. Ich bin Praktiker und mache es schlauer: Ich gehe nicht, unabhaengig davon, ob Sie diese bestaetigen (beziehungsweise rechtzeitig erhalten). (:-)

Distributed Computing Group Roger Wattenhofer16 Real World Generals Date: Mi :01 From: Roger Wattenhofer To: Friedemann Mattern Subject: Re: Naechste Runde: Re: Vorlesung... Ich glaube, jetzt sind wir so weit, dass ich diese s in der Vorlesung auflegen werde...

Distributed Computing Group Roger Wattenhofer17 Real World Generals Date: Wed, 11 Dec :55: From: Friedemann Mattern To: Roger Wattenhofer Subject: Re: Naechste Runde: Re: Vorlesung... Kein Problem. (Hauptsache es kommt raus, dass der Prakiker am Ende der schlauere ist... Und der Theoretiker entweder heute noch auf das allerletzte Ack wartet oder wissend das das ja gar nicht gehen kann alles gleich von vornherein bleiben laesst... (:-))

Distributed Computing Group Roger Wattenhofer18 Theorem There is no non-trivial protocol that ensures the red armies attacks simultaneously

Distributed Computing Group Roger Wattenhofer19 Proof Strategy Assume a protocol exists Reason about its properties Derive a contradiction

Distributed Computing Group Roger Wattenhofer20 Proof 1.Consider the protocol that sends fewest messages 2.It still works if last message lost 3.So just don’t send it –Messengers’ union happy 4.But now we have a shorter protocol! 5.Contradicting #1

Distributed Computing Group Roger Wattenhofer21 Fundamental Limitation Need an unbounded number of messages Or possible that no attack takes place

Distributed Computing Group Roger Wattenhofer22 You May Find Yourself … I want a real-time YAFA compliant Two Generals protocol using UDP datagrams running on our enterprise-level fiber tachyion network...

Distributed Computing Group Roger Wattenhofer23 You might say I want a real-time YAFA compliant Two Generals protocol using UDP datagrams running on our enterprise-level fiber tachyion network... Yes, Ma’am, right away!

Distributed Computing Group Roger Wattenhofer24 You might say I want a real-time dot-net compliant Two Generals protocol using UDP datagrams running on our enterprise-level fiber tachyion network... Yes, Ma’am, right away! Advantage: Buys time to find another job No one expects software to work anyway Advantage: Buys time to find another job No one expects software to work anyway

Distributed Computing Group Roger Wattenhofer25 You might say I want a real-time dot-net compliant Two Generals protocol using UDP datagrams running on our enterprise-level fiber tachyion network... Yes, Ma’am, right away! Advantage: Buys time to find another job No one expects software to work anyway Disadvantage: You’re doomed Without this course, you may not even know you’re doomed Disadvantage: You’re doomed Without this course, you may not even know you’re doomed

Distributed Computing Group Roger Wattenhofer26 You might say I want a real-time YAFA compliant Two Generals protocol using UDP datagrams running on our enterprise-level fiber tachyion network... I can’t find a fault-tolerant algorithm, I guess I’m just a pathetic loser.

Distributed Computing Group Roger Wattenhofer27 You might say I want a real-time YAFA compliant Two Generals protocol using UDP datagrams running on our enterprise-level fiber tachyion network... I can’t find a fault-tolerant algorithm, I guess I’m just a pathetic loser. Advantage: No need to take course Advantage: No need to take course

Distributed Computing Group Roger Wattenhofer28 You might say I want a real-time YAFA compliant Two Generals protocol using UDP datagrams running on our enterprise-level fiber tachyion network... I can’t find a fault-tolerant algorithm, I guess I’m just a pathetic loser Advantage: No need to take course Advantage: No need to take course Disadvantage: Boss fires you, hires University St. Gallen graduate Disadvantage: Boss fires you, hires University St. Gallen graduate

Distributed Computing Group Roger Wattenhofer29 You might say I want a real-time YAFA compliant Two Generals protocol using UDP datagrams running on our enterprise-level fiber tachyion network... Using skills honed in course, I can avert certain disaster! Rethink problem spec, or Weaken requirements, or Build on different platform Using skills honed in course, I can avert certain disaster! Rethink problem spec, or Weaken requirements, or Build on different platform

Distributed Computing Group Roger Wattenhofer30 Consensus: Each Thread has a Private Input

Distributed Computing Group Roger Wattenhofer31 They Communicate

Distributed Computing Group Roger Wattenhofer32 They Agree on Some Thread’s Input 19

Distributed Computing Group Roger Wattenhofer33 Consensus is important With consensus, you can implement anything you can imagine… Examples: with consensus you can decide on a leader, implement mutual exclusion, or solve the two generals problem

Distributed Computing Group Roger Wattenhofer34 You gonna learn In some models, consensus is possible In some other models, it is not Goal of this and next lecture: to learn whether for a given model consensus is possible or not … and prove it!

Distributed Computing Group Roger Wattenhofer35 Consensus #1 shared memory n processors, with n > 1 Processors can atomically read or write (not both) a shared memory cell

Distributed Computing Group Roger Wattenhofer36 Protocol (Algorithm?) There is a designated memory cell c. Initially c is in a special state “?” Processor 1 writes its value v 1 into c, then decides on v 1. A processor j (j not 1) reads c until j reads something else than “?”, and then decides on that.

Distributed Computing Group Roger Wattenhofer37 Unexpected Delay Swapped out back at ???

Distributed Computing Group Roger Wattenhofer38 Heterogeneous Architectures ??? Pentium 286 yawn (1)

Distributed Computing Group Roger Wattenhofer39 Fault-Tolerance ???

Distributed Computing Group Roger Wattenhofer40 Consensus #2 wait-free shared memory n processors, with n > 1 Processors can atomically read or write (not both) a shared memory cell Processors might crash (halt) Wait-free implementation… huh?

Distributed Computing Group Roger Wattenhofer41 Wait-Free Implementation Every process (method call) completes in a finite number of steps Implies no mutual exclusion We assume that we have wait-free atomic registers (that is, reads and writes to same register do not overlap)

Distributed Computing Group Roger Wattenhofer42 A wait-free algorithm… There is a cell c, initially c=“?” Every processor i does the following r = Read(c); if (r == “?”) then Write(c, v i ); decide v i ; else decide r;

Distributed Computing Group Roger Wattenhofer43 Is the algorithm correct? time cell c ? ? ? ! 17!

Distributed Computing Group Roger Wattenhofer44 Theorem: No wait-free consensus ???

Distributed Computing Group Roger Wattenhofer45 Proof Strategy Make it simple –n = 2, binary input Assume that there is a protocol Reason about the properties of any such protocol Derive a contradiction

Distributed Computing Group Roger Wattenhofer46 Wait-Free Computation Either A or B “moves” Moving means –Register read –Register write A movesB moves

Distributed Computing Group Roger Wattenhofer47 The Two-Move Tree Initial state Final states

Distributed Computing Group Roger Wattenhofer48 Decision Values

Distributed Computing Group Roger Wattenhofer49 Bivalent: Both Possible 111 bivalent 10 0

Distributed Computing Group Roger Wattenhofer50 Univalent: Single Value Possible 111 univalent 10 0

Distributed Computing Group Roger Wattenhofer51 1-valent: Only 1 Possible valent 0 1

Distributed Computing Group Roger Wattenhofer52 0-valent: Only 0 possible valent 10 0

Distributed Computing Group Roger Wattenhofer53 Summary Wait-free computation is a tree Bivalent system states –Outcome not fixed Univalent states –Outcome is fixed –Maybe not “known” yet –1-Valent and 0-Valent states

Distributed Computing Group Roger Wattenhofer54 Claim Some initial system state is bivalent (The outcome is not always fixed from the start.)

Distributed Computing Group Roger Wattenhofer55 A 0-Valent Initial State All executions lead to decision of 0 0 0

Distributed Computing Group Roger Wattenhofer56 A 0-Valent Initial State Solo execution by A also decides 0 0

Distributed Computing Group Roger Wattenhofer57 A 1-Valent Initial State All executions lead to decision of 1 1 1

Distributed Computing Group Roger Wattenhofer58 A 1-Valent Initial State Solo execution by B also decides 1 1

Distributed Computing Group Roger Wattenhofer59 A Univalent Initial State? Can all executions lead to the same decision? 0 1

Distributed Computing Group Roger Wattenhofer60 State is Bivalent Solo execution by A must decide 0 Solo execution by B must decide 1 0 1

Distributed Computing Group Roger Wattenhofer61 0-valent Critical States 1-valent critical

Distributed Computing Group Roger Wattenhofer62 Critical States Starting from a bivalent initial state The protocol can reach a critical state –Otherwise we could stay bivalent forever –And the protocol is not wait-free

Distributed Computing Group Roger Wattenhofer63 From a Critical State c If A goes first, protocol decides 0 If B goes first, protocol decides 1 0-valent 1-valent

Distributed Computing Group Roger Wattenhofer64 Model Dependency So far, memory-independent! True for –Registers –Message-passing –Carrier pigeons –Any kind of asynchronous computation

Distributed Computing Group Roger Wattenhofer65 What are the Threads Doing? Reads and/or writes To same/different registers

Distributed Computing Group Roger Wattenhofer66 Possible Interactions x.read()y.read()x.write()y.write() x.read() ???? y.read() ???? x.write() ???? y.write() ????

Distributed Computing Group Roger Wattenhofer67 Reading Registers A runs solo, decides 0 B reads x 1 0 A runs solo, decides 1 c States look the same to A

Distributed Computing Group Roger Wattenhofer68 Possible Interactions x.read()y.read()x.write()y.write() x.read() no y.read() no x.write() no ?? y.write() no ??

Distributed Computing Group Roger Wattenhofer69 Writing Distinct Registers A writes yB writes x 10 c The song remains the same A writes y B writes x

Distributed Computing Group Roger Wattenhofer70 Possible Interactions x.read()y.read()x.write()y.write() x.read() no y.read() no x.write() no ? y.write() no ?

Distributed Computing Group Roger Wattenhofer71 Writing Same Registers States look the same to A A writes x B writes x 1 A runs solo, decides 1 c 0 A runs solo, decides 0 A writes x

Distributed Computing Group Roger Wattenhofer72 That’s All, Folks! x.read()y.read()x.write()y.write() x.read() no y.read() no x.write() no y.write() no

Distributed Computing Group Roger Wattenhofer73 Theorem It is impossible to solve consensus using read/write atomic registers –Assume protocol exists –It has a bivalent initial state –Must be able to reach a critical state –Case analysis of interactions Reads vs others Writes vs writes

Distributed Computing Group Roger Wattenhofer74 What Does Consensus have to do with Distributed Systems?

Distributed Computing Group Roger Wattenhofer75 We want to build a Concurrent FIFO Queue

Distributed Computing Group Roger Wattenhofer76 With Multiple Dequeuers!

Distributed Computing Group Roger Wattenhofer77 A Consensus Protocol 2-element array FIFO Queue with red and black balls 8 Coveted red ballDreaded black ball

Distributed Computing Group Roger Wattenhofer78 Protocol: Write Value to Array 01 0

Distributed Computing Group Roger Wattenhofer79 0 Protocol: Take Next Item from Queue 01 8

Distributed Computing Group Roger Wattenhofer80 01 Protocol: Take Next Item from Queue I got the coveted red ball, so I will decide my value I got the dreaded black ball, so I will decide the other’s value from the array 8

Distributed Computing Group Roger Wattenhofer81 Why does this Work? If one thread gets the red ball Then the other gets the black ball Winner can take her own value Loser can find winner’s value in array –Because threads write array before dequeuing from queue

Distributed Computing Group Roger Wattenhofer82 Implication We can solve 2-thread consensus using only –A two-dequeuer queue –Atomic registers

Distributed Computing Group Roger Wattenhofer83 Implications Assume there exists –A queue implementation from atomic registers Given –A consensus protocol from queue and registers Substitution yields –A wait-free consensus protocol from atomic registers contradiction

Distributed Computing Group Roger Wattenhofer84 Corollary It is impossible to implement a two- dequeuer wait-free FIFO queue with read/write shared memory. This was a proof by reduction; important beyond NP-completeness…