Strong Consistency & CAP Theorem

Slides:



Advertisements
Similar presentations
High throughput chain replication for read-mostly workloads
Advertisements

Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Authored by: Seth Gilbert and Nancy Lynch Presented by:
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Life after CAP Ali Ghodsi CAP conjecture [reminder] Can only have two of: – Consistency – Availability – Partition-tolerance Examples.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Strong Consistency and Agreement COS 461: Computer Networks Spring 2011 Mike Freedman 1 Jenkins,
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
IBM Haifa Research 1 The Cloud Trade Off IBM Haifa Research Storage Systems.
Fault Tolerance via the State Machine Replication Approach Favian Contreras.
CAP + Clocks Time keeps on slipping, slipping…. Logistics Last week’s slides online Sign up on Piazza now – No really, do it now Papers are loaded in.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Hwajung Lee.  Improves reliability  Improves availability ( What good is a reliable system if it is not available?)  Replication must be transparent.
CSE 486/586 Distributed Systems Consistency --- 3
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586 Distributed Systems Consistency --- 2
Distributed Systems – Paxos
CSE 486/586 Distributed Systems Consistency --- 2
CSE 486/586 Distributed Systems Consistency --- 1
Consistency and Replication
Strong Consistency & CAP Theorem
Replication State Machines via Primary-Backup
Distributed Transactions and Spanner
Consistency and Replication
Strong Consistency & CAP Theorem
View Change Protocols and Reconfiguration
COS 418: Advanced Computer Systems Lecture 5 Michael Freedman
MVCC and Distributed Txns (Spanner)
Replication and Consistency
Strong consistency and consensus
CSE 486/586 Distributed Systems Consistency --- 3
Implementing Consistency -- Paxos
Consistency Models.
Replication and Consistency
EECS 498 Introduction to Distributed Systems Fall 2017
Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
EECS 498 Introduction to Distributed Systems Fall 2017
CSE 486/586 Distributed Systems Consistency --- 1
Replication Improves reliability Improves availability
EECS 498 Introduction to Distributed Systems Fall 2017
Fault-tolerance techniques RSM, Paxos
PERSPECTIVES ON THE CAP THEOREM
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Scalable Causal Consistency
CAP Theorem and Consistency Models
Lecture 21: Replication Control
Atomic Commit and Concurrency Control
EEC 688/788 Secure and Dependable Computing
Concurrency Control II and Distributed Transactions
Causal Consistency and Two-Phase Commit
Transaction Properties: ACID vs. BASE
View Change Protocols and Reconfiguration
COS 418: Distributed Systems Lecture 16 Wyatt Lloyd
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Strong Consistency & CAP Theorem
Consistency and Replication
Global Distribution.
Lecture 21: Replication Control
CSE 486/586 Distributed Systems Consistency --- 2
Implementing Consistency -- Paxos
CSE 486/586 Distributed Systems Consistency --- 3
CSE 486/586 Distributed Systems Consistency --- 1
Concurrency control (OCC and MVCC)
Replication and Consistency
Presentation transcript:

Strong Consistency & CAP Theorem COS 418: Distributed Systems Lecture 13 Michael Freedman

Consistency models 2PC / Consensus Paxos / Raft Eventual consistency Dynamo

Consistency in Paxos/Raft add jmp mov shl Log Consensus Module State Machine Processes may be running on different machines Fault-tolerance / durability: Don’t lose operations Consistency: Ordering between (visible) operations

Correct consistency model? A B Processes may be running on different machines Let’s say A and B send an op. All readers see A → B ? All readers see B → A ? Some see A → B and others B → A ?

Paxos/RAFT has strong consistency Provide behavior of a single copy of object: Read should return the most recent write Subsequent reads should return same value, until next write Telephone intuition: Alice updates Facebook post Alice calls Bob on phone: “Check my Facebook post!” Bob read’s Alice’s wall, sees her post

Strong Consistency? write(A,1) success 1 read(A) Phone call: Ensures happens-before relationship, even through “out-of-band” communication

Strong Consistency? write(A,1) success 1 read(A) One cool trick: Delay responding to writes/ops until properly committed

Strong Consistency? This is buggy! write(A,1) success 1 read(A) committed Isn’t sufficient to return value of third node: It doesn’t know precisely when op is “globally” committed Instead: Need to actually order read operation

Order all operations via (1) leader, (2) consensus Strong Consistency! write(A,1) success 1 read(A) Order all operations via (1) leader, (2) consensus

Strong consistency = linearizability Linearizability (Herlihy and Wang 1991) All servers execute all ops in some identical sequential order Global ordering preserves each client’s own local ordering Global ordering preserves real-time guarantee All ops receive global time-stamp using a sync’d clock If tsop1(x) < tsop2(y), OP1(x) precedes OP2(y) in sequence If ts(a) < ts (b), are either linearizable? Does ts(a) have to be before ts(b)? If concurrent, both just need to agree (tie break via some other thing (processor ID)?) Once write completes, all later reads (by wall-clock start time) should return value of that write or value of later write. Once read returns particular value, all later reads should return that value or value of later write.

Intuition: Real-time ordering write(A,1) success 1 committed read(A) Once write completes, all later reads (by wall-clock start time) should return value of that write or value of later write. Once read returns particular value, all later reads should return that value or value of later write.

Weaker: Sequential consistency Sequential = Linearizability – real-time ordering All servers execute all ops in some identical sequential order Global ordering preserves each client’s own local ordering If ts(a) < ts (b), are either linearizable? Does ts(a) have to be before ts(b)? If concurrent, both just need to agree (tie break via some other thing (processor ID)?) With concurrent ops, “reordering” of ops (w.r.t. real-time ordering) acceptable, but all servers must see same order e.g., linearizability cares about time sequential consistency cares about program order

Sequential Consistency write(A,1) success read(A) In example, system orders read(A) before write(A,1)

Valid Sequential Consistency? x  Why? Because P3 and P4 don’t agree on order of ops. Doesn’t matter when events took place on diff machine, as long as proc’s AGREE on order. What if P1 did both W(x)a and W(x)b? Neither valid, as (a) doesn’t preserve local ordering

Tradeoffs are fundamental? 2PC / Consensus Paxos / Raft Eventual consistency Dynamo

“CAP” Conjection for Distributed Systems From keynote lecture by Eric Brewer (2000) History: Eric started Inktomi, early Internet search site based around “commodity” clusters of computers Using CAP to justify “BASE” model: Basically Available, Soft- state services with Eventual consistency Popular interpretation: 2-out-of-3 Consistency (Linearizability) Availability Partition Tolerance: Arbitrary crash/network failures

CAP Theorem: Proof Not consistent Gilbert, Seth, and Nancy Lynch. "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services." ACM SIGACT News 33.2 (2002): 51-59.

CAP Theorem: Proof Not available Gilbert, Seth, and Nancy Lynch. "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services." ACM SIGACT News 33.2 (2002): 51-59.

CAP Theorem: Proof Not partition tolerant Gilbert, Seth, and Nancy Lynch. "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services." ACM SIGACT News 33.2 (2002): 51-59.

CAP Theorem: AP or CP Not partition tolerant Criticism: It’s not 2-out-of-3 Can’t “choose” no partitions So: AP or CP

More tradeoffs L vs. C L and C are fundamentally at odds Low-latency: Speak to fewer than quorum of nodes? 2PC: write N, read 1 RAFT: write ⌊N/2⌋ + 1, read ⌊N/2⌋ + 1 General: |W| + |R| > N L and C are fundamentally at odds “C” = linearizability, sequential, serializability (more later)

PACELC If there is a partition (P): Else (no partition) How does system tradeoff A and C? Else (no partition) How does system tradeoff L and C? Is there a useful system that switches? Dynamo: PA/EL “ACID” dbs: PC/EC http://dbmsmusings.blogspot.com/2010/04/problems-with-cap-and-yahoos-little.html

More linearizable replication algorithms

Chain replication Writes to head, which orders all writes Processes may be running on different machines Writes to head, which orders all writes When write reaches tail, implicitly committed rest of chain Reads to tail, which orders reads w.r.t. committed writes

Chain replication for read-heavy (CRAQ) Processes may be running on different machines Goal: If all replicas have same version, read from any one Challenge: They need to know they have correct version

Chain replication for read-heavy (CRAQ) Processes may be running on different machines Replicas maintain multiple versions of objects while “dirty”, i.e., contain uncommitted writes Commitment sent “up” chain after reaches tail

Chain replication for read-heavy (CRAQ) Processes may be running on different machines Read to dirty object must check with tail for proper version This orders read with respect to global order, regardless of replica that handles

Performance: CR vs. CRAQ 1x- 3x- 7x- R. van Renesse and F. B. Schneider. Chain replication for supporting high throughput and availability. OSDI 2004. J. Terrace and M. Freedman. Object Storage on CRAQ: High-throughput chain replication for read-mostly workloads. USENIX ATC 2009.

Wednesday lecture Causal Consistency