Download presentation
Presentation is loading. Please wait.
Published byAllan Calfee Modified over 10 years ago
1
Life after CAP Ali Ghodsi alig@cs.berkeley.edu
2
CAP conjecture [reminder] Can only have two of: – Consistency – Availability – Partition-tolerance Examples – Databases, 2PC, centralized algo (C & A) – Distributed databases, majority protocols (C & P) – DNS, Bayou (A & P)
3
CAP theorem Formalization by Gilbert & Lynch What does impossible mean? – There exist an execution which violates one of CAP – not possible to guarantee that an algorithm has all three at all times Shard data with different CAP tradeoffs Detect partitions and weaken consistency
4
Partition-tolerance & availability What is partition-tolerance? – Consistency and Availability are provided by algo – Partitions are external events (scheduler/oracle) Partition-tolerance is really a failure model Partition-tolerance equivalent with omissions In the CAP theorem – Proof rests on partitions that never heal – Datacenters can guarantee recovery of partitions! Can guarantee that conflict resolution eventually happens
5
Availability In CAP theorem – ”Eventually get a response” Too strong – Availability typically probabilistic, e.g. 99.999% Too weak – Availability typically with SLOs, e.g. response within 1 sec
6
How do we ensure consistency Main technique to be consistent – Quorum principle – Example: Majority quorums Always write to and read from a majority of nodes At least one node knows most recent value WRITE(v) READ v majority(9)=5
7
Quorum Principle Majority Quorum – Pro: tolerate up to N/2 -1 crashes – Con: Have to read/write N/2 +1 values Read/write quorums (Dynamo, ZooKeeper, Chain Repl) – Read R nodes, Rrite W nodes, s.t. R + W > N (W > N/2) – Pro: adjust performance of reads/writes – Con: availability can suffer Maekwa Quorum – Arrange nodes in a MxM grid – Write to row+col, read cols (always overlap) – Pro: Only need to read/write O( sqrt(N) ) nodes – Con: Tolerate at most O( sqrt(N) ) crashes (reconfiguration) 7 P1P2P3 P4P5P6 P7P8P9
8
Probabilistic Quorums Quorum size α√N, ( α > 1) intersects with probability 1-exp(α 2 ) – Example: N=16 nodes, quorum size 7, intersects 95%, tolerates 9 failures – Maekwa: N=16 nodes, quorum size 7, intersects 100%, tolerates 4 failures – Pro: Small quorums, high fault-tolerance – Con: Could fail to intersect, N usually large 8
9
Quorums and CAP With quorums we can get – C & P: partition can make quorum unavailable – C & A: no-partition ensures availability and atomicity Faced decision when fail to get quorum [brewer’11] – Sacrifice availability by waiting for merger – Sacrifice atomicity by ignoring the quorum Can we get CAP for weaker consistency?
10
What does atomicity really mean? Linearization Points – Read ops appear as if immediately happened at all nodes at time between invocation and response – Write ops appear as if immediately happened at all nodes at time between invocation and response P3P3 P2P2 W(5) W(6) R P1P1 R invocation response
11
Definition of Atomicity Linearization Points – Read ops appear as if immediately happened at all nodes at time between invocation and response – Write ops appear as if immediately happened at all nodes at time between invocation and response P3P3 P2P2 W(5) W(6) R:5 P1P1 R:6 atomic
12
Definition of Atomicity P3P3 P2P2 W(5) W(6) R:6 P1P1 atomic R:5 P3P3 P2P2 W(5) W(6) R:6 P1P1 not atomic
13
Atomicity too strong? P3P3 P2P2 W(5) W(6) R:6 P1P1 R:5 not atomic Linearization points too strong? – Why not just have R:5 appear atomically right after W(5)? – Lamport: ”If P 2 ’s operator phones P 1 and tells her I just read 6”
14
Atomicity too strong? P3P3 P2P2 W(5) W(6) R:6 P1P1 R:5 not atomic sequentially consistent Sequential consistency – Weaker than atomicity – Sequential consistency removes this ”real-time” requirement – Any global ordering OK as long as they respect local ordering – Does Gilbert’s proof fall apart for sequential consistency? Causal memory – Weaker than sequential – No need to have global view, each process different view – Local, read/writes immediately return to caller – CAP theorem does not apply to causal memory P2P2 W(1) P1P1 R:0 W(0) R:1 causally consistent
15
Going really weak Eventual consistency – When network non-partitioned, all nodes eventually have the same value – I.e. don’t be ”consistent” at all times, but only after partitions heal! Based on powerful technique: gossipping – Periodically exchange ”logs” with one random node – Exchange must be constant-sized packets – Set reconciliation, merkle trees, etc – Use (clock, node_id) to break ties of events in log Properties of gossipping – All nodes will have the same value in O(log N) time – No positive-feedback cycles that congest the network
16
BASE Catch all for any consistency model C’ that enables C’-A-P – Eventual consistency – PRAM consistency – Causal consistency Main ingredients – Stale data – Soft-state (regenerateable state) – Approximate answers
17
Summary No need to ensure CAP at all times – Switch between algorithms or satisfy subset at different times Weaken consistency model – Choose weaker consistency: Causal memory (relatively strong) work around CAP – Only be consistent when network isn’t partitioned: Eventual consistency (very weak) works around CAP Weaken partition-tolerance – Some environments never partition, e.g. datacenters – Tolerate unavailability in small quorums – Some env. have recovery guarantees (partitions heal within X hours), perform conflict resolution
18
Related Work (ignored in talk) PRAM consistency (Pipelined RAM) – Weaker than causal and non-blocking Eventual Linearizability (PODC’10) – Becomes atomic after quiescent periods Gossipping & set reconciliation – Lots of related work
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.