Download presentation
Presentation is loading. Please wait.
Published byRandolf Flowers Modified over 9 years ago
1
Conflict-free Replicated Data Types MARC SHAPIRO, NUNO PREGUIÇA, CARLOS BAQUERO AND MAREK ZAWIRSKI Presented by: Ron Zisman
2
Motivation Replication and Consistency - essential features of large distributed systems such as www, p2p, and cloud computing Lots of replicas Great for fault-tolerance and read latency × Problematic when updates occur Slow synchronization Conflicts in case of no synchronization 2
3
Motivation We look for an approach that: supports Replication guarantees Eventual Consistency is Fast and Simple Conflict-free objects = no synchronization whatsoever Is this practical? 3
4
Contributions Theory Strong Eventual Consistency (SEC) A solution to the CAP problem Formal definitions Two sufficient conditions Strong equivalence between the two Incomparable to sequential consistency Practice CRDTs = Convergent or Commutative Replicated Data Types Counters Set Directed graph 4
5
Strong Consistency Ideal consistency: all replicas know about the update immediately after it executes Preclude conflicts Replicas update in the same total order Any deterministic object Consensus Serialization bottleneck Tolerates < n/2 faults Correct, but doesn’t scale 5
6
Strong Consistency 6 Ideal consistency: all replicas know about the update immediately after it executes Preclude conflicts Replicas update in the same total order Any deterministic object Consensus Serialization bottleneck Tolerates < n/2 faults Correct, but doesn’t scale
7
Strong Consistency 7 Ideal consistency: all replicas know about the update immediately after it executes Preclude conflicts Replicas update in the same total order Any deterministic object Consensus Serialization bottleneck Tolerates < n/2 faults Correct, but doesn’t scale
8
Strong Consistency 8 Ideal consistency: all replicas know about the update immediately after it executes Preclude conflicts Replicas update in the same total order Any deterministic object Consensus Serialization bottleneck Tolerates < n/2 faults Correct, but doesn’t scale
9
Strong Consistency 9 Ideal consistency: all replicas know about the update immediately after it executes Preclude conflicts Replicas update in the same total order Any deterministic object Consensus Serialization bottleneck Tolerates < n/2 faults Correct, but doesn’t scale
10
Eventual Consistency Update local and propagate No foreground synch Eventual, reliable delivery On conflict Arbitrate Roll back Consensus moved to background Better performance × Still complex 10
11
Eventual Consistency 11 Update local and propagate No foreground synch Eventual, reliable delivery On conflict Arbitrate Roll back Consensus moved to background Better performance × Still complex
12
Eventual Consistency 12 Update local and propagate No foreground synch Eventual, reliable delivery On conflict Arbitrate Roll back Consensus moved to background Better performance × Still complex
13
Eventual Consistency 13 Update local and propagate No foreground synch Eventual, reliable delivery On conflict Arbitrate Roll back Consensus moved to background Better performance × Still complex
14
Eventual Consistency 14 Update local and propagate No foreground synch Eventual, reliable delivery On conflict Arbitrate Roll back Consensus moved to background Better performance × Still complex
15
Eventual Consistency 15 Update local and propagate No foreground synch Eventual, reliable delivery On conflict Arbitrate Roll back Consensus moved to background Better performance × Still complex
16
Eventual Consistency Reconcile 16 Update local and propagate No foreground synch Eventual, reliable delivery On conflict Arbitrate Roll back Consensus moved to background Better performance × Still complex
17
Strong Eventual Consistency Update local and propagate No synch Eventual, reliable delivery No conflict deterministic outcome of concurrent updates No consensus: ≤ n-1 faults Solves the CAP problem 17
18
Strong Eventual Consistency 18 Update local and propagate No synch Eventual, reliable delivery No conflict deterministic outcome of concurrent updates No consensus: ≤ n-1 faults Solves the CAP problem
19
Strong Eventual Consistency 19 Update local and propagate No synch Eventual, reliable delivery No conflict deterministic outcome of concurrent updates No consensus: ≤ n-1 faults Solves the CAP problem
20
Strong Eventual Consistency 20 Update local and propagate No synch Eventual, reliable delivery No conflict deterministic outcome of concurrent updates No consensus: ≤ n-1 faults Solves the CAP problem
21
Strong Eventual Consistency 21 Update local and propagate No synch Eventual, reliable delivery No conflict deterministic outcome of concurrent updates No consensus: ≤ n-1 faults Solves the CAP problem
22
Definition of EC Eventual delivery: An update delivered at some correct replica is eventually delivered to all correct replicas Termination: All method executions terminate Convergence: Correct replicas that have delivered the same updates eventually reach equivalent state Doesn’t preclude roll backs and reconciling 22
23
Definition of SEC 23 Eventual delivery: An update delivered at some correct replica is eventually delivered to all correct replicas Termination: All method executions terminate Strong Convergence: Correct replicas that have delivered the same updates have equivalent state
24
System model System of non- byzantine processes interconnected by an asynchronous network Partition-tolerance and recovery What are the two simple conditions that guarantee strong convergence? 24
25
Query Client sends the query to any of the replicas Local at source replica Evaluate synchronously, no side effects 25
26
Query Client sends the query to any of the replicas Local at source replica Evaluate synchronously, no side effects 26
27
Query Client sends the query to any of the replicas Local at source replica Evaluate synchronously, no side effects 27
28
State-based approach 28 payload set initial state query update merge
29
State-based replication 29
30
State-based replication 30
31
State-based replication 31
32
State-based replication 32
33
State-based replication 33
34
Semi-lattice 34
35
If: then replicas converge to LUB of last values 35 payload type forms a semi-lattice updates are increasing merge computes Least Upper Bound
36
Operation-based approach 36 payload set initial state query prepare-update effect-update delivery precondition
37
Operation-based replication 37 Local at source Precondition, compute Broadcast to all replicas
38
Operation-based replication 38 Local at source Precondition, compute Broadcast to all replicas Eventually, at all replicas: Downstream precondition Assign local replica
39
Operation-based replication 39 Local at source Precondition, compute Broadcast to all replicas Eventually, at all replicas: Downstream precondition Assign local replica
40
If: then replicas converge Liveness: all replicas execute all operations in delivery order where the downstream precondition (P) is true Safety: concurrent operations all commute 40
41
A state-based object can emulate an operation-based object, and vice-versa Use state-based reasoning and then covert to operation based for better efficiency 41
42
Comparison State-based Update ≠ merge operation Simple data types State includes preceding updates; no separate historical information Inefficient if payload is large File systems (NFS, Dynamo) Operation-based Update operation Higher level, more complex More powerful, more constraining Small messages Collaborative editing (Treedoc), Bayou, PNUTS State-based or op-based, as convenient 42
43
SEC is incomparable to sequential consistency 43
44
SEC is incomparable to sequential consistency 44
45
Example CRDTs Multi-master counter Observed-Remove Set Directed Graph 45
46
Multi-master counter 46
47
Multi-master counter 47
48
Set design alternatives Sequential specification: {true} add(e) {e ∈ S} {true} remove(e) {e ∈ S} Concurrent: {true} add(e) ║ remove(e) {???} linearizable? error state? last writer wins? add wins? remove wins? 48
49
Observed-Remove Set 49
50
Observed-Remove Set 50
51
Observed-Remove Set 51
52
Observed-Remove Set 52
53
Observed-Remove Set 53
54
Observed-Remove Set 54
55
Observed-Remove Set 55
56
Observed-Remove Set 56
57
OR-Set + Snapshot 57
58
Sharded OR-Set Very large objects Independent shards Static: hash, Dynamic: consensus Statically-Sharded CRDT Each shard is a CRDT Update: single shard No cross-object invariants A combination of independent CRDTs remains a CRDT Statically-Sharded OR-Set Combination of smaller OR-Sets Consistent snapshots: clock cross shards 58
59
Directed Graph – Motivation Design a web search engine compute page rank by a directed graph Efficiency and scalability Asynchronous processing Responsiveness Incremental processing, as fast as each page is crawled 59 Operations Find new pages: add vertex Parse page links: add/remove arc Add URLs of linked pages to be crawled: add vertex Deleted pages: remove vertex (lookup masks incident arcs) Broken links allowed: add arc works even if tail vertex doesn’t exist
60
Graph design alternatives 60
61
Directed Graph (op-based) Payload: OR-Set V (vertices), OR-Set A (arcs) 61
62
Directed Graph (op-based) Payload: OR-Set V (vertices), OR-Set A (arcs) 62
63
Summary Principled approach Strong Eventual Consistency Two sufficient conditions: State: monotonic semi-lattice Operation: commutativity Useful CRDTs Multi-master counter, OR-Set, Directed Graph 63
64
Future Work Theory Class of computations accomplished by CRDTs Complexity classes of CRDTs Classes of invariants supported by a CRDT CRDTs and self-stabilization, aggregation, and so on Practice Library implementation of CRDTs Supporting non-critical synchronous operations (commiting a state, global reset, etc) Sharding 64
65
Extras: MV-Register and the Shopping Cart Anomaly 65 MV-Register ≈ LWW-Set Register Payload = { (value, versionVector) } assign: overwrite value, vv++ merge: union of every element in each input set that is not dominated by an element in the other input set A more recent assignment overwrites an older one Concurrent assignments are merged by union (VC merge)
66
Extras: MV-Register and the Shopping Cart Anomaly 66 Shopping cart anomaly deleted element reappears MV-Register does not behave like a set Assignment is not an alternative to proper add and remove operations
67
The problem with eventual consistency jokes is that you can't tell who doesn't get it from who hasn't gotten it. 67
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.