Download presentation
Presentation is loading. Please wait.
Published byMervin Hodge Modified over 9 years ago
1
CAP + Clocks Time keeps on slipping, slipping…
2
Logistics Last week’s slides online Sign up on Piazza now – No really, do it now Papers are loaded in HotCRP – Sign up for account at http://cs7780.ccs.neu.eduhttp://cs7780.ccs.neu.edu – I will make you a PC member so you can enter “reviews” – Do review preferences
3
Logistics (2) Some changes to the schedule – NENS field trip – Moved other stuff down Don’t forget about project proposals – Details on content on the website
4
Logistics (3) Audit policy – You must present at least one paper – You must read all of them and contribute to the discussion (no dead weight) – No project or project presentation requirement
5
CAP “Theorem” and Implications
6
Desirable properties of data systems ACID transactions – Atomic: Either all of the transaction happens or none – Consistent: After a transaction, all uniqueness properties are maintained – Isolation: One transaction does not affect another – Durable: Once complete, the transaction remains complete
7
Move to distributed environment CAP – Consistency: updates to data are applied to all or none – Availability: must be able to access all data – Partitions: failures can partition network into subtrees Why should we want all three? Is it possible?
8
Eric Brewer’s CAP “theorem” The Brewer Conjecture – No system can simultaneously achieve C and A and P – Implication: must perform tradeoffs to obtain 2 at the expense of the 3rd – Never published, but widely recognized
9
CAP Examples 9 Write (key, 1) Replicate (key, 2) Read Availability – Client can always read Impact of partitions – Not consistent (key, 1) Write (key, 1) Replicate (key, 2) Read Consistency Reads always return accurate results Impact of partitions No availability Error: Service Unavailable A+P C+P What about C+A? Doesn’t really exist Partitions are always possible Tradeoffs must be made to cope with them
10
Proof of Theorem Strong proof when no clocks – No way to reliably stay consistent What changes with clocks?
11
Weak consistency Use clocks, impose partial ordering – Allows a form of “eventual consistency” Key result that drives many of today’s systems – “return most of the data most of the time” – “usually correct results” Where is this not ok?
12
12 years later: Have the “rules” changed? What if partitions are rare? – Then most of the time C/A are maintained CAP don’t need to be binary properties Choosing between C or A is not a fixed decision, can change over time
13
How long is the partition? We can’t “choose” to never have P – But we can put constraints on how the system operates within some threshold time – Primary partitions enable partial availability – But require knowing invariants to reconcile consistency after partition is done
14
So what do you do when there is a partition? Detection Enter “partition mode” – Why? Recover after partition is done – What are the challenges?
15
Recovery We’ve all done this – (CVS, SVN, git, hg) How does it work? – Commutative operations – Commutative replicated data types Special data types will come up again and again
16
Exercise Let’s match popular applications to consistency/availability models
17
Time and Clocks
18
Time It’s pretty important, eh? It marches on Only goes forward …
19
Time It defines a distributed system “…message transmission time is not negligible compared to the time between events in a single process”
20
Partial time ordering No way for all clocks to have exactly the same time all the time – Why? Instead, focus on a partial ordering – “happens before” relationship – use it to build a (arbitrary) total ordering – this does not use physical time What is the difference?
21
Scalar Clocks General technique described by Leslie Lamport – Explicitly maps out time as a sequence of version numbers at each participant (from 1978!!) Key idea – Each process maintains a counter (“clock”) – When it sends a message, it includes the counter – When receiving message, each process updates its clock to be greater than timestamp in message 21
22
Nice applications of Logical Clocks Fully distributed – Mutual exclusion – Fairness – Liveness
23
Physical clocks What do they give us?
24
Vector clocks The idea – A vector clock is a list of (node, counter) pairs – Every version of every object has one vector clock Detecting causality – If all of A’s counters are less-than-or-equal to all of B’s counters, then A is ancestor of B, and can be forgotten – Intuition: A was applied to every node before B was applied to any node. Therefore, A precedes B Use vector clocks to perform syntactic reconciliation
25
Simple Vector Clock Example Key features – Writes always succeed – Reconcile on read Possible issues – Large vector sizes – Need to be trimmed Solution – Add timestamps – Trim oldest nodes – Can introduce error D1 ([Sx, 1]) D2 ([Sx, 2]) D3 ([Sx, 2], [Sy, 1]) D4 ([Sx, 2], [Sz, 1]) D5 ([Sx, 2], [Sy, 1], [Sz, 1]) D5 ([Sx, 2], [Sy, 1], [Sz, 1]) Write by Sx Write by SzWrite by Sy Read reconcile 25
26
Vector clock properties Isomorphic: Timestamps can be used to get causal ordering (and vice versa) Strong consistency Event counting Useful for distributed debugging, causal ordering, etc
27
Matrix time Add another dimension – Everyone has a consistent view of everyone else’s clock – Can be used to “prune” stale information
28
Take-aways Consistency, Availability, Partitions – Can we do them all together? – Depends on goals, time Time is relative – Order matters – Logical ordering and causality can give us total ordering consistency
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.