Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 7/A Consistency And Replication Modified by Dr. Gheith Abandah
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Overview Reasons for Replication Data-centric Consistency Models Continuous Consistency Consistent Ordering of Operations Client-centric Consistency Models Eventual Consistency Monotonic Reads Monotonic Writes Read Your Writes Writes Follow Reads Replica Management Replica-Server Placement Content Replication and Placement Content Distribution Consistency Protocols Continuous Consistency Primary-Based Protocols Replicated-Write Protocols Cache-Coherence Protocols Implementing Client-Centric Consistency
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Reasons for Replication Data are replicated to increase the reliability of a system. Replication for performance Scaling in numbers Scaling in geographical area Caveat Gain in performance Cost of increased bandwidth for maintaining replication
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Data-centric Consistency Models The general organization of a logical data store, physically distributed and replicated across multiple processes. Consistency model: A contract between a (distributed) data store and processes, in which the data store specifies precisely what the results of read and write operations are in the presence of concurrency.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Continuous Consistency (1) Continuous consistency is often implemented as a toolkit giving choice to the programmer. Observation We can actually talk a about a degree of consistency: replicas may differ in their numerical value replicas may differ in their relative staleness there may be differences with respect to (number and order) of performed update operations Conit Consistency unit → specifies the data unit over which consistency is to be measured.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Continuous Consistency (2) Conit (contains the variables x and y) Each replica maintains a vector clock B sends A operation [ : x := x +2] at time 5; A has made this operation permanent (cannot be rolled back)
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Continuous Consistency (3) A has three pending operations → order deviation = 3, B has 2 A has missed one operation from B, yielding a max diff of 5 units → (1;5), B has missed three operations from A → (3;6)
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Continuous Consistency (4) Choosing the appropriate granularity for a conit. An example when two replicas may differ in no more than one outstanding update (a) Large granularity: Two updates lead to update propagation.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Continuous Consistency (5) (b) Small granularity: No update propagation is needed (yet). But with larger housekeeping overheads.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Overview Reasons for Replication Data-centric Consistency Models Continuous Consistency Consistent Ordering of Operations Client-centric Consistency Models Eventual Consistency Monotonic Reads Monotonic Writes Read Your Writes Writes Follow Reads Replica Management Replica-Server Placement Content Replication and Placement Content Distribution Consistency Protocols Continuous Consistency Primary-Based Protocols Replicated-Write Protocols Cache-Coherence Protocols Implementing Client-Centric Consistency
Consistent Ordering of Operations Models from the field of concurrent programming domain: Sequential consistency Casual consistency Grouping operations Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved
Sequential Consistency (1) Example of the notation: Behavior of two processes operating on the same data item x. The horizontal axis is time.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Sequential Consistency (2) A data store is sequentially consistent when: The result of any execution is the same as if the (read and write) operations by all processes on the data store … were executed in some sequential order and … the operations of each individual process appear … in this sequence in the order specified by its program.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Sequential Consistency (3) (a) A sequentially consistent data store. (b) A data store that is not sequentially consistent (P3 and P4 are not consistent).
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Sequential Consistency (4) Example: Three concurrently-executing processes.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Sequential Consistency (5) Four valid execution sequences for the processes of the previous example. The vertical axis is time.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Causal Consistency (1) For a data store to be considered causally consistent, it is necessary that the store obeys the following condition: Writes that are potentially causally related … must be seen by all processes in the same order. Concurrent writes … may be seen in a different order on different machines.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Causal Consistency (2) This sequence is allowed with a causally-consistent store, but not with a sequentially consistent store. W 1 (x)c and W 2 (x)b are concurrent.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Causal Consistency (3) W 1 (x)a and W 2 (x)b are causally related. (a) A violation of a causally-consistent store.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Causal Consistency (4) W 1 (x)a and W 2 (x)b are not causally related. (b) A correct sequence of events in a causally-consistent store.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Grouping Operations (1) Definition Accesses to synchronization variables are sequentially consistent. No access to a synchronization variable is allowed to be performed until all previous writes have completed everywhere. No data access is allowed to be performed until all previous accesses to synchronization variables have been performed. Basic idea You don’t care that reads and writes of a series of operations are immediately known to other processes. You just want the effect of the series itself to be known.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Grouping Operations (2) A valid event sequence for entry consistency.