Consistency and Replication

Slides:



Advertisements
Similar presentations
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 7: Consistency 4/13/20151Distributed Systems - COMP 655.
Advertisements

Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Consistency and Replication Chapter Introduction: replication and scalability 6.2 Data-Centric Consistency Models 6.3 Client-Centric Consistency.
Consistency and Replication Chapter Topics Reasons for Replication Models of Consistency –Data-centric consistency models: strict, linearizable,
Consistency and Replication
Consistency and Replication Chapter 6. Object Replication (1) Organization of a distributed remote object shared by two different clients.
Replication and Consistency Chapter 6. Data-Centric Consistency Models The general organization of a logical data store, physically distributed and replicated.
Consistency Models Based on Tanenbaum/van Steen’s “Distributed Systems”, Ch. 6, section 6.2.
Consistency and Replication. Replication of data Why? To enhance reliability To improve performance in a large scale system Replicas must be consistent.
Computer Science Lecture 14, page 1 CS677: Distributed OS Consistency and Replication Today: –Introduction –Consistency models Data-centric consistency.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Distributed Systems Spring 2009
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Computer Science Lecture 14, page 1 CS677: Distributed OS Consistency and Replication Today: –Introduction –Consistency models Data-centric consistency.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class: Web Caching Use web caching as an illustrative example Distribution protocols –Invalidate.
CS 582 / CMPE 481 Distributed Systems
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Computer Science Lecture 14, page 1 CS677: Distributed OS Consistency and Replication Introduction Consistency models –Data-centric consistency models.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Consistency.
Consistency and Replication
Consistency And Replication
Consistency and Replication Chapter 6. Release Consistency (1) A valid event sequence for release consistency. Use acquire/release operations to denote.
Consistency and Replication. Replication of data Why? To enhance reliability To improve performance in a large scale system Replicas must be consistent.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Consistency and Replication Distributed Software Systems.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
第5讲 一致性与复制 §5.1 副本管理 Replica Management §5.2 一致性模型 Consistency Models
Consistency and Replication Chapter 6 Presenter: Yang Jie RTMM Lab Kyung Hee University.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Consistency and Replication Chapter 6. Topics Reasons for Replication Models of Consistency –Data-centric consistency models –Client-centric consistency.
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Hwajung Lee.  Improves reliability  Improves availability ( What good is a reliable system if it is not available?)  Replication must be transparent.
Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
Replication Improves reliability Improves availability ( What good is a reliable system if it is not available?) Replication must be transparent and create.
Consistency and Replication Chapter 6 Presenter: Yang Jie RTMM Lab Kyung Hee University.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Consistency and Replication CSCI 6900/4900. FIFO Consistency Relaxes the constraints of the causal consistency “Writes done by a single process are seen.
OS2 –Sem 1, Rasool Jalili Consistency and Replication Chapter 6.
Consistency and Replication (1). Topics Why Replication? Consistency Models – How do we reason about the consistency of the “global state”? u Data-centric.

CSE 486/586 Distributed Systems Consistency --- 2
CS6320 – Performance L. Grewe.
Consistency and Replication
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Replication and Consistency
Consistency Models.
Replication and Consistency
Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
Consistency and Replication
DATA CENTRIC CONSISTENCY MODELS
CSE 486/586 Distributed Systems Consistency --- 1
Replication Improves reliability Improves availability
Active replication for fault tolerance
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Ch 5 Replication and Consistency
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Distributed Systems CS
Replica Placement Model: We consider objects (and don’t worry whether they contain just data or code, or both) Distinguish different processes: A process.
Last Class: Web Caching
EEC 688/788 Secure and Dependable Computing
Replication Chapter 7.
Consistency and Replication
CSE 486/586 Distributed Systems Consistency --- 2
Replication and Consistency
Presentation transcript:

Consistency and Replication Chapter 14

Topics Reasons for Replication Models of Consistency Data-centric consistency models: strict, linearizable, sequential, causal, FIFO, weak, release, entry Client-centric consistency models: monotonic reads, monotonic writes, read-your-writes Protocols for Achieving Consistency ROWA Read and write quorums

Replication Reasons: Reliability: increase availability when servers crash Performance: load balancing; scale with size of geographical region Availability: local server likely to be available When one copy is modified, all replicas have to be updated Problem: how to keep the replicas consistent

Object Replication Approach 1: application is responsible for replication Application needs to handle consistency issues Approach 2: system (middleware) handles replication Consistency handled by the middleware: Simplifies application development but makes object-specific solutions harder (CORBA)

Replication and Scaling Replication and caching used for system scalability Multiple copies: Improves performance by reducing access latency But higher network overhead of maintaining consistency Example: object is replicated N times Read frequency R, write frequency W If R <= W, high consistency overhead If R >> W, replication makes sense Consistency maintenance is itself an issue What semantics to provide? Tight consistency requires globally synchronized clocks! Solution: loosen consistency requirements Variety of consistency semantics possible

Consistency Models Consistency Model: contract between processes and the data store. If processes follow contract, the data store works correctly.

Data-Centric Consistency Data must maintain consistency to a globally-aware standard. “I see all the data servers at once and determine if they are properly consistent”. Data-centric consistency describes how all copies should be updated regardless of whether or not some client sees those updates. In client-centric we are only concerned with what a particular client sees. An intelligent front-end can restrict the servers that certain clients can access so the client sees the desired consistency.

Strict Consistency Def.: Any read on a data item x returns a value corresponding to the result of the most recent write on x (regardless of which copy was written to). Behavior of two processes, operating on the same data item. (a) A strictly consistent store. (b) A store that is not strictly consistent.

Sequential Consistency (1) Def.: The result of any excution is the same as if the operations by all processes on the data store were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program. Sequential consistency is weaker than strict consistency All processes see the same interleaving of operations A sequentially consistent data store. A data store that is not sequentially consistent.

Sequential Consistency (2) Sequential consistency comparable to serializability of transactions Process P1 Process P2 Process P3 x = 1; print (y, z); y = 1; print (x, z); z = 1; print (x, y); Any valid interleaving is allowed All agree on the same interleaving Each process preserves its program order Nothing is said about “most recent write”

Sequential Consistency (3) x = 1; print (y, z); y = 1; print (x, z); z = 1; print (x, y); Prints: 001011 Signature: 001011 (a) print (x,z); print(y, z); Prints: 101011 Signature: 101011 (b) Prints: 010111 Signature: 010111 (c) Prints: 111111 111111 (d) Four valid execution sequences for the processes of the previous slide. The vertical axis is time.

Linearizability Assumption: Operations are timestamped (e.g., Lamport TS) Def.: The result of any execution is the same as if the operations by all processes on the data store were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program. In addition, if tsOP1(x)<tsOP2(y), then OP1(x) should precede OP2(y) in this sequence. Linearizable data store is also sequentially consistent Linearizability is weaker than strict consistency, but stronger than sequential consistency - adds global TS requirements to sequential consistency.

Linearizability Neither of these is linearizable since W(x)a on P1 occurs before W(x)b on P2. In (a) P3 and P4 would have to see a before b.

Causal Consistency (1) Writes that are potentially causally related must be seen by all processes in the same order. Concurrent writes may be seen in a different order on different machines. Causal consistency is weaker than sequential consistency

Causal Consistency (2) This sequence is allowed with a causally-consistent store, but not with sequentially or strictly consistent store. W2(x)b may depend on R2(x)a and therefore depends on W1(x)a. Thus, a must be seen before b at other processes W2(x)b and W1(x)c are concurrent

Causal Consistency (3) A violation of a causally-consistent store: W2(x)b depends on W1(x)a. A correct sequence of events in a causally-consistent store.

FIFO Consistency (1) Writes done by a single process are seen by all other processes in the order in which they were issued. Writes from different processes may be seen in a different order by different processes. FIFO consistency is weaker than causal consistency. Simple implementation: tag each write by (Proc ID, seq #)

FIFO Consistency (2) A valid sequence of events of FIFO consistency

FIFO Consistency (3) Signature: 001001 Process P1 Process P2 Process P3 x = 1; print (y, z); y = 1; print (x, z); z = 1; print (x, y); Process P1’s view x = 1; print (y, z); y = 1; z = 1; Prints: 00 Process P2’s view print (x, z); Prints: 10 Process P3’s view print (x, y); Prints: 01 Signature: 001001 Statement execution as seen by the three processes. The statements in bold are the write-updates originating from other the processes. Signature 001001 not possible with sequential consistency. In sequential consistency, all processes have the same view.

FIFO Consistency (4) Process P1 Process P2 x = 1; if (y == 0) kill (P2); y = 1; if (x == 0) kill (P1); Sequential consistency: 6 possible statement orderings; none of them kills both processes FIFO consistency: both processes can get killed

Models Based on a Sync Operation No consistency is enforced until a synchronization operation is performed. This operation can be done after local reads and writes to propagate the changes throughout the system. Weak Consistency Release Consistency Entry Consistency

Weak Consistency (1) Often not necessary to see all writes done by a process Weak consistency enforces consistency on a group of operations; not individual read/write statements Synchronization point: Propagate changes made to local data store to remote data stores Changes made by remote data stores are imported Weak consistency is weaker than FIFO consistency

Weak Consistency (2) Properties: Accesses to synchronization variables associated with a data store are sequentially consistent (i.e., all processes see all operations on synchronization variables in the same order) No operation on a synchronization variable is allowed to be performed until all previous writes have been completed everywhere (i.e., guarantees all writes have propagated) No read or write operation on data items are allowed to be performed until all previous operations to synchronization variables have been performed (i.e., when accessing data items, all previous synchronizations have completed)

Weak Consistency (3) A valid sequence of events for weak consistency. An invalid sequence for weak consistency.

Release Consistency (1) More efficient implementation than weak consistency by identifying critical regions Acquire: ensure that all local copies of the data are brought up to date to be consistent with (released) remote ones Release: data that has been changed is propagated out to remote data stores Acquire does not guarantee that locally made changes will be sent to other copies immediately Release does not necessarily import changes from other copies

Release Consistency (2) A valid event sequence for release consistency.

Release Consistency (3) Rules: Before a read or write operation on shared data is performed, all previous acquires done by the process must have completed successfully. Before a release is allowed to be performed, all previous reads and writes by the process must have completed Accesses to synchronization variables are FIFO consistent (sequential consistency is not required).

Release Consistency (4) Different implementations: Eager release consistency: process doing the release pushes out all the modified data to all other processes. Lazy release consistency: no update messages are sent at time of release. When another process does an acquire, it has to obtain the most recent version.

Entry Consistency (1) Every data item is associated with a synchronization variable. Each data item can be acquired and released as in release consistency. Acquire (entry) gets most recent value. Advantage: increased parallelism Disadvantage: increased overhead

Entry Consistency (2) A valid event sequence for entry consistency.

Summary of Consistency Models Description Strict Absolute time ordering of all shared accesses matters. Linearizability All processes must see all shared accesses in the same order. Accesses are furthermore ordered according to a (nonunique) global timestamp Sequential All processes see all shared accesses in the same order. Accesses are not ordered in time Causal All processes see causally-related shared accesses in the same order. FIFO All processes see writes from each other in the order they were used. Writes from different processes may not always be seen in that order (a) Weak Shared data can be counted on to be consistent only after a synchronization is done Release Shared data are made consistent when a critical region is exited Entry Shared data pertaining to a critical region are made consistent when a critical region is entered. (b) Consistency models not using synchronization operations. Models with synchronization operations.

Client Centric Consistency (1) Strong consistency for data store often not necessary Consistency guarantees from a clients perspective Clients often tolerate inconsistencies (e.g., out of date web-pages) Assumptions: Client may move to a different replica during a single session or may be prevented Eventual consistent data store: total propagation and consistent ordering Trade-off: consistency vs. availability

Client Centric Consistency (2) The principle of a mobile user accessing different replicas of a distributed database.

Intuition Assume the application is like a message board, so when P1 reads x, it sees the history of values for x. Also, when P1 writes to x, the new value is appended like a new message would be. 1: X = 2 2: X = 9 3: X = 5 .

Monotonic Reads (1) Distributed Data Store If a process P1 reads the value of a data item x, any successive read by P1 will always return that same value or a more recent one. Distributed Data Store R(x) P1 R(x)

Monotonic Reads (2) Definition: If Read R1 occurs before R2 in a session and R1 accesses server S1 at time t1 and R2 accesses server S2 at time t2, R2 sees the same as R1 or a more recent value. R1 and R2 are operations by the same client. Example: Calendar updates

R1 and R2 are 2 reads by the same client. Monotonic Reads (3) One client, 2 servers S1 W1 R1 S2 R2 Valid S1 W1 R1 S2 R2 W0 Invalid: R2 doesn‘t see W1 W0 R1 and R2 are 2 reads by the same client. If R1 saw W1, then a later R2 should see W1.

Monotonic Writes (1) A write operation by P1 on data item x is completed before any successive write of x by P1 (even if P1 is now attached to a different server). Assume the application is like a message board, so when P1 reads x, it sees the history of values for x.

Monotonic Writes (1) Definition: If Write W1 precedes Write W2 in a session, then, for any server S2, if W2 is on S2 then W1 is on S2 and the order is W1 precedes W2. Like monotonic reads except the writes force consistency. Example: Software Update

Not valid because W1 is not in the history of S2 Monotonic Writes (2) S1 W1 S2 W2 Valid S1 W1 S2 W2 W0 Invalid W0 Not valid because W1 is not in the history of S2

Read Your Writes (1) Definition: If a client’s Read R follows Write W in a session and R is performed at server S at time t, then W is on S at time t Example: Password update propagation: you update your password on one system in the design center then move to another machine. “If I write something, then do a read – what I wrote should be there”.

Read Your Writes (2) S1 W1 W2 S2 R Valid S1 W1 W2 S2 R Invalid

State vs. Operations Design choices of update propagation: Propagate only a notification of an update (e.g., invalidation protocols) Transfer data from one copy to another Propagate the update operation to other copies (a.k.a. active replication)

Pull vs. Push Protocols Issue Push-based Pull-based State of server List of client replicas and caches None Messages sent Update (and possibly fetch update later) Poll and update Response time at client Immediate (or fetch-update time) Fetch-update time A comparison between push-based and pull-based protocols in the case of multiple client, single server systems.

Epidemic Protocols Useful for eventual consistency Propagating updates to all replicas in as few messages as possible Update propagation model: Infective: node holds update and is willing to spread Susceptible: node willing to accept update Removed: updated node not willing to spread Anti-entropy: pick nodes at random Exchanging updates between nodes P and Q: P only pushes its own updates to Q P only pulls in new updates from Q P and Q send updates to each other

Consistency Protocol Implementations Primary based (each data item has an associated primary): Passive replication Queries (read-only) can be done on backup Replicated write (write operations carried out at multiple replicas, update anywhere): Active replication May use quorum based protocols

Primary based Replica Management FE C RM Primary Backup Primary-based protocol with a fixed server to which all read and write operations are forwarded. (could be eager or lazy)

Primary-Backup Replication There is a single primary RM and one or more secondary (backup, slave) RMs FEs communicate with the primary which executes the operation and sends copies of the updated data to backups If the primary fails, one of the backups is promoted to act as the primary This system implements linearizability, since the primary sequences all the operations on the shared objects

Active Replication This is Read One Write All (ROWA) FE C RM a FE multicasts each request to the group of RMs the RMs process each request identically and reply Requires totally ordered reliable multicast so that all RMs perfrom the same operations in the same order This is Read One Write All (ROWA)

Alternative to ROWA Write quorum: all write quorums must have a non-empty intersection. Read quorum: any read quorum must have a non-empty intersection with all write quorums. Write operation: write new value to all copies in the quorum and update version number. Read operation: read all copies in read quorum and pick the most recent (copies must have write timestamp or version number).

Quorum-Based Protocols Constraints on read quorum NR and write quorum NW: NR + NW > N NW > N/2 Three examples of the voting algorithm: A correct choice of read and write set A choice that may lead to write-write conflicts A correct choice, known as ROWA (read one, write all)

More Info Causally consistent lazy replication (Wuu1984) Efficient Solutions to the Replicated Log and Dictionary Problem, by Wuu G. T. and Bernstein A. J., Readings in client-centric consistency http://www.cs.ubc.ca/grads/resources/thesis/Nov03/Sunny_Ho.pdf The End