Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 DISTRIBUTED SYSTEMS.

Slides:



Advertisements
Similar presentations
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Advertisements

Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Principles and Paradigms Consistency and Replication
Consistency and Replication
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
1 CS 194: Lecture 8 Consistency and Replication Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Consistency and Replication (3). Topics Consistency protocols.
Consistency and Replication Chapter 6. Object Replication (1) Organization of a distributed remote object shared by two different clients.
Consistency and Replication. Replication of data Why? To enhance reliability To improve performance in a large scale system Replicas must be consistent.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Distributed Systems Spring 2009
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class: Web Caching Use web caching as an illustrative example Distribution protocols –Invalidate.
CS 582 / CMPE 481 Distributed Systems
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Consistency.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Client-Centric.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class:Consistency Semantics Consistency models –Data-centric consistency models –Client-centric.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Consistency and Replication Chapter 7
Consistency & Replication Chapter No. 6
1 6.4 Distribution Protocols Different ways of propagating/distributing updates to replicas, independent of the consistency model. First design issue.
Consistency and Replication Chapter Concepts Reasons for Replication Reliability Earthquake, flood Misoperation Performance Place copies of data.
Consistency and Replication
Consistency And Replication
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Consistency.
CS 5204 (FALL 2005)1 Leases: An Efficient Fault Tolerant Mechanism for Distributed File Cache Consistency Gray and Cheriton By Farid Merchant Date: 9/21/05.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
Consistency and Replication Chapter 6. Release Consistency (1) A valid event sequence for release consistency. Use acquire/release operations to denote.
Consistency and Replication. Replication of data Why? To enhance reliability To improve performance in a large scale system Replicas must be consistent.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency protocols.
Consistency and Replication Distributed Software Systems.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Consistency.
Replication (1). Topics r Why Replication? r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric consistency.
第5讲 一致性与复制 §5.1 副本管理 Replica Management §5.2 一致性模型 Consistency Models
Consistency and Replication Chapter 6 Presenter: Yang Jie RTMM Lab Kyung Hee University.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
Distributed Systems CS Consistency and Replication – Part IV Lecture 21, Nov 10, 2014 Mohammad Hammoud.
Chapter 7 Consistency And Replication (CONSISTENCY PROTOCOLS)
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Distributed Systems CS Consistency and Replication – Part IV Lecture 13, Oct 23, 2013 Mohammad Hammoud.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Consistency and Replication Chapter 6 Presenter: Yang Jie RTMM Lab Kyung Hee University.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
9.2 SECURE CHANNELS JEJI RAMCHAND VEDULLAPALLI. Content Introduction Authentication Message Integrity and Confidentiality Secure Group Communications.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Consistency and Replication CSCI 6900/4900. FIFO Consistency Relaxes the constraints of the causal consistency “Writes done by a single process are seen.
OS2 –Sem 1, Rasool Jalili Consistency and Replication Chapter 6.
Consistency and Replication (1). Topics Why Replication? Consistency Models – How do we reason about the consistency of the “global state”? u Data-centric.
Consistency and Replication
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
7.1. CONSISTENCY AND REPLICATION INTRODUCTION
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Distributed Systems CS
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Replica Placement Model: We consider objects (and don’t worry whether they contain just data or code, or both) Distinguish different processes: A process.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Last Class: Web Caching
Presentation transcript:

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 7 Consistency And Replication (CONSISTENCY PROTOCOLS) 1

CONSISTENCY PROTOCOLS Continuous Consistency - Bounding Numerical Deviation - Bounding Staleness Deviations - Bounding Ordering Deviations Sequential Consistency - Primary-based Protocol - Remote-write Protocol - Local-write Protocol - Replicated-write Protocol - Active Replication - Majority Voting (Quorum-Based) Cache-Coherence Protocols Implementing Client-Centric Consistency Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

consistency protocol A consistency protocol describes an implementation of a specific consistency model. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Continuous Consistency Bounding Numerical Deviation: A solution for keeping the numerical deviation within bounds We concentrate on writes to a single data item x Each write W(x) has an associated weight that represents the numerical value by which x is updated, denoted as weight (W) For simplicity, we assume that weight (W) > 0. Each write W is initially submitted to one out of the N available replica servers, in which case that server becomes the write's origin, denoted as origin (W) each server S i will keep track of a log L i of writes that it has performed on its own local copy of x. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Continuous consistency: Numerical deviation Consider a data item x and let weight(W) denote the numerical change in its value after a write operation W. Assume that weight(W) > 0. W is initially forwarded to one of the N replicas, denoted as origin(W). TW[ i,j ] are the writes executed by server S i that originated from S j TW[ i, j ] = ∑{weight(W) | origin(W) = S j & W Є log(S i )} The goal is for any time t, to let the current value V i at server S i deviate within bounds from the actual value v(t) of x. This actual value is completely determined by all submitted writes Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Continuous consistency: Numerical deviation Actual value v(t) of x: N v(t) = v (0) + ∑ TW[k,k] k=1 value v i of x at replica i N v(i) = v (0) + ∑ TW[i,k] k=1 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Continuous consistency: Numerical deviation for every server S i, we associate an upperbound δ i such that we need to enforce: v(t) – v i ≤ δi for every server S i when a server S i propagates a write originating from S j to S k the latter will be able to learn about the value TW [i,j] at the time the write was sent S k maintains a view TW k [ i, j ] of what it believes S i will have as value for TW[ i,j ]. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Continuous consistency: Numerical deviation Solution S ĸ sends operations from its log to S i when it sees that TW ĸ [ i, k ] is getting too far from TW[ k, k ], in particular, when TW[ k, k ] - TW ĸ [ i, k ] > δ i / (N -1) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Continuous consistency: Numerical deviation when server S k notices that S i has not been keeping in the right pace with the updates that have been submitted to S k it forwards writes from its log to Si. This forwarding effectively advances the view TW ĸ [ i, k ] that S k has of TW[ i, k ], making the deviation TW[ i, k ]- TW ĸ [ i, k ] smaller. In particular, S k advances its view on TW[ i, k ]when an application submits a new write that would increase TW[ k, k ] - TW ĸ [ i, k ] beyond δ i / (N -1). Advancement always ensures that v(t) – v i ≤ δi Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Bounding Staleness Deviations There are many ways to keep the staleness of replicas within specified bounds One approach is to timestamp each submitted write by its origin server let server S k keep a real-time vector clock RVC k where: RVCk[i] = T(i) means that S k has seen all writes that have been submitted to S i up to time T(i) T(i) denotes the time local to S i Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Bounding Staleness Deviations If the clocks between the replica servers are loosely synchronized: Whenever server S k notes that T( k ) - RVC k [ i ] is about to exceed a specified limit it simply starts pulling in writes that originated from S i with a timestamp later than RVC k [ i ]· a replica server is responsible for keeping its copy of x up to date regarding writes that have been issued elsewhere Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Numerical VS Staleness Bounds Numerical bounds follows a push approach, by letting an origin server keep replicas up to date by forwarding writes Staleness Bounds follows a Pull approach, Replica servers pull in updates from origin servers Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Bounding Ordering Deviations ordering deviations in continuous consistency are caused by the fact that a replica server tentatively applies updates that have been submitted to it. each server will have a local queue of tentative writes for which the actual order in which they are to be applied to the local copy of x still needs to be determined The ordering deviation is bounded by specifying the maximal length of the queue of tentative writes enforce a globally consistent ordering of tentative writes Using primary-based or quorum-based protocols Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Primary-Based Protocols each data item x in the data store has an associated primary, which is responsible for coordinating write operations on x provide a straightforward implementation of sequential consistency all processes see all write operations in the same order, no matter which backup server they use to perform read operations A distinction can be made as to whether: - the primary is fixed at a remote server - write operations can be carried out locally after moving the primary to the process where the write operation is initiated Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Remote- Write Protocols all write operations need to be forwarded to a fixed single server Read operations can be carried out locally Also called (primary-backup protocol), Fig Disadvantage: it may take a relatively long time before the process that initiated the update is allowed to continue, an update is implemented as a blocking operation Alternative: nonblocking approach Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Remote-Write Protocols Figure The principle of a primary-backup protocol. 16

Remote- Write Protocols nonblocking approach: As soon as the primary has updated its local copy of x, it returns an acknowledgment. After that, it tells the backup servers to perform the update as well Advantage: write operations may speed up considerably Disadvantage: fault tolerance, updates may not be backed up by other servers Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Local- Write Protocols when a process wants to update data item x, it locates the primary copy of x, and subsequently moves it to its own location (Fig. 7-21) Advantage (in nonblocking protocol only): multiple, successive write operations can be carried out locally, while reading processes can still access their local copy updates are propagated to the replicas after the primary has finished with locally performing the updates Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Local-Write Protocols Figure Primary-backup protocol in which the primary migrates to the process wanting to perform an update. 19

Local- Write Protocols Example primary-backup protocol with local writes Mobile computing in disconnected mode (ship all relevant files to user before disconnecting, and update later on) While being disconnected, all update operations are carried out locally. while other processes can still perform read operations (but no updates). Later, when connecting again, updates are propagated from the primary to the backups, bringing the data store in a consistent state again Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Local- Write Protocols Example primary-backup protocol with local writes distributed file systems that require a high degree of fault tolerance a fixed central server through which all write operations take place The server temporarily allows one of the replicas to perform a series of local updates to speed up performance When the replica server is done, the updates are propagated to the central server, from where they are then distributed to the other replica servers Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Replicated-Write Protocols write operations can be carried out at multiple replicas instead of only one A distinction can be made between: - active replication, in which an operation is forwarded to all replicas - majority voting (Quorum-Based Protocols) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Active Replication each replica has an associated process that carries out update operations operations need to be carried out in the same order everywhere. totally-ordered multicast mechanism. Such a multicast can be implemented using Lamport's logical clocks, Disadvantage: this implementation of multicasting does not scale well in large distributed systems. total ordering can be achieved using a central coordinator, also called a sequencer. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Active Replication first forward each operation to the sequencer, which assigns it a unique sequence number and subsequently forwards the operation to all replicas Operations are carried out in the order of their sequence number. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Quorum-Based Protocols A different approach to supporting replicated writes is to use voting clients to request and acquire the permission of multiple servers before either reading or writing a replicated data item Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Quorum-Based Protocols how the algorithm works? a file is replicated on N servers to update a file, a client must first contact at least half the servers plus one (a majority) and get them to agree to do the update. Once they have agreed, the file is changed and a new version number is associated with the new file. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Quorum-Based Protocols how the algorithm works? To read a replicated file, a client must also contact at least half the servers plus one and ask them to send the version numbers associated with the file. If all the version numbers are the same, this must be the most recent version Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Quorum-Based Protocols to read a file of which N replicas exist, a client needs to assemble a read quorum, an arbitrary collection of any N R servers, or more to modify a file, a write quorum of at least N w servers is required. The values of N R and N w are subject to the following two constraints: N R +N W > N and N W > N / 2 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Quorum-Based Protocols Figure Three examples of the voting algorithm. (a) A correct choice of read and write set. (b) A choice that may lead to write-write conflicts. (c) A correct choice, known as ROWA (read one, write all). 29

Cache-Coherence Protocols cache-coherence protocols ensures that a cache is consistent with the server-initiated replicas coherence detection strategy (when) coherence enforcement strategy (How) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Cache-Coherence Protocols Caches are controlled by clients instead of servers much research in the design and implementation of caches caching solutions differ in their coherence detection strategy, when inconsistencies are detected. Dynamic solutions: inconsistencies are detected at runtime. For example, a check is made with the server to see whether the cached data have been modified since they were cached In the case of distributed databases, dynamic detection- based protocols can be further classified by considering exactly when during a transaction the detection is done Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Implementing Client-Centric Consistency A Naive Implementation: each write operation W is assigned a globally unique identifier. Such an identifier is assigned by the server to which the write had been submitted (Origin of W). for each client, a track of two sets of writes: The read set for a client consists of the writes relevant for the read operations performed by a client The write set consists of the identifiers of the writes performed by the client. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Implementing Client-Centric Consistency Monotonic-read consistency: When a client performs a read operation at a server, that server is handed the client's read set to check whether all the identified writes have taken place locally (The size of the set may introduce a performance problem) If not, it contacts the other servers to ensure that it is brought up to date before carrying out the read operation Alternatively, the read operation is forwarded to a server where the write operations have already taken place After the read operation is performed, the write operations that have taken place at the selected server and which are relevant for the read operation are added to the client‘s read set Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Implementing Client-Centric Consistency Monotonic-write consistency: When a client initiates a new write operation at a server, the server is handed over the client's write set It then ensures that the identified write operations are performed first and in the correct order After performing the new operation, that operation's write identifier is added to the write set Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Implementing Client-Centric Consistency read-your-writes consistency : requires that the server where the read operation is performed has seen all the write operations in the client's write set The writes can be fetched from other servers before the read operation is performed Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Implementing Client-Centric Consistency writes-follow-reads consistency: implemented by first bringing the selected server up to date with the write operations in the client's read set Then, later adding the identifier of the write operation to the write set, along with the identifiers in the read set Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Improving Efficiency read set and write set associated with each client can become very large Solution: a client's read and write operations are grouped into sessions A session is typically associated with an application: it is opened when the application starts and is closed when it exits Whenever a client closes a session, the sets are cleared Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved

Summary Consistency protocols describe specific implementations of consistency models a distinction can be made between primary-based protocols and replicated-write protocols In primary-based protocols, all update operations are forwarded to a primary copy that subsequently ensures the update is properly ordered and forwarded In replicated-write protocols, an update is forwarded to several replicas at the same time Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved