Fault Tolerance and Replication

Slides:



Advertisements
Similar presentations
Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Advertisements

Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
DISTRIBUTED SYSTEMS II REPLICATION CNT. II Prof Philippas Tsigas Distributed Computing and Systems Research Group.
1 Linearizability (p566) the strictest criterion for a replication system  The correctness criteria for replicated objects are defined by referring to.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Distributed Systems 2006 Styles of Client/Server Computing.
ABCSG - Distributed Database 1 Data Management Distributed Database Data Replication.
Group Communications Group communication: one source process sending a message to a group of processes: Destination is a group rather than a single process.
CS 582 / CMPE 481 Distributed Systems
CMPT 431 Dr. Alexandra Fedorova Lecture XII: Replication.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
CS 582 / CMPE 481 Distributed Systems Replication.
CSS490 Replication & Fault Tolerance
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
Jeff Chheng Jun Du.  Distributed file system  Designed for scalability, security, and high availability  Descendant of version 2 of Andrew File System.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
Transactions and concurrency control
Distributed Databases
Database Replication. Replication Replication is the process of sharing information so as to ensure consistency between redundant resources, such as software.
Mobility in Distributed Computing With Special Emphasis on Data Mobility.
Database Design – Lecture 16
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
Replication ( ) by Ramya Balakumar
Distributed Systems Course Replication 14.1 Introduction to replication 14.2 System model and group communication 14.3 Fault-tolerant services 14.4 Highly.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Replication with View Synchronous Group Communication Steve Ko Computer Sciences and Engineering.
DISTRIBUTED SYSTEMS II REPLICATION Prof Philippas Tsigas Distributed Computing and Systems Research Group.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Databases Illuminated
DISTRIBUTED SYSTEMS II AGREEMENT - COMMIT (2-3 PHASE COMMIT) Prof Philippas Tsigas Distributed Computing and Systems Research Group.
1 Highly available services  we discuss the application of replication techniques to make services highly available. –we aim to give clients access to.
CS542: Topics in Distributed Systems Replication.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
CS425 / CSE424 / ECE428 — Distributed Systems — Fall 2011 Some material derived from slides by Prashant Shenoy (Umass) & courses.washington.edu/css434/students/Coda.ppt.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Fault Tolerant Services
Chap 7: Consistency and Replication
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Write Conflicts in Optimistic Replication Problem: replicas may accept conflicting writes. How to detect/resolve the conflicts? client B client A replica.
Replication and Group Communication. Management of Replicated Data FE Requests and replies C Replica C Service Clients Front ends managers RM FE RM Instructor’s.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Hwajung Lee.  Improves reliability  Improves availability ( What good is a reliable system if it is not available?)  Replication must be transparent.
Replication Improves reliability Improves availability ( What good is a reliable system if it is not available?) Replication must be transparent and create.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Lecture 13: Replication Haibin Zhu, PhD. Assistant Professor Department of Computer Science Nipissing University © 2002.
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.
Fault Tolerance (2). Topics r Reliable Group Communication.
THE EVOLUTION OF CODA M. Satyanarayanan Carnegie-Mellon University.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
Distributed Computing Systems Replication Dr. Sunny Jeong. Mr. Colin Zhang With Thanks to Prof. G. Coulouris,
Replication Chapter Katherine Dawicki. Motivations Performance enhancement Increased availability Fault Tolerance.
Exercises for Chapter 14: Replication
Chapter 14: Replication Introduction
Outline Announcements Fault Tolerance.
CSE 486/586 Distributed Systems Consistency --- 1
Replication Improves reliability Improves availability
Active replication for fault tolerance
Lecture 21: Replication Control
Slides for Chapter 15: Replication
Distributed Systems Course Replication
Lecture 21: Replication Control
Abstractions for Fault Tolerance
Presentation transcript:

Fault Tolerance and Replication This power point presentation has been adapted from: (1) web.njit.edu/~gblank/cis633/Lectures/Replication.ppt

Content Introduction System model and the role of group communication Fault tolerant services Case study: Bayou and Coda Transaction with replicated data

Content Introduction System model and the role of group communication Fault tolerant services Case study: Bayou and Coda Transaction with replicated data

Introduction Replication Duplicate limited or heavily loaded resources to provide access and ensure access after failures Replication is important for performance enhancement, increased availability and fault tolerance.

Introduction Replication Performance enhancement Data are replicated between several originating servers in the same domain The workload is shared between the servers by binding all the server IP addresses to the site’s DNS name It increases performance with little cost to the system

Introduction Replication Increased availability Replication is a technique for automatically maintaining the availability of data despite server failures If data are replicated at two or more failure-independent servers, then client software may be able to access data at an alternative server should the default server fail or become unreachable

Introduction Replication Fault tolerance Highly available data is not necessarily providing correct data (may be out of date) A fault-tolerant service always guarantees the correctness of the freshness of data supplied to the client and the effects of the client’s operations upon the data

Introduction Replication Replication requirements: Transparency Users should not need to be aware that data is replicated, and the performance and utility of the information retrieval should not be noticeably different from unreplicated data Consistency Different copies of replicated data should be the same. When data are changed, it is distributed to all replicated servers

Content Introduction System model and the role of group communication Fault tolerant services Case study: Bayou and Coda Transaction with replicated data

System Model & The Role of Group Communication Introduction The data in the system are composed of objects (e.g.,files, components, Java objects, etc.) Each logical object is implemented by a collection of physical objects called replicas, each stored on a computer. The replicas of a given object are not necessarily identical, at least not at any particular point in time. Some replicas may have received updates that others have not received.

System Model & The Role of Group Communication

System Model & The Role of Group Communication Replica Managers (RM) components that contain the objects on a particular computer and perform operations on them. Front ends (FE) Components that handle client’s requests communicate with one or more of the replica managers by message passing A front end may be implemented in the client’s address space, or it may be a separate process

System Model & The Role of Group Communication 5 phases in the a request upon replicated objects [Wiesmann et al. 2000] Front end requests service from one or more RMs which may communicate with the other RMs. The front end may communicate through one RM or multicast to all of them. RMs coordinate to prepare to execute the request. This may require ordering of the operations. RMs execute the request (may be reversible later). RMs reach agreement on effect of the request. One or more RMs pass a response back to the front end.

System Model & The Role of Group Communication RM in group communication is complex, especially in the case of dynamic groups. A group membership service may be used to manage the addition and removal of replica managers, and detect and recover from crashes and faults.

System Model & The Role of Group Communication Tasks of a Group Membership Service Provide an interface for group membership changes Implement a failure detector Notify members of group membership changes Perform group address expansion for multicast delivery of messages.

System Model & The Role of Group Communication Join Group address expansion Multicast communication send Fail Group membership management Leave Process group

Content Introduction System model and the role of group communication Fault tolerant services Case study: Bayou and Coda Transaction with replicated data

Fault Tolerant Services Introduction Replicating data and functionality at replica managers can be used to provide a service that is correct despite process failures A replication service is correct if it keeps responding despite faults Clients can’t see the difference between a service provided by replication and one with a single copy of the data.

Fault Tolerant Services Introduction A criteria for replicated objects is linearizable Every operation is synchronous Clients must wait for one operation to complete before starting another. A replicated shared object is sequentially consistent if for any execution interleaved operations produce a single correct copy and the order of the operations is consistent with the order in which they were performed

Fault Tolerant Services Update process Read-only requests have no impact on the replicated object Update processes may need to managed properly to avoid inconsistency. A strategy to avoid inconsistency Make all updates to a primary copy of the data and copy that to the other replicas (passive replication). If the primary fails, one of the backups is promoted to act as primary.

Fault Tolerant Services Passive (primary-backup) replication

Fault Tolerant Services Passive (primary-backup) replication The sequence of events when a client requests an operation Request: front end issues a request with a unique identifier to the primary replica manager. Coordination: primary processes request atomically, checking ID for duplicate requests. Execution: request is processed and stored. Agreement: if an update, primary sends info to backups, which update and acknowledge. Response: primary notifies front end, which passes information to client.

Fault Tolerant Services Passive (primary-backup) replication It gives fault tolerance at a cost in performance. high overhead to updating the replicas, so it gives lower performance than non-replicated objects. To solve this issue: Allow read-only requests to be made to backup RMs, but send all updates to the primary. Limited value for transaction processing systems but is very effective for decision support systems (mostly read-only requests).

Fault Tolerant Services Active Replication

Fault Tolerant Services Active Replication Active Replication steps: Request: front end attaches unique ID to request and multicasts (totally ordered, reliable) to RMs. Front end is assumed to fail only by crashing. Coordination: every correct RM receives request in same total order. Execution: every RM executes the request. Coordination: (not required due to multicast) Response: each RM sends response to front end, which manages responses depending on failure assumptions and multicast algorithm.

Fault Tolerant Services Active Replication The model assumes totally ordered and reliable multicasting. This is equivalent to solving consensus, which requires either a synchronous system or a technique such as failure detectors in an asynchronous system. The model can be simplified if updates are assumed to be commutative, so that the effect of two operations is the same in any order. E.g. A bank account—daily deposits and withdrawals can be done in any order unless the balance goes below zero. If a process avoids overdrafts, the effects are commutative.

Content Introduction System model and the role of group communication Fault tolerant services Case study: Bayou and Coda Transaction with replicated data

Case study: Bayou and Coda Introduction Implementation of replication techniques to make services highly available Giving clients access to the service (with reasonable response times) Fault tolerant systems send updates and all correct RMs receive updates as soon as possible. May be unacceptable for high availability systems. May be desirable to increase performance by providing slower (but still acceptable) updates with a minimal set of RMs. Weaker consistency tends to require less agreement and provides more availability.

Case study: Bayou and Coda Is an approach to high availability Users working in a disconnected fashion can make any updates in any partition at any time, with the updates recorded at any replica manager. The replica managers are required to detect and manage conflicts at the time when two partitions are rejoined and the updates are merged. Domain specific policies, called operational transformations, are used to resolve conflicts by giving priority to some partitions.

Case study: Bayou and Coda Bayou holds state values in a database to support queries and updates. Updates are a special case of a transaction, using the equivalent of a stored procedure to guarantee the ACID properties. Eventually every RM gets the same set of updates and applies them so that their databases are identical. However, since this is delayed, in an active system with a consistent stream of updates the databases may never really be identical.

Case study: Bayou and Coda Bayou Update Resolution Updates are marked as tentative when they are first applied to a database. Once coordination with the other RMS makes it possible to resolve conflicts and place the updates in a canonical order, they are committed. Once committed, they remain applied in their allotted order. Usually, this is achieved by designating a primary RM. Every update includes a dependency check and follows a merge procedure.

Case study: Bayou and Coda

Case study: Bayou and Coda In Bayou, replication is not transparent to the application. Knowledge of the application semantics is required to increase data availability while maintaining a replication state that can be called eventually sequentially consistent. Disadvantages include increased complexity for the application programmers and the users. The operational transformation approach is particularly suited for groupware, where workers access documents remotely.

Case study: Bayou and Coda The Coda file system is a descendent of Andrew File System (AFS) To address several requirements that AFS does not meet – particularly the requirement to provide high availability despite disconnected operation It was developed in a research project at Carnegie-Mellon University Increasing users of AFS that use laptop: A need to support disconnected use of replicated data and to increase performance and availability.

Case study: Bayou and Coda The Coda architecture: Coda has Venus processes at the client computers and Vice processes at the file servers. The Vice processes are replica managers. A set of servers holding replicas of a file volume is a volume storage group (VSG). Clients access a subset known as the available volume storage group (AVSG), which varies as servers are connected or disconnected. Updates are distributed by broadcasting to the AVSG after a close. If the AVSG is empty (disconnected operation) files are cached until reconnected.

Case study: Bayou and Coda Coda uses an optimistic replication strategy files can be updated when the network is partitioned or during disconnected operation. A Coda version vector (CVV) is a timestamp that is used at each site to determine whether there are any conflicts among updates at the time of reconnection. If no conflict, updates are performed. Coda does not attempt to resolve conflicts. If there is a conflict, the file is marked inoperable, and the owner of the file is notified. This is done at the AVSG level, so conflicts may recur at the VSG level.

Content Introduction System model and the role of group communication Fault tolerant services Case study: Bayou and Coda Transaction with replicated data

Transaction with Replicated Data Introduction Client should see that transactions on replicated objects should appear the same as on non-replicated objects Client transactions are interleaved in a serially equivalent manner. One-copy serializability: If replicated object transactions are performed and the result is the similar as on a single set of objects

Transaction with Replicated Data Introduction 3 replication schemes for network partition: Available copies with validation Available copies replication is applied in each partition. When a partition is repaired, a validation procedure is applied and any inconsistencies are dealt with. Quorum consensus: A subgroup must have a quorum (has sufficient members) in order to be allowed to continue providing a service in the presence of a partition. When a partition is repaired (and when a replica manager restarts after a failure), replica managers get their objects up-to-date by means of recovery procedures. Virtual partition: A combination of quorum consensus and available copies. If a virtual partition has a quorum, it can use available copies replication.

Transaction with Replicated Data Available copies Allows for some RMs to be unavailable. Updates must be made to all available replicas of the data, with provisions to restore and update a RM that has crashed.

Transaction with Replicated Data Available copies

Transaction with Replicated Data Available copies with validation An optimistic approach that allows updates in different partitions of a network. When the partition is corrected, conflicts must be detected and compensating actions must be taken. This approach is limited to situations in which such compensation is possible.

Transaction with Replicated Data Quorum consensus Is a pessimistic approach to replicated transactions. A quorum is a subgroup of RMs that is large enough to give it the right to carry out transactions even if some RMs are not available. This limits updates to a single subset of the RMs, which update other RMs after a partition is corrected. Gifford’s File Replication: a Quorum scheme in which a number of votes is assigned to each copy of a replicated file. A certain number of votes are required for either read or update operations, with writes limited to subsets of more than half the RMs. The rest of the RMs will be updated as a background task when they are available. Copies of data without enough read votes are considered weak copies and may be read locally with limits assumed on their currency and quality.

Transaction with Replicated Data Virtual Partition Algorithm This approach combines Quorum Consensus to handle partitions and Available Copies for faster read operations. A virtual partition is an abstraction of a real partition and contains a set of replica managers.

Transaction with Replicated Data Virtual Partition Algorithm

Transaction with Replicated Data Virtual Partition Algorithm

Transaction with Replicated Data Virtual Partition Algorithm Issues: If network partitions are intermittent, different virtual partitions can form: Overlapping virtual partitions violate one-copy serializability. Higher logical timestamps determine the selection of consistent virtual partitions where partitions are uncommon.

Transaction with Replicated Data Virtual Partition Algorithm

End of the Chapter …