CS 582 / CMPE 481 Distributed Systems

Slides:



Advertisements
Similar presentations
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Advertisements

Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
Consistency and Replication (3). Topics Consistency protocols.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
DISTRIBUTED SYSTEMS II REPLICATION CNT. II Prof Philippas Tsigas Distributed Computing and Systems Research Group.
1 Linearizability (p566) the strictest criterion for a replication system  The correctness criteria for replicated objects are defined by referring to.
DISTRIBUTED SYSTEMS II REPLICATION –QUORUM CONSENSUS Prof Philippas Tsigas Distributed Computing and Systems Research Group.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
CS 582 / CMPE 481 Distributed Systems
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Transaction Management and Concurrency Control
CS 582 / CMPE 481 Distributed Systems Concurrency Control.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
CS 582 / CMPE 481 Distributed Systems Replication.
1 ICS 214B: Transaction Processing and Distributed Data Management Replication Techniques.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Chapter 18: Distributed Coordination (Chapter 18.1 – 18.5)
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
CS 603 Data Replication February 25, Data Replication: Why? Fault Tolerance –Hot backup –Catastrophic failure Performance –Parallelism –Decreased.
6.4 Data and File Replication Gang Shen. Why replicate  Performance  Reliability  Resource sharing  Network resource saving.
6.4 Data And File Replication Presenter : Jing He Instructor: Dr. Yanqing Zhang.
Fault Tolerance via the State Machine Replication Approach Favian Contreras.
Replication ( ) by Ramya Balakumar
Distributed Systems Course Replication 14.1 Introduction to replication 14.2 System model and group communication 14.3 Fault-tolerant services 14.4 Highly.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 18: Replication.
Slides for Chapter 14: Replication From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley 2001.
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
DISTRIBUTED SYSTEMS II REPLICATION CNT. Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Exercises for Chapter 18: Replication From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley 2001.
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
Practical Byzantine Fault Tolerance
Consistency and Replication. Replication of data Why? To enhance reliability To improve performance in a large scale system Replicas must be consistent.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Databases Illuminated
1 Highly available services  we discuss the application of replication techniques to make services highly available. –we aim to give clients access to.
CS542: Topics in Distributed Systems Replication.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
Fault Tolerant Services
Fault Tolerance and Replication
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
Providing High Availability Using Lazy Replication Rivaka Ladin, Barbara Liskov, Liuba Shrira, Sanjay Ghemawat Presented by Huang-Ming Huang.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
1 Highly available services  we discuss the application of replication techniques to make services highly available. –we aim to give clients access to.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Exercises for Chapter 14: Replication
Lecturer : Dr. Pavle Mogin
6.4 Data and File Replication
Chapter 14: Replication Introduction
Replication Control II Reading: Chapter 15 (relevant parts)
Distributed systems II Replication Cnt.
Outline Announcements Fault Tolerance.
Replication and Recovery in Distributed Systems
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Slides for Chapter 15: Replication
UNIVERSITAS GUNADARMA
Slides for Chapter 18: Replication
EEC 688/788 Secure and Dependable Computing
Presentation transcript:

CS 582 / CMPE 481 Distributed Systems Replication (cont.)

Class Overview Group Model Replication Model Request Ordering Case study Lazy replication Transactions with replicated data

Lazy Replication Design principle Supported ordering Characteristic provide a method to support causal ordering for consistency while providing better performance request is executed at one replica as updating of other replicas happens by exchange of gossip messages (lazy replication) Supported ordering causal ordering forced ordering (causal, total ordering) immediate ordering (sync ordering) Characteristic it’s a replication method, not a replication framework no client replication is considered claimed to be better suited for providing replicated services

Lazy Replication (cont) Operational model FE (front end) client’s request always goes through FE usually communicates with one replica (closely available one) if fails or heavily loaded, contact another keeps vector timestamp attaches it to request updates it to a new time stamp obtained from result or contacts other FEs operation types query: read only operation update: modification without reading state

Lazy Replication (cont) Replica operation keeps states for consistent data state value value timestamp represents previous updates to validate the current value update log for safety and consistency among replicas replica timestamp represents updates that have been processed request id unique id for update request

Lazy Replication (cont) Causal operation FE informs replicas of causal ordering of requests every reply message for update request contains a unique id, uid every request takes a set of uids (label) listing all the updates that should have happened before it query returns value and label listing all the updates that should have happened before it FE guarantees causality FE sends its label in every request and merge uid or label in each reply with its label (union of two sets) FE communicates with other clients and merges labels in messages from other FEs with its label gossip message handling at replica gossip message: replica log + replica timestamp Tasks add new info in the message to replica’s log merge replica’s timestamp with message’s timestamp apply any updates that have become stable and have not been executed before discard update records from log if they have been received by all replicas discard redundant entries from log and cid list

Lazy Replication (cont) Forced ordering provided only for updates implementation causal and total ordering same as causal ordering except that uids are totally ordered with respect to one another one of replicas becomes a primary sequencer; any replica can take over the role when the primary fails forced update operation phase 1: prepare assign uid to update create log record for update and send it to backup replicas which, in turn, acknowledge receipt phase 2: commit when majority of backups acknowledge, primary adds record to its log, applies update to value, and replies to FE backups are informed of commit in next gossip message

Lazy Replication (cont) Immediate ordering provided only for updates Implementation one of replicas becomes a primary primary coordinates with all replicas to determine what updates precedes sync update and generate label 3 phase operation are used to prevent failure of primary from queries being blocked phase 1: pre-prepare ask every backup to send its log and timestamp once reply for each backup arrives, stop responding to queries phase 2: prepare create a log record for immediate update and send it to backups phase 3: commit if majority of backups acknowledges receipt, then primary adds record to its log, applies update to value, and replies to FE if failure happens, new primary will carry out phase 2 again

Transactions with Replication Data Replicated transactions transactions in which a physical copy of each logical data item is replicated at a group of servers (replicas) One-copy serializability effects of transactions performed by various clients on replicated data items are the same as if they had been performed one at a time on single data item to achieve this concurrency control mechanisms are applied to all of replicas 2PC protocol becomes two level nested 2PC protocol phase 1 a worker forwards “ready” message to replicas and collects answers phase 2 a worker forward “commit” message to replicas primary copy replication: concurrency control is only applied to primary

Transactions with Replicated Data (cont.) Available copies replication designed to allow for some replicas being allowed unavailable client’s Read operation is performed on any of available copy but Write operation on all of available copies failures and recoveries of replicas should be serialized to support one-copy serializability local validation concurrency control mechanism to ensure that any failure or recovery event does not appear to happen during the progress of a transaction

Transactions with Replicated Data (cont) Network partition can separate a group of replicas into subgroup between which communications are not possible assume that partition will be repaired Resolutions available copies with validation quorum consensus virtual partition

Transactions with Replicated Data (cont) Available copies with validation available copies algorithm is applied to each partition after partition is repaired, possibly conflicting transaction is validated version vector can be used to check validity of separately committed data items precedence graphs can be used to detect conflicts between Read and Write operations between partitions only feasible with applications where compensation is allowed

Transactions with Replicated Data (cont) Quorum Consensus [Gifford 1979] operations are only allowed when a certain number of replicas (i.e. quorum) are available in the partition only one partition can allow operations to be committed so as to prevent transactions in different partitions from producing inconsistent results Other sub groups will have out-of-date copies (version no. or time-stamps) Quorum set W > half the total votes R + W > total number of votes for group any pair of read quorum and write quorum must contain common copies, so no conflicting operations on the same copy read operations check if there is enough number of copies ≥ R (maybe out-of-date) perform operation on up-to-date copy write operations check if there is enough number of up-to-date copies ≥ W perform operation on all replicas

Transactions with Replicated Data (cont) Quorum Consensus Example Three examples of the voting algorithm: A correct choice of read and write set A choice that may lead to write-write conflicts A correct choice, known as ROWA (read one, write all)

Transactions with Replicated Data (cont) Virtual partition combination of quorum consensus (to cope with partition) and available copies algorithm (inexpensive Read operation) to support one-copy serializability, transaction aborts if replica fails and virtual partition changes during progress of transaction when a virtual partition is formed, all the replicas must be brought up to date by copying from other replicas virtual partition creation phase 1 initiator sends Join request to each potential replica with logical timestamp each replica compares timestamp of current virtual partition if proposed time stamp is greater than local one, reply yes otherwise, no phase 2 if initiator gets sufficient Yes replies to form read and write quora send confirmation message with list of members each member records timestamp and members