Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.

Slides:



Advertisements
Similar presentations
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 7: Consistency 4/13/20151Distributed Systems - COMP 655.
Advertisements

Logical Clocks.
Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
Consistency and Replication (3). Topics Consistency protocols.
Consistency and Replication. Replication of data Why? To enhance reliability To improve performance in a large scale system Replicas must be consistent.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
CS 582 / CMPE 481 Distributed Systems
CMPT 431 Dr. Alexandra Fedorova Lecture XII: Replication.
Election AlgorithmsCS-4513 D-term Election Algorithms CS-4513 Distributed Computing Systems (Slides include materials from Operating System Concepts,
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Clock Synchronization and algorithm
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XII: Replication.
Logical Clocks (2). Topics r Logical clocks r Totally-Ordered Multicasting r Vector timestamps.
Consistency and Replication CSCI 4780/6780. Chapter Outline Why replication? –Relations to reliability and scalability How to maintain consistency of.
Consistency And Replication
Replication and Consistency. Reference The Dangers of Replication and a Solution, Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. In Proceedings.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Replication with View Synchronous Group Communication Steve Ko Computer Sciences and Engineering.
Consistent and Efficient Database Replication based on Group Communication Bettina Kemme School of Computer Science McGill University, Montreal.
Consistency and Replication. Replication of data Why? To enhance reliability To improve performance in a large scale system Replicas must be consistent.
Logical Clocks. Topics Logical clocks Totally-Ordered Multicasting Vector timestamps.
Lamport’s Logical Clocks & Totally Ordered Multicasting.
Replication (1). Topics r Why Replication? r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric consistency.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Distributed Systems CS Consistency and Replication – Part IV Lecture 21, Nov 10, 2014 Mohammad Hammoud.
Distributed Systems Replication. Why Replication Replication is Maintenance of copies at multiple sites Enhancing Services by replicating data Performance.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Distributed Coordination. Turing Award r The Turing Award is recognized as the Nobel Prize of computing r Earlier this term the 2013 Turing Award went.
Distributed Systems CS Consistency and Replication – Part I Lecture 10, September 30, 2013 Mohammad Hammoud.
Chap 7: Consistency and Replication
Logical Clocks. Topics Logical clocks Totally-Ordered Multicasting Vector timestamps.
Replication and Group Communication. Management of Replicated Data FE Requests and replies C Replica C Service Clients Front ends managers RM FE RM Instructor’s.
Distributed Systems CS Consistency and Replication – Part IV Lecture 13, Oct 23, 2013 Mohammad Hammoud.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Logical Clocks. Topics r Logical clocks r Totally-Ordered Multicasting.
Synchronization in Distributed File Systems Advanced Operating System Zhuoli Lin Professor Zhang.
Fault Tolerance (2). Topics r Reliable Group Communication.
Consistency and Replication (1). Topics Why Replication? Consistency Models – How do we reason about the consistency of the “global state”? u Data-centric.
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
Distributed Computing Systems Replication Dr. Sunny Jeong. Mr. Colin Zhang With Thanks to Prof. G. Coulouris,
Replication Chapter Katherine Dawicki. Motivations Performance enhancement Increased availability Fault Tolerance.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586 Distributed Systems Consistency --- 2
CS6320 – Performance L. Grewe.
Distributed Shared Memory
CSE 486/586 Distributed Systems Consistency --- 1
Consistency and Replication
Consistency Models.
7.1. CONSISTENCY AND REPLICATION INTRODUCTION
CSE 486/586 Distributed Systems Consistency --- 1
Active replication for fault tolerance
Distributed Systems CS
Lecture 21: Replication Control
Distributed Systems CS
Replica Placement Model: We consider objects (and don’t worry whether they contain just data or code, or both) Distinguish different processes: A process.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Lecture 21: Replication Control
CSE 486/586 Distributed Systems Consistency --- 2
Network management system
CSE 486/586 Distributed Systems Consistency --- 1
Presentation transcript:

Replication (1)

Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures

Readings r Van Steen and Tanenbaum: 6.1, 6.2 and 6.3, 6.4 r Coulouris: 11,14

Why Replicate? r Fault-tolerance / High availability m As long as one replica is up, the service is available m Assume each of n replicas has same independent probability p to fail. Availability = 1 - pn Fault-Tolerance: Take-Over

Why Replicate? (II) r Fast local access m Client can always send requests to closest replica m Goal: no communication to remote replicas necessary during request execution m Goal: client experiences location transparency since all access is fast local access Fast local access Toronto Beijing Dubai

Why Replicate? r Scalability and load distribution m Requests can be distributed among replicas m Handle increasing load by adding new replicas to the system cluster instead of bigger server

Common Replication Examples r DNA naming service r Web browsers often locally store a copy of a previously fetched web page. m This is referred to as caching a web page. r Replication of a database r Replication of game state r Google File System (GFS)

System Model FE Requests and replies C Replica C Service Clients Front ends managers RM FE RM Clients’ request are handled by front ends. A front end makes replication transparent.

Developing Replication Systems r Consistency r Faults r Changes in Group Membership

Example of Data Inconsistency r Assume x = 6 r Client operations: m write(x = 5) m read (x) // should return 5 on a single- server system r Assume 3 replicas: R1,R2,R3

Example of Data Inconsistency r Client executes write (x = 5) r We want x = 5 at R1,R2,R3 r The write requires network communication m Assume a message is sent to each replica indicating that x=5 BTW, this is multicasting r The amount of time it takes for a message to get to a replica varies

Example of Data Inconsistency r R1 receives message m x is now 5 at R1 r A client executes read (x) m The read causes a message to be sent to R2 r If R2 hasn’t received the update message then R2 sends the old value of x (which is 6)

Example Scenario r Replicated accounts in New York(NY) and San Francisco(SF) r Two transactions occur at the same time and multicast m Current balance: $1,000 m Add $100 at SF m Add interest of 1% at NY m If not done in the same order at each site then one site will record a total amount of $1,111 and the other records $1,110.

Example Scenario r Updating a replicated database and leaving it in an inconsistent state.

Strict Consistency r Strict consistency is defined as follows: m Read is expected to return the value resulting from the most recent write operation m Assumes absolute global time m All writes are instantaneously visible to all r Suppose that process p i updates the value of x to 5 from 4 at time t 1 and multicasts this value to all replicas m Process p j reads the value of x at t 2 (t 2 > t 1 ). m Process p j should read x as 5 regardless of the size of the (t 2 -t 1 ) interval.

Strict Consistency r What if t 2 -t 1 = 1 nsec and the optical fiber between the host machines with the two processes is 3 meters. m The update message would have to travel at 10 times the speed of light m Not allowed by Einsten’s special theory of relativity. r Can’t have strict consistency

Sequential Consistency r Sequential Consistency: m The result of any execution is the same as if the (read and write) operations by all processes on the data were executed in some sequential order m Operations of each individual process appears in this sequence in the order specified by is programs.

Example Scenario r Banking operation r We must ensure that the two update operations are performed in the same order at each copy. r Although it makes a difference whether the deposit is processed before the interest update or the other way around, it does matter which order is followed from the point of view of consistency.

Causal Consistency r Causal Consistency: That if one update, U1, causes another update, U2, to occur then U1 should be executed before U2 at each copy. r Application: Bulletin board

FIFO Consistency r Writes done by a single process are seen by all other processes in the order in which they were issued r … but writes from different processes may be seen in a different order by different processes. r i.e., there are no guarantees about the order in which different processes see writes, except that two or more writes from a single source must arrive in order.

FIFO Consistency r Caches in web browsers m All updates are updated by page owner. m No conflict between two writes m Note: If a web page is updated twice in a very short period of time then it is possible that the browser doesn’t see the first update.

Implementation Option: Sequential Consistency r Have a centralized process that is a sequencer. r Each update request is sent to the sequencer which m Assigns the request a unique sequence number m Update request is forwarded to each replica m Operations are carried out in the order of their sequence number

What about the other models? r Good question r We will get to those later

Faults r The use of the sequencer is interesting but r What if the sequencer fails? m One technique is an election algorithm Selects a new sequencer Very well studied m However, we will see that Google’s File System (GFS) uses a different (and simpler) approach r What if one of the replica’s fail? m Lots of solutions

Design Considerations r Where to submit updates? m A designated server or any server? r When to propagate updates? r How many replicas to install?

Summary r We have given you a taste of some of the issues r Let’s look at how Google deals with these issues in its file system