Distributed Systems CS

Slides:



Advertisements
Similar presentations
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 7: Consistency 4/13/20151Distributed Systems - COMP 655.
Advertisements

Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Consistency and Replication Chapter Topics Reasons for Replication Models of Consistency –Data-centric consistency models: strict, linearizable,
Consistency and Replication. Replication of data Why? To enhance reliability To improve performance in a large scale system Replicas must be consistent.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Distributed Systems CS Consistency and Replication – Part II Lecture 11, Oct 10, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.
Distributed Systems CS Consistency and Replication – Part I Lecture 10, Oct 5, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.
Computer Science Lecture 14, page 1 CS677: Distributed OS Consistency and Replication Today: –Introduction –Consistency models Data-centric consistency.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
CS 582 / CMPE 481 Distributed Systems
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Computer Science Lecture 14, page 1 CS677: Distributed OS Consistency and Replication Introduction Consistency models –Data-centric consistency models.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Consistency.
Distributed Systems CS Programming Models- Part V Replication and Consistency- Part I Lecture 18, Oct 29, 2014 Mohammad Hammoud 1.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class:Consistency Semantics Consistency models –Data-centric consistency models –Client-centric.
Consistency and Replication CSCI 4780/6780. Chapter Outline Why replication? –Relations to reliability and scalability How to maintain consistency of.
Consistency And Replication
Distributed Systems CS Consistency and Replication – Part II Lecture 11, Oct 2, 2013 Mohammad Hammoud.
Practical Byzantine Fault Tolerance
Consistency and Replication. Replication of data Why? To enhance reliability To improve performance in a large scale system Replicas must be consistent.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
Distributed Systems CS Consistency and Replication – Part IV Lecture 21, Nov 10, 2014 Mohammad Hammoud.
Distributed Systems CS Consistency and Replication – Part I Lecture 10, September 30, 2013 Mohammad Hammoud.
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
Distributed Systems CS Consistency and Replication – Part IV Lecture 13, Oct 23, 2013 Mohammad Hammoud.
Hwajung Lee.  Improves reliability  Improves availability ( What good is a reliable system if it is not available?)  Replication must be transparent.
Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
Replication Improves reliability Improves availability ( What good is a reliable system if it is not available?) Replication must be transparent and create.
Distributed Systems CS Consistency and Replication – Part I Lecture 11, Oct 19, 2015 Mohammad Hammoud.
Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.

CS6320 – Performance L. Grewe.
Distributed Systems – Paxos
Distributed Systems CS
CSE 486/586 Distributed Systems Consistency --- 1
Consistency and Replication
Consistency and Replication
Replication and Consistency
Implementing Consistency -- Paxos
Consistency Models.
Replication and Consistency
Distributed Systems CS
Outline Announcements Fault Tolerance.
Distributed Systems CS
Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
DATA CENTRIC CONSISTENCY MODELS
CSE 486/586 Distributed Systems Consistency --- 1
Replication Improves reliability Improves availability
Consistency and Replication
Active replication for fault tolerance
Distributed Systems CS
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Outline Announcements Lab2 Distributed File Systems 1/17/2019 COP5611.
Lecture 21: Replication Control
Causal Consistency and Two-Phase Commit
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Distributed Systems CS
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
EEC 688/788 Secure and Dependable Computing
Distributed Systems CS
Consistency and Replication
Distributed Systems (15-440)
Lecture 21: Replication Control
Implementing Consistency -- Paxos
Replication and Consistency
Presentation transcript:

Distributed Systems CS 15-440 Replication – Part I Lecture 23, December 03, 2018 Mohammad Hammoud

Today… Last Session: Caching (or client-side replication) – Part III Today’s Session: Replication (or more precisely, server-side replication) Data-Centric Consistency Models Announcements: Project 4 is out. It is due on December 12 by midnight PS5 is due on December 6 by midnight

Overview Motivation Consistency Models Data-Centric Consistency Models Today’s lecture Motivation Consistency Models Data-Centric Consistency Models Client-Centric Consistency Models Consistency Protocols Next lecture

Overview Motivation Consistency Models Data-Centric Consistency Models Client-Centric Consistency Models Consistency Protocols

Why Replication? Replication is necessary for: Improving performance A client can access nearby replicated copies and save latency Increasing the availability of services Replication can mask failures such as server crashes and network disconnection Enhancing the scalability of systems Requests to data can be distributed across many servers, which contain replicated copies of the data Securing against malicious attacks Even if some replicas are malicious, security of data can be guaranteed by relying on replicated copies at non-compromised servers

1. Replication for Improving Performance Example: Replication at secondary servers in Content Delivery Network (CDNs) Main Server Secondary Servers

2. Replication for High-Availability Example: Google File-System replicates data blocks at computers across different racks, clusters, and data-centers If one computer or a rack or a cluster crashes, blocks can still be accessed from other sources

3. Replication for Enhancing Scalability Distributing data across replicated servers helps in saving the main server from becoming a performance bottleneck Example: Content Delivery Networks can decrease load at main (primary) servers Main Server Replicated Servers

4. Replication for Securing Against Malicious Attacks If a minority of servers in a system are malicious, the non-malicious servers can outvote the malicious ones This technique can also be used to provide fault-tolerance against non- malicious but faulty servers Example: In a peer-to-peer system, peers can coordinate to prevent delivering faulty data to the requester 1 5 6 3 7 4 2 Number of servers with correct data outvote the faulty servers n = Servers with correct data = Servers with faulty data = Servers that do not have the requested data

Event 2 = Add interest of 5% Why Consistency? But (server-side) replication comes with a cost, which is the necessity for maintaining consistency (or more precisely consistent ordering of updates) Example: A Bank Database Event 1 = Add $1000 Event 2 = Add interest of 5% 1 2 Bal=2000 Bal=2100 Bal=1000 4 3 Bal=2050 Bal=1050 Bal=1000 Replicated Database

Overview Motivation Consistency Models Data-Centric Consistency Models Client-Centric Consistency Models Consistency Protocols

Maintaining Consistency of Replicated Data DATA-STORE Replica 1 Replica 2 Replica 3 Replica n x=5 x=2 x=0 x=2 x=5 x=0 x=0 x=5 x=2 x=0 x=5 x=2 Process 1 R(x)0 W(x)2 R(x)5 R(x)? Process 2 R(x)0 R(x)? R(x)2 Process 3 W(x)5 Strict Consistency Data is always fresh After a write operation, the update is propagated to all the replicas A read operation will result in reading the most recent write If read-to-write ratio is low, this leads to large overheads =Read variable x; Result is b = Write variable x; Result is b P1 =Process P1 =Timeline at P1 R(x)b W(x)b

Maintaining Consistency of Replicated Data (Cont’d) DATA-STORE Replica 1 Replica 2 Replica 3 Replica n x=0 x=0 x=2 x=0 x=2 x=5 x=0 x=2 x=0 x=3 x=2 Process 1 R(x)0 W(x)2 R(x)? R(x)5 Process 2 R(x)5 R(x)3 R(x)? Process 3 W(x)5 Loose Consistency Data might be stale A read operation may result in reading a value that was written long back Replicas are generally out-of-sync The replicas may sync at coarse grained time, thus reducing the overhead =Read variable x; Result is b = Write variable x; Result is b P1 =Process P1 =Timeline at P1 R(x)b W(x)b

Trade-offs in Maintaining Consistency Maintaining consistency should balance between the strictness of consistency versus efficiency (or performance) Good-enough consistency depends on your application Loose Consistency Strict Consistency Easier to implement, and is efficient Generally hard to implement, and is inefficient

Consistency Model A consistency model is a contract between: The process that wants to use the data and the data-store A consistency model states the level (or degree) of consistency provided by the data-store to the processes while reading and writing data

Types of Consistency Models Consistency models can be divided into two types: Data-Centric Consistency Models These models define how updates are propagated across the replicas to keep them consistent Client-Centric Consistency Models These models assume that clients connect to different replicas at different times They ensure that whenever a client connects to a replica, the replica is brought up to date with the replica that the client accessed previously

Overview Motivation Consistency Models Data-Centric Consistency Models Client-Centric Consistency Models Consistency Protocols

Consistent Ordering of Operations We need to express the semantics of parallel accesses when shared data are replicated Before updates at replicas are committed, all replicas shall reach an agreement on a global ordering of the updates That is, replicas in shared data-stores should agree on a consistent ordering of updates What consistent ordering of updates can replicas agree on?

Types of Ordering Below are three major types of orderings: Total Ordering Sequential Ordering Sequential Consistency Model Causal Ordering Causal Consistency Model

Types of Ordering Below are three major types of orderings: Total Ordering Sequential Ordering Sequential Consistency Model Causal Ordering Causal Consistency Model

Total Ordering What is total ordering? If process Pi sends a message mi and Pj sends mj, and if one correct process delivers mi before mj then every other correct process delivers mi before mj Messages can denote replica updates In the example Ex1, if P1 issues the operation m(1,1): x=x+1; and If P3 issues m(3,1): print(x); and P1 or P2 or P3 delivers m(3,1) before m(1,1) Then, at all replicas P1, P2, P3 the following order of operations are executed print(x); x=x+1; m(1,1) P1 P2 P3 m(3,1) Ex1: Total Order P1 P2 P3 m(1,1) m(3,1) Ex2: Not in Total Order

Types of Ordering Below are three major types of orderings: Total Ordering Sequential Ordering Sequential Consistency Model Causal Ordering Causal Consistency Model

Sequential Ordering What is sequential ordering? If a process Pi sends a sequence of messages m(i,1),...., m(i,ni), and Process Pj sends a sequence of messages m(j,1),...., m(j,nj), Then: At any process, the set of messages received are in some sequential order Messages from each individual process should appear in the same order sent by that process At every process, mi,1 should be delivered before mi,2, which should be delivered before mi,3 and so on... At every process, mj,1 should be delivered before mj,2, which should be delivered before mj,3 and so on... P1 P2 P3 m(1,1) m(3,1) m(3,2) m(1,2) m(3,3) Valid Sequential Orders P1 P2 P3 m(1,1) m(3,1) m(3,2) m(1,2) m(3,3) Invalid Sequential Orders, but Valid Total Order

Sequential Consistency (Cont’d) Consider three processes P1, P2 and P3 executing multiple instructions on three shared variables x, y and z Assume that x, y and z are set to zero at start There are many valid sequences in which operations can be executed, respecting sequential consistency (or program order) How can a program identify the wrong sequence among the following sequences: P1 P2 P3 x = 1 print (y,z) y = 1 print (x,z) z = 1 print (x,y) x = 1 print (y,z) y = 1 print (x,z) z = 1 print (x,y) x = 1 y = 1 print (x,z) print (y,z) z = 1 print (x,y) print (y,z) y = 1 print (x,y) x = 1 print (x,z) z = 1 y = 1 z = 1 print (x,y) print (x,z) x = 1 print (y,z) Output 001011 101011 000110 010111 Signature 001011 101011 001001 110101

Implications of Adopting A Sequential Consistency Model for Applications There might be several different sequentially consistent combinations of ordering Number of combinations for a total of n instructions = The contract between the process and the distributed data-store is that the process must accept all of the sequential orderings as valid results A process that works for some of the sequential orderings and not for others is INCORRECT 𝑂(𝑛!)

Types of Ordering Below are three major types of orderings: Total Ordering Sequential Ordering Sequential Consistency Model Causal Ordering Causal Consistency Model

Causal vs. Concurrent events Consider an interaction between processes P1 and P2 operating on replicated data x and y P1 W(x)a P1 W(x)a R(x)a W(y)b W(y)b R(x)a P2 P2 Events are causally related Events are not concurrent Computation of y at P2 may have depended on the value of x written by P1 Events are not causally related Events are concurrent Computation of y at P2 does not depend on the value of x written by P1 =Read variable x; Result is b = Write variable x; Result is b P1 =Process P1 =Timeline at P1 R(x)b W(x)b

Causal Ordering What is causal ordering? In Ex1: In Ex2: If process Pi sends a message mi and Pj sends mj, and if mimj (operator ‘’ is Lamport’s happened-before relation) then any correct process that delivers mj will deliver mi before mj In Ex1: m(1,1) and m(3,1) are in Causal Order m(1,1) and m(1,2) are in Causal Order In Ex2: m(1,1) and m(3,1) are NOT in Causal Order m(1,1) m(1,2) P1 P2 P3 m(3,1) Ex1: Valid Causal Orders P1 P2 P3 m(1,1) m(3,1) m(1,2) Ex2: Invalid Causal Order

Causal Consistency Model A data-store is causally consistent if: Writes that are potentially causally related are seen by all the processes in the same order But concurrent writes may be seen in a different order on different machines

Implications of adopting a Causally Consistent Data-store for Applications Processes have to keep track of which processes have seen which writes This requires maintaining a dependency graph between write and read operations Vector clocks provide a way to maintain causally consistent data-stores

Next Class Consistency Models Client-Centric Consistency Models Consistency Protocols We will study various implementations of consistency models