Distributed Systems CS 15-440 Consistency and Replication – Part I Lecture 10, September 30, 2013 Mohammad Hammoud.

Slides:



Advertisements
Similar presentations
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 7: Consistency 4/13/20151Distributed Systems - COMP 655.
Advertisements

Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Consistency and Replication Chapter Introduction: replication and scalability 6.2 Data-Centric Consistency Models 6.3 Client-Centric Consistency.
Consistency and Replication
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Distributed Systems CS Consistency and Replication – Part II Lecture 11, Oct 10, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.
Distributed Systems CS Consistency and Replication – Part III Lecture 12, Oct 12, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Distributed Systems CS Consistency and Replication – Part I Lecture 10, Oct 5, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.
Computer Science Lecture 14, page 1 CS677: Distributed OS Consistency and Replication Today: –Introduction –Consistency models Data-centric consistency.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class: Web Caching Use web caching as an illustrative example Distribution protocols –Invalidate.
Distributed Systems CS Google Chubby and Message Ordering Recitation 4, Sep 29, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
Distributed Systems CS Synchronization – Part II Lecture 8, Sep 28, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.
Distributed Systems CS Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Computer Science Lecture 14, page 1 CS677: Distributed OS Consistency and Replication Introduction Consistency models –Data-centric consistency models.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Consistency.
Distributed Systems CS Programming Models- Part V Replication and Consistency- Part I Lecture 18, Oct 29, 2014 Mohammad Hammoud 1.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class:Consistency Semantics Consistency models –Data-centric consistency models –Client-centric.
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
Distributed Systems (15-440) Mohammad Hammoud December 4 th, 2013.
Consistency and Replication CSCI 4780/6780. Chapter Outline Why replication? –Relations to reliability and scalability How to maintain consistency of.
Consistency And Replication
Distributed Shared Memory: A Survey of Issues and Algorithms B,. Nitzberg and V. Lo University of Oregon.
CH2 System models.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Consistency.
Distributed Systems CS Consistency and Replication – Part II Lecture 11, Oct 2, 2013 Mohammad Hammoud.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
第5讲 一致性与复制 §5.1 副本管理 Replica Management §5.2 一致性模型 Consistency Models
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Distributed Systems CS Consistency and Replication – Part IV Lecture 21, Nov 10, 2014 Mohammad Hammoud.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
12/17/2015Distributed Systems - Comp 6551 Consistency and Replication The problems we are trying to solve Types of consistency Approaches to propagation.
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.
Distributed Systems CS Consistency and Replication – Part IV Lecture 13, Oct 23, 2013 Mohammad Hammoud.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Client-Centric Consistency Models
Hwajung Lee.  Improves reliability  Improves availability ( What good is a reliable system if it is not available?)  Replication must be transparent.
Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
Replication Improves reliability Improves availability ( What good is a reliable system if it is not available?) Replication must be transparent and create.
Consistency and Replication Chapter 6 Presenter: Yang Jie RTMM Lab Kyung Hee University.
Distributed Systems CS Consistency and Replication – Part III Lecture 13, Oct 26, 2015 Mohammad Hammoud.
Distributed Systems CS Consistency and Replication – Part I Lecture 11, Oct 19, 2015 Mohammad Hammoud.
Distributed Systems CS Consistency and Replication – Part IV Lecture 14, Oct 28, 2015 Mohammad Hammoud.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Consistency and Replication CSCI 6900/4900. FIFO Consistency Relaxes the constraints of the causal consistency “Writes done by a single process are seen.
Consistency and Replication (1). Topics Why Replication? Consistency Models – How do we reason about the consistency of the “global state”? u Data-centric.
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
Distributed Shared Memory
Distributed Systems CS
CSE 486/586 Distributed Systems Consistency --- 1
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Distributed Systems CS
Consistency Models.
Distributed Systems CS
CSE 486/586 Distributed Systems Consistency --- 1
Consistency and Replication
Distributed Systems CS
EEC 688/788 Secure and Dependable Computing
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Distributed Systems CS
Distributed Systems CS
Distributed Systems (15-440)
Presentation transcript:

Distributed Systems CS Consistency and Replication – Part I Lecture 10, September 30, 2013 Mohammad Hammoud

Today…  Part I of the Lecture:  Synchronization: Mutual Exclusion and Election Algorithms  Part II of the Lecture:  Consistency and Replication  Motivation  Overview  Types of Consistency Models 2 A New Chapter

Why Replication? Replication is the process of maintaining the data at multiple computers Replication is necessary for: 1.Improving performance A client can access the replicated copy of the data that is near to its location 2.Increasing the availability of services Replication can mask failures such as server crashes and network disconnection 3.Enhancing the scalability of the system Requests to the data can be distributed to many servers which contain replicated copies of the data 4.Securing against malicious attacks Even if some replicas are malicious, secure data can be guaranteed to the client by relying on the replicated copies at the non-compromised servers 3

1. Replication for Improving Performance Example Applications Caching webpages at the client browser Caching IP addresses at clients and DNS Name Servers Caching in Content Delivery Network (CDNs) Commonly accessed contents, such as software and streaming media, are cached at various network locations 4 Main Server Replicated Servers

2. Replication for High-Availability Availability can be increased by storing the data at replicated locations (instead of storing one copy of the data at a server) Example: Google File-System replicates the data at computers across different racks, clusters and data-centers If one computer or a rack or a cluster crashes, then the data can still be accessed from another source 5

3. Replication for Enhancing Scalability Distributing the data across replicated servers helps in avoiding bottlenecks at the main server It balances the load between the main and the replicated servers Example: Content Delivery Networks decrease the load on main servers of the website 6 Main Server Replicated Servers

Replication for Securing Against Malicious Attacks If a minority of the servers that hold the data are malicious, the non- malicious servers can outvote the malicious servers, thus providing security The technique can also be used to provide fault-tolerance against non-malicious but faulty servers Example: In a peer-to-peer system, peers can coordinate to prevent delivering faulty data to the requester 7 n = Servers with correct data n = Servers with faulty data n = Servers that do not have the requested data Number of servers with correct data outvote the faulty servers

Why Consistency? In a DS with replicated data, one of the main problems is keeping the data consistent An example: In an e-commerce application, the bank database has been replicated across two servers Maintaining consistency of replicated data is a challenge 8 Bal=1000 Replicated Database Event 1 = Add $1000 Event 2 = Add interest of 5% Bal= Bal= Bal= Bal=2100

Overview of Consistency and Replication Consistency Models Data-Centric Consistency Models Client-Centric Consistency Models Replica Management When, where and by whom replicas should be placed? Which consistency model to use for keeping replicas consistent? Consistency Protocols We study various implementations of consistency models 9 Next lectures Today’s lecture

Overview Consistency Models Data-Centric Consistency Models Client-Centric Consistency Models Replica Management Consistency Protocols 10

Introduction to Consistency and Replication In a distributed system, shared data is typically stored in distributed shared memory, distributed databases or distributed file systems The storage can be distributed across multiple computers Simply, we refer to a series of such data storage units as data-stores Multiple processes can access shared data by accessing any replica on the data-store Processes generally perform read and write operations on the replicas Process 1 Process 2 Process 3 Local Copy Distributed data-store

Maintaining Consistency of Replicated Data 12 x=0 Replica 1Replica 2Replica 3Replica n Process 1 Process 2 Process 3 R(x)b =Read variable x; Result is b W(x)b = Write variable x; Result is b P1 =Process P1=Timeline at P1 R(x)0 W(x)2 x=2 R(x)?R(x)2 W(x)5 R(x)?R(x)5 x=5 DATA-STORE Strict Consistency Data is always fresh After a write operation, the update is propagated to all the replicas A read operation will result in reading the most recent write If there are occasional writes and reads, this leads to large overheads

Maintaining Consistency of Replicated Data (Cont’d) 13 x=0 Replica 1Replica 2Replica 3Replica n Process 1 Process 2 Process 3 R(x)b =Read variable x; Result is b W(x)b = Write variable x; Result is b P1 =Process P1=Timeline at P1 R(x)0 R(x)5 W(x)2 x=2 R(x)?R(x)3 W(x)5 R(x)?R(x)5 x=0x=5x=3 DATA-STORE Loose Consistency Data might be stale A read operation may result in reading a value that was written long back Replicas are generally out-of-sync The replicas may sync at coarse grained time, thus reducing the overhead

Trade-offs in Maintaining Consistency Maintaining consistency should balance between the strictness of consistency versus efficiency Good-enough consistency depends on your application 14 Strict Consistency Generally hard to implement, and is inefficient Loose Consistency Easier to implement, and is efficient

Consistency Model A consistency model is a contract between the process that wants to use the data, and the replicated data repository (or data-store) A consistency model states the level of consistency provided by the data-store to the processes while reading and writing the data 15

Types of Consistency Models Consistency models can be divided into two types: Data-Centric Consistency Models These models define how the data updates are propagated across the replicas to keep them consistent Client-Centric Consistency Models These models assume that clients connect to different replicas at different times The models ensure that whenever a client connects to a replica, the replica is brought up to date with the replica that the client accessed previously 16

Summary Replication is necessary for improving performance, scalability availability, and security Replicated data-stores should be designed after carefully evaluating the trade-offs between tolerable data inconsistency and efficiency Consistency Models describe the contract between the data- store and processes about what form of consistency to expect from the system Consistency models can be classified into two types: Data-Centric Consistency models Client-Centric Consistency models 17

Next Classes Data-Centric Consistency Models Sequential and Causal Consistency Models Client-Centric Consistency Models Eventual Consistency, Monotonic Reads, Monotonic Writes, Read Your Writes and Writes Follow Reads Replica Management Replica management studies: when, where and by whom replicas should be placed which consistency model to use for keeping replicas consistent Consistency Protocols We study various implementations of consistency models 18

References [1] Haifeng Yu and Amin Vahdat, “Design and evaluation of a conit-based continuous consistency model for replicated services” [2] load-on-the-web/ [3] [4] [5] [6] [7] 19