CIS 720 Replication. Replica Management Three Subproblems Your boss says to you, “Our system is too slow, make it faster.” You decide that replication.

Slides:



Advertisements
Similar presentations
COS 461 Fall 1997 Replication u previous lectures: replication for performance u today: replication for availability and fault tolerance –availability:
Advertisements

Chapter 12 Message Ordering. Causal Ordering A single message should not be overtaken by a sequence of messages Stronger than FIFO Example of FIFO but.
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
CS 542: Topics in Distributed Systems Diganta Goswami.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
CSIS 7102 Spring 2004 Lecture 5 : Non-locking based concurrency control (and some more lock-based ones, too) Dr. King-Ip Lin.
Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung
Copyright 2004 Koren & Krishna ECE655/DataRepl.1 Fall 2006 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing.
Failure Detection The ping-ack failure detector in a synchronous system satisfies – A: completeness – B: accuracy – C: neither – D: both.
Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Storage March 12, Distributed Storage What is Distributed Storage?  Simple answer: Storage that can be shared throughout a network.
DISTRIBUTED SYSTEMS II REPLICATION CNT. II Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
More on Replication and Consistency CS-4513, D-Term More on Replication and Consistency CS-4513 D-Term 2007 (Slides include materials from Operating.
CS 582 / CMPE 481 Distributed Systems
1 Distributed Databases CS347 Lecture 16 June 6, 2001.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Real-Time Distributed Databases By: Chris Scardino CSC536 Monday, May 2, 2005.
Low-Latency Multi-Datacenter Databases using Replicated Commit
CS 603 Data Replication February 25, Data Replication: Why? Fault Tolerance –Hot backup –Catastrophic failure Performance –Parallelism –Decreased.
State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.
6.4 Data and File Replication Gang Shen. Why replicate  Performance  Reliability  Resource sharing  Network resource saving.
04/20/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
6.4 Data And File Replication Presenter : Jing He Instructor: Dr. Yanqing Zhang.
Tolerating Faults in Distributed Systems
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
Concurrency Server accesses data on behalf of client – series of operations is a transaction – transactions are atomic Several clients may invoke transactions.
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Chapter 6 Distributed File Systems Summary Bernard Chen 2007 CSc 8230.
Hierarchical Quorum Consensus: A New Algorithm for Managing Replicated Data Akhil Kumar IEEE TRANSACTION ON COMPUTERS, VOL.40, NO.9, SEPTEMBER 1991.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Consistency.
Chapter 6.5 Distributed File Systems Summary Junfei Wen Fall 2013.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Paxos A Consensus Algorithm for Fault Tolerant Replication.
By Shruti poundarik.  Data Objects and Files are replicated to increase system performance and availability.  Increased system performance achieved.
CS542: Topics in Distributed Systems Replication.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
Exhaustion, Branch and Bound, Divide and Conquer.
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
Spring 2003CS 4611 Replication Outline Failure Models Mirroring Quorums.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Hwajung Lee.  Improves reliability  Improves availability ( What good is a reliable system if it is not available?)  Replication must be transparent.
3/6/99 1 Replication CSE Transaction Processing Philip A. Bernstein.
Replication Improves reliability Improves availability ( What good is a reliable system if it is not available?) Replication must be transparent and create.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Transactions What is a transaction? (A sequence of server operations that must be carried out atomically ) ACID properties - what are these.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Oct 1, 2015 Lecture 12: Mutual Exclusion All slides © IG.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
6.4 Data and File Replication
Distributed Systems CS
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
EECS 498 Introduction to Distributed Systems Fall 2017
Consistency and Replication
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
CS510 - Portland State University
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
EEC 688/788 Secure and Dependable Computing
Implementing Consistency -- Paxos
Ch 6. Summary Gang Shen.
CIS 720 Replication 1.
Presentation transcript:

CIS 720 Replication

Replica Management

Three Subproblems Your boss says to you, “Our system is too slow, make it faster.” You decide that replication of servers is the answer. What do you do next? What are the questions that need to be answered? –Where to place servers? –Where to place content? –What replication algorithm to use?

Placing Servers Given a set of N locations, how do you place the K servers? –What are the goals? –What is the metric that is being optimized? One algorithm, each time you place a server, minimize the average remaining distance to clients. –What is “distance”? –Is “average” the right thing to minimize? What if one client accesses a lot, the other not so much.

One-copy equivalence Conditions to ensure one-copy equivalence: - a read and a write operation cannot happen at the same time - two write operations cannot happen at the same time

Quorum based protocols Each copy p has a weight weight(p) For each data item d, - read quorum r(d) - write quorum w(d) Read quorum = any set of copies whose combined weight is >= r(d) Write quorum = any set of copies whose combined weight is >= w(d)

A BC 322 Read quorum: 3 { (A), (B, C), (A, B), (A, C), (A,B, C) } Write quorum: 4 { (A, B), (A, C), (B, C), (A,B, C)}

To ensure one-copy equivalence, we use the following rules: r(d) + w(d) > total(d) w(d) > total(d)/2 total(d) = sum of the weights of all the replicas

A BC 322 Read quorum: 3 Write quorum: 4

A timestamp for each variable is maintained at each replica To write x, - lock a write quorum - let max be the largest timestamps in the quorum for x - write x with timestamp max + 1 to the quorum

To read x, - lock a read quorum - read data items from the read quorum - return the value with the largest timestamp

Lock granting rules Two or more read locks can be granted concurrently on a replica Two write locks or a read lock and a write lock cannot be granted at the same time.

Avoid deadlocks Acquire locks in the increasing order of replica ids

Common quorum protocols Majority consensus: weight(p) = 1; N copies r(d) = N/2 + 1; w(d) = N/2 + 1 Read one/write all weight(p) = 1 r(d) = 1; w(d) = N Write one/read all

Fault tolerance Majority consensus: tolerate up to N/2 failures Read one/write all writes will be blocked on any failure

Mesh-based quorums