Replication March 16, 2001. 2 Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 

Slides:



Advertisements
Similar presentations
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Advertisements

Failure Detection The ping-ack failure detector in a synchronous system satisfies – A: completeness – B: accuracy – C: neither – D: both.
Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Storage March 12, Distributed Storage What is Distributed Storage?  Simple answer: Storage that can be shared throughout a network.
Replica Control for Peer-to- Peer Storage Systems.
1 Countermeasures against Consistency Anomalies in Databases with Relaxed ACID Properties. By Lars Frank Copenhagen Business School.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
CS 582 / CMPE 481 Distributed Systems
Coda file system: Disconnected operation By Wallis Chau May 7, 2003.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Transaction Management and Concurrency Control
1 ICS 214B: Transaction Processing and Distributed Data Management Replication Techniques.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Overview Distributed vs. decentralized Why distributed databases
Manajemen Basis Data Pertemuan 10 Matakuliah: M0264/Manajemen Basis Data Tahun: 2008.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Real-Time Distributed Databases By: Chris Scardino CSC536 Monday, May 2, 2005.
Replication and Consistency CS-4513 D-term Replication and Consistency CS-4513 Distributed Computing Systems (Slides include materials from Operating.
1 More on Distributed Coordination. 2 Who’s in charge? Let’s have an Election. Many algorithms require a coordinator. What happens when the coordinator.
More on Replication and Consistency CS-4513 D-term More on Replication and Consistency CS-4513 Distributed Computing Systems (Slides include materials.
Distributed Databases
6.4 Data and File Replication Gang Shen. Why replicate  Performance  Reliability  Resource sharing  Network resource saving.
Distributed Deadlocks and Transaction Recovery.
Transactions March 14, Transactions What is a transaction?  A set of operations that must behave as a single operation  e.g., { read account.
Distributed Transactions March 15, Transactions What is a Distributed Transaction?  A transaction that involves more than one server  Network.
Presented by Dr. Greg Speegle April 12,  Two-phase commit slow relative to local transaction processing  CAP Theorem  Option 1: Reduce availability.
Chapter 19 Recovery and Fault Tolerance Copyright © 2008.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
CS551 - Lecture 18 1 CS551 Object Oriented Middleware (VII) Advanced Topics (Chap of EDO) Yugi Lee STB #555 (816)
Data in the Cloud – I Parallel Databases The Google File System Parallel File Systems.
Replicated Databases. Reading Textbook: Ch.13 Textbook: Ch.13 FarkasCSCE Spring
Concurrency Control. Objectives Management of Databases Concurrency Control Database Recovery Database Security Database Administration.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Distributed synchronization and mutual exclusion Distributed Transactions.
CS425 / CSE424 / ECE428 — Distributed Systems — Fall 2011 Some material derived from slides by Prashant Shenoy (Umass) & courses.washington.edu/css434/students/Coda.ppt.
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed File Systems
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
Fault Tolerance and Replication
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
IM NTU Distributed Information Systems 2004 Distributed Transactions -- 1 Distributed Transactions Yih-Kuen Tsay Dept. of Information Management National.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
CSE 486/586 Distributed Systems Consistency --- 3
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
9 1 Chapter 9_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.
CIS 720 Replication. Replica Management Three Subproblems Your boss says to you, “Our system is too slow, make it faster.” You decide that replication.
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.
THE EVOLUTION OF CODA M. Satyanarayanan Carnegie-Mellon University.
Implementation of Simple Cloud-based Distributed File System Group ID: 4 Baolin Wu, Liushan Yang, Pengyu Ji.
DStore: An Easy-to-Manage Persistent State Store Andy Huang and Armando Fox Stanford University.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.
CSE 486/586 Distributed Systems Consistency --- 1
6.4 Data and File Replication
CSE 486/586 Distributed Systems Consistency --- 3
Outline Announcements Fault Tolerance.
CSE 486/586 Distributed Systems Consistency --- 1
Replication and Recovery in Distributed Systems
Replication and Availability in Distributed Systems
Abstractions for Fault Tolerance
CSE 486/586 Distributed Systems Consistency --- 3
Ch 6. Summary Gang Shen.
Presentation transcript:

Replication March 16, 2001

2 Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance  Replication can occur at many levels:  Data  Computation  Communication  Issues: Consistency and Coordination

3 Replication Examples  Atomic writes via file logging  Performance hit - more than double the time to write  DNS vs. Grapevine  Consistency limits performance and scaling  RAID  Replication increases performance, but costs money  AFS  Possibility that merge is not practical Replication, cont.

4 Replication of Data in Transactions  One-copy semantics  To a transaction, a set of replicated objects should behave as one  Big Tradeoff - Performance under less than perfect conditions  Network failures  Server downtime Replication, cont.

5 Replication Model  Simple: Read-one/write all  Use locks across all replicas to ensure consistency  Problem - availability  If a replica is down, its objects cannot be updated until it comes back up again  Worse - if the network is partitioned, a whole set of objects will be affected Replication, cont.

6 Network Failures  Failures Partition a set of replicas  You have a variety of options to choose from:  Optimistic  Pessimistic  Key metrics for choice:  Resource contention  Frequency and duration of network failures Replication, cont.

7 Optimistic Approach  Allow any replica to act as a proxy for the whole  As partitions dissolve, merge updates  Use normal commit semantics if you can always merge safely  Alternative approach: manage inconsistencies outside of the transaction  e.g., ATMs, date book Replication, cont.

8 Example - Available Copies Replication  Each object that is replicated is managed by a replica manager  Any available replica manager can process requests  Reads from any one replica OK, writes propagate to available replicas  Transactions can lock objects locally at replicas  Two-phase commit at replica level to ensure consistency among replicas  Requires ability to detect server failures Replication, cont.

9 Pessimistic Approach - Quorum  One partition can be the proxy for the whole  Other partitions must wait for network failure to be resolved  Each object replica has a value + timestamp when the value was last updated  To read or write, you must have a quorum:  Writes - more than half the votes  Reads - more than (total votes minus minimum needed to write) Replication, cont.

10 Virtual Partition  Combine Quorum with available copies replication  Transaction acts in a virtual partition, artificially created  Virtual Partition is smallest quorum  Assumption - Virtual Partition small enough to practically allow available copies to succeed Replication, cont.

11 Putting it all together  Transactions  Atomicity  Recoverability  Implemented on top of:  Locks (mutual exclusion algorithms)  Non-volatile storage with atomic writes (logging)  To achieve performance and availability  Distribution and Replication Transaction implementation come in many flavors Transaction Layers