Replication and Consistency. Reference The Dangers of Replication and a Solution, Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. In Proceedings.

Slides:



Advertisements
Similar presentations
1 Scaleable Replicated Databases Jim Gray (Microsoft) Pat Helland (Microsoft) Dennis Shasha (Columbia) Pat ONeil (U.Mass)
Advertisements

Concurrency Control WXES 2103 Database. Content Concurrency Problems Concurrency Control Concurrency Control Approaches.
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Eventual Consistency Jinyang. Sequential consistency Sequential consistency properties: –Latest read must see latest write Handles caching –All writes.
Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
Distributed Systems 2006 Styles of Client/Server Computing.
More on Replication and Consistency CS-4513, D-Term More on Replication and Consistency CS-4513 D-Term 2007 (Slides include materials from Operating.
CS 582 / CMPE 481 Distributed Systems
CMPT 431 Dr. Alexandra Fedorova Lecture XII: Replication.
Mobile Computing and Databases - A Survey A presentation by Dharmesh Thakkar based on the publication by Daniel Barbara.
Transaction Processing IS698 Min Song. 2 What is a Transaction?  When an event in the real world changes the state of the enterprise, a transaction is.
1 Transaction Management Database recovery Concurrency control.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
G Robert Grimm New York University Bayou: A Weakly Connected Replicated Storage System.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XII: Replication.
More on Replication and Consistency CS-4513 D-term More on Replication and Consistency CS-4513 Distributed Computing Systems (Slides include materials.
CS 603 Data Replication February 25, Data Replication: Why? Fault Tolerance –Hot backup –Catastrophic failure Performance –Parallelism –Decreased.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Transactions and concurrency control
Multicast Communication Multicast is the delivery of a message to a group of receivers simultaneously in a single transmission from the source – The source.
6.4 Data and File Replication Gang Shen. Why replicate  Performance  Reliability  Resource sharing  Network resource saving.
Database Replication. Replication Replication is the process of sharing information so as to ensure consistency between redundant resources, such as software.
Distributed Systems Tutorial 11 – Yahoo! PNUTS written by Alex Libov Based on OSCON 2011 presentation winter semester,
AN OPTIMISTIC CONCURRENCY CONTROL ALGORITHM FOR MOBILE AD-HOC NETWORK DATABASES Brendan Walker.
6.4 Data And File Replication Presenter : Jing He Instructor: Dr. Yanqing Zhang.
Practical Replication. Purposes of Replication Improve Availability Replicated databases can be accessed even if several replicas are unavailable Improve.
Feb 7, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements Tomorrow’s class is officially cancelled. If you need someone to go over the reference implementation.
CS Storage Systems Dangers of Replication Materials taken from “J. Gray, P. Helland, P. O’Neil, and D. Shasha. The Dangers of Replication and a.
Jan 31, 2001CSCI {4,6}900: Ubiquitous Computing1 Recap. Ubiquitous Computing Vision –The Computer for the Twenty-First Century, Mark Weiser –The Coming.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Trade-offs in Cloud.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
Consistent and Efficient Database Replication based on Group Communication Bettina Kemme School of Computer Science McGill University, Montreal.
Concurrency Server accesses data on behalf of client – series of operations is a transaction – transactions are atomic Several clients may invoke transactions.
Replicated Databases. Reading Textbook: Ch.13 Textbook: Ch.13 FarkasCSCE Spring
Preventive Replication in Database Cluster Esther Pacitti, Cedric Coulon, Patrick Valduriez, M. Tamer Özsu* LINA / INRIA – Atlas Group University of Nantes.
Distributed Computing Systems CSCI 4780/6780. Geographical Scalability Challenges Synchronous communication –Waiting for a reply does not scale well!!
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Databases Illuminated
By Shruti poundarik.  Data Objects and Files are replicated to increase system performance and availability.  Increased system performance achieved.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Feb 1, 2001CSCI {4,6}900: Ubiquitous Computing1 Eager Replication and mobile nodes Read on disconnected clients may give stale data Eager replication prohibits.
1 Multiversion Reconciliation for Mobile Databases Shirish Hemanath Phatak & B.R.Badrinath Presented By Presented By Md. Abdur Rahman Md. Abdur Rahman.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
2/29/ Replication CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety Copyright ©2012 Philip A. Bernstein.
Ing. Erick López Ch. M.R.I. Replicación Oracle. What is Replication  Replication is the process of copying and maintaining schema objects in multiple.
Fault Tolerance and Replication
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Write Conflicts in Optimistic Replication Problem: replicas may accept conflicting writes. How to detect/resolve the conflicts? client B client A replica.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
Distributed Systems CS Consistency and Replication – Part IV Lecture 13, Oct 23, 2013 Mohammad Hammoud.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
3/6/99 1 Replication CSE Transaction Processing Philip A. Bernstein.
DATABASE REPLICATION DISTRIBUTED DATABASE. O VERVIEW Replication : process of copying and maintaining database object, in multiple database that make.
Eventual Consistency Jinyang. Review: Sequential consistency Sequential consistency properties: –All read/write ops follow some total ordering –Read must.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
Nomadic File Systems Uri Moszkowicz 05/02/02.
Lecturer : Dr. Pavle Mogin
6.4 Data and File Replication
Introduction to NewSQL
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Joint Advanced Students School 2005
Concurrency Control WXES 2103 Database.
Distributed Transactions
Presentation transcript:

Replication and Consistency

Reference The Dangers of Replication and a Solution, Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. In Proceedings of the ACM SIGMOD international conference on Management of Data, 1996

Introduction r When you have mobility, replication allows mobile nodes to read and update the database while disconnected from the network.

Eager Replication r All replicas synchronized to the same value immediately R R R RR R time

Eager Replication r All replicas synchronized to the same value r Lower update performance and response time R R R RR R time

Lazy Replication r One replica is updated by the transaction r Replicas synchronize asynchronously r Multiple versions of data R R R RR R time

Example r Consider a joint checking account. Suppose that it has $1,000 in it. r The account is replicated in three places: the wife’s checkbook, the husband’s checkbook and the bank’s ledger. r Eagar replication assumes that all three books have the same account balance. m It prevents the husband and wife from writing checks totaling more than $1,000.

Example r Lazy replication allows both the husband and wife to write check totaling $1,000 for a total of $2,000 in withdrawels. r When these checks arrived at the bank or when husband and wife communicate, someone or something reconciles the transactions. r The bank is the does the reconciliation by rejecting updates that cause an overdraft. r Lots of time may be spent reconciling.

Example r The database for a checking account is a single number, and a log of updates to that number. r Databases are usually more complex. r Disconnected operation and message delays mean lazy replication has more frequent reconciliation.

Concurrency Anomaly in Lazy Replication r R` - Which version of data should it see? r If committed transaction is ‘wrong’, conflict r Conflicts have to be reconciled R’ R R``` RR`` R` time

Scaleup pitfall r When the nodes divulge hopelessly we get system delusion – database is inconsistent and no obvious way to repair it R’ R R``` RR`` R` time

Regulate Replica Updates r Group: Any node with a copy can update item m Update anywhere r Master: Only a master can update the primary copy. All replicas are read-only. All update requests are sent to the master

Replication Strategies Propagation Vs. Ownership LazyEager GroupN transactions N object owners 1 transaction N object owners MasterN transactions 1 object owner 1 transaction 1 object owner Two tierN+1 transactions, 1 object owner Tentative locate update, eager base update

Eager Replication and Mobile Nodes r Read on disconnected clients may give stale data r Simple eager replication prohibits updates if any node is disconnected R R R RR R time

Eager Replication and Mobile Nodes r For high availability, eager replication systems allow updates among members of the cluster. r When a node joins a cluster, the cluster sends the new node all replica updates since the node was disconnected.

Eager Replication and Mobile Nodes r Even if all the nodes are connected all the time, updates may fail due to deadlocks that prevent serialization errors. m The probability of deadlocks and consequently failed transactions rises very quickly with transaction size and with the number of nodes. It is estimated that a 10-fold increase in nodes gives a 1000-fold increased in failed transactions.

Lazy Replication and Mobile Nodes r With lazy group replication, we have to wait for all nodes to come online to commit r Lazy master replication cannot work for mobile nodes and network connection is needed for transaction to complete

Lazy Replication and Mobile Nodes r Lazy group replication allows any node to update any local data. r When the transaction commits, a transaction is sent to every other node to apply the root transaction’s updates to the replicas at the destination node. r Two nodes may race to update the same object. This must be detected and reconciled.

Lazy Replication and Mobile Nodes r Timestamps are commonly used to detect and reconcile lazy-group transactional updates. r Each object carries the timestamp of its most recent update. r Each replica update carries the new value and is tagged with the old object timestamp. r Each node detects incoming replica updates that would overwrite earlier committed updates. r The node tests if the local replica’s timestamp and the update’s old timestamp are equal. r If so, the update is safe.

Lazy Replication and Mobile Nodes r The local replica’s timestamp advances to the new transaction’s timestamp and the object value is updated. r If the current timestamp of the local replica does not match the old timestamp seen by the root transaction, then the update may be “dangerous”. m The node rejects the incoming transaction and submits it for reconciliation.

Example Replication Scenario: #1 r Replicated DNS servers m One primary DNS server m Multiple replicas DNS1.UGA.EDU DNS2.UGA.EDU DNS3.UGA.EDU m Replicas use zone transfers to get an up-to-date database from the the primary server m Transfers database every so often m Inconsistent state between transfers Lazy, master replication

Example Replication #2 r Palm Pilot Synchronization r Database (your address book) is in PIM (Outlook say), Palm Desktop, your Palm device. Updates are allowed anywhere. You could authorize your secretary to add items to your Outlook r Lazy group update

Example Replication #3 r Gnutella – when you add a new song into your computer, when do the other nodes see it? Eventually r Lazy group update

Example Replication #4 r Newsgroups r Everyone can post to newsgroup. You post in comp.risks from UWO, and your friend also posts at the same time from Toronto. My friend at Waterloo will see it in some order (UWO first and then Toronto or the other way around) r Lazy group replication

Example Replication #5 r Distributed databases with ACID syntax r Eager master

Convergence Property r If no new transactions arrive, if all the nodes are connected together, they will all converge to the same replicated state after exchanging replica updates r Updates may be lost because of newer updates r Commutative updates – incremental transformations that can be applied in any order

Two-Tier Replication r Mobile nodes m Disconnected most of the time. m Mobile nodes store Master version and Tentative version Master version on disconnected or lazy replica maybe outdated Most recent value due to local updates is maintained as a tentative value r Base Nodes m Always connected. Store a replica of the database. Items are mastered in base nodes

Two-Tier Transaction r Base transaction m Work only on master data m Produce new master data r Tentative transaction m Work on local tentative data m Produce new tentative versions m Also produce base transaction to be run at a later time on the base nodes r Acceptance criteria for each transaction update

Key Properties of Two-Tier Replication Schemes r Mobile nodes may make tentative database updates r Base transactions execute with single-copy serializability so the master base system state is the result of a serializable execution r A transaction becomes durable when the base transaction completes r Replicas at all connected nodes converge to the base system state r If all transactions commute, there are no reconciliations