Bayou. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer, and Marvin M. Theimer.

Slides:



Advertisements
Similar presentations
Consistency Guarantees and Snapshot isolation Marcos Aguilera, Mahesh Balakrishnan, Rama Kotla, Vijayan Prabhakaran, Doug Terry MSR Silicon Valley.
Advertisements

Eventual Consistency Jinyang. Sequential consistency Sequential consistency properties: –Latest read must see latest write Handles caching –All writes.
Feb 7, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
“Managing Update Conflicts in Bayou, a Weekly Connected Replicated Storage System” Presented by - RAKESH.K.
1 CS 194: Lecture 10 Bayou, Brewer, and Byzantine.
CS 582 / CMPE 481 Distributed Systems
“Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System ” Distributed Systems Κωνσταντακοπούλου Τζένη.
Flexible Update Propagation for Weakly Consistent Replication Karin Petersen, Mike K. Spreitzer, Douglas B. Terry, Marvin M. Theimer and Alan J. Demers.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
G Robert Grimm New York University Bayou: A Weakly Connected Replicated Storage System.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Consistency.
Ordering of events in Distributed Systems & Eventual Consistency Jinyang Li.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Client-Centric.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System D. B. Terry, M. M. Theimer, K. Petersen, A. J. Demers, M. J. Spreitzer.
Mobility Presented by: Mohamed Elhawary. Mobility Distributed file systems increase availability Remote failures may cause serious troubles Server replication.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Query Processing in Mobile Databases
Deadlocks in Distributed Systems Deadlocks in distributed systems are similar to deadlocks in single processor systems, only worse. –They are harder to.
Distributed Deadlocks and Transaction Recovery.
Microsoft ® Office Outlook ® 2007 Training See and Use Multiple Calendars ICT Staff Development presents:
Mobility in Distributed Computing With Special Emphasis on Data Mobility.
Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,
Feb 7, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements.
CS Storage Systems Lecture 14 Consistency and Availability Tradeoffs.
1 CS 268: Lecture 20 Classic Distributed Systems: Bayou and BFT Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Feb 7, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements Tomorrow’s class is officially cancelled. If you need someone to go over the reference implementation.
Replication and Consistency. Reference The Dangers of Replication and a Solution, Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. In Proceedings.
Replication ( ) by Ramya Balakumar
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Overview – Chapter 11 SQL 710 Overview of Replication
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Consistency.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Refactoring and Synchronization with the StarTeam Plug-in for Eclipse  Jim Wogulis  Principal Architect, Borland Software Corporation.
Mobile File System Byung Chul Tak. AFS  Andrew File System Distributed computing environment developed at CMU provides transparent access to remote shared.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Feb 1, 2001CSCI {4,6}900: Ubiquitous Computing1 Eager Replication and mobile nodes Read on disconnected clients may give stale data Eager replication prohibits.
Distributed File Systems
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.
Write Conflicts in Optimistic Replication Problem: replicas may accept conflicting writes. How to detect/resolve the conflicts? client B client A replica.
Bayou: Replication with Weak Inter-Node Connectivity Brad Karp UCL Computer Science CS GZ03 / th November, 2007.
CSE 486/586 Distributed Systems Consistency --- 3
Consistency Guarantees Prasun Dewan Department of Computer Science University of North Carolina
Eventual Consistency Jinyang. Review: Sequential consistency Sequential consistency properties: –All read/write ops follow some total ordering –Read must.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
THE EVOLUTION OF CODA M. Satyanarayanan Carnegie-Mellon University.
Mobility Victoria Krafft CS /25/05. General Idea People and their machines move around Machines want to share data Networks and machines fail Network.
Mobile File Systems.
CSE 486/586 Distributed Systems Consistency --- 2
Nomadic File Systems Uri Moszkowicz 05/02/02.
Eventual Consistency: Bayou
CSE 486/586 Distributed Systems Consistency --- 1
Replication and Consistency
CSE 486/586 Distributed Systems Consistency --- 3
Replication and Consistency
EECS 498 Introduction to Distributed Systems Fall 2017
CSE 486/586 Distributed Systems Consistency --- 1
Fault-tolerance techniques RSM, Paxos
Eventual Consistency: Bayou
Outline The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer, and Marvin M. Theimer. IEEE.
CSE 486/586 Distributed Systems Consistency --- 2
CSE 486/586 Distributed Systems Consistency --- 3
Replication and Consistency
Presentation transcript:

Bayou

References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer, and Marvin M. Theimer. IEEE Data Engineering, December 1998 r Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System Douglas B. Terry, Marvin M. Theimer, Karin Petersen, Alan J. Demers, Mike J. Spreitzer and Carl H. Hauser. In ACM Symposium on Operating Systems Principles (SOSP ’95)

Introduction r The scenario presented here is a little different then in previous work r Server replicas can be disconnected (temporarily) from the networks r There are conflicts in write updates that are best handled by the user m Can’t view data as just bits r A read operation may result in stale data

Calendar Application r Calendar updates made by several people m e.g., meeting room scheduling, or exec+admin r Want to allow updates offline r But conflicts can’t be prevented r Two possibilities: m Disallow offline updates m Conflict resolution

Calendar Application r Suppose we have two conference rooms of the same capacity. I want to schedule my meeting in one of the conference rooms. I don’t care which exact room it is. r If two people reserve the same room at the same time, there is a conflict, but if they reserve the same room at different times or reserve different rooms at the same time, there is no conflict. Rm2 Rm1 time No conflict

Calendar Application Rm2 Rm1 time No conflict Rm2 Rm1 time conflict

Calendar Application r We can lock the entire database m Not needed when there is no conflict m In the case of conflicts, there is an application specific way to deal with the conflict – we move one reservation to the other room m If the other room is reserved, we ask the user, if they can easily move the reservation to another acceptable time m Other resolution strategies: Classes over meetings Admin meetings bump faculty meetings

Shared mailbox r Shared mailbox folders- shared between, me and my 4 TAs. r We all replicate the mailbox. m OP1: I see a mail from the class, respond to it and delete it. m OP2: The TA sees the same mail and files it in CS1026. m OP3: I see an from a friend and file it as important. mailbox CS1026ImportantRecruitingChocolate

Shared Mailbox r All of us operate on the same mailbox r You can lock the entire mailbox before someone operates on it. m Can’t work when disconnected m Clearly not necessary for doing only one operation m For operation OP1 and OP2, it is not clear who should win, should the mail be deleted or should it be filed in Assign1?

Two Approaches to Building Replicated Services r Transparent replication system: m Allow systems that were developed assuming a central file system or database to run unchanged on top of a strongly-consistent replicated storage system (as seen in Oceanstore) r Non-transparent replication system: m Relaxed consistency model – access-update-anywhere m Applications involved in conflict detection and resolution. Hence applications need to be modified (e.g. Bayou, Coda file system etc)

Hypothesis r Conflicts between application users are not easily handled transparently. r Applications know best on how to resolve conflicts r The challenge is providing the right interface to support cooperation between applications and their data managers

Conflict Detection : Dependency Checks r Each Write operation includes a dependency check consisting of an application-supplied query and its expected result. r If the check fails, then the requested update is not performed and the server invokes a procedure to resolve the detected conflict.

Example of Bayou Write 3-tuple: For example, Update: Dependency check: Mergeproc: m A different merge procedure could search for the next available time slot to schedule the meeting, which is an option a user might choose if any time would be satisfactory. m Another is to allow the conflicts; sometimes users like conflicts

Conflict Resolution : Merge Procedure r A merge procedure is run to resolve a detected conflict r Merge procedures are written by application programs r Merge procedures are in the form of templates that are instantiated with the appropriate details r In the case where automatic resolution is not possible, the merge procedure will still run to completion, but is expected to produce a revised update that logs the detected conflict in some fashion that will enable a person to resolve the conflict later.

Replica Consistency r We may have multiple replicas e.g., m Calendar on a PDA; intermittent connectivity m Calendar on a server in the wired network r Need conflict resolution as described earlier but this in itself is not sufficient. r Need mechanism to deal with the variation at the multiple servers r Want users to be able to continue after a write even if not all replicas have carried out a write operation.

Replica Consistency r Bayou is designed for eventual consistency: m All servers receive all Writes via the pair-wise anti- entropy process (described later) m Two servers holding the same set of writes will have the same data contents r Strict bounds on Write propagation delays not enforced. r Bayou has these features: m Writes are performed in the same, well-defined order at all servers m The conflict detection and merge procedures are deterministic so that servers resolve the same conflicts in the same manner

Replica Consistency r When a Write is accepted by a Bayou server from a client, it is deemed tentative r Timestamps for tentative Writes must monotonically increase at each server r At some point each tentative write should be marked as committed r Must be able to undo writes; Why? m Servers may receive Writes from clients and from other servers in an order that differs from the required execution order; m Servers immediately apply all known Writes to their replicas.

Anti-Entropy r Anti-entropy refers to a propagation model of information between servers r A server P picks another server Q at random to exchange updates r Three approaches: m P only pushes its own updates to Q m P only pulls in new updates from Q m P and Q send updates to each other r This model is used in Bayou for propagating writes (updates) m All replicas receive all updates chain of pair-wise interactions

Example r Suppose a user keeps the primary copy of his calendar with him on his laptop and allows others, such as a spouse or secretary, to keep secondary (mostly read copies). r The user updates his own calendar; This is committed immediately. r Updates by the spouse/secretary are tentative until anti-entropy takes place with the user. At this point, the user can commit and propagate the order to the spouse/secretary during anti- entropy.

Ordering of Updates r Maintain ordered list of writes at each node m This will be referred to as the write log m Each write has a unique Write ID: m The local-time-stamp is assigned by the server that accepted the write from front end (referred to as the accepting server)

Write Log Example r : Node A asks for meeting M1 to occur at 10 AM, else 11 AM in MC 316 r : Node B asks for meeting M2 to occur at 10 AM, else 11 AM in MC 316 r Let’s agree to sort by write ID (e.g., r As writes operations spread from node to node, nodes may initially apply updates in different orders r The dependency check and merge procedures do not necessarily take care of this r Want writes for a particular data to be performed in the same order at all replicas

Write Log Example r Each newly seen write merged into write log r Log replayed m May cause calendar displayed to user to change! m i.e., all entries tentative, nothing stable unless an entry is committed. r How do we get commits?

Criteria for Committing Writes r For log entry X to be committed, everyone must agree on: m Total order of all previous committed entries m Fact that X is next in total order m Fact that all uncommitted entries are “after” X

How Bayou Agrees on Total Order of Committed Writes r One node designated primary replica r Primary marks each write it receives with permanent CSN (commit sequence number) m That write is committed m Complete timestamp is r Nodes exchange CSNs

How Bayou Agrees on Total Order of Committed Writes r CSNs define total order for committed writes m All nodes eventually agree on total order m Uncommitted writes (these do not have CSNs) come after all committed writes r Are there constraints on what the total order may look like? m Yes. m Assume that a user has issued two writes for a data collection m The order they were issued in should be reflected in the total order

Showing Users that Writes Have Committed r Still not safe to show users that an appointment request has committed r Entire log up to newly committed entry must be committed m else there might be earlier committed write a node doesn’t know about! m …and upon learning about it, would have to re-run conflict resolution r Result: committed write not stable unless node has seen all prior committed writes

Commits r A server,R, should keep track for all other servers the following: m R.V[X] is the latest timestamp from server X that server R has seen r When two servers connect, exchanging the version vectors allows them to identify the missing updates

Committed vs. Tentative Writes r Can now show user if a write has committed m When node has seen every CSN up to that point, as guaranteed by propagation protocol r Slow or disconnected node cannot prevent commits! m Primary replica allocates CSNs; global order of writes may not reflect real-time write times r What about tentative writes, though—how do they behave, as seen by users?

Trimming the Log r When nodes receive new CSNs, can discard all committed log entries seen up to that point m Update protocol guarantees CSNs received in order r Instead, keep copy of whole database as of highest CSN m By definition, official committed database m Everyone does (or will) agree on contents m Entries never need go through conflict resolution

Technology Impact r TrueSync - end-to-end synchronization software and infrastructure solutions for the wireless Internet m  SyncML - SyncML is the common language for synchronizing all devices and applications over any network. m Ericsson, IBM, Lotus, Motorola, Nokia, Palm Inc., Psion, Starfish Software etc. (614 companies) m

Conclusions r Difference from other replicated systems m Non-transparency m Application-specific conflict detection m Per-write conflict resolvers m Partial and multi-object updates m Tentative and stable resolutions m Security r Future goal m Partial replication, policies for choosing servers for anti-entropy, building servers with conventional database managers, alternate data models, and finer grain access control.