G22.3250-001 Robert Grimm New York University Bayou: A Weakly Connected Replicated Storage System.

Slides:



Advertisements
Similar presentations
My first computer: The Apple ][ It wanted to be programmed.
Advertisements

Eventual Consistency Jinyang. Sequential consistency Sequential consistency properties: –Latest read must see latest write Handles caching –All writes.
Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
“Managing Update Conflicts in Bayou, a Weekly Connected Replicated Storage System” Presented by - RAKESH.K.
G Robert Grimm New York University Disconnected Operation in the Coda File System.
1 CS 194: Lecture 10 Bayou, Brewer, and Byzantine.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class: Web Caching Use web caching as an illustrative example Distribution protocols –Invalidate.
CS 582 / CMPE 481 Distributed Systems
“Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System ” Distributed Systems Κωνσταντακοπούλου Τζένη.
Flexible Update Propagation for Weakly Consistent Replication Karin Petersen, Mike K. Spreitzer, Douglas B. Terry, Marvin M. Theimer and Alan J. Demers.
Department of Electrical Engineering
Replication and Consistency CS-4513 D-term Replication and Consistency CS-4513 Distributed Computing Systems (Slides include materials from Operating.
Ordering of events in Distributed Systems & Eventual Consistency Jinyang Li.
Versioning and Eventual Consistency COS 461: Computer Networks Spring 2011 Mike Freedman 1.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
1 CS 194: Lecture 9 Bayou Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences University.
Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System D. B. Terry, M. M. Theimer, K. Petersen, A. J. Demers, M. J. Spreitzer.
Mobility Presented by: Mohamed Elhawary. Mobility Distributed file systems increase availability Remote failures may cause serious troubles Server replication.
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “DATABASE RECOVERY” (PART – 1) Academic Year 2014 Spring.
Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Practical Replication. Purposes of Replication Improve Availability Replicated databases can be accessed even if several replicas are unavailable Improve.
Mobility in Distributed Computing With Special Emphasis on Data Mobility.
Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,
Asynchronous Replication and Bayou. Asynchronous Replication client B Idea: build available/scalable information services with read-any-write-any replication.
Fault Tolerance via the State Machine Replication Approach Favian Contreras.
CS Storage Systems Lecture 14 Consistency and Availability Tradeoffs.
1 CS 268: Lecture 20 Classic Distributed Systems: Bayou and BFT Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Feb 7, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements Tomorrow’s class is officially cancelled. If you need someone to go over the reference implementation.
Consistency And Replication
Bayou. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer, and Marvin M. Theimer.
Replication and Consistency. Reference The Dangers of Replication and a Solution, Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. In Proceedings.
1 Consistency of Replicated Data in Weakly Connected Systems CS444N, Spring 2002 Instructor: Mary Baker.
D u k e S y s t e m s Asynchronous/Causal Replication in Bayou Jeff Chase Duke University.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
Practical Byzantine Fault Tolerance
Asynchronous Replication and Bayou Jeff Chase CPS 212, Fall 2000.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Mobile File System Byung Chul Tak. AFS  Andrew File System Distributed computing environment developed at CMU provides transparent access to remote shared.
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Feb 1, 2001CSCI {4,6}900: Ubiquitous Computing1 Eager Replication and mobile nodes Read on disconnected clients may give stale data Eager replication prohibits.
1 Multiversion Reconciliation for Mobile Databases Shirish Hemanath Phatak & B.R.Badrinath Presented By Presented By Md. Abdur Rahman Md. Abdur Rahman.
Write Conflicts in Optimistic Replication Problem: replicas may accept conflicting writes. How to detect/resolve the conflicts? client B client A replica.
Asynchronous Replication client B Idea: build available/scalable information services with read-any-write-any replication and a weak consistency model.
Client-Centric Consistency Models
Bayou: Replication with Weak Inter-Node Connectivity Brad Karp UCL Computer Science CS GZ03 / th November, 2007.
Consistency Guarantees Prasun Dewan Department of Computer Science University of North Carolina
Consistency and Replication Chapter 6 Presenter: Yang Jie RTMM Lab Kyung Hee University.
Eventual Consistency Jinyang. Review: Sequential consistency Sequential consistency properties: –All read/write ops follow some total ordering –Read must.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.
Consistency and Replication (1). Topics Why Replication? Consistency Models – How do we reason about the consistency of the “global state”? u Data-centric.
Mobility Victoria Krafft CS /25/05. General Idea People and their machines move around Machines want to share data Networks and machines fail Network.
Disk Cache Main memory buffer contains most recently accessed disk sectors Cache is organized by blocks, block size = sector’s A hash table is used to.
Nomadic File Systems Uri Moszkowicz 05/02/02.
Eventual Consistency: Bayou
Replication and Consistency
CSE 486/586 Distributed Systems Consistency --- 3
Replication and Consistency
EECS 498 Introduction to Distributed Systems Fall 2017
Eventual Consistency: Bayou
Fault-Tolerant State Machine Replication
Outline The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer, and Marvin M. Theimer. IEEE.
Replica Placement Model: We consider objects (and don’t worry whether they contain just data or code, or both) Distinguish different processes: A process.
Last Class: Web Caching
CSE 486/586 Distributed Systems Consistency --- 3
Replication and Consistency
Presentation transcript:

G Robert Grimm New York University Bayou: A Weakly Connected Replicated Storage System

Altogether Now: The Three Questions  What is the problem?  What is new or different?  What are the contributions and limitations?

Bayou from High Above  A replicated storage system  Designed with mobile computing in mind  Supports read/write anywhere  Makes very limited assumptions about connectivity  Provides eventual consistency  Exposes both tentative and stable data  Is not transparent to applications  Writes are ’s  Is centered around an epidemic anti-entropy protocol  One-way operation between pairs of servers  Propagation of writes  Constrained by accept order

The Target Environment  A worst-case scenario…  Mobile computers  Expensive connection time  Frequent disconnections  Computers never connected simultaneously  Though, Bayou also works in less bad environments  Considerable flexibility in setting anti-entropy policies  When to reconcile  With which replicas to reconcile  When to truncate the write-log  From which servers to create new replicas

Let’s Start from Scratch: Calendaring as an Example  Two main issues related to consistency  Ordering of operations  Detection and resolution of conflicts  The traditional solution: Lots of clients, one server  Ordering: One copy, server picks order  Conflicts: Server checks for conflicts, returns errors  So, why not use this approach?  Local access on personal devices  Intermittent connectivity with Internet  Intermittent connectivity with other users (Infrared, Bluetooth,…)

Straw Man: Swap/Sync Databases  May be resource intensive  Might require lots of network bandwidth  Hard to ensure consistency  There is no notion (of ordering) of operations  It is hard to automatically detect conflicts and resolve them  Problem: Viewing DB as collection of bits  Represents a snapshot in time  Solution: View DB in terms of updates  Operational: Read, think, make change  Well-ordered: Ensure that all replicas converge on same snapshot

Towards a More Good Solution  Maintain an ordered list of updates for each node  Enter the write log  Make sure every node has the same updates  Make sure every node applies updates in same order  Accept order, causal accept order, total order  Make sure that updates are deterministic  No access to local time, server name, rand(), …  Now, a sync does not merge databases, but lists  Much easier than merging of collections of bits

What about Ordering? Enter Session Guarantees  Observation: We very much care about ordering  Even for tentative operations  Read your writes: W  R  E.g., change password, log in  Monotonic reads: W  R1  R2  E.g., meetings stay in calendar, listed s readable  Write follows read: W1  R1  W2  W1  W2  E.g., newsgroup reply appears after original post  Monotonic writes: W1  W2  E.g., last text file edit survives

An Example Write  Also need timestamp:  Lamport clock for causal accept order

Propagating Writes  Unidirectional, peer-to-peer synchronization  By wired/wireless network, floppy disk, USB keychain,…  Updates may appear out of (total) order  E.g.,, ; node B receives  Need to be merged into log  Undo newer updates (e.g., )  Insert just received updates  Replay the log  User’s view of data (calendar) may change  But when everybody has seen all writes, everybody will agree

We Like Short Logs… Step 1: How about Commitment?  We need to know when everybody has seen a write  Lamport clocks preserve causal order, but don’t provide global consensus  We need a notion of commitment  For entry X to be committed, everyone must agree on  The total order of all previous writes  The fact that X is next in this total order  The fact that all tentative entries follow after X  “Any mechanism that stabilizes the position of a write in the log can be used.”

Bayou’s Civilized Commitment Procedure  Each data collection has one primary replica (why?)  Commits all writes for that collection  Marks each write with a commit sequence number (CSN)  Timestamp really is  Propagates commitments during anti-entropy  How to ensure that CSN order observes causal accept order?  Local time actually is Lamport (logical) time  Everybody propagates updates in order  As a result, primary sees updates in causal order and commits them in that order

We Like Short Logs… Step 2: Let’s Throw Writes Away!  Truncating the log  Tentative writes must never be discarded  May have to be undone and redone (due to reordering)  Committed writes may be discarded  But other replicas may not yet have seen them  So, keep some amount of history  But, where did the truncated writes go?  We don’t just have a log, but also an actual database  Also contains tentative writes  But all committed entries are marked as such (flag bit)  We track omitted sequence number (OSN)

Let’s Throw Writes Away! (cont.)  During anti-entropy, we may have to send DB (!)  If receiver’s CSN is smaller than sender’s OSN  I.e., if receiver’s head of log is before sender’s tail of log  Sender’s DB provides new starting point for receiver  Receiver discards committed writes  Receiver and send continue with rest of anti-entropy

Some More Details  Replicas can be added and removed dynamically  Addition/removal is relative to another replica  Replicas named relative to that replica (preserves causal order)  Access control performed at granularity of database  Based on public/private key cryptography  Checked by accepting and by committing replica  Accepting replica first-line defense against unauthorized access  Committing replica definitive authority

What Do You Think?