1 CS 194: Lecture 10 Bayou, Brewer, and Byzantine.

Slides:



Advertisements
Similar presentations
Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Authored by: Seth Gilbert and Nancy Lynch Presented by:
Advertisements

Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Life after CAP Ali Ghodsi CAP conjecture [reminder] Can only have two of: – Consistency – Availability – Partition-tolerance Examples.
Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
L-11 Consistency 1. Important Lessons Replication  good for performance/reliability  Key challenge  keeping replicas up-to-date Wide range of consistency.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class: Web Caching Use web caching as an illustrative example Distribution protocols –Invalidate.
CS 582 / CMPE 481 Distributed Systems
1 CS 194: Lecture 15 Midterm Review. 2 Notation  Red titles mean start of new topic  They don’t indicate increased importance…..
CSS490 Replication & Fault Tolerance
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Last Class: Weak Consistency
G Robert Grimm New York University Bayou: A Weakly Connected Replicated Storage System.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Client-Centric.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
1 CS 194: Lecture 9 Bayou Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences University.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class:Consistency Semantics Consistency models –Data-centric consistency models –Client-centric.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System D. B. Terry, M. M. Theimer, K. Petersen, A. J. Demers, M. J. Spreitzer.
Mobility Presented by: Mohamed Elhawary. Mobility Distributed file systems increase availability Remote failures may cause serious troubles Server replication.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
IBM Haifa Research 1 The Cloud Trade Off IBM Haifa Research Storage Systems.
Distributed Deadlocks and Transaction Recovery.
1 CS 268: Lecture 19 A Quick Survey of Distributed Systems 80 slides in 80 minutes!
Mobility in Distributed Computing With Special Emphasis on Data Mobility.
CS Storage Systems Lecture 14 Consistency and Availability Tradeoffs.
1 CS 268: Lecture 20 Classic Distributed Systems: Bayou and BFT Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Feb 7, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements Tomorrow’s class is officially cancelled. If you need someone to go over the reference implementation.
1. Big Data A broad term for data sets so large or complex that traditional data processing applications ae inadequate. 2.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Trade-offs in Cloud.
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
TRANSACTIONS (AND EVENT-DRIVEN PROGRAMMING) EE324.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
CAP Theorem Justin DeBrabant CIS Advanced Systems - Fall 2013.
Fault Tolerance and Replication
Consensus and leader election Landon Cox February 6, 2015.
Two-Phase Commit Brad Karp UCL Computer Science CS GZ03 / M th October, 2008.
V1.7Fault Tolerance1. V1.7Fault Tolerance2 A characteristic of Distributed Systems is that they are tolerant of partial failures within the distributed.
Write Conflicts in Optimistic Replication Problem: replicas may accept conflicting writes. How to detect/resolve the conflicts? client B client A replica.
Bayou: Replication with Weak Inter-Node Connectivity Brad Karp UCL Computer Science CS GZ03 / th November, 2007.
CSE 486/586 Distributed Systems Consistency --- 3
Introduction to Fault Tolerance By Sahithi Podila.
Fault Tolerance Chapter 7. Goal An important goal in distributed systems design is to construct the system in such a way that it can automatically recover.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
Mobility Victoria Krafft CS /25/05. General Idea People and their machines move around Machines want to share data Networks and machines fail Network.
CS 540 Database Management Systems NoSQL & NewSQL Some slides due to Magda Balazinska 1.
Nomadic File Systems Uri Moszkowicz 05/02/02.
Distributed Systems – Paxos
Fault Tolerance - Transactions
Fault Tolerance - Transactions
CSE 486/586 Distributed Systems Consistency --- 3
Fault Tolerance - Transactions
Distributed Systems CS
Distributed Transactions
Distributed Systems CS
Fault Tolerance - Transactions
Last Class: Web Caching
Distributed Systems CS
Implementing Consistency -- Paxos
Abstractions for Fault Tolerance
CSE 486/586 Distributed Systems Consistency --- 3
Fault Tolerance - Transactions
Presentation transcript:

1 CS 194: Lecture 10 Bayou, Brewer, and Byzantine

2 Agenda  Review of Bayou  Channeling Eric Brewer (CAP theorem)  A peek at fault tolerance

3 Review of Bayou With examples!

4 Why Bayou?  Eventual consistency: strongest scalable consistency model  But not strong enough for mobile clients -Accessing different replicas can lead to strange results -Application-independent conflict detection misses some conflicts and creates others falsely  Bayou was designed to move beyond eventual consistency -Session guarantees -Application-specific conflict detection and resolution

5 Bayou System Assumptions  Variable degrees of connectivity: -Connected, disconnected, and weakly connected  Variable end-node capabilities: -Workstations, laptops, PDAs, etc.  Availability crucial

6 Resulting Design Choices  Variable connectivity  Flexible update propagation -Incremental progress, pairwise communication  Variable end-nodes  Flexible notion of clients and servers -Some nodes keep state (servers), some don’t (clients) -Laptops could have both, PDAs probably just clients  Availability crucial  Must allow disconnected operation -Conflicts inevitable -Use application-specific conflict detection and resolution

7 Components of Design  Update propagation  Conflict detection  Conflict resolution  Session guarantees

8 Updates  Identified by a triple: -Commit-stamp -Time-stamp -Server-ID of accepting server  Updates are either committed or tentative -Commit-stamps increase monotonically -Tentative updates have commit-stamp=inf  Primary server does all commits: (why?) -It sets the commit-stamp -Commit-stamp different from time-stamp

9 Update Log  Update log in order: -Committed updates (in commit-stamp order) -Tentative updates (in time-stamp order)  Can truncate committed updates, and only keep db state -Why?  Clients can request two views: (or other app-specific views) -Committed view -Tentative view

10 Tentative vs Committed Views  Committed view: -Updates will never be reordered -But may be substantially out-of-date  Tentative view: -Much more current -But updates might be reordered  Tradeoff is application-dependent: -Calendars: avoid tentative commitments, but don’t count on them -Weather: being current more important than permanence

11 Anti-Entropy Exchange  Each server keeps a version vector: -R.V[X] is the latest timestamp from server X that server R has seen  When two servers connect, exchanging the version vectors allows them to identify the missing updates  These updates are exchanged in the order of the logs, so that if the connection is dropped the crucial monotonicity property still holds -If a server X has an update accepted by server Y, server X has all previous updates accepted by that server

12 Requirements for Eventual Consistency  Universal propagation: anti-entropy  Globally agreed ordering: commit-stamps  Determinism: writes do not involve information not contained in the log (no time-of-day, process-ID, etc.)

13 Example with Three Servers P [0,0,0] A [0,0,0] B [0,0,0] Version Vectors

14 All Servers Write Independently P [8,0,0] A [0,10,0] B [0,0,9]

15 P and A Do Anti-Entropy Exchange P [8,10,0] A [8,10,0] B [0,0,9] [8,0,0] [0,10,0]

16 P Commits Some Early Writes P [8,10,0] A [8,10,0] B [0,0,9] [8,10,0]

17 P and B Do Anti-Entropy Exchange P [8,10,9] A [8,10,0] B [8,10,9] [8,10,0] [0,0,9]

18 P Commits More Writes P [8,10,9] P [8,10,9]

19 Bayou Writes  Identifier (commit-stamp, time-stamp, server-ID)  Nominal value  Write dependencies  Merge procedure

20 Conflict Detection  Write specifies the data the write depends on: -Set X=8 if Y=5 and Z=3 -Set Cal(11:00-12:00)=dentist if Cal(11:00-12:00) is null

21 Conflict Resolution  Specified by merge procedure (mergeproc)  When conflict is detected, mergeproc is called -Move appointments to open spot on calendar -Move meetings to open room

22 Session Guarantees  Ensured by client, not by distribution mechanism  Needed to ensure user sees sensible results  To implement, client records: -All writes during that session (write-set) -The writes relevant to each read read-set) Must be supplied by server Can be approximated by version vector

23 The Four Session Guarantees GuaranteeState updatedState checked Read your writesWriteRead Monotonic readsReadRead Writes follow readsReadWrite Monotonic writesWriteWrite

24 Example  Return to example with servers P, A, and B  Client attaches to server P with vector [8,3,5]  Client reads, with read-set {P6,A1,A2,B5}  Client writes, with timestamp P9  Client then detaches and reattaches to another server  For which of these vectors can client read or write?

25 What Reads/Writes are Allowed? Read-set {P6,A1,A2,B5}, Write-set P9  [7,1,6]Read Your Writes: No Monotonic Reads: No Writes Following Reads: No Monotonic Writes: No No R, No W  [7,4,6]Read Your Writes: No Monotonic Reads: Yes Writes Following Reads: Yes Monotonic Writes: No No R, No W

26 What Reads/Writes are Allowed? Read-set {P6,A1,A2,B5}, Write-set P9  [9,3,4]Read Your Writes: Yes Monotonic Reads: No Writes Following Reads: No Monotonic Writes: Yes No R, No W  [10,3,8]Read Your Writes: Yes Monotonic Reads: Yes Writes Following Reads: Yes Monotonic Writes: Yes R, W

27 Channeling Eric Brewer Slightly more hair, much less wisdom

28 A Clash of Cultures  Classic distributed systems: focused on ACID semantics -A: Atomic -C: Consistent -I: Isolated -D: Durable  Modern Internet systems: focused on BASE -Basically Available -Soft-state (or scalable) -Eventually consistent

29 ACID vs BASE ACID  Strong consistency for transactions highest priority  Availability less important  Pessimistic  Rigorous analysis  Complex mechanisms BASE  Availability and scaling highest priorities  Weak consistency  Optimistic  Best effort  Simple and fast

30 Why the Divide?  What goals might you want from a shared-date system? -C, A, P  Strong Consistency: all clients see the same view, even in the presence of updates  High Availability: all clients can find some replica of the data, even in the presence of failures  Partition-tolerance: the system properties hold even when the system is partitioned

31 CAP Conjecture (later theorem)  You can only have two out of these three properties  The choice of which feature to discard determines the nature of your system

32 Consistency and Availability  Comment: -Providing transactional semantics requires all nodes to be in contact with each other  Examples: -Single-site and clustered databases -Other cluster-based designs  Typical Features: -Two-phase commit -Cache invalidation protocols -Classic DS style

33 Consistency and Partition-Tolerance  Comment: -If one is willing to tolerate system-wide blocking, then can provide consistency even when there are temporary partitions  Examples: -Distributed databases -Distributed locking -Quorum (majority) protocols  Typical Features: -Pessimistic locking -Minority partitions unavailable -Also common DS style Voting vs primary replicas

34 Partition-Tolerance and Availability  Comment: -Once consistency is sacrificed, life is easy….  Examples: -DNS -Web caches -Coda -Bayou  Typical Features: -TTLs and lease cache management -Optimistic updating with conflict resolution -This is the “Internet design style”

35 Techniques  Expiration-based caching:AP  Quorum/majority algorithms:PC  Two-phase commit:AC

36 Byzantine

37 Failures  So far, have assume nodes are either up or down  But nodes are far more interesting than that!

38 Failure Models Type of failureDescription Crash failureA server halts, but is working correctly until it halts Omission failure Receive omission Send omission A server fails to respond to incoming requests A server fails to receive incoming messages A server fails to send messages Timing failureA server's response lies outside the specified time interval Response failure Value failure State transition failure The server's response is incorrect The value of the response is wrong The server deviates from the correct flow of control Arbitrary failureA server may produce arbitrary responses at arbitrary times

39 Previous Algorithms  Only cope with crash-failure  What happens if some other failure occurs?  Bayou as an example: -If server lies about updates, algorithm gets hopelessly confused  Generally, most other distributed protocols fail when faced with anything other than crash failures  Next: how to deal with a wider variety of failures

40 Same Dichotomy Exists  Classic Distributed Systems: -Byzantine Algorithms -Two-phase Commit  Internet style: -Checkable or “self-verifying” protocols -Very new field in Internet research -You now know as much as we do about it…..