Scalable Causal Consistency

Slides:

Advertisements

Similar presentations

Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky David G. Andersen * Princeton, Intel Labs, CMU Dont Settle for Eventual : Scalable Causal Consistency.

Advertisements

High throughput chain replication for read-mostly workloads

Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.

Failure Detection The ping-ack failure detector in a synchronous system satisfies – A: completeness – B: accuracy – C: neither – D: both.

Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.

Causal Consistency Without Dependency Check Messages Willy Zwaenepoel.

Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.

CS 582 / CMPE 481 Distributed Systems

Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.

Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.

Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101

Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

Molecular Transactions G. Ramalingam Kapil Vaswani Rigorous Software Engineering, MSRI.

Consistency and Replication. Replication of data Why? To enhance reliability To improve performance in a large scale system Replicas must be consistent.

Eiger: Stronger Semantics for Low-Latency Geo-Replicated Storage Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡ * Princeton,

IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.

Haonan Lu*†, Kaushik Veeraraghavan†, Philippe Ajoux†, Jim Hunt†,

CSE 486/586 Distributed Systems Consistency --- 2

CS6320 – Performance L. Grewe.

Slide credits: Thomas Kao

Shuai Mu, Lamont Nelson, Wyatt Lloyd, Jinyang Li

Trade-offs in Cloud Databases

CSE 486/586 Distributed Systems Consistency --- 2

CSE 486/586 Distributed Systems Consistency --- 1

CPS 512 midterm exam #1, 10/7/2016 Your name please: ___________________ NetID:___________ /60 /40 /10.

Strong Consistency & CAP Theorem

Strong Consistency & CAP Theorem

Google Filesystem Some slides taken from Alan Sussman.

Distributed Transactions and Spanner

Consistency and Replication

Strong Consistency & CAP Theorem

View Change Protocols and Reconfiguration

Replication and Consistency

I Can’t Believe It’s Not Causal

EECS 498 Introduction to Distributed Systems Fall 2017

CSE 486/586 Distributed Systems Consistency --- 3

Implementing Consistency -- Paxos

Consistent Data Access From Data Centers

Consistency Models.

EECS 498 Introduction to Distributed Systems Fall 2017

Replication and Consistency

EECS 498 Introduction to Distributed Systems Fall 2017

EECS 498 Introduction to Distributed Systems Fall 2017

EECS 498 Introduction to Distributed Systems Fall 2017

EECS 498 Introduction to Distributed Systems Fall 2017

CSE 486/586 Distributed Systems Consistency --- 1

EECS 498 Introduction to Distributed Systems Fall 2017

Presented by Marek Zawirski

PERSPECTIVES ON THE CAP THEOREM

Prophecy: Using History for High-Throughput Fault Tolerance

CAP Theorem and Consistency Models

Lecture 21: Replication Control

Atomic Commit and Concurrency Control

Distributed File Systems

Causal Consistency and Two-Phase Commit

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S

Our Final Consensus Protocol: RAFT

Distributed Systems CS

Transaction Properties: ACID vs. BASE

View Change Protocols and Reconfiguration

COS 418: Distributed Systems Lecture 16 Wyatt Lloyd

Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.

Strong Consistency & CAP Theorem

Replication Chapter 7.

Consistency and Replication

Lecture 21: Replication Control

CSE 486/586 Distributed Systems Consistency --- 2

Implementing Consistency -- Paxos

CSE 486/586 Distributed Systems Consistency --- 3

Replication and Consistency

Presentation transcript:

Scalable Causal Consistency COS 418: Distributed Systems Lecture 14 Wyatt Lloyd

Consistency Hierarchy Linearizability e.g., RAFT Sequential Consistency CAP PRAM 1988 (Princeton) Causal+ Consistency e.g., Bayou Eventual Consistency e.g., Dynamo

Consistency Hierarchy Linearizability e.g., RAFT Sequential Consistency CAP Part 1: More on consistency PRAM 1988 (Princeton) Causal+ Consistency e.g., Bayou P2: Scalable Causal Consistency Eventual Consistency e.g., Dynamo

Last Time’s Causal+ Consistency Partially orders all operations, does not totally order them Does not look like a single machine Guarantees For each process, ∃ an order of all writes + that process’s reads Order respects the happens-before () ordering of operations + replicas converge to the same state Skip details, makes it stronger than eventual consistency

Causal Consistency Writes that are potentially causally related must be seen by all processes in same order. Concurrent writes may be seen in a different order on different processes. Concurrent: Ops not causally related

Causal Consistency Writes that are potentially causally related must be seen by all processes in same order. Concurrent writes may be seen in a different order on different processes. Concurrent: Ops not causally related P1 P2 P3 a f b c d e g Physical time ↓

Causal Consistency P1 P2 P3 a f b c d e g Operations a, b b, f c, f e, f e, g a, c a, e Concurrent? N Y P2 P3 a f b c d e g Physical time ↓

Causal Consistency P1 P2 P3 a f b c d e g Operations a, b b, f c, f e, f e, g a, c a, e Concurrent? N Y P2 P3 a f b c d e g Physical time ↓

Causal Consistency: Quiz Valid under causal consistency Why? W(x)b and W(x)c are concurrent So all processes don’t (need to) see them in same order P3 and P4 read the values ‘a’ and ‘b’ in order as potentially causally related. No ‘causality’ for ‘c’.

Sequential Consistency: Quiz Invalid under sequential consistency Why? P3 and P4 see b and c in different order But fine for causal consistency B and C are not causually related

 x Causal Consistency A: Violation: W(x)b happens after W(x)a B: Correct. P2 doesn’t read value of a before W

Causal consistency within replicated systems

Implications of laziness on consistency shl Consensus Module State Machine Consensus Module State Machine Consensus Module State Machine Log Log Log add jmp mov shl add jmp mov shl add jmp mov shl Linearizability / sequential: Eager replication Trades off low-latency for consistency

Implications of laziness on consistency shl State Machine State Machine State Machine Log Log Log add jmp mov shl add jmp mov shl add jmp mov shl Causal consistency: Lazy replication Trades off consistency for low-latency Maintain local ordering when replicating Operations may be lost if failure before replication

Consistency Hierarchy Linearizability e.g., RAFT Sequential Consistency CAP Part 1: More on consistency PRAM 1988 (Princeton) Causal+ Consistency e.g., Bayou P2: Scalable Causal Consistency Eventual Consistency e.g., Dynamo

Consistency vs Scalability Scalability: Adding more machines allows more data to be stored and more operations to be handled! System Consistency Scalable? Dynamo Eventual Yes Bayou Causal No Paxos/RAFT Linearizable It’s time to think about scability!

Consistency vs Scalability Scalability: Adding more machines allows more data to be stored and more operations to be handled! System Consistency Scalable? Dynamo Eventual Yes Bayou Causal No COPS Paxos/RAFT Linearizable Next Time!

Don't Settle for Eventual: Scalable Causal Consistency for [Geo-Replicated] Storage with COPS W. Lloyd, M. Freedman, M. Kaminsky, D. Andersen SOSP 2011

Geo-Replicated Storage: Serve User Requests Quickly

Inside the Datacenter Web Tier Storage Tier Replication A-F G-L M-R S-Z Replication Web Tier Storage Tier A-F G-L M-R S-Z Remote DC

Trade-offs vs. Consistency (Stronger) Partition Tolerance Availability Low Latency Partition Tolerance Scalability

Scalability through Sharding A-Z A-F G-L M-R S-Z A-L M-Z A-C D-F G-J K-L M-O P-S T-V W-Z

Causality By Example Friends Boss Same process Reads-From (message receipt) Transitivity Remove boss from friends group Post to friends: “Time for a new job!” Friend reads post Friends Boss New Job!

Previous Causal Systems Bayou ‘94, TACT ‘00, PRACTI ‘06 Log-exchange based Log is single serialization point Implicitly captures & enforces causal order Loses cross-server causality OR Limits scalability Remote DC Local Datacenter 4 3 2 1 4 3 2 1 √ X

Scalability Key Idea Capture causality with explicit dependency metadata Enforce with distributed verifications Delay exposing replicated writes until all dependencies are satisfied in the datacenter 1 3 after Local Datacenter Remote DC 1 1 3 3 4 2 4 2

Available and Low Latency Our Architecture A-F G-L M-R S-Z A-F G-L M-R S-Z Client All Ops Local = Available and Low Latency Mention logical server abstraction here

COPS Architecture A-F G-L M-R S-Z Client Library A-F G-L M-R S-Z

Read A-F G-L M-R S-Z Client Library A-F G-L M-R S-Z read read

Write = Client Library write write_after write + ordering after metadata after = A-F G-L M-R S-Z Client Library A-F G-L M-R S-Z write write_after Replication after write

Exposing values after dep_checks return ensures causal Replicated Write Exposing values after dep_checks return ensures causal Unique Timestamp Locator Key dep_check(A195) deps L 337 A 195 A-F G-L M-R S-Z dep check (L337) write_after(…,deps)

Basic Architecture Summary All ops local, replicate in background Availability and low latency Shard data across many nodes Scalability Control replication with dependencies Causal consistency

From fully distributed operation Scalable Causal+ A-F G-L M-R S-Z From fully distributed operation read A-F G-L M-R S-Z read A-F G-L M-R S-Z write_after read write_after

Scalability Shard data for scalable storage New distributed protocol for scalably applying writes across shards Also need a new distributed protocol for consistently reading data across shards…

Asynchronous requests + distributed data = ?? Reads Aren’t Enough Asynchronous requests + distributed data = ?? Turing’s Operations New Job! Boss A-F G-L M-R S-Z Progress Web Srv Boss Boss I <3 Job Boss Progress 1 2 Progress Mention logical server abstraction here Boss New Job! 4 from 1 from 4 I <3 Job New Job! 3

Read-Only Transactions Consistent up-to-date view of data Across many servers Logical Time Alan…Friends 1 11 Alan…Status 2 19 Boss New Job! I <3 Job Alonzo…Friends Alan More on transactions next time!

COPS Scaling Evaluation More servers => More operations/sec

COPS Scalable causal consistency Shard for scalable storage Distributed protocols for coordinating writes and reads Evaluation confirms scalability All operations handled in local datacenter Availability Low latency We’re thinking scalably now! Next time: scalable strong consistency