Shuai Mu, Lamont Nelson, Wyatt Lloyd, Jinyang Li

Slides:

Advertisements

Similar presentations

Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky David G. Andersen * Princeton, Intel Labs, CMU Dont Settle for Eventual : Scalable Causal Consistency.

Advertisements

There is more Consensus in Egalitarian Parliaments Presented by Shayan Saeed Used content from the author's presentation at SOSP '13

Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos.

High throughput chain replication for read-mostly workloads

Calvin : Fast Distributed Transactions for Partitioned Database

Failure Detection The ping-ack failure detector in a synchronous system satisfies – A: completeness – B: accuracy – C: neither – D: both.

Author: Yang Zhang[SOSP’ 13] Presentator: Jianxiong Gao.

The SMART Way to Migrate Replicated Stateful Services Jacob R. Lorch, Atul Adya, Bill Bolosky, Ronnie Chaiken, John Douceur, Jon Howell Microsoft Research.

Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :

CS 582 / CMPE 481 Distributed Systems

Paxos Quorum Leases Sayed Hadi Hashemi.

Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks Jiaqing Du, Daniele Sciascia, Sameh Elnikety.

Rococo: Extract more concurrency from distributed transactions

Low-Latency Multi-Datacenter Databases using Replicated Commit

Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

Fault Tolerance via the State Machine Replication Approach Favian Contreras.

VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.

From Viewstamped Replication to BFT Barbara Liskov MIT CSAIL November 2007.

1 ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin Best Paper Award at SOSP 2007.

Byzantine fault tolerance

Eiger: Stronger Semantics for Low-Latency Geo-Replicated Storage Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡ * Princeton,

S-Paxos: Eliminating the Leader Bottleneck

Systems Research Barbara Liskov October Replication Goal: provide reliability and availability by storing information at several nodes.

CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.

CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.

Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.

Minimizing Commit Latency of Transactions in Geo-Replicated Data Stores Paper Authors: Faisal Nawab, Vaibhav Arora, Divyakant Argrawal, Amr El Abbadi University.

BChain: High-Throughput BFT Protocols

MDCC: Multi-data Center Consistency

CSE 486/586 Distributed Systems Mid-Semester Overview

a journey from the simple to the optimal

Alternative system models

The SNOW Theorem and Latency-Optimal Read-Only Transactions

CPS 512 midterm exam #1, 10/7/2016 Your name please: ___________________ NetID:___________ /60 /40 /10.

CSCI5570 Large Scale Data Processing Systems

Clock-SI: Snapshot Isolation for Partitioned Data Stores

Principles of Computer Security

The SNOW Theorem and Latency-Optimal Read-Only Transactions

I Can’t Believe It’s Not Causal

Implementing Consistency -- Paxos

EECS 498 Introduction to Distributed Systems Fall 2017

Outline Announcements Fault Tolerance.

Principles of Computer Security

2PC Recap Eventual Consistency & Dynamo

2PC Recap Eventual Consistency & Dynamo

EECS 498 Introduction to Distributed Systems Fall 2017

Replication and Recovery in Distributed Systems

EEC 688/788 Secure and Dependable Computing

From Viewstamped Replication to BFT

EEC 688/788 Secure and Dependable Computing

CSE 486/586 Distributed Systems Concurrency Control --- 3

Scalable Causal Consistency

Distributed Transactions

Lecture 21: Replication Control

Atomic Commit and Concurrency Control

EECS 498 Introduction to Distributed Systems Fall 2017

IS 651: Distributed Systems Final Exam

EEC 688/788 Secure and Dependable Computing

The SNOW Theorem and Latency-Optimal Read-Only Transactions

Distributed Systems CS

COS 418: Distributed Systems Lecture 16 Wyatt Lloyd

EEC 688/788 Secure and Dependable Computing

The SMART Way to Migrate Replicated Stateful Services

Database System Architectures

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

Lecture 21: Replication Control

Implementing Consistency -- Paxos

Sean Choi, Seo Jin Park, Muhammad Shahbaz,

Presentation transcript:

Janus Consolidating Concurrency Control and Consensus for Commits under Conflicts Shuai Mu, Lamont Nelson, Wyatt Lloyd, Jinyang Li New York University, University of Southern California

Geo-replicate for fault tolerance State of the Art for Distributed Transactions Layer Concurrency Control on top of Consensus California Texas New York Transaction Protocol (e.g., 2PC) Paxos Shard for scalability Geo-replicate for fault tolerance Paxos

Latency Limitation: Multiple Wide-Area Round Trips from Layering California Texas New York a++ b++

Throughput Limitation: Conflicts Cause Aborts California Texas New York a++ b++ a*=2 b*=2

Goals: Fewer Wide-Area Round Trips and Commits Under Conflicts Best case wide-area RTTs Janus Tapir [SOSP’15] ... 1 Spanner [OSDI’12] ... Calvin [SIGMOD’12] ... ≥ 2 Behavior under conflicts Aborts Commits

Establish Order Before Execution to Avoid Aborts Designed for transactions with static read & write-sets Structure a transaction as a set of stored procedure pieces Servers establishes consistent ordering for pieces before execution a++ a++ a++ a b++ b++ b*=2 Challenge: Distributed ordering to avoid bottleneck a*=2 a*=2 a*=2 a*=2 b b*=2 b*=2 b++ b++

Establish Order for Transactions and Replication Together to Commit in 1 Wide-area Roundtrip Consistent ordering for transaction and replication is the same! Layering establishes the same order twice while Janus orders once Transaction Replication a++ a++ a++ a++ a++ a*=2 a*=2 a*=2 b++ b++ a*=2 a*=2 a*=2 a++ a++ a a' Replica of a Challenge: Fault tolerance for ordering b*=2 b*=2 a*=2 a*=2 b++ b++ b b*=2 b*=2

Overview of the Janus Protocol Conflicts? No Send pieces to servers Establish initial order using dependencies Detect conflicts Pre-accept Establish final ordering Execute pieces in order Commit Yes Replicate dependencies Accept

No Conflicts: Commit in 1 Wide-Area Round Trip Pre-accept Commit 1 Local RTT A B Execute California A B Execute New York 1 Wide-area RTT

Conflicts: Commit in 2 Wide-Area RTT Accept Pre-accept Commit ∪ A B California A B New York

Conflicts: Commit in 2 Wide-Area Round Trips Pre-accept Accept Commit A B California A B New York ∪

Conflicts: Commit in 2 Wide-Area Round Trip Merge Dependencies Deterministically Order Cycles Accept Commit A B Execute California A B Execute New York

Janus Achieves Fewer Wide-Area Round Trips and Commits Under Conflicts No conflicts: commit in 1 wide-area round trip Pre-accept sufficient to ensure same order under failures Conflicts: commit in 2 wide-area round trips Accept phase replicates dependencies to ensure same order under failures

Janus Paper Includes Many More Details Full details of execution Quorum sizes Behavior under server failure Behavior under coordinator (client) failure Design extensions to handle dynamic read & write sets

Evaluation https://github.com/NYU-NEWS/janus Throughput under conflicts Latency under conflicts Overhead when there are no conflicts? Baselines 2PL (2PC) layered on top of MultiPaxos TAPIR [SOSP’15] Testbed: EC2 (Oregon, Ireland, Seoul)

Janus Commits under Conflicts for High Throughput No aborts 2PL Aborts due to conflicts at shards Tapir Aborts due to conflicts at shards & replicas TPC-C with 6 shards, 3-way geo-replicated (9 total servers), 1 warehouse per shard.

Janus Commits under Conflicts for Low Latency High latency due to retries after aborts 2 wide-area roundtrips 2PL Tapir Janus 2 wide-area roundtrips plus execution time 1 wide-area roundtrip TPC-C with 6 shards, 3-way geo-replicated (9 total servers), 1 warehouse per shard.

Small Throughput Overhead under Few Conflicts from tracking dependencies Overhead from accept phase + increased dependency tracking Janus Tapir Overhead from retries after aborts Microbenchmark with 3 shards, 3-way replicated in a single data center (9 total servers).

Related Work EPaxos [SOSP’13] Rococo [OSDI’14] Isolation Level 1 RTT Commit under Conflicts Janus [OSDI’16] Strict-Serial ✔ Tapir [SOSP’15] ✖ Rep.Commit [VLDB’13] Calvin [SIGMOD’12] Spanner [OSDI’12] MDCC [EuroSys’13] ReadCommit* COPS [SOSP’11] Causal+ Eiger [NSDI’13] EPaxos [SOSP’13] Rococo [OSDI’14]

Conclusion Two limitations for layered transaction protocols Multiple wide-area round trips in the best case Conflicts cause aborts Janus consolidates concurrency control and consensus Ordering requirements are similar and can be combined! Establishing a single ordering with dependency tracking enables: Committing in 1 wide-area round trip in the best case Committing in 2 wide-area round trips under conflicts Evaluation Small throughput overhead when there are no conflicts Low latency and good throughput even with many conflicts