Shuai Mu, Lamont Nelson, Wyatt Lloyd, Jinyang Li

Slides:



Advertisements
Similar presentations
Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky David G. Andersen * Princeton, Intel Labs, CMU Dont Settle for Eventual : Scalable Causal Consistency.
Advertisements

There is more Consensus in Egalitarian Parliaments Presented by Shayan Saeed Used content from the author's presentation at SOSP '13
Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos.
High throughput chain replication for read-mostly workloads
Calvin : Fast Distributed Transactions for Partitioned Database
Failure Detection The ping-ack failure detector in a synchronous system satisfies – A: completeness – B: accuracy – C: neither – D: both.
Author: Yang Zhang[SOSP’ 13] Presentator: Jianxiong Gao.
The SMART Way to Migrate Replicated Stateful Services Jacob R. Lorch, Atul Adya, Bill Bolosky, Ronnie Chaiken, John Douceur, Jon Howell Microsoft Research.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
CS 582 / CMPE 481 Distributed Systems
Paxos Quorum Leases Sayed Hadi Hashemi.
Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks Jiaqing Du, Daniele Sciascia, Sameh Elnikety.
Rococo: Extract more concurrency from distributed transactions
Low-Latency Multi-Datacenter Databases using Replicated Commit
Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.
Fault Tolerance via the State Machine Replication Approach Favian Contreras.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
From Viewstamped Replication to BFT Barbara Liskov MIT CSAIL November 2007.
1 ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin Best Paper Award at SOSP 2007.
Byzantine fault tolerance
Eiger: Stronger Semantics for Low-Latency Geo-Replicated Storage Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡ * Princeton,
S-Paxos: Eliminating the Leader Bottleneck
Systems Research Barbara Liskov October Replication Goal: provide reliability and availability by storing information at several nodes.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
Minimizing Commit Latency of Transactions in Geo-Replicated Data Stores Paper Authors: Faisal Nawab, Vaibhav Arora, Divyakant Argrawal, Amr El Abbadi University.
BChain: High-Throughput BFT Protocols
MDCC: Multi-data Center Consistency
CSE 486/586 Distributed Systems Mid-Semester Overview
a journey from the simple to the optimal
Alternative system models
The SNOW Theorem and Latency-Optimal Read-Only Transactions
CPS 512 midterm exam #1, 10/7/2016 Your name please: ___________________ NetID:___________ /60 /40 /10.
CSCI5570 Large Scale Data Processing Systems
Clock-SI: Snapshot Isolation for Partitioned Data Stores
Principles of Computer Security
The SNOW Theorem and Latency-Optimal Read-Only Transactions
I Can’t Believe It’s Not Causal
Implementing Consistency -- Paxos
EECS 498 Introduction to Distributed Systems Fall 2017
Outline Announcements Fault Tolerance.
Principles of Computer Security
2PC Recap Eventual Consistency & Dynamo
2PC Recap Eventual Consistency & Dynamo
EECS 498 Introduction to Distributed Systems Fall 2017
Replication and Recovery in Distributed Systems
EEC 688/788 Secure and Dependable Computing
From Viewstamped Replication to BFT
EEC 688/788 Secure and Dependable Computing
CSE 486/586 Distributed Systems Concurrency Control --- 3
Scalable Causal Consistency
Distributed Transactions
Lecture 21: Replication Control
Atomic Commit and Concurrency Control
EECS 498 Introduction to Distributed Systems Fall 2017
IS 651: Distributed Systems Final Exam
EEC 688/788 Secure and Dependable Computing
The SNOW Theorem and Latency-Optimal Read-Only Transactions
Distributed Systems CS
COS 418: Distributed Systems Lecture 16 Wyatt Lloyd
EEC 688/788 Secure and Dependable Computing
The SMART Way to Migrate Replicated Stateful Services
Database System Architectures
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Lecture 21: Replication Control
Implementing Consistency -- Paxos
Sean Choi, Seo Jin Park, Muhammad Shahbaz,
Presentation transcript:

Janus Consolidating Concurrency Control and Consensus for Commits under Conflicts Shuai Mu, Lamont Nelson, Wyatt Lloyd, Jinyang Li New York University, University of Southern California

Geo-replicate for fault tolerance State of the Art for Distributed Transactions Layer Concurrency Control on top of Consensus California Texas New York Transaction Protocol (e.g., 2PC) Paxos Shard for scalability Geo-replicate for fault tolerance Paxos

Latency Limitation: Multiple Wide-Area Round Trips from Layering California Texas New York a++ b++

Throughput Limitation: Conflicts Cause Aborts California Texas New York a++ b++ a*=2 b*=2

Goals: Fewer Wide-Area Round Trips and Commits Under Conflicts Best case wide-area RTTs Janus Tapir [SOSP’15] ... 1 Spanner [OSDI’12] ... Calvin [SIGMOD’12] ... ≥ 2 Behavior under conflicts Aborts Commits

Establish Order Before Execution to Avoid Aborts Designed for transactions with static read & write-sets Structure a transaction as a set of stored procedure pieces Servers establishes consistent ordering for pieces before execution a++ a++ a++ a b++ b++ b*=2 Challenge: Distributed ordering to avoid bottleneck a*=2 a*=2 a*=2 a*=2 b b*=2 b*=2 b++ b++

Establish Order for Transactions and Replication Together to Commit in 1 Wide-area Roundtrip Consistent ordering for transaction and replication is the same! Layering establishes the same order twice while Janus orders once Transaction Replication a++ a++ a++ a++ a++ a*=2 a*=2 a*=2 b++ b++ a*=2 a*=2 a*=2 a++ a++ a a' Replica of a Challenge: Fault tolerance for ordering b*=2 b*=2 a*=2 a*=2 b++ b++ b b*=2 b*=2

Overview of the Janus Protocol Conflicts? No Send pieces to servers Establish initial order using dependencies Detect conflicts Pre-accept Establish final ordering Execute pieces in order Commit Yes Replicate dependencies Accept

No Conflicts: Commit in 1 Wide-Area Round Trip Pre-accept Commit 1 Local RTT A B Execute California A B Execute New York 1 Wide-area RTT

Conflicts: Commit in 2 Wide-Area RTT Accept Pre-accept Commit ∪ A B California A B New York

Conflicts: Commit in 2 Wide-Area Round Trips Pre-accept Accept Commit A B California A B New York ∪

Conflicts: Commit in 2 Wide-Area Round Trip Merge Dependencies Deterministically Order Cycles Accept Commit A B Execute California A B Execute New York

Janus Achieves Fewer Wide-Area Round Trips and Commits Under Conflicts No conflicts: commit in 1 wide-area round trip Pre-accept sufficient to ensure same order under failures Conflicts: commit in 2 wide-area round trips Accept phase replicates dependencies to ensure same order under failures

Janus Paper Includes Many More Details Full details of execution Quorum sizes Behavior under server failure Behavior under coordinator (client) failure Design extensions to handle dynamic read & write sets

Evaluation https://github.com/NYU-NEWS/janus Throughput under conflicts Latency under conflicts Overhead when there are no conflicts? Baselines 2PL (2PC) layered on top of MultiPaxos TAPIR [SOSP’15] Testbed: EC2 (Oregon, Ireland, Seoul)

Janus Commits under Conflicts for High Throughput No aborts 2PL Aborts due to conflicts at shards Tapir Aborts due to conflicts at shards & replicas TPC-C with 6 shards, 3-way geo-replicated (9 total servers), 1 warehouse per shard.

Janus Commits under Conflicts for Low Latency High latency due to retries after aborts 2 wide-area roundtrips 2PL Tapir Janus 2 wide-area roundtrips plus execution time 1 wide-area roundtrip TPC-C with 6 shards, 3-way geo-replicated (9 total servers), 1 warehouse per shard.

Small Throughput Overhead under Few Conflicts from tracking dependencies Overhead from accept phase + increased dependency tracking Janus Tapir Overhead from retries after aborts Microbenchmark with 3 shards, 3-way replicated in a single data center (9 total servers).

Related Work EPaxos [SOSP’13] Rococo [OSDI’14] Isolation Level 1 RTT Commit under Conflicts Janus [OSDI’16] Strict-Serial ✔ Tapir [SOSP’15] ✖ Rep.Commit [VLDB’13] Calvin [SIGMOD’12] Spanner [OSDI’12] MDCC [EuroSys’13] ReadCommit* COPS [SOSP’11] Causal+ Eiger [NSDI’13] EPaxos [SOSP’13] Rococo [OSDI’14]

Conclusion Two limitations for layered transaction protocols Multiple wide-area round trips in the best case Conflicts cause aborts Janus consolidates concurrency control and consensus Ordering requirements are similar and can be combined! Establishing a single ordering with dependency tracking enables: Committing in 1 wide-area round trip in the best case Committing in 2 wide-area round trips under conflicts Evaluation Small throughput overhead when there are no conflicts Low latency and good throughput even with many conflicts