Sisi Duan Assistant Professor Information Systems

Slides:

Advertisements

Similar presentations

NETWORK ALGORITHMS Presenter- Kurchi Subhra Hazra.

Advertisements

High throughput chain replication for read-mostly workloads

CS 5204 – Operating Systems1 Paxos Student Presentation by Jeremy Trimble.

CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.

The SMART Way to Migrate Replicated Stateful Services Jacob R. Lorch, Atul Adya, Bill Bolosky, Ronnie Chaiken, John Douceur, Jon Howell Microsoft Research.

1 Attested Append-Only Memory: Making Adversaries Stick to their Word Byung-Gon Chun (ICSI) October 15, 2007 Joint work with Petros Maniatis (Intel Research,

Yee Jiun Song Cornell University. CS5410 Fall 2008.

2/23/2009CS50901 Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial Fred B. Schneider Presenter: Aly Farahat.

Distributed Systems CS Case Study: Replication in Google Chubby Recitation 5, Oct 06, 2011 Majd F. Sakr, Vinay Kolar, Mohammad Hammoud.

Byzantine fault tolerance

Byzantine Fault Tolerance CS 425: Distributed Systems Fall Material drived from slides by I. Gupta and N.Vaidya.

Fault Tolerance via the State Machine Replication Approach Favian Contreras.

Low-Overhead Byzantine Fault-Tolerant Storage James Hendricks, Gregory R. Ganger Carnegie Mellon University Michael K. Reiter University of North Carolina.

An Introduction to Consensus with Raft

Practical Byzantine Fault Tolerance

Practical Byzantine Fault Tolerance Jayesh V. Salvi

Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.

From Viewstamped Replication to BFT Barbara Liskov MIT CSAIL November 2007.

Presenters: Rezan Amiri Sahar Delroshan

1 ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin Best Paper Award at SOSP 2007.

Byzantine fault tolerance

Practical Byzantine Fault Tolerance and Proactive Recovery

Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.

Byzantine Fault Tolerance CS 425: Distributed Systems Fall 2012 Lecture 26 November 29, 2012 Presented By: Imranul Hoque 1.

Re-Configurable Byzantine Quorum System Lei Kong S. Arun Mustaque Ahamad Doug Blough.

Systems Research Barbara Liskov October Replication Goal: provide reliability and availability by storing information at several nodes.

Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.

9.2 SECURE CHANNELS JEJI RAMCHAND VEDULLAPALLI. Content Introduction Authentication Message Integrity and Confidentiality Secure Group Communications.

Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.

Advanced Operating Systems Chapter 6.1 – Characteristics of a DFS Jongchan Shin.

BChain: High-Throughput BFT Protocols

Exercises for Chapter 11: COORDINATION AND AGREEMENT

Intrusion Tolerant Architectures

Primary-Backup Replication

Slide credits: Thomas Kao

The consensus problem in distributed systems

Tolerating Latency in Replicated State Machines through Client Speculation April 22, 2009 Benjamin Wester1, James Cowling2, Edmund B. Nightingale3, Peter.

Distributed Systems – Paxos

Secure Causal Atomic Broadcast, Revisited

Principles of Computer Security

Principles of Computer Security

View Change Protocols and Reconfiguration

The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.

Distributed Systems: Paxos

Byzantine Fault Tolerance

EECS 498 Introduction to Distributed Systems Fall 2017

Providing Secure Storage on the Internet

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S

Principles of Computer Security

Jacob Gardner & Chuan Guo

IS 651: Distributed Systems Byzantine Fault Tolerance

Active replication for fault tolerance

From Viewstamped Replication to BFT

IS 651: Distributed Systems Fault Tolerance

Lecture 21: Replication Control

Fault-Tolerant State Machine Replication

IS 651: Distributed Systems Final Exam

CMSC Cluster Computing Basics

View Change Protocols and Reconfiguration

EEC 688/788 Secure and Dependable Computing

The SMART Way to Migrate Replicated Stateful Services

Cryptography Lecture 24.

Lecture 21: Replication Control

Implementing Consistency -- Paxos

Last Class: Fault Tolerance

Sisi Duan Assistant Professor Information Systems

Blockchains Lecture 1.

Sisi Duan Assistant Professor Information Systems

Blockchains Lecture 6.

Presentation transcript:

Sisi Duan Assistant Professor Information Systems sduan@umbc.edu IS 698/800-01: Advanced Distributed Systems Separating Agreement from Execution Sisi Duan Assistant Professor Information Systems sduan@umbc.edu

Overview Separating execution from agreement Why and how? Another look at Paxos and Fabric Shuttle: Byzantine chain replication

SMR and Blockchains What we have talked about so far… SMR as a service Availability and integrity (total order of requests, consistency…) Storage (read and write…) The challenges in real applications Execution time Storage space

Another look at hyperledger Fabric

Another look at Paxos A diagram closer to the original Paxos algorithm

HDFS

MapReduce

Separating execution from agreement Wang, Yang, Lorenzo Alvisi, and Mike Dahlin. "Gnothi: Separating data and metadata for efficient and available storage replication." ATC 2012. Androulaki, Elli, et al. "Erasure-coded Byzantine storage with separate metadata." PODC. Springer, Cham, 2014. Cachin, Christian, Dan Dobre, and Marko Vukolić. "Separating data and control: Asynchronous BFT storage with 2t+ 1 data replicas." Symposium on Self- Stabilizing Systems. Springer, Cham, 2014. Yin, Jian, et al. "Separating agreement from execution for byzantine fault tolerant services." ACM SIGOPS Operating Systems Review 37.5 (2003): 253-267. Duan, Sisi, and Haibin Zhang. "Practical state machine replication with confidentiality." SRDS. IEEE, 2016. Van Renesse, Robbert, Chi Ho, and Nicolas Schiper. "Byzantine chain replication." International Conference On Principles Of Distributed Systems. Springer, Berlin, Heidelberg, 2012.

An example of separation Cachin, Christian, Dan Dobre, and Marko Vukolić. "Separating data and control: Asynchronous BFT storage with 2t+ 1 data replicas." Symposium on Self-Stabilizing Systems. Springer, Cham, 2014.

Separating Execution from Agreement Yin, Jian, et al. "Separating agreement from execution for byzantine fault tolerant services." ACM SIGOPS Operating Systems Review 37.5 (2003): 253-267. Agreement cluster generates an order for a request The ordered request is sent to the execution cluster What should be the proof? Execution cluster executes the requests and returns the result to A and then the client What should be the failure model? Agreement cluster? Execution cluster?

Separating Execution from Agreement Client sends request to Agreement cluster (AC) A server orders the request using BFT such as PBFT AC generates a commit certificate (2f+1 signatures with commit messages) and sends to EC EC Waits until all previous requests have been executed executes the requests and generates a reply Reply certificate: t+1 signatures and a reply

Separating Execution from Agreement Benefits? Execution extensive operations do not become the bottleneck Storage space is not an issue Agreement cluster does not need to know what’s inside the request Simple majority from execution cluster

Separating Execution from Agreement Confidentiality? Faulty results are sent to the clients as well

Confidentiality Adding privacy firewall Instead of letting clients collect and combine the results, do them for the clients using the F.

Confidentiality Adding privacy firewall Filter the results How many F do we need in each row? How many rows do we need? Assumption: h failures in the firewalls

Confidentiality The request and reply body are encrypted Using threshold signatures Why?

The privacy firewall There exists at least one correct path between AC and EC which only consists of correct filters There exists one row which only consists of correct filter nodes and the rows below do not have any information from nay replicas in EC

Another look at confidentiality and safety Duan, Sisi, and Haibin Zhang. "Practical state machine replication with confidentiality." SRDS. IEEE, 2016 The requests and replies are encrypted? What’s the issue? Encryption: deterministic or randomized? Deterministic… not good… Randomization: violate the need for threshold signature and the PF cannot filter…

Enabling confidentiality

Enabling Confidentiality AEAD (soundness) Authenticated encryption scheme with associated data Efficiency Symmetric key based authentication and encryption Randomized BFT (request and reply privacy) Random coins Request-specific random coin

Byzantine Chain Replication: Shuttle Van Renesse, Robbert, Chi Ho, and Nicolas Schiper. "Byzantine chain replication." International Conference On Principles Of Distributed Systems. Springer, Berlin, Heidelberg, 2012. Another concept of separating roles Configurable chain replication

Replica states and transitions PENDING ACTIVE IMMUTABLE Transitions orderCommand(p,C,s,o) BecomeImmutable(p) switchConfig(C,C’,H) becomeActive(p,C,hist)

Olympus: The centralized oracle Configuration service Generates configurations Issuing inithist statements Generate a new configuration upon receiving a reconfiguration-request statement

Shuttle: The failure-free case Client obtains a configuration from Olympus Send o to the head of the chain Each replica Validates the requests Applies o and obtain a result r Adds the request to the order proof Adds the result to the result proof Forwards the message to the next replica Tail: forwards the result proof to the client The result proof is return backwards to the chain

Shuttle: under failures Client cannot receive response on time and retransmit If the server has the result, return to the client If the server is immutable, return error Head If it has the result, sent to the client If it has ordered requests and still is waiting for the result to come back: starts a timer It does not recognize the operation: order it again

Shuttle: under failures If the timer expires at any replica Send a reconfiguration-request statement to Olympus Olympus Send signed wedge requests to all the replicas Correct replicas respond with wedged statements (becomeImmutable) Await responses from a quorum of replicas, and construct a history h by selecting the longest order proof for each slot Allocate a new configuration C of replicas and seed those with h and a configuration statement for C. The replica then become active(becomeActive)

A comparison Different primitives and purposes Simple separation: performance, no confidentiality

Conclusion The concept is everywhere Simple Chubby Zookeeper Storage sys Simple Separate the design of agreement and storage Performance Latency is obviously longer

Conclusion Also a common concept Pros Cons Google file system master service Pros Simple Efficient Cons Single point of failures? How about frequent failures?