SOSP 2007 © 2007 Andreas Haeberlen, MPI-SWS 1 Practical accountability for distributed systems Andreas Haeberlen MPI-SWS / Rice University Petr Kuznetsov.

Slides:



Advertisements
Similar presentations
In Byzantium Advanced Topics in Distributed Systems Spring 2011 Imranul Hoque 1.
Advertisements

Depot: Cloud Storage with Minimal Trust OSDI 2010 Prince Mahajan, Srinath Setty, Sangmin Lee, Allen Clement, Lorenzo Alvisi, Mike Dahlin, and Michael Walfish.
Secure Multiparty Computations on Bitcoin
Accountable systems or how to catch a liar? Jinyang Li (with slides from authors of SUNDR and PeerReview)
Cryptography and Network Security 2 nd Edition by William Stallings Note: Lecture slides by Lawrie Brown and Henric Johnson, Modified by Andrew Yang.
© 2010 Andreas Haeberlen 1 Accountable Virtual Machines OSDI (October 4, 2010) Andreas Haeberlen University of Pennsylvania Paarijaat Aditya Rodrigo Rodrigues.
SecureMR: A Service Integrity Assurance Framework for MapReduce Wei Wei, Juan Du, Ting Yu, Xiaohui Gu North Carolina State University, United States Annual.
1 The Case for Byzantine Fault Detection. 2 Challenge: Byzantine faults Distributed systems are subject to a variety of failures and attacks Hacker break-in.
Reliable Client Accounting for P2P-Infrastructure Hybrids Paarijaat Aditya †, Ming-Chen Zhao ‡, Yin Lin *, Andreas Haeberlen ‡, Peter Druschel †, Bruce.
LADIS workshop (Oct 11, 2009) A Case for the Accountable Cloud Andreas Haeberlen MPI-SWS.
P. Kouznetsov, 2006 Abstracting out Byzantine Behavior Peter Druschel Andreas Haeberlen Petr Kouznetsov Max Planck Institute for Software Systems.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Haifeng Yu National University of Singapore
SIA: Secure Information Aggregation in Sensor Networks Bartosz Przydatek, Dawn Song, Adrian Perrig Carnegie Mellon University Carl Hartung CSCI 7143: Secure.
Chapter 1 – Introduction
1 Cryptography and Network Security Third Edition by William Stallings Lecturer: Dr. Saleem Al_Zoubi.
Slide 1 Client / Server Paradigm. Slide 2 Outline: Client / Server Paradigm Client / Server Model of Interaction Server Design Issues C/ S Points of Interaction.
CSCE 715 Ankur Jain 11/16/2010. Introduction Design Goals Framework SDT Protocol Achievements of Goals Overhead of SDT Conclusion.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
SRG PeerReview: Practical Accountability for Distributed Systems Andreas Heaberlen, Petr Kouznetsov, and Peter Druschel SOSP’07.
NSDI (April 24, 2009) © 2009 Andreas Haeberlen, MPI-SWS 1 NetReview: Detecting when interdomain routing goes wrong Andreas Haeberlen MPI-SWS / Rice Ioannis.
2/23/2009CS50901 Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial Fred B. Schneider Presenter: Aly Farahat.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
© 2006 Andreas Haeberlen, MPI-SWS 1 The Case for Byzantine Fault Detection Andreas Haeberlen MPI-SWS / Rice University Petr Kouznetsov MPI-SWS Peter Druschel.
Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.
Building and Programming the Cloud, Mysore, Jan Accountable distributed systems and the accountable cloud Peter Druschel joint work with Andreas.
Applied Cryptography for Network Security
Cryptography and Network Security Chapter 1. Chapter 1 – Introduction The art of war teaches us to rely not on the likelihood of the enemy's not coming,
Cryptography and Network Security Third Edition by William Stallings Lecture slides by Lawrie Brown.
Cryptography and Network Security Chapter 1 Fourth Edition by William Stallings Lecture slides by Lawrie Brown.
Alexander Potapov.  Authentication definition  Protocol architectures  Cryptographic properties  Freshness  Types of attack on protocols  Two-way.
Computer Security Tran, Van Hoai Department of Systems & Networking Faculty of Computer Science & Engineering HCMC University of Technology.
1 Cryptography and Network Security Fourth Edition by William Stallings Lecture slides by Lawrie Brown Changed by: Somesh Jha [Lecture 1]
Dr. Lo’ai Tawalbeh 2007 INCS 741: Cryptography Chapter 1:Introduction Dr. Lo’ai Tawalbeh New York Institute of Technology (NYIT) Jordan’s Campus
Cryptography and Network Security
Eng. Wafaa Kanakri Second Semester 1435 CRYPTOGRAPHY & NETWORK SECURITY Chapter 1:Introduction Eng. Wafaa Kanakri UMM AL-QURA UNIVERSITY
Fault Tolerance via the State Machine Replication Approach Favian Contreras.
SecureMR: A Service Integrity Assurance Framework for MapReduce Author: Wei Wei, Juan Du, Ting Yu, Xiaohui Gu Source: Annual Computer Security Applications.
NSDI (April 24, 2009) © 2009 Andreas Haeberlen, MPI-SWS 1 NetReview: Detecting when interdomain routing goes wrong Andreas Haeberlen MPI-SWS / Rice Ioannis.
1 Configurable Security for Scavenged Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany, Matei Ripeanu.
1 The Design of a Robust Peer-to-Peer System Rodrigo Rodrigues, Barbara Liskov, Liuba Shrira Presented by Yi Chen Some slides are borrowed from the authors’
CSC8320. Outline Content from the book Recent Work Future Work.
Accountability Aditya Akella. Outline Accountable Virtual Machines Accountability in and via SDN.
Presented by Keun Soo Yim March 19, 2009
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Introduction. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
Practical Byzantine Fault Tolerance
Evoting using collaborative clustering Justin Gray Osama Khaleel Joey LaConte Frank Watson.
Cryptography and Network Security (CS435) Part One (Introduction)
Key Management. Given a computer network with n hosts, for each host to be able to communicate with any other host would seem to require as many as n*(n-1)
1 ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin Best Paper Award at SOSP 2007.
Byzantine fault tolerance
BFTW 3 workshop (Sep 22, 2009)© 2009 Andreas Haeberlen 1 The Fault Detection Problem Andreas Haeberlen MPI-SWS Petr Kuznetsov TU Berlin / Deutsche Telekom.
A. Haeberlen Fault Tolerance and the Five-Second Rule 1 HotOS XV (May 18, 2015) Ang Chen Hanjun Xiao Andreas Haeberlen Linh Thi Xuan Phan Department of.
1 Chapter 1 – Background Computer Security T/ Tyseer Alsamany - Computer Security.
SIGCOMM 2012 (August 16, 2012) Private and Verifiable Interdomain Routing Decisions Mingchen Zhao * Wenchao Zhou * Alexander Gurney * Andreas Haeberlen.
Topic 1 – Introduction Huiqun Yu Information Security Principles & Applications.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
P2PSIP Security Analysis and evaluation draft-song-p2psip-security-eval-00 Song Yongchao Ben Y. Zhao
PeerReview: Practical Accountability for Distributed Systems SOSP 07.
Tamper Resistant Software: An Implementation By David Aucsmith, IAL In Information Hiding Workshop, RJ Anderson (ed), LNCS, 1174, pp , “Integrity.
Cryptography and Network Security Chapter 1. Background  Information Security requirements have changed in recent times  traditionally provided by physical.
Bitcoin is a cryptographic currency that has been in continuous operation over the last 3 years. It currently enjoys an exchange rate of $4.80 (as of April.
1 Network Security Maaz bin ahmad.. 2 Outline Attacks, services and mechanisms Security attacks Security services Security Mechanisms A model for Internetwork.
Fail-Stop Processors UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau One paper: Byzantine.
Robust Distributed Systems
Distributed Systems – Paxos
Accountable Virtual Machines
Presentation transcript:

SOSP 2007 © 2007 Andreas Haeberlen, MPI-SWS 1 Practical accountability for distributed systems Andreas Haeberlen MPI-SWS / Rice University Petr Kuznetsov MPI-SWS Peter Druschel MPI-SWS

SOSP © 2007 Andreas Haeberlen, MPI-SWS Motivation Distributed state, incomplete information General case: Multiple admins with different interests Admin

SOSP © 2007 Andreas Haeberlen, MPI-SWS General faults occur in practice Many faults are not 'fail-stop' Node is still running, but its behavior changes Examples: Hardware malfunctions Misconfigurations Software modifications by users Hacker attacks...

SOSP © 2007 Andreas Haeberlen, MPI-SWS Dealing with general faults is difficult How to detect faults? How to identify the faulty nodes? How to convince others that a node is (not) faulty? Incorrect message Responsible admin

SOSP © 2007 Andreas Haeberlen, MPI-SWS Learning from the 'offline' world Relies on accountability Example: Banks Can be used to detect, identify and convince But: Existing fault-tolerance work mostly focused on prevention Goal: A general+practical system for accountability RequirementSolution CommitmentSigned receipts Tamper-evident recordDouble-entry bookkeeping InspectionsAudits

SOSP © 2007 Andreas Haeberlen, MPI-SWS Outline Introduction What is accountability? How can we implement it? How well does it work?

SOSP © 2007 Andreas Haeberlen, MPI-SWS Ideal accountability Whenever a node is faulty in any way, the system generates a proof of misbehavior against that node Fault := Node deviates from expected behavior Recall that our goal is to detect faults identify the faulty nodes convince others that a node is (or is not) faulty Can we build a system that provides the following guarantee?

SOSP © 2007 Andreas Haeberlen, MPI-SWS Can we detect all faults? Problem: Faults that affect only a node's internal state Requires online trusted probes at each node Focus on observable faults: Faults that causally affect a correct node This allows us to detect faults without introducing any trusted components A A X C C

SOSP © 2007 Andreas Haeberlen, MPI-SWS Can we always get a proof? Problem: He-said-she-said situation Three possible causes: A never sent X B refuses to accept X X was lost by the network Cannot get proof of misbehavior! Generalize to verifiable evidence: a proof of misbehavior, or a challenge that the node cannot answer What if, after a long time, no response has arrived? Does not prove the fault, but we can suspect the node A A X B B C C ? I sent X! I never received X! ?!

SOSP © 2007 Andreas Haeberlen, MPI-SWS Practical accountability We propose the following definition of a distributed system with accountability: This is useful Any (!) fault that affects a correct node is eventually detected and linked to a faulty node It can be implemented in practice Whenever a fault is observed by a correct node, the system eventually generates verifiable evidence against a faulty node

SOSP © 2007 Andreas Haeberlen, MPI-SWS Outline Introduction What is accountability? How can we implement it? How well does it work?

SOSP © 2007 Andreas Haeberlen, MPI-SWS Adds accountability to a given system Implemented as a library Provides secure record, commitment, auditing, etc. Assumptions: Implementation: PeerReview 1. System can be modeled as collection of deterministic state machines 2. Nodes have reference implementations of the state machines 3. Correct nodes can eventually communicate 4. Nodes can sign messages

SOSP © 2007 Andreas Haeberlen, MPI-SWS M PeerReview from 10,000 feet All nodes keep a log of their inputs & outputs Including all messages Each node has a set of witnesses, who audit its log periodically If the witnesses detect misbehavior, they generate evidence make the evidence avai- lable to other nodes Other nodes check evi- dence, report fault A's log B's log A A B B M C C D D E E A's witnesses M

SOSP © 2007 Andreas Haeberlen, MPI-SWS PeerReview detects tampering A B Message Hash chain Send(X) Recv(Y) Send(Z) Recv(M) H0H0 H1H1 H2H2 H3H3 H4H4 B's log ACK What if a node modifies its log entries? Log entries form a hash chain Inspired by secure histories [Maniatis02] Signed hash is included with every message  Node commits to its current state  Changes are evident Hash(log)

SOSP © 2007 Andreas Haeberlen, MPI-SWS PeerReview detects inconsistencies What if a node keeps multiple logs? forks its log? Check whether the signed hashes form a single hash chain H3'H3' Read X H4'H4' Not found Read Z OK Create X H0H0 H1H1 H2H2 H3H3 H4H4 OK "View #1" "View #2"

SOSP © 2007 Andreas Haeberlen, MPI-SWS Module B PeerReview detects faults How to recognize faults in a log? Assumption: Node can be modeled as a deterministic state machine To audit a node: Replay inputs to a trusted copy of the state machine Check outputs against the log Module A Module B =? Log Network Input Output State machine if ≠ Module A

SOSP © 2007 Andreas Haeberlen, MPI-SWS PeerReview offers provable guarantees PeerReview guarantees that: 1) Faults will be detected 2) Good nodes cannot be accused Formal definitions and proof in a TR If node commits a fault + has a correct witness, then witness obtains a proof of misbehavior (PoM), or a challenge that the faulty node cannot answer If node is correct there can never be a PoM, and it can answer any challenge

SOSP © 2007 Andreas Haeberlen, MPI-SWS Outline Introduction What is accountability? How can we implement it? How well does it work? Is it widely applicable? How much does it cost? Does it scale?

SOSP © 2007 Andreas Haeberlen, MPI-SWS PeerReview is widely applicable App #1: NFS server in the Linux kernel Many small, latency-sensitive requests Tampering with files Lost updates App #2: Overlay multicast Transfers large volume of data Freeloading Tampering with content App #3: P2P Complex, large, decentralized Denial of service Attacks on DHT routing More information in the paper Metadata corruption Incorrect access control Censorship

SOSP © 2007 Andreas Haeberlen, MPI-SWS How much does PeerReview cost? Dominant cost depends on number of witnesses W O(W 2 ) component Baseline Avg traffic (Kbps/node) Number of witnesses Baseline traffic Signatures and ACKs Checking logs W dedicated witnesses

SOSP © 2007 Andreas Haeberlen, MPI-SWS Mutual auditing Small probability of error is inevitable Example: Replication Can use this to optimize PeerReview Accept that an instance of a fault is found only with high probability Asymptotic complexity: O(N 2 )  O(log N) Small random sample of peers chosen as witnesses Node

SOSP © 2007 Andreas Haeberlen, MPI-SWS PeerReview is scalable Assumption: Up to 10% of nodes can be faulty Probabilistic guarantees enable scalability Example: system scales to over 10,000 nodes with P= DSL/cable upstream system w/o accountability O((log N) 2 ) O(log N) system + PeerReview (P= ) system + PeerReview (P=1.0) System size (nodes) Avg traffic (Kbps/node)

SOSP © 2007 Andreas Haeberlen, MPI-SWS Summary Accountability is a new approach to handling faults in distributed systems detects faults identifies the faulty nodes produces evidence Our practical definition of accountability: Whenever a fault is observed by a correct node, the system eventually generates verifiable evidence against a faulty node PeerReview: A system that enforces accountability Offers provable guarantees and is widely applicable Thank you! Andreas Haeberlen received a SOSP travel scholarship, which was supported by Infosys