Failure Resilience in the Peer-to-Peer-System OceanStore Speaker: Corinna Richter.

Slides:



Advertisements
Similar presentations
COS 461 Fall 1997 Replication u previous lectures: replication for performance u today: replication for availability and fault tolerance –availability:
Advertisements

Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems Yuchong Hu, Patrick P. C. Lee, Kenneth.
What is OceanStore? - 10^10 users with files each - Goals: Durability, Availability, Enc. & Auth, High performance - Worldwide infrastructure to.
Byzantine Generals. Outline r Byzantine generals problem.
HAIL (High-Availability and Integrity Layer) for Cloud Storage
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
POND: the OceanStore Prototype Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon, Ben Zhao and John Kubiatowicz UC, Berkeley File and Storage.
Pond: the OceanStore Prototype CS 6464 Cornell University Presented by Yeounoh Chung.
Pond The OceanStore Prototype. Pond -- Dennis Geels -- January 2003 Talk Outline System overview Implementation status Results from FAST paper Conclusion.
Pond: the OceanStore Prototype Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon,
Pond The OceanStore Prototype. Introduction Problem: Rising cost of storage management Observations: Universal connectivity via Internet $100 terabyte.
Pond: the OceanStore Prototype Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon,
The Byzantine Generals Problem Boon Thau Loo CS294-4.
Trustworthy Services from Untrustworthy Components: Overview Fred B. Schneider Department of Computer Science Cornell University Ithaca, New York
Beyond the MDS Bound in Distributed Cloud Storage
David Choffnes, Winter 2006 OceanStore Maintenance-Free Global Data StorageMaintenance-Free Global Data Storage, S. Rhea, C. Wells, P. Eaton, D. Geels,
Resilient Peer-to-Peer Streaming Paper by: Venkata N. Padmanabhan Helen J. Wang Philip A. Chou Discussion Leader: Manfred Georg Presented by: Christoph.
Outline for today Structured overlay as infrastructures Survey of design solutions Analysis of designs.
1 Principles of Reliable Distributed Systems Lecture 6: Synchronous Uniform Consensus Spring 2005 Dr. Idit Keidar.
Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,
1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
A Secure Fault-Tolerant Conference- Key Agreement Protocol Wen-Guey Tzeng Source : IEEE Transactions on computers Speaker : LIN, KENG-CHU.
Distributed Cluster Repair for OceanStore Irena Nadjakova and Arindam Chakrabarti Acknowledgements: Hakim Weatherspoon John Kubiatowicz.
OceanStore An Architecture for Global-scale Persistent Storage By John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
Tentative Updates in MINO Steven Czerwinski Jeff Pang Anthony Joseph John Kubiatowicz ROC Winter Retreat January 13, 2002.
Naming and Integrity: Self-Verifying Data in Peer-to-Peer Systems Hakim Weatherspoon, Chris Wells, John Kubiatowicz University of California, Berkeley.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Erasure Coding vs. Replication: A Quantiative Comparison
OceanStore: Data Security in an Insecure world John Kubiatowicz.
OceanStore: An Architecture for Global-Scale Persistent Storage Professor John Kubiatowicz, University of California at Berkeley
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
OceanStore/Tapestry Toward Global-Scale, Self-Repairing, Secure and Persistent Storage Anthony D. Joseph John Kubiatowicz Sahara Retreat, January 2003.
Wide-area cooperative storage with CFS
OceanStore An Architecture for Global-Scale Persistent Storage Motivation Feature Application Specific Components - Secure Naming - Update - Access Control-
Long Term Durability with Seagull Hakim Weatherspoon (Joint work with Jeremy Stribling and OceanStore group) University of California, Berkeley ROC/Sahara/OceanStore.
OceanStore: An Architecture for Global - Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patric Eaton, Dennis Geels,
Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.
Vs. Object-Process Methodology Written by Linder Tanya Rubinshtein Leena Nazaredko Anton Research Report Work Flow Management System.
Fault Tolerance via the State Machine Replication Approach Favian Contreras.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
1 The Design of a Robust Peer-to-Peer System Gisik Kwon Dept. of Computer Science and Engineering Arizona State University Reference: SIGOPS European Workshop.
Tolerating Faults in Distributed Systems
Low-Overhead Byzantine Fault-Tolerant Storage James Hendricks, Gregory R. Ganger Carnegie Mellon University Michael K. Reiter University of North Carolina.
Pond: the OceanStore Prototype Sean Rhea, Patric Eaton, Dennis Gells, Hakim Weatherspoon, Ben Zhao, and John Kubiatowicz University of California, Berkeley.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
1 The Design of a Robust Peer-to-Peer System Rodrigo Rodrigues, Barbara Liskov, Liuba Shrira Presented by Yi Chen Some slides are borrowed from the authors’
OceanStore: An Infrastructure for Global-Scale Persistent Storage John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels,
Practical Byzantine Fault Tolerance
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
OceanStore: An Architecture for Global- Scale Persistent Storage.
Efficient Fork-Linearizable Access to Untrusted Shared Memory Presented by: Alex Shraer (Technion) IBM Zurich Research Laboratory Christian Cachin IBM.
Freenet “…an adaptive peer-to-peer network application that permits the publication, replication, and retrieval of data while protecting the anonymity.
Toward Achieving Tapeless Backup at PB Scales Hakim Weatherspoon University of California, Berkeley Frontiers in Distributed Information Systems San Francisco.
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
POND: THE OCEANSTORE PROTOTYPE S. Rea, P. Eaton, D. Geels, H. Weatherspoon, J. Kubiatowicz U. C. Berkeley.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
PROCESS RESILIENCE By Ravalika Pola. outline: Process Resilience  Design Issues  Failure Masking and Replication  Agreement in Faulty Systems  Failure.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Chapter 8 Fault Tolerance. Outline Introductions –Concepts –Failure models –Redundancy Process resilience –Groups and failure masking –Distributed agreement.
CS791Aravind Elango Maintenance-Free Global Data Storage Sean Rhea, Chris Wells, Patrick Eaten, Dennis Geels, Ben Zhao, Hakim Weatherspoon and John Kubiatowicz.
OceanStore : An Architecture for Global-Scale Persistent Storage Jaewoo Kim, Youngho Yi, Minsik Cho.
OceanStore: An Architecture for Global-Scale Persistent Storage
Providing Secure Storage on the Internet
Outline Announcements Fault Tolerance.
Pond: the OceanStore Prototype
Replication and Availability in Distributed Systems
Distributed Systems CS
Content Distribution Network
Outline for today Oceanstore: An architecture for Global-Scale Persistent Storage – University of California, Berkeley. ASPLOS 2000 Feasibility of a Serverless.
Presentation transcript:

Failure Resilience in the Peer-to-Peer-System OceanStore Speaker: Corinna Richter

19/09/2015Corinna Richter: Failure Resilience2 Outline Introduction An Overview over OceanStore Failure Resilience in OceanStore Byzantine Fault Protocol Proactive Threshold Signatures Erasure Coding Summary Questions

19/09/2015Corinna Richter: Failure Resilience3 Introduction Failure Resilience: “A system responds according to the specification in spite of a limited number of faults” Availibility Reliablitiy How does this work in open Peer-to-Peer- Systems? Specific problems Solutions in OceanStore

19/09/2015Corinna Richter: Failure Resilience4 OceanStore: Basics Archival Storage Client Inner ring Replicas Client Archival Storage Quelle: John Kubiatowicz “Internet-scale, persistent data store designed for incremental scalability, secure sharing and long-term durability” infrastructure is constantly changing and untrusted except in aggregate

19/09/2015Corinna Richter: Failure Resilience5 OceanStore: Inner Ring Primary replica for one data-object serializes update actions for this object checks the correctness of the update “knows” the current version of the object implemented by a group of servers : distributed load  no „single point of failure“ What about correct decisions, if some hosts are faulty?

19/09/2015Corinna Richter: Failure Resilience6 Byzantine Fault Protocol - Problem Byzantine faults vs. Fail-Stop-Processes Fail-Stop: Omission, Crash  no reaction Byzantine faults  reaction might be faulty How many faulty processes are tolerable? How can all correct processes (of the primary replica) find the same decision? Illustration: The Byzantine Generals Problem

19/09/2015Corinna Richter: Failure Resilience7 Byzantine Fault Problem - Model There is only a solution of the BFP-Problem if less than one third of the processes are faulty! Commander P2P1 Go! Stop! Commander P2P1 Go!Stop! Who is the traitor?

19/09/2015Corinna Richter: Failure Resilience8 Byzantine Fault Problem - A “proof” by intuition Primary Replica f=3, n=? Client 3 answers may be delayed and faulty  He can’t wait for more than n-3 messages. 3 of n-3 messages may still be faulty  must have (n-3)-3 > 3  n > 9 Update X

19/09/2015Corinna Richter: Failure Resilience9 Byzantine Fault Protocol Ex.:order of updates - position of update X? P1 P2 P4 i i i P3 Round 1: P1 sends his decision to n-1 processes After round f+1 P2:(i, i, k) => i P3:(i, i, k) => i Round 2: each of the n-1 processes sends value he received to n-2 processes Round i: use the majority of round i-1 k ik i i

19/09/2015Corinna Richter: Failure Resilience10 Byzantine Fault Problem - Solution in OceanStore How can a system guarantee this? other systems: Reboot of a secure partition at regular intervals OceanStore: dynamically exchange the Server of the inner ring Responsible Party chooses the hosts for the inner ring analyses the stability of the hosts more Responsible Parties in a system

19/09/2015Corinna Richter: Failure Resilience11 BFP with signed messages: OceanStore Symmetric Keys vs. asymmetric Keys : MACs for the intern communication of the inner ring Public Key for the communication with others Proactive Threshold Signatures: One Public Key for all n hosts of the inner ring generate n=3f+1 private key shares

19/09/2015Corinna Richter: Failure Resilience12 Proactive Threshold Signatures - BFP in OceanStore f+1 private keys are combined to a full signature at most one of these messages comes from a correct host all correct hosts work deterministically Exchange of the server no interruption: public key stays unchanged generate new set of private key shares and delete the old set

19/09/2015Corinna Richter: Failure Resilience13 OceanStore: Update Primary Replica Write object x Other users Archival storage Secondary Dissemination Tree Quelle: S.Rhea, P. Eaton, D. Geels, H. Weatherspoon, B.Zhao, and J. Kubiatowicz

19/09/2015Corinna Richter: Failure Resilience14 Erasure Coding - Motivation Data availability must be guaranteed Omission of hosts, crashes, etc. Redundancy of the data replicated, distributed data storage on several hosts Problem of naive Replication inefficient with respect to the total storage consumed  Erasure Coding

19/09/2015Corinna Richter: Failure Resilience15 Erasure Coding Idea: divide one block of data in m fragments and code these in n fragments (n>m). Distribute these n fragments arbitrarily on the hosts. m/n=r, Rate of encoding Storage costs multiplied by n/m Example: m=16, n=32, r=1/2, storage costs x 2 m=16 fragments Code them in 32 fragments on distributed servers

19/09/2015Corinna Richter: Failure Resilience16 Erasure Coding: Efficiency POND: Cauchy Reed Solomon Code with m = 16 and n = 32 The reconstruction of the data is possible with any m fragments complex algorithm for (de-) coding Data availibility is determined by possible permutations of the fragments increased by a factor of 4000 for n=32

19/09/2015Corinna Richter: Failure Resilience17 Erasure Coding: Disadvantages Primary Replica has to compute the coding and decoding of the fragments Very expensive operation! Just decode, if there is no secondary replica for this object  whole block caching

19/09/2015Corinna Richter: Failure Resilience18 OceanStore: Dissemination Tree Tree-Structure for one data-object root: primary replica nodes: secondary replicas in cache publication of updates down the tree self-organising structure Primary Replica Secondary Replica.....

19/09/2015Corinna Richter: Failure Resilience19 Summary OceanStore:Internet-scale, global, persistent data store interesting solutions for failure resilience in peer-to-peer-systems Proactive Threshold Signatures Byzantine Fault Protocol Erasure Coding Results of a Prototype-Implementation Threshold Signatures not efficient to compute Further research based on OceanStore API

19/09/2015Corinna Richter: Failure Resilience20 Failure Resilience Questions?