Ivy: A Read/Write Peer-to- Peer File System Authors: Muthitacharoen Athicha, Robert Morris, Thomer M. Gil, and Benjie Chen Presented by Saurabh Jha 1.

Slides:



Advertisements
Similar presentations
Andrew File System CSS534 ZACH MA. History  Originated in October 1982, by the Information Technology Center (ITC) formed with Carnegie Mellon and IBM.
Advertisements

Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
SUNDR: Secure Untrusted Data Repository
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Ivy: A Read/Write P2P File System Athicha Muthitacharoan, Robert Morris, Thomer Gil and Benjie Chen Presented by Rachel Rubin CS 294-4, Fall 2003.
G Robert Grimm New York University Disconnected Operation in the Coda File System.
Ivy: A Read/Write Peer-to- Peer File System A.Muthitacharoen, R. Morris, T. Gil, and B. Chen Presented by: Matthew Allen.
Distributed File System: Design Comparisons II Pei Cao Cisco Systems, Inc.
A Peer-to-Peer File System OSCAR LAB. Overview A short introduction to peer-to-peer (P2P) Systems Ivy: a read/write P2P file system (OSDI’02)
Wide-area cooperative storage with CFS
Distributed File System: Data Storage for Networks Large and Small Pei Cao Cisco Systems, Inc.
Distributed File System: Design Comparisons II Pei Cao.
Northwestern University 2007 Winter – EECS 443 Advanced Operating Systems The Google File System S. Ghemawat, H. Gobioff and S-T. Leung, The Google File.
Case Study - GFS.
File Systems (2). Readings r Silbershatz et al: 11.8.
Distributed File Systems Sarah Diesburg Operating Systems CS 3430.
Network File Systems Victoria Krafft CS /4/05.
Distributed File Systems
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
1 The Google File System Reporter: You-Wei Zhang.
Wide-Area Cooperative Storage with CFS Robert Morris Frank Dabek, M. Frans Kaashoek, David Karger, Ion Stoica MIT and Berkeley.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
1 JTE HPC/FS Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6.
Distributed File Systems
Ivy: A Read/Write Peer-to-Peer File System A. Muthitacharoen, R. Morris, T. M. Gil, and B. Chen In Proceedings of OSDI ‘ Presenter : Chul Lee.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Chapter 20 Distributed File Systems Copyright © 2008.
Chapter 10: File-System Interface Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 Chapter 10: File-System.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
1 Chap8 Distributed File Systems  Background knowledge  8.1Introduction –8.1.1 Characteristics of File systems –8.1.2 Distributed file system requirements.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Chord+DHash+Ivy: Building Principled Peer-to-Peer Systems Robert Morris Joint work with F. Kaashoek, D. Karger, I. Stoica, H. Balakrishnan,
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
A Low-bandwidth Network File System Athicha Muthitacharoen et al. Presented by Matt Miller September 12, 2002.
Distributed File Systems Architecture – 11.1 Processes – 11.2 Communication – 11.3 Naming – 11.4.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
CS425 / CSE424 / ECE428 — Distributed Systems — Fall 2011 Some material derived from slides by Prashant Shenoy (Umass) & courses.washington.edu/css434/students/Coda.ppt.
1 JTE HPC/FS Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Distributed File Systems Architecture – 11.1 Processes – 11.2 Communication – 11.3 Naming – 11.4.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
1 JTE HPC/FS Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Truly Distributed File Systems Paul Timmins CS 535.
CalvinFS: Consistent WAN Replication and Scalable Metdata Management for Distributed File Systems Thomas Kao.
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Slide credits: Thomas Kao
Distributed File Systems
Distributed P2P File System
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
Distributed File Systems
Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
CSE 451: Operating Systems Distributed File Systems
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
THE GOOGLE FILE SYSTEM.
Distributed File Systems
Distributed File Systems
Presentation transcript:

Ivy: A Read/Write Peer-to- Peer File System Authors: Muthitacharoen Athicha, Robert Morris, Thomer M. Gil, and Benjie Chen Presented by Saurabh Jha 1

Introduction - P2P Storage Systems Enables programs to store and access remote files exactly as they do local ones Goal: Build a Distributed File System on P2P model Allow users to share storage or files easily and arbitrarily The focus of this talk Ivy – A Read/Write Peer-to-Peer File Systems Presented by Saurabh Jha for CS UIUC 2

Outline Introduction Motivation for P2P DFS Challenges in building P2P DFS Ivy Introduction Design Semantics Experiments / Evaluation Pros/Cons Discussion Presented by Saurabh Jha for CS UIUC3

Motivation ModelExamples of File Systems Client Server ModelNFS, CIFS, AFS etc. Cluster of ServersZebraFS, xFS, Lustre etc. P2PCFS, Ivy etc. Client Server Model Cluster of Servers Model P2P Model - Single Point of Failure - Server performance bottleneck + Easy to setup - Dedicated - Too many design considerations + Scalable and Trusted - Too many design considerations - Hard to achieve popular file semantics + Not dedicated Presented by Saurabh Jha for CS UIUC 4

Challenges in Designing P2P Systems Peer issues Decentralization Churn Trust Accounting Difficult with multiple shared writers Difficult with churn and partitioning Difficult with trusted partners gone rogue Difficult to provide popular NFS like semantics for accessing files File issues Consistency/Integrity Persistence Security Transparency Presented by Saurabh Jha for CS UIUC5

Ivy A multi-user read/write P2P file system Log-based file system Logs stored in DHash distributed hash table Allows transparent access to files using NFS interface Users do not have to trust each other Allows to operate under partitioning Consistency and conflict resolution semantics defined Close to Open consistency of file data Figure - Ivy Software Architecture Solves Trust Issues NFS File Operation Semantics Handles P2P Part Presented by Saurabh Jha for CS UIUC6

Design Ivy uses set of logs One per participant Linked list of immutable records Both metadata and data Owner appends own log only but can read all of them Each file system has its own view block Created by participants or a new participant joins this view block View block is essentially a data structure pointing to all log heads log-head points to most recent log record Maintain private snapshot of the system Figure – Example Ivy View and Logs Presented by Saurabh Jha for CS UIUC7

Supporting File System Operations Table - Log Types A participant consult all logs to find relevant information Ivy uses version vector to order and create a consistent view of the file system Concurrent versions are resolved by ordering public keys Presented by Saurabh Jha for CS UIUC8

Cache Consistency An update operation of one Ivy participant is visible to another almost immediately except during partitioning The immutable nature of logs helps provide better consistency model than NFS Provides close-to-open consistency for file data Caches file blocks along with version vector Use this cached version to serve future requests if no change in the log heads Presented by Saurabh Jha for CS UIUC 9

Updates Concurrent Updates Concurrent updates are ordered using public keys When the write is complete [close()], participants agree on this order or the application manually resolves it Lack of serialization (as offered by centralized file system) can create conflicting views and unintended operations Exclusive creation of directories except during partitioning Partitioned Updates Ivy not aware of partitions, hence conflicting updates can be made Relies on DHash servers to make sure that updates are available to each peer [Not Guaranteed though!] Healing ensures that meta-data structure is consistent. May lead to lost updates. Presented by Saurabh Jha for CS UIUC10

Experiments Uses Modified Andrew Benchmark (MAB) for evaluation Create dir hierarchy, Copy files, Walk directory while reading attr for each file, read files, compile files into programs Experiment Configuration Mode Single Node WAN Parameters (varying) # Participants # Concurrent Writers Snapshot interval Presented by Saurabh Jha for CS UIUC11

Evaluation 386 NFS RPCs 508 Log Updates taking 7.2 seconds Inserts 8.8MB for 1.6 MB data 1 NFS requests causes 3 log head fetches Total Fetches: 3346 Performance decreases due to increased number of rtts’ ~ 3X Slower WANSingle Node Presented by Saurabh Jha for CS UIUC 12

Evaluation Many Logs/ One WriterMany DHash ServersMany Writers Little Impact: Logs are fetched in parallel Runtime growth due to increase in chord #packets used to coordinate to DHash servers Runtime growth due to increased number of log heads and cost of fetching Presented by Saurabh Jha for CS UIUC 13

Pros NFS like semantic, fully functional with existing OS’es No need for dedicated servers Provision for security, recovery and integrity Defined useful semantics No locks ! Presented by Saurabh Jha for CS UIUC14

Cons Huge Storage Requirement - Immutable and append on only logs What happens when participants run out of storage May be not, storage is cheap! 2 – 3 X slower, would you be really using it ? CVS Manual conflict resolutions! Scalability Issues Work more focused on showing the possibility of concept through implementation rather than design itself. Some choices are arbitrary Presented by Saurabh Jha for CS UIUC15

Discussion: How did Ivy handle these challenges? Unanswered Question No model to show the impact of node to node latency How does churn rate affect file system performance How to detect rogue agents? How much should each peer contribute? Manual conflict resolution, can there be some abstraction? Peer issues Decentralization Churn Trust Accounting Transparency Presented by Saurabh Jha for CS UIUC16

Comparison of P2P Storage Systems 17