Outline Distributed File Systems – continued 11/27/2018 COP5611.

Slides:



Advertisements
Similar presentations
DISTRIBUTED FILE SYSTEMS Computer Engineering Department Distributed Systems Course Asst. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2013.
Advertisements

Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Yanjun Zhao.  A network file system where a single file system can be distributed across several physical computers  allows administrators to group.
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
1 Network File System (NFS) a)The remote access model. b)The upload/download model.
Computer Science Lecture 20, page 1 CS677: Distributed OS Today: Distributed File Systems Issues in distributed file systems Sun’s Network File System.
Distributed File Systems Chapter 11
Coda file system: Disconnected operation By Wallis Chau May 7, 2003.
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Other File Systems: AFS, Napster. 2 Recap NFS: –Server exposes one or more directories Client accesses them by mounting the directories –Stateless server.
Computer Science Lecture 21, page 1 CS677: Distributed OS Today: Coda, xFS Case Study: Coda File System Brief overview of other recent file systems –xFS.
1 CS 194: Distributed Systems Distributed File Systems Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and.
6/24/2015B.RamamurthyPage 1 File System B. Ramamurthy.
Jeff Chheng Jun Du.  Distributed file system  Designed for scalability, security, and high availability  Descendant of version 2 of Andrew File System.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
7/15/2015B.RamamurthyPage 1 File System B. Ramamurthy.
Case Study - GFS.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Distributed Systems. Interprocess Communication (IPC) Processes are either independent or cooperating – Threads provide a gray area – Cooperating processes.
Distributed File Systems
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
What is a Distributed File System?? Allows transparent access to remote files over a network. Examples: Network File System (NFS) by Sun Microsystems.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.
Chapter 10: File-System Interface Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 Chapter 10: File-System.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Serverless Network File Systems Overview by Joseph Thompson.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
EE324 INTRO TO DISTRIBUTED SYSTEMS. Distributed File System  What is a file system?
Distributed File Systems Group A5 Amit Sharma Dhaval Sanghvi Ali Abbas.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
Synchronization in Distributed File Systems Advanced Operating System Zhuoli Lin Professor Zhang.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last class: Distributed File Systems Issues in distributed file systems Sun’s Network File System.
Chapter Five Distributed file systems. 2 Contents Distributed file system design Distributed file system implementation Trends in distributed file systems.
Distributed Systems: Distributed File Systems Ghada Ahmed, PhD. Assistant Prof., Computer Science Dept. Web:
Distributed File Systems Sun Network File Systems Andrew Fıle System CODA File System Plan 9 xFS SFS Hadoop.
Computer Science Lecture 20, page 1 CS677: Distributed OS Today: Distributed File Systems Issues in distributed file systems Sun’s Network File System.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
Distributed File Systems
Distributed File Systems
File System Implementation
Nache: Design and Implementation of a Caching Proxy for NFSv4
Synchronization in Distributed File System
NFS and AFS Adapted from slides by Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift.
Distributed Systems CS
Outline Midterm results summary Distributed file systems – continued
File System B. Ramamurthy B.Ramamurthy 11/27/2018.
Ch. 7 Distributed File Systems
Today: Coda, xFS Case Study: Coda File System
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Distributed File Systems
Distributed File Systems
Outline Announcements Lab2 Distributed File Systems 1/17/2019 COP5611.
Distributed File Systems
Distributed File Systems
Chapter 15: File System Internals
Today: Distributed File Systems
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Distributed File Systems
Distributed File Systems
Distributed File Systems
Network File System (NFS)
Presentation transcript:

Outline Distributed File Systems – continued 11/27/2018 COP5611

Distributed File Systems A distributed file system is a resource management component in a distributed operating systems It implements a common file system shared by all the computers in the systems Two important goals Network transparency High availability 11/27/2018 COP5611

Architecture 11/27/2018 COP5611

Architecture – cont. Normally for performance reasons distributed file systems are organized as a client-server architecture File servers store files and perform storage and retrieval upon client’s requests Two most important parts are Name server Cache manager 11/27/2018 COP5611

Architecture – cont. 11/27/2018 COP5611

Architecture – cont. 11/27/2018 COP5611

Mounting Mounting is a way to bind together different file systems to form a single hierarchical structured name space It is widely used in both local and distributed UNIX machines In distributed file systems, file systems maintained by remote servers are mounted at the clients 11/27/2018 COP5611

Mounting – cont. 11/27/2018 COP5611

Mounting – cont. 11/27/2018 COP5611

Mounting – cont. 11/27/2018 COP5611

Mounting – cont. 11/27/2018 COP5611

Automounting - cont. 11/27/2018 COP5611

Automounting – cont. 11/27/2018 COP5611

Caching Caching is commonly used in distributed file systems to reduce delays in accessing the data In file caching, a copy of the data stored at a remote file server is brought to the client, reducing access delays due to network latency The effectiveness of caching is based on the temporal locality in programs Files can also be cached at the server side 11/27/2018 COP5611

Client Caching 11/27/2018 COP5611

Client Caching – cont. 11/27/2018 COP5611

Cache Consistency 11/27/2018 COP5611

Hints An alternative approach to caching The cached data is treated as hints The cached data is not guaranteed to be completely accurate The cache consistency issue is ignored in this implementation This is useful for applications which can recover from invalid cached data 11/27/2018 COP5611

Bulk Data Transfer Bulk data transfer is to transfer multiple data blocks instead of just the block being referenced by the client Temporal locality and the fact that most files are accessed in their entirety Reduce the network communication overhead by reducing the cost of executing communication protocols 11/27/2018 COP5611

Security 11/27/2018 COP5611

Naming in Distributed File Systems A name in file systems is a way to reference a file or a directory Name resolution refers to the process of mapping a name to an object (or in the case of replication, to multiple objects) A name space is a collection of names 11/27/2018 COP5611

Naming in a Local File System 11/27/2018 COP5611

Naming in a Local File System – cont. 11/27/2018 COP5611

Naming in Distributed File Systems – cont. Three approaches to naming in distributed file systems The simplest scheme is to concatenate the host name to the names of files Not network transparent Not location-independent Mounting remote directories to local directories Location transparent but not network transparent A single global directory Limited to a few cooperating computers 11/27/2018 COP5611

Naming in Distributed File Systems – cont. Context Content can be used to partition a file name space Here a filename consists of a context and a name local to the context Name resolution involves interpreting the name within a context, which may invoke other contexts recursively 11/27/2018 COP5611

Naming in Distributed File Systems – cont. Name Servers are responsible for name resolution in distributed file systems A name server is a process that maps names specified by clients to stored objects such as files and directories A single name server vs. multiple name servers 11/27/2018 COP5611

Caches on Disk or Memory Cache in main memory vs. cache on a local disk Cache in main memory Advantages Disadvantages Cache on a local disk 11/27/2018 COP5611

Writing Policy This is related to the cache consistency It decides what to do when a cache block at the client is modified Several different policies Write-through Delayed writing policy for some time Delayed writing policy when the file is closed 11/27/2018 COP5611

Cache Consistency Schemes to guarantee consistency Server-initiated approach Servers inform the cache managers whenever the data in client caches become stale Cache managers can retrieve the new data when needed Client-initiated approach Cache managers validate data with the server before returning it to the clients Limited caching 11/27/2018 COP5611

Availability Availability is an important issue in distributed file systems Replication is the primary mechanism for enhancing the availability of files in distributed file systems Replication Unit of replication Replica management 11/27/2018 COP5611

Scalability Scalability deals with the suitability of the design to support more clients Caching helps reduce the client response time Server-initiated cache invalidation Some clients can be used as servers The structure of the server process also plays a major role in scalability 11/27/2018 COP5611

Semantics Semantics of a file system characterize the effects of accesses on files For example, a read operation should return the data (stored) due to the latest write operation Guaranteeing the semantics when employing caching, is difficult and expensive 11/27/2018 COP5611

Case Studies There are distributed file systems that have been developed Architecture Communication Processes and their organization Naming Consistency Caching and replication Fault tolerance Security 11/27/2018 COP5611

Case Studies – cont. These examples are taken from Tanenbaum and van Steen’s book It does not follow the main textbook because The material there is not up-to-date It did not provide enough details You need to read Sect. 9.5 11/27/2018 COP5611

Sun Network File Systems Developed by Sun The first version was developed and was kept to Sun The second version was incorporated in SunOS 2.0 Version 3 was released around 1994 Version 4 is under development to make the NFS a true wide-area file system across the Internet 11/27/2018 COP5611

Sun Network File Systems – cont. NFS is not a true file system It is a collection of protocols that together provide clients with a model of a distributed files system In some sense, it is a middle ware (like CORBA) NFS protocols are designed in such a way that different implementations should easily interoperate It can run on a heterogeneous collection of computers 11/27/2018 COP5611

NFS Architecture (1) The remote access model. The upload/download model 11/27/2018 COP5611

NFS Architecture (2) 11/27/2018 COP5611

File System Model 11/27/2018 COP5611 Operation v3 v4 Description Create Yes No Create a regular file Create a nonregular file Link Create a hard link to a file Symlink Create a symbolic link to a file Mkdir Create a subdirectory in a given directory Mknod Create a special file Rename Change the name of a file Rmdir Remove an empty subdirectory from a directory Open Open a file Close Close a file Lookup Look up a file by means of a file name Readdir Read the entries in a directory Readlink Read the path name stored in a symbolic link Getattr Read the attribute values for a file Setattr Set one or more attribute values for a file Read Read the data contained in a file Write Write data to a file 11/27/2018 COP5611

Communication Reading data from a file in NFS version 3. Reading data using a compound procedure in version 4. 11/27/2018 COP5611

Processes NFS is a traditional client-server system In versions 2 and 3, the NFS servers are stateless in that they are not required to maintain any client state The main advantage of the stateless approach is simplicity However, some of operations are intrinsically stateful and they are implemented in a separate lock manager Version 4 uses a stateful approach 11/27/2018 COP5611

Naming (1) 11/27/2018 COP5611

Naming (2) 11/27/2018 COP5611

File Attributes (1) Attribute Description TYPE The type of the file (regular, directory, symbolic link) SIZE The length of the file in bytes CHANGE Indicator for a client to see if and/or when the file has changed FSID Server-unique identifier of the file's file system 11/27/2018 COP5611

File Attributes (2) Attribute Description ACL an access control list associated with the file FILEHANDLE The server-provided file handle of this file FILEID A file-system unique identifier for this file FS_LOCATIONS Locations in the network where this file system may be found OWNER The character-string name of the file's owner TIME_ACCESS Time when the file data were last accessed TIME_MODIFY Time when the file data were last modified TIME_CREATE Time when the file was created 11/27/2018 COP5611

Semantics of File Sharing (1) On a single processor, when a read follows a write, the value returned by the read is the value just written. In a distributed system with caching, obsolete values may be returned. 11/27/2018 COP5611

Semantics of File Sharing (2) Method Comment UNIX semantics Every operation on a file is instantly visible to all processes Session semantics No changes are visible to other processes until the file is closed Immutable files No updates are possible; simplifies sharing and replication Transaction All changes occur atomically 11/27/2018 COP5611

File Locking in NFS (1) Operation Description Lock Creates a lock for a range of bytes Lockt Test whether a conflicting lock has been granted Locku Remove a lock from a range of bytes Renew Renew the leas on a specified lock 11/27/2018 COP5611

File Locking in NFS (2) Current file denial state NONE READ WRITE BOTH Succeed Fail (a) Requested file denial state (b) Request access Current access state 11/27/2018 COP5611

Client Caching (1) 11/27/2018 COP5611

Client Caching (2) 11/27/2018 COP5611

RPC Failures 11/27/2018 COP5611

Security 11/27/2018 COP5611

Secure RPCs 11/27/2018 COP5611

Access Control 11/27/2018 COP5611 Operation Description Read_data Permission to read the data contained in a file Write_data Permission to to modify a file's data Append_data Permission to to append data to a file Execute Permission to to execute a file List_directory Permission to to list the contents of a directory Add_file Permission to to add a new file t5o a directory Add_subdirectory Permission to to create a subdirectory to a directory Delete Permission to to delete a file Delete_child Permission to to delete a file or directory within a directory Read_acl Permission to to read the ACL Write_acl Permission to to write the ACL Read_attributes The ability to read the other basic attributes of a file Write_attributes Permission to to change the other basic attributes of a file Read_named_attrs Permission to to read the named attributes of a file Write_named_attrs Permission to to write the named attributes of a file Write_owner Permission to to change the owner Synchronize Permission to to access a file locally at the server with synchronous reads and writes 11/27/2018 COP5611

Sun Network File Systems – cont. Setting a Network File System on Unix systems NFS software must have installed Server side Start NFS daemons Export directories /etc/exports Client side /etc/fstab mount -a 11/27/2018 COP5611

Coda Distributed File System Developed at CMU Integrated now with a number of UNIX systems Based on the Andrew File System (AFS) Design goals Scalable Secure Highly available High degree of naming and location transparency So that the system would appear to its users as a pure local file system 11/27/2018 COP5611

The Coda File System Type of user Description Owner The owner of a file Group The group of users associated with a file Everyone Any user of a process Interactive Any process accessing the file from an interactive terminal Network Any process accessing the file via the network Dialup Any process accessing the file through a dialup connection to the server Batch Any process accessing the file as part of a batch job Anonymous Anyone accessing the file without authentication Authenticated Any authenticated user of a process Service Any system-defined service process 11/27/2018 COP5611

Overview of Coda (1) 11/27/2018 COP5611

Overview of Coda (2) 11/27/2018 COP5611

Communication (1) 11/27/2018 COP5611

Communication (2) Sending an invalidation message one at a time. Sending invalidation messages in parallel. 11/27/2018 COP5611

Naming 11/27/2018 COP5611

File Identifiers 11/27/2018 COP5611

Sharing Files in Coda 11/27/2018 COP5611

Transactional Semantics File-associated data Read? Modified? File identifier Yes No Access rights Last modification time File length File contents The metadata read and modified for a store session type in Coda. 11/27/2018 COP5611

Client Caching 11/27/2018 COP5611

Server Replication 11/27/2018 COP5611

Disconnected Operation 11/27/2018 COP5611

Secure Channels (1) Mutual authentication in RPC2. 11/27/2018 COP5611

Secure Channels (2) Setting up a secure channel between a (Venus) client and a Vice server in Coda. 11/27/2018 COP5611

Access Control Operation Description Read Read any file in the directory Write Modify any file in the directory Lookup Look up the status of any file Insert Add a new file to the directory Delete Delete an existing file Administer Modify the ACL of the directory Classification of file and directory operations recognized by Coda with respect to access control. 11/27/2018 COP5611

Plan 9: Resources Unified to Files 11/27/2018 COP5611

Communication Files associated with a single TCP connection in Plan 9. Description ctl Used to write protocol-specific control commands data Used to read and write data listen Used to accept incoming connection setup requests local Provides information on the caller's side of the connection remote Provides information on the other side of the connection status Provides diagnostic information on the current status of the connection Files associated with a single TCP connection in Plan 9. 11/27/2018 COP5611

Processes 11/27/2018 COP5611

Naming 11/27/2018 COP5611

Overview of xFS. 11/27/2018 COP5611

Processes (1) 11/27/2018 COP5611

Processes (2) 11/27/2018 COP5611

Naming Main data structures used in xFS. Data structure Description Manager map Maps file ID to manager Imap Maps file ID to log address of file's inode Inode Maps block number (i.e., offset) to log address of block File identifier Reference used to index into manager map File directory Maps a file name to a file identifier Log addresses Triplet of stripe group, ID, segment ID, and segment offset Stripe group map Maps stripe group ID to list of storage servers 11/27/2018 COP5611

Overview of SFS The organization of SFS. 11/27/2018 COP5611

Naming A self-certifying pathname in SFS. /sfs LOC HID Pathname /sfs/sfs.vu.sc.nl:ag62hty4wior450hdh63u623i4f0kqere/home/steen/mbox A self-certifying pathname in SFS. 11/27/2018 COP5611

Summary Issue NFS Coda Plan 9 xFS SFS Design goals Access transparency High availability Uniformity Serverless system Scalable security Access model Remote Up/Download Log-based Communication RPC Special Active msgs Client process Thin/Fat Fat Thin Medium Server groups No Yes Mount granularity Directory File system Name space Per client Global Per process File ID scope File server Server Sharing sem. Session Transactional UNIX N/S Cache consist. write-back write-through Replication Minimal ROWA None Striping Fault tolerance Reliable comm. Replication and caching Recovery Client-based Reintegration Checkpoint & write logs Secure channels Existing mechanisms Needham-Schroeder No pathnames Self-cert. Access control Many operations Directory operations UNIX based NFS BASED A comparison between NFS, Coda, Plan 9, xFS. N/S indicates that nothing has been specified. 11/27/2018 COP5611