Presentation is loading. Please wait.

Presentation is loading. Please wait.

Outline Distributed File Systems – continued 11/27/2018 COP5611.

Similar presentations


Presentation on theme: "Outline Distributed File Systems – continued 11/27/2018 COP5611."— Presentation transcript:

1 Outline Distributed File Systems – continued 11/27/2018 COP5611

2 Distributed File Systems
A distributed file system is a resource management component in a distributed operating systems It implements a common file system shared by all the computers in the systems Two important goals Network transparency High availability 11/27/2018 COP5611

3 Architecture 11/27/2018 COP5611

4 Architecture – cont. Normally for performance reasons distributed file systems are organized as a client-server architecture File servers store files and perform storage and retrieval upon client’s requests Two most important parts are Name server Cache manager 11/27/2018 COP5611

5 Architecture – cont. 11/27/2018 COP5611

6 Architecture – cont. 11/27/2018 COP5611

7 Mounting Mounting is a way to bind together different file systems to form a single hierarchical structured name space It is widely used in both local and distributed UNIX machines In distributed file systems, file systems maintained by remote servers are mounted at the clients 11/27/2018 COP5611

8 Mounting – cont. 11/27/2018 COP5611

9 Mounting – cont. 11/27/2018 COP5611

10 Mounting – cont. 11/27/2018 COP5611

11 Mounting – cont. 11/27/2018 COP5611

12 Automounting - cont. 11/27/2018 COP5611

13 Automounting – cont. 11/27/2018 COP5611

14 Caching Caching is commonly used in distributed file systems to reduce delays in accessing the data In file caching, a copy of the data stored at a remote file server is brought to the client, reducing access delays due to network latency The effectiveness of caching is based on the temporal locality in programs Files can also be cached at the server side 11/27/2018 COP5611

15 Client Caching 11/27/2018 COP5611

16 Client Caching – cont. 11/27/2018 COP5611

17 Cache Consistency 11/27/2018 COP5611

18 Hints An alternative approach to caching
The cached data is treated as hints The cached data is not guaranteed to be completely accurate The cache consistency issue is ignored in this implementation This is useful for applications which can recover from invalid cached data 11/27/2018 COP5611

19 Bulk Data Transfer Bulk data transfer is to transfer multiple data blocks instead of just the block being referenced by the client Temporal locality and the fact that most files are accessed in their entirety Reduce the network communication overhead by reducing the cost of executing communication protocols 11/27/2018 COP5611

20 Security 11/27/2018 COP5611

21 Naming in Distributed File Systems
A name in file systems is a way to reference a file or a directory Name resolution refers to the process of mapping a name to an object (or in the case of replication, to multiple objects) A name space is a collection of names 11/27/2018 COP5611

22 Naming in a Local File System
11/27/2018 COP5611

23 Naming in a Local File System – cont.
11/27/2018 COP5611

24 Naming in Distributed File Systems – cont.
Three approaches to naming in distributed file systems The simplest scheme is to concatenate the host name to the names of files Not network transparent Not location-independent Mounting remote directories to local directories Location transparent but not network transparent A single global directory Limited to a few cooperating computers 11/27/2018 COP5611

25 Naming in Distributed File Systems – cont.
Context Content can be used to partition a file name space Here a filename consists of a context and a name local to the context Name resolution involves interpreting the name within a context, which may invoke other contexts recursively 11/27/2018 COP5611

26 Naming in Distributed File Systems – cont.
Name Servers are responsible for name resolution in distributed file systems A name server is a process that maps names specified by clients to stored objects such as files and directories A single name server vs. multiple name servers 11/27/2018 COP5611

27 Caches on Disk or Memory
Cache in main memory vs. cache on a local disk Cache in main memory Advantages Disadvantages Cache on a local disk 11/27/2018 COP5611

28 Writing Policy This is related to the cache consistency
It decides what to do when a cache block at the client is modified Several different policies Write-through Delayed writing policy for some time Delayed writing policy when the file is closed 11/27/2018 COP5611

29 Cache Consistency Schemes to guarantee consistency
Server-initiated approach Servers inform the cache managers whenever the data in client caches become stale Cache managers can retrieve the new data when needed Client-initiated approach Cache managers validate data with the server before returning it to the clients Limited caching 11/27/2018 COP5611

30 Availability Availability is an important issue in distributed file systems Replication is the primary mechanism for enhancing the availability of files in distributed file systems Replication Unit of replication Replica management 11/27/2018 COP5611

31 Scalability Scalability deals with the suitability of the design to support more clients Caching helps reduce the client response time Server-initiated cache invalidation Some clients can be used as servers The structure of the server process also plays a major role in scalability 11/27/2018 COP5611

32 Semantics Semantics of a file system characterize the effects of accesses on files For example, a read operation should return the data (stored) due to the latest write operation Guaranteeing the semantics when employing caching, is difficult and expensive 11/27/2018 COP5611

33 Case Studies There are distributed file systems that have been developed Architecture Communication Processes and their organization Naming Consistency Caching and replication Fault tolerance Security 11/27/2018 COP5611

34 Case Studies – cont. These examples are taken from Tanenbaum and van Steen’s book It does not follow the main textbook because The material there is not up-to-date It did not provide enough details You need to read Sect. 9.5 11/27/2018 COP5611

35 Sun Network File Systems
Developed by Sun The first version was developed and was kept to Sun The second version was incorporated in SunOS 2.0 Version 3 was released around 1994 Version 4 is under development to make the NFS a true wide-area file system across the Internet 11/27/2018 COP5611

36 Sun Network File Systems – cont.
NFS is not a true file system It is a collection of protocols that together provide clients with a model of a distributed files system In some sense, it is a middle ware (like CORBA) NFS protocols are designed in such a way that different implementations should easily interoperate It can run on a heterogeneous collection of computers 11/27/2018 COP5611

37 NFS Architecture (1) The remote access model.
The upload/download model 11/27/2018 COP5611

38 NFS Architecture (2) 11/27/2018 COP5611

39 File System Model 11/27/2018 COP5611 Operation v3 v4 Description
Create Yes No Create a regular file Create a nonregular file Link Create a hard link to a file Symlink Create a symbolic link to a file Mkdir Create a subdirectory in a given directory Mknod Create a special file Rename Change the name of a file Rmdir Remove an empty subdirectory from a directory Open Open a file Close Close a file Lookup Look up a file by means of a file name Readdir Read the entries in a directory Readlink Read the path name stored in a symbolic link Getattr Read the attribute values for a file Setattr Set one or more attribute values for a file Read Read the data contained in a file Write Write data to a file 11/27/2018 COP5611

40 Communication Reading data from a file in NFS version 3.
Reading data using a compound procedure in version 4. 11/27/2018 COP5611

41 Processes NFS is a traditional client-server system
In versions 2 and 3, the NFS servers are stateless in that they are not required to maintain any client state The main advantage of the stateless approach is simplicity However, some of operations are intrinsically stateful and they are implemented in a separate lock manager Version 4 uses a stateful approach 11/27/2018 COP5611

42 Naming (1) 11/27/2018 COP5611

43 Naming (2) 11/27/2018 COP5611

44 File Attributes (1) Attribute Description TYPE
The type of the file (regular, directory, symbolic link) SIZE The length of the file in bytes CHANGE Indicator for a client to see if and/or when the file has changed FSID Server-unique identifier of the file's file system 11/27/2018 COP5611

45 File Attributes (2) Attribute Description ACL
an access control list associated with the file FILEHANDLE The server-provided file handle of this file FILEID A file-system unique identifier for this file FS_LOCATIONS Locations in the network where this file system may be found OWNER The character-string name of the file's owner TIME_ACCESS Time when the file data were last accessed TIME_MODIFY Time when the file data were last modified TIME_CREATE Time when the file was created 11/27/2018 COP5611

46 Semantics of File Sharing (1)
On a single processor, when a read follows a write, the value returned by the read is the value just written. In a distributed system with caching, obsolete values may be returned. 11/27/2018 COP5611

47 Semantics of File Sharing (2)
Method Comment UNIX semantics Every operation on a file is instantly visible to all processes Session semantics No changes are visible to other processes until the file is closed Immutable files No updates are possible; simplifies sharing and replication Transaction All changes occur atomically 11/27/2018 COP5611

48 File Locking in NFS (1) Operation Description Lock
Creates a lock for a range of bytes Lockt Test whether a conflicting lock has been granted Locku Remove a lock from a range of bytes Renew Renew the leas on a specified lock 11/27/2018 COP5611

49 File Locking in NFS (2) Current file denial state NONE READ WRITE BOTH
Succeed Fail (a) Requested file denial state (b) Request access Current access state 11/27/2018 COP5611

50 Client Caching (1) 11/27/2018 COP5611

51 Client Caching (2) 11/27/2018 COP5611

52 RPC Failures 11/27/2018 COP5611

53 Security 11/27/2018 COP5611

54 Secure RPCs 11/27/2018 COP5611

55 Access Control 11/27/2018 COP5611 Operation Description Read_data
Permission to read the data contained in a file Write_data Permission to to modify a file's data Append_data Permission to to append data to a file Execute Permission to to execute a file List_directory Permission to to list the contents of a directory Add_file Permission to to add a new file t5o a directory Add_subdirectory Permission to to create a subdirectory to a directory Delete Permission to to delete a file Delete_child Permission to to delete a file or directory within a directory Read_acl Permission to to read the ACL Write_acl Permission to to write the ACL Read_attributes The ability to read the other basic attributes of a file Write_attributes Permission to to change the other basic attributes of a file Read_named_attrs Permission to to read the named attributes of a file Write_named_attrs Permission to to write the named attributes of a file Write_owner Permission to to change the owner Synchronize Permission to to access a file locally at the server with synchronous reads and writes 11/27/2018 COP5611

56 Sun Network File Systems – cont.
Setting a Network File System on Unix systems NFS software must have installed Server side Start NFS daemons Export directories /etc/exports Client side /etc/fstab mount -a 11/27/2018 COP5611

57 Coda Distributed File System
Developed at CMU Integrated now with a number of UNIX systems Based on the Andrew File System (AFS) Design goals Scalable Secure Highly available High degree of naming and location transparency So that the system would appear to its users as a pure local file system 11/27/2018 COP5611

58 The Coda File System Type of user Description Owner
The owner of a file Group The group of users associated with a file Everyone Any user of a process Interactive Any process accessing the file from an interactive terminal Network Any process accessing the file via the network Dialup Any process accessing the file through a dialup connection to the server Batch Any process accessing the file as part of a batch job Anonymous Anyone accessing the file without authentication Authenticated Any authenticated user of a process Service Any system-defined service process 11/27/2018 COP5611

59 Overview of Coda (1) 11/27/2018 COP5611

60 Overview of Coda (2) 11/27/2018 COP5611

61 Communication (1) 11/27/2018 COP5611

62 Communication (2) Sending an invalidation message one at a time.
Sending invalidation messages in parallel. 11/27/2018 COP5611

63 Naming 11/27/2018 COP5611

64 File Identifiers 11/27/2018 COP5611

65 Sharing Files in Coda 11/27/2018 COP5611

66 Transactional Semantics
File-associated data Read? Modified? File identifier Yes No Access rights Last modification time File length File contents The metadata read and modified for a store session type in Coda. 11/27/2018 COP5611

67 Client Caching 11/27/2018 COP5611

68 Server Replication 11/27/2018 COP5611

69 Disconnected Operation
11/27/2018 COP5611

70 Secure Channels (1) Mutual authentication in RPC2. 11/27/2018 COP5611

71 Secure Channels (2) Setting up a secure channel between a (Venus) client and a Vice server in Coda. 11/27/2018 COP5611

72 Access Control Operation Description Read Read any file in the directory Write Modify any file in the directory Lookup Look up the status of any file Insert Add a new file to the directory Delete Delete an existing file Administer Modify the ACL of the directory Classification of file and directory operations recognized by Coda with respect to access control. 11/27/2018 COP5611

73 Plan 9: Resources Unified to Files
11/27/2018 COP5611

74 Communication Files associated with a single TCP connection in Plan 9.
Description ctl Used to write protocol-specific control commands data Used to read and write data listen Used to accept incoming connection setup requests local Provides information on the caller's side of the connection remote Provides information on the other side of the connection status Provides diagnostic information on the current status of the connection Files associated with a single TCP connection in Plan 9. 11/27/2018 COP5611

75 Processes 11/27/2018 COP5611

76 Naming 11/27/2018 COP5611

77 Overview of xFS. 11/27/2018 COP5611

78 Processes (1) 11/27/2018 COP5611

79 Processes (2) 11/27/2018 COP5611

80 Naming Main data structures used in xFS. Data structure Description
Manager map Maps file ID to manager Imap Maps file ID to log address of file's inode Inode Maps block number (i.e., offset) to log address of block File identifier Reference used to index into manager map File directory Maps a file name to a file identifier Log addresses Triplet of stripe group, ID, segment ID, and segment offset Stripe group map Maps stripe group ID to list of storage servers 11/27/2018 COP5611

81 Overview of SFS The organization of SFS. 11/27/2018 COP5611

82 Naming A self-certifying pathname in SFS. /sfs LOC HID Pathname
/sfs/sfs.vu.sc.nl:ag62hty4wior450hdh63u623i4f0kqere/home/steen/mbox A self-certifying pathname in SFS. 11/27/2018 COP5611

83 Summary Issue NFS Coda Plan 9 xFS SFS Design goals Access transparency High availability Uniformity Serverless system Scalable security Access model Remote Up/Download Log-based Communication RPC Special Active msgs Client process Thin/Fat Fat Thin Medium Server groups No Yes Mount granularity Directory File system Name space Per client Global Per process File ID scope File server Server Sharing sem. Session Transactional UNIX N/S Cache consist. write-back write-through Replication Minimal ROWA None Striping Fault tolerance Reliable comm. Replication and caching Recovery Client-based Reintegration Checkpoint & write logs Secure channels Existing mechanisms Needham-Schroeder No pathnames Self-cert. Access control Many operations Directory operations UNIX based NFS BASED A comparison between NFS, Coda, Plan 9, xFS. N/S indicates that nothing has been specified. 11/27/2018 COP5611


Download ppt "Outline Distributed File Systems – continued 11/27/2018 COP5611."

Similar presentations


Ads by Google