Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 8: Distributed File Systems

Similar presentations


Presentation on theme: "Chapter 8: Distributed File Systems"— Presentation transcript:

1 Chapter 8: Distributed File Systems
Speaker : 呂俊欣 M 5/2/2003 2018/11/22

2 Outline Introduction File service architecture
Sun Network File System (NFS) Andrew File System (AFS) Recent advances Summary 2018/11/22

3 File System A file system
Is responsible for the organization, storage, retrieval, naming, sharing and protection of files Is designed to store and manage large number of files, with facilities for creating, naming and deleting files Stores programs and data and makes them available as needed 2018/11/22

4 Layered File System 2018/11/22

5 Persistence Probably one of the most important services provided by a file system is persistence The files exist after the program, and even the computer, has terminated Files typically do not go away They are persistent and exist between sessions In conventional systems files are the only persistence objects 2018/11/22

6 Distributed Objects Using the OO paradigm is it easy to build distributed systems Place objects on different machines Systems have been developed that allow this Java RMI CORBA ORB Having a persistent object store would be useful Java RMI activation daemon Certain ORB implementations 2018/11/22

7 Properties of Storage Systems
Types of consistency between copies: 1 - strict one-copy consistency - approximate consistency X - no automatic consistency Figure 8.1 Sharing Persis- Distributed Consistency Example tence cache/replicas maintenance Main memory 1 RAM File system 1 UNIX file system Distributed file system Sun NFS Web server Web Popup: The items appear in pairs. Distributed shared memory Ivy (Ch. 16) Remote objects (RMI/ORB) 1 CORBA Persistent object store 1 CORBA Persistent Object Service Persistent distributed object store PerDiS, Khazana 2018/11/22 *

8 File Model Files contain both data and attributes
Data is a sequence of bytes accessible by read/write operations Attributes consist of a collection of information about the file 2018/11/22

9 Common File Attributes
File length Creation timestamp Read timestamp Write timestamp Attribute timestamp Reference count Owner File type Access control list 2018/11/22

10 UNIX file system operations
Write a simple C program to copy a file using the UNIX file system operations shown in Figure 8.4. #define BUFSIZE 1024 #define READ 0 #define FILEMODE 0644 void copyfile(char* oldfile, char* newfile) { char buf[BUFSIZE]; int i,n=1, fdold, fdnew; if((fdold = open(oldfile, READ))>=0) { fdnew = creat(newfile, FILEMODE); while (n>0) { n = read(fdold, buf, BUFSIZE); if(write(fdnew, buf, n) < 0) break; } close(fdold); close(fdnew); else printf("Copyfile: couldn't open file: %s \n", oldfile); main(int argc, char **argv) { copyfile(argv[1], argv[2]); filedes = open(name, mode) filedes = creat(name, mode) Opens an existing file with the given name. Creates a new file with the given name. Both operations deliver a file descriptor referencing the open file. The mode is read, write or both. status = close(filedes) Closes the open file filedes. count = read(filedes, buffer, n) count = write(filedes, buffer, n) Transfers n bytes from the file referenced by filedes to buffer. Transfers n bytes to the file referenced by filedes from buffer. Both operations deliver the number of bytes actually transferred and advance the read-write pointer. pos = lseek(filedes, offset, whence) Moves the read-write pointer to offset (relative or absolute, depending on whence). status = unlink(name) Removes the file name from the directory structure. If the file has no other names, it is deleted. status = link(name1, name2) Adds a new name (name2) for a file (name1). status = stat(name, buffer) Gets the file attributes for file name into buffer. Figure 8.4 UNIX file system operations A Example Write a simple C program to copy a file using the UNIX file system operations shown in Figure 8.4. copyfile(char * oldfile, * newfile) { <you write this part, using open(), creat(), read(), write()> } Note: remember that read() returns 0 when you attempt to read beyond the end of the file. 2018/11/22 *

11 File system modules 2018/11/22

12 File Service A file service allows for the storage and access of files on a network Remote file access is identical to local file access Convenient for users who use different workstations Other services can be easily implemented Makes management and deployment easier and more economical File systems were the first distributed systems that were developed Defines the service not the implementation 2018/11/22

13 File Server A process that runs on some machine and helps to implement the file service A system may have more than one file server 2018/11/22

14 File Service Models Upload/ ReadMe.txt download model ReadMe.txt
Client Server Remote access model ReadMe.txt Server Client 2018/11/22

15 File service Requirements
Transparency Access Programs are unaware of the fact that files are distributed Location Programs see a uniform file name space. They do not know, or care, where the files are physically located Mobility Programs do need to be informed when files move (provided the name of the file remains unchanged) 2018/11/22

16 Transparency Revisited
Location Transparency Path name gives no hint where the file is physically located \\redshirt\ptt\dos\filesystem.ppt File is on redshirt but where is redshirt? 2018/11/22

17 File service Requirements cont.
Transparency Performance Satisfactory performance across a specified range of system loads Scaling Service can be expanded to meet additional loads 2018/11/22

18 File service Requirements cont.
Concurrent File Updates Changes to a file by one program should not interfere with the operation of other clients simultaneously accessing the same file File-level or record-level locking Other forms of concurrency control to minimize contention 2018/11/22

19 File Locking Lock cannot be granted if other process already has a lock on the file (or block) Client request gets put at the end of a FIFO queue As soon as lock is removed, the server grants the next lock to the client at the top of the list Client 1 Client 2 Client 3 copy copy Request 3 Return lock Server Disk Sent to end of queue file File is locked Request 3 Request 2 FIFO queue 2018/11/22

20 File service Requirements cont.
File Replication A file may be represented by several copies of its contents at different locations Load-sharing between servers makes service more scalable Local access has better response (lower latency) Fault tolerance Full replication is difficult to implement. Caching (of all or part of a file) gives most of the benefits (except fault tolerance) 2018/11/22

21 File service Requirements cont.
Hardware and Software Heterogeneity Service can be accessed by clients running on (almost) any OS or hardware platform. Design must be compatible with the file systems of different OSes Service interfaces must be open - precise specifications of APIs are published. 2018/11/22

22 File service Requirements cont.
Fault Tolerance The service can continue to operate in the face of client and server failures. Consistency UNIX one-copy update semantics Security Based on identity of user making request Identities of remote users must be authenticated Privacy requires secure communication Efficiency Should offer facilities that are of at least the same power and as those found in conventional systems 2018/11/22

23 File Sharing Semantics
When more than one user shares a file, it is necessary to define the semantics of reading and writing For single processor systems The system enforces an absolute time ordering on all operations and always returns the result of the last operation Referred to as UNIX semantics 2018/11/22

24 UNIX Semantics A B PID0 A B C PID1 Original file Writes “c”
Read gets “abc” 2018/11/22

25 Distributed A B PID0 A B C Client 1 A B A B PID1 Client 2 Read “ab”
Writes “c” Client 1 A B A B PID1 Read “ab” Read gets “c” Client 2 2018/11/22

26 Summary Method Comment UNIX semantics
Every operation on a file is instantly visible to all processes Session Semantics No changes are visible to other processes until the file is closed Immutable Files No updates are possible; simplifies sharing and replication Transactions All changes have the all-or-nothing property 2018/11/22

27 Caching Attempt to hold what is needed by the process in high speed storage Parameters What unit does the cache manage? Files, blocks, … What do you do when the cache fills up? Replacement policy 2018/11/22

28 Cache Consistency The real problem with caching and distributed file systems is cache consistency If two processes are caching the same file, how to the local copies find out about changes made to the file? When they close their files, who wins the race? Client caching needs to be thought out carefully 2018/11/22

29 Cache Strategies Method Comment Write Through
Changes to file sent to server; Works but does not reduce write traffic. Delayed Write Send changes to server periodically; better performance but possibly ambiguous semantics. Write on Close Write changes when file is closed; Matches session semantics. Centralized Control File server keeps track of who has which file open and for what purpose; UNIX Semantics, but not robust and scales poorly 2018/11/22

30 Replication Multiple copies of files are maintained
Increase reliability by having several copies of a file Allow file access to occur even if a server is down Load-balancing across servers Replication transparency To what extent is the user aware that some files are replicated? 2018/11/22

31 Types of Replication S0 S0 S0 C S1 C S1 C S1 S2 S2 S2
Explicit File Replication Lazy File Replication Group Replication S0 S0 S0 Later Now C S1 C S1 C S1 S2 S2 S2 2018/11/22

32 Update Protocols Okay so now we have replicas, how do we update them?
Primary Copy Replication Change is sent to primary Primary sends changes to secondary servers Voting Primary is down you are dead Client must receive permissions of multiple servers before making an update 2018/11/22

33 File Service Architecture
Lookup AddName UnName GetNames Read Write Create Delete GetAttributes SetAttributes Figure 8.5 Client computer Server computer Application program Client module Flat file service Directory service Popup shows names of the file and directory operations. If the Unix file primitives are mapped directly onto file server operations, it is impossible to support many of the requirements outlined on the previous slide. Instead, the file service is implemented in part by a module that runs in the the client computer. The Client module implements nearly all of the state-dependent funtionality (open files, position of read-write pointer, etc.)- supporting fault tolerance - and it maintains a cache of recently-used file data - efficiency, limited replication and scalability. It also translates the file operations performed by applications into the API of the file server - transparency and heterogeneity. It supplies authentication information and performs encryption when necessary, but access control checks must be made by the server. 2018/11/22

34 System Modules Flat File Service Directory Service Client Module
Implements operations on the contents of files UFIDs are used to refer to files (think I-node) Directory Service Provides a mapping between text names and UFIDs Note that the name space is not necessarily flat Might be a client of the flat file service if it requires persistent storage Client Module Provides client access to the system 2018/11/22

35 Server operations for the model file service
Figures 8.6 and 8.7 Flat file service Read(FileId, i, n) -> Data Write(FileId, i, Data) Create() -> FileId Delete(FileId) GetAttributes(FileId) -> Attr SetAttributes(FileId, Attr) Directory service Lookup(Dir, Name) -> FileId AddName(Dir, Name, File) UnName(Dir, Name) GetNames(Dir, Pattern) -> NameSeq position of first byte FileId Note that open files and RW pointer are eliminated from this interface. All Flat File service (and Directory Service) operations are repeatable - idempotent - except Create. Neither of the server modules maintains any state on behalf of the client (unless encryption is used). Example B Show how each file operation of the program that wrote in Example A would be executed using the operations of the Model File Service in Figures 8.6 and 8.7. Pathname lookup Pathnames such as '/usr/bin/tar' are resolved by iterative calls to lookup(), one call for each component of the path, starting with the ID of the root directory '/' which is known in every client. FileId A unique identifier for files anywhere in the network. Similar to the remote object references described in Section 2018/11/22

36 Example B solution server operations for: copyfile("/usr/include/glob.h", "/foo") fdold = open('/usr/include/glob.h", READ) Client module actions: FileId = Lookup(Root, "usr") remote invocation FileId = Lookup(FileId, "include") - remote invocation FileId = Lookup(FileId, "glob.h") - remote invocation client module makes an entry in an open files table with file = FileId, mode = READ, and RWpointer = 0. It returns the table row number as the value for fdold Show how each file operation of the program that you wrote in Example A would be executed using the operations of the Model File Service in Figures 8.6 and 8.7. if((fdold = open(oldfile, READ))>=0) { fdnew = creat(newfile, FILEMODE); while (n>0) { n = read(fdold, buf, BUFSIZE); if(write(fdnew, buf, n) < 0) break; } close(fdold); close(fdnew); fdnew = creat("/foo", FILEMODE) Client module actions: FileId = create() - remote invocation AddName(Root, "foo", FileId) - remote invocation SetAttributes(FileId, attributes)- remote invocation client module makes an entry in its openfiles table with file = FileId, mode = WRITE, and RWpointer = 0. It returns the table row number as the value for fdnew n = read(fdold, buf, BUFSIZE) Client module actions: Read(openfiles[fdold].file, openfiles[fdold].RWpointer, BUFSIZE) remote invocation increment the RWpointer in the openfiles table by BUFSIZE and assign the resulting array of data to buf 2018/11/22 *

37 File Group A collection of files that can be located on any server or moved between servers while maintaining the same names. Similar to a UNIX filesystem Helps with distributing the load of file serving between several servers. File groups have identifiers which are unique throughout the system (and hence for an open system, they must be globally unique). Used to refer to file groups and files To construct a globally unique ID we use some unique attribute of the machine on which it is created, e.g. IP number, even though the file group may move subsequently. IP address date 32 bits 16 bits File Group ID: Pop-up about construction of globally unique ID. 2018/11/22 *

38 NFS NFS was originally designed and implemented by Sun Microsystems
Three interesting aspects Architecture Protocol ( RFC 1094 ) Implementation Sun’s RPC system was developed for use in NFS Can be configured to use UDP or TCP 2018/11/22

39 Overview Basic idea is to allow an arbitrary collection of clients and servers to share a common file system An NFS server exports one of its directories Clients access exported directories by mounting them To programs running on the client, there is almost no difference between local and remote files 2018/11/22

40 NFS protocol (remote operations)
NFS architecture Application program NFS Client Kernel Client computer Figure 8.8 Client computer Server computer Application Application program program Other file system UNIX kernel system calls NFS protocol (remote operations) UNIX Operations on local files Operations on remote files Virtual file system Virtual file system Popup 1 adds more labels UNIX UNIX NFS NFS file file client server system system 2018/11/22 *

41 Remote Procedure Call Model
4.Server stub is given message; arguments are unmarshaled/converted 7.Server stub converts/marshals values; remote kernel is called 1.Client stub called; argument marshalling performed 3.Network message transferred to remote host 5.Remote procedure is executed with arguments 2.Local kernel called with network message 6.Procedure return values given back to server stub 8.Network message transferred back to local host 10.Return values are given to client from stub 9.Client stub receives message from kernel Client Process Server Process Client Routine Server Routine Local Procedure call (1) (10) (6) (5) Client Stub Server Stub (2) (9) (7) (4) System call Protocols are independent of UNIX (XDR) Implementations exists for Win95/NT, MacOS, Linux (8) Network Routine Network Routine (3) Network Communication Local Kernel Local Kernel 2018/11/22 Remote Procedure Call Model

42 Virtual File System Part of Unix kernel
Makes access to local and remote files transparent Translates between Unix file identifiers and NFS file handles Keeps tack of file systems that are currently available both locally and remotely NFS file handles: File system ID I-node number I-node generation number File systems are mounted The VFS keeps a structure for each mounted file system 2018/11/22

43 Virtual File System cont.
2018/11/22

44 But, for a Unix implementation there are advantages:
NFS architecture: does the implementation have to be in the system kernel? No: There are examples of NFS clients and servers that run at application-level as libraries or processes (e.g. early Windows and MacOS implementations, current PocketPC, etc.) But, for a Unix implementation there are advantages: Binary code compatible - no need to recompile applications Standard system calls that access remote files can be routed through the NFS client module by the kernel Shared cache of recently-used blocks at client Kernel-level server can access i-nodes and file blocks directly But a privileged (root) application program could do almost the same. Security of the encryption key used for authentication. Although the NFS client and server modules reside in the kernel in standard Unix implementations, it is perfectly possible to construct application-level versions. Examples of application-level clients or servers: Windows 3.0, Win 95?, early MacOS, PocketPC 2018/11/22 *

45 NFS server operations (simplified)
Figure 8.9 read(fh, offset, count) -> attr, data write(fh, offset, count, data) -> attr create(dirfh, name, attr) -> newfh, attr remove(dirfh, name) status getattr(fh) -> attr setattr(fh, attr) -> attr lookup(dirfh, name) -> fh, attr rename(dirfh, name, todirfh, toname) link(newdirfh, newname, dirfh, name) readdir(dirfh, cookie, count) -> entries symlink(newdirfh, newname, string) -> status readlink(fh) -> string mkdir(dirfh, name, attr) -> newfh, attr rmdir(dirfh, name) -> status statfs(fh) -> fsstats Model flat file service Read(FileId, i, n) -> Data Write(FileId, i, Data) Create() -> FileId Delete(FileId) GetAttributes(FileId) -> Attr SetAttributes(FileId, Attr) Model directory service Lookup(Dir, Name) -> FileId AddName(Dir, Name, File) UnName(Dir, Name) GetNames(Dir, Pattern) ->NameSeq fh = file handle: Filesystem identifier i-node number i-node generation Idempotent read/write getattr is used to implement fstat(), setattr is used to update the attributes NFS Create adds file to directory, whereas we use two operations in MFS Popup shows Model directory service ops Optional: Class Exercise C Show how each file operation of the program that you wrote in Class Exercise A would be executed using the operations of the NFS server operations in Figure 8.9. 2018/11/22

46 NFS Client Module Part of Unix kernel
Allows user programs to access files via UNIX system calls without recompilation or reloading One module serves all user-level processes A shared cache holds recently used blocks The encryption key for authentication of user IDs is kept in the kernel 2018/11/22

47 NFS access control and authentication
Stateless server, so the user's identity and access rights must be checked by the server on each request. In the local file system they are checked only on open() Every client request is accompanied by the userID and groupID not shown in the Figure 8.9 because they are inserted by the RPC system Server is exposed to impostor attacks unless the userID and groupID are protected by encryption Kerberos has been integrated with NFS to provide a stronger and more comprehensive security solution Kerberos is described in Chapter 7. Integration of NFS with Kerberos is covered later in this chapter. Kerberized NFS is described below. (stateless),亦即每次使用者端向 伺服器提出要求時,就要重新建立連線 2018/11/22 *

48 Access Control NFS servers are stateless Does this scare you?
User’s identity must be verified for each request The UNIX UID and GID of the user are used for authentication purposes Does this scare you? Kerberized NFS 2018/11/22

49 Mount service Mount operation:
mount(remotehost, remotedirectory, localdirectory) Server maintains a table of clients who have mounted filesystems at that server Each client maintains a table of mounted file systems holding: < IP address, port number, file handle> Hard versus soft mounts 2018/11/22 *

50 File Mounting File mounting protocol
A client sends a path name to a server and can request permission to mount the directory If the request is legal, the server returns a file handle to the client The handle identifies the file system type, the disk, the I-node number of the directory, and security Subsequent calls to read/write in that directory use the file handle 2018/11/22

51 Mount service Client Server 1 OK
system call: mount( “Server 1”, “/nfs/users”, “/usr/staff” ) OK Return file handle Virtual file system Virtual file system Local Remote UNIX UNIX NFS NFS Check permission file file server client system system (root) (root) nfs etc . . . vmunix usr jim jane joe ann staff exports jim jane joe ann users students x 2018/11/22

52 Local and Remote Access
Note: The file system mounted at /usr/students in the client is actually the sub-tree located at /export/people in Server 1; the file system mounted at /usr/staff in the client is actually the sub-tree located at /nfs/users in Server 2. 2018/11/22

53 NFS path translation Pathnames are translated in a step by step procedure by the client The file handle used for one step is used as a parameter at next lookup 2018/11/22

54 Automounting Allows a number of remote directories to be associated with a local directory Nothing is mounted until a client tries to access a remote directory Advantages Don’t need to do any work if the files are not accessed Some fault tolerance is provided 2018/11/22

55 Server caching Read-ahead – fetches the pages following those that have been recently read Delayed-write – doesn’t write out disk blocks until the cache buffer is needed for something else The UNIX sync flushes altered pages to disk every 30 seconds NFS commit operation forces the blocks of a file to be written in delayed-write mode NFS also offers write-through caching – block is written to disk before the reply is sent back to client What problems occur with delayed-write? What problems occur with write-through? 2018/11/22

56 Client caching (reads)
Client caching can result in inconsistent files. Why? NFS uses timestamped validation of cache blocks: Tc is time block last validated Tm is time when block was last modified at the server t is the freshness interval (set adaptively for individual files 3 to 30 secs) T is current time If (T – Tc < t) or (T-Tc > t and Tm client = Tm server), file is okay Validation check is made a client with each access When a new value to Tm is received for a file, it is applied to all blocks 2018/11/22

57 Client caching (writes)
Modified pages are marked as dirty and flushed at next sync Bio-daemons (block input-output) perform read-ahead and delayed-write notified when client reads a block to get next blocks notified when client fills a block then writes it out 2018/11/22

58 Other NFS optimizations
Sun RPC runs over UDP by default (can use TCP if required) Uses UNIX BSD Fast File System with 8-kbyte blocks reads() and writes() can be of any size (negotiated between client and server) The guaranteed freshness interval t is set adaptively for individual files to reduce gettattr() calls needed to update Tm File attribute information (including Tm) is piggybacked in replies to all file requests 2018/11/22 *

59 NFS Summary An excellent example of a simple, robust, high-performance distributed service. Achievement of transparencies (See section 1.4.7): Access: Excellent; the API is the UNIX system call interface for both local and remote files. Location: Not guaranteed but normally achieved; naming of filesystems is controlled by client mount operations, but transparency can be ensured by an appropriate system configuration. Concurrency: Limited but adequate for most purposes; when read-write files are shared concurrently between clients, consistency is not perfect. Replication: Limited to read-only file systems; for writable files, the SUN Network Information Service (NIS) runs over NFS and is used to replicate essential system files, see Chapter 14. 2018/11/22 cont'd *

60 NFS summary * Achievement of transparencies (continued):
Failure: Limited but effective; service is suspended if a server fails. Recovery from failures is aided by the simple stateless design. Mobility: Hardly achieved; relocation of files is not possible, relocation of filesystems is possible, but requires updates to client configurations. Performance: Good; multiprocessor servers achieve very high performance, but for a single filesystem it's not possible to go beyond the throughput of a multiprocessor server. Scaling: Good; filesystems (file groups) may be subdivided and allocated to separate servers. Ultimately, the performance limit is determined by the load on the server holding the most heavily-used filesystem (file group). 2018/11/22 *

61 Andrew File System (AFS)
2018/11/22

62 Abstract Distributed File Systems such as the AT&T RFS system provide the same consistency semantics of a single machine file system, often at great cost to performance. Distributed File Systems such as SUN NFS provide good performance, but with extremely weak consistency guarantees. 2018/11/22

63 Abstract The designers of AFS believed that a compromise could be achieved between the two extremes. AFS attempts to provide useful file system consistency guarantees along with good performance. 2018/11/22

64 Design Goals Performance Consistency Guarantees Scalability
Minimize Server Load Minimize Network Communications Consistency Guarantees After a file system call completes, the resulting file system state is immediately visible everywhere on the network (with one exception, discussed later). Scalability Provide the appearance of a single, unified file system to approximately 5000 client nodes connected on a single LAN. A single file server should provide service to about 50 clients. UNIX Support AFS is intended primarily for use by Unix workstations. 2018/11/22

65 Influential Observations
Design goals were based in part on the following observations: Files are small; most are less than 10 kB Read operations on files are much more common than writes (6X more common) Sequential access is common; random access is rare Most files are read and written by only one user. Usually only one user modifies a shared file. Files are referenced in bursts. If a file has been referenced recently, there is a high probability that it will be referenced again in the near future. 2018/11/22

66 Implementation Overview
2018/11/22

67 Venus / Vice = Client / Server
User-level UNIX process that runs in each client computer and corresponds to the client module in the previous abstract model. Vice User-level UNIX process that runs in each server machine. Threads Both Venus and Vice make use of a non-pre-emptive threads package to enable concurrent processing. 2018/11/22

68 File Name Space Local files are used only for
Temporary files (/tmp) and Processes that are essential for workstation startup. Other standard UNIX files Such as those normally found in /bin, /lib Implemented as symbolic links From the local space to shared space. Users’ directories In the shared space Allows them to access their files from any workstation. 2018/11/22

69 File Name Space 2018/11/22

70 System Call Interception
User programs use conventional UNIX pathnames to refer to files, but AFS uses fids (file identifiers) in the communication between Venus and Vice. Local file system calls Handled as normal BSD Unix file system calls. Remote file system calls In situations requiring non-local file operations, the BSD Unix kernel has been modified to convert conventional UNIX pathnames to fids and forward the fids to Venus. Venus then communicates directly with Vice using fids. The BSD Unix kernel below Vice was also modified to so that Vice can perform file operations in terms of fids instead of the conventional UNIX file descriptors. FID calculation on the client side minimizes server workload. 2018/11/22

71 System Call Interception
2018/11/22

72 File Descriptors (FIDS)
Each file and directory in the shared file space is identified by a unique, 96-bit fid. [example] whatfid . ./ProblemFile .: 1: ./ProblemFile: 1: ( The File IDentifier or FID is composed of four numbers. The first number is the cell number; "1:" corresponds to the local cell. The second number, is the volume id number. Use `vos listvldb ` to find the corresponding volume name. The third number is the vnode The fourth number is the uniquifier. AFS uses the third and fourth numbers to track the file's location in the cache. 2018/11/22

73 File System Call Implementation
2018/11/22

74 Cache Consistency Callback Promise
A token issued by the Vice server that it is the custodian of the file, guaranteeing that it will notify the Venus process when any other client modifies the file. Stored with the cached files on the client workstation disks Two states: Valid Cancelled 2018/11/22

75 Cache Consistency cont.
Callback Benefits Results in communication between the client and server only when the file has been updated. The client does not need to inform the server that it wants to open a file (likely for reading) if there is a valid copy on the client machine (the callback status is set to valid). 2018/11/22

76 Cache Consistency cont.
Callback Drawbacks Mechanism used in AFS-2 and later versions requires Vice servers to maintain some state on behalf of their Venus clients. If clients in different workstations open, write, and close the same file concurrently, all but the update resulting from the last close will be silently lost (no error report is given). Clients must implement concurrency control independently if they require it. 2018/11/22

77 AFS Links 2018/11/22

78 Recent akdvances in file services
NFS enhancements WebNFS - NFS server implements a web-like service on a well-known port. Requests use a 'public file handle' and a pathname-capable variant of lookup(). Enables applications to access NFS servers directly, e.g. to read a portion of a large file. One-copy update semantics (Spritely NFS, NQNFS) - Include an open() operation and maintain tables of open files at servers, which are used to prevent multiple writers and to generate callbacks to clients notifying them of updates. Performance was improved by reduction in gettattr() traffic. Improvements in disk storage organization RAID - improves performance and reliability by striping data redundantly across several disk drives Log-structured file storage - updated pages are stored contiguously in memory and committed to disk in large contiguous blocks (~ 1 Mbyte). File maps are modified whenever an update occurs. Garbage collection to recover disk space. RAID = redundant arrays of inexpensive disks 2018/11/22 *

79 New design approaches Distribute file data across several servers
Exploits high-speed networks (ATM, Gigabit Ethernet) Layered approach, lowest level is like a 'distributed virtual disk' Achieves scalability even for a single heavily-used file 'Serverless' architecture Exploits processing and disk resources in all available network nodes Service is distributed at the level of individual files Examples: xFS (section 8.5): Experimental implementation demonstrated a substantial performance gain over NFS and AFS Frangipani (section 8.5): Performance similar to local UNIX file access Tiger Video File System (see Chapter 15) Peer-to-peer systems: Napster, OceanStore (UCB), Farsite (MSR), Publius (AT&T research) - see web for documentation on these very recent systems 2018/11/22 *

80 New design approaches Replicated read-write files High availability
Disconnected working re-integration after disconnection is a major problem if conflicting updates have ocurred Examples: Bayou system (Section ) Coda system (Section ) 2018/11/22 *

81 Summary Sun NFS is an excellent example of a distributed service designed to meet many important design requirements Effective client caching can produce file service performance equal to or better than local file systems Consistency versus update semantics versus fault tolerance remains an issue Superior scalability can be achieved with whole-file serving (Andrew FS) or the distributed virtual disk approach Future requirements: support for mobile users, disconnected operation, automatic re-integration (Cf. Coda file system, Chapter 14) support for data streaming and quality of service (Cf. Tiger file system, Chapter 15) 2018/11/22 *


Download ppt "Chapter 8: Distributed File Systems"

Similar presentations


Ads by Google