DFS Design and Implementation Yang Wang
Review Characteristics of a DFS: a. Dispersed clients b. Dispersed files c. Multiplicity of Users d. Multiplicity of files
What and Why DFS ? Distributed File System Transparency Name Service, Directory Service, Caching and replication, access control and protection It is one of the two important components in any distributed computation.
Outline Files and File Systems File Mounting and Server Registration Stateful and Stateless File Servers File Access and Semantics of Sharing Example and Research
Files and File systems Files are named data objects. File system is responsible for the naming, creation, deletion, retrieval, modification, and protection of a file in the system. Logical components of a file for users File Name File Attributes Data units
Files and File systems File name : symbolic name – When accessing a file, its symbolic name is mapped to a unique file id (ufid or file handle) that can locate the physical file Mapping is the primary function of the Directory Service File Attributes: Name, Size, Location, Time, Type etc. Data units: Organization – Flat structure of a stream of bytes of sequence of blocks – Hierarchical structure of indexed records
Files and File systems File Access Sequential access mode File position pointer to indicate the position of the next data unit to be accessed. Direct access Explicitly reference fixed-size data units by their block numbers. Indexed sequential access Using an key associated with each data block, and a sequence of key/object pairs is stored in a large data block. Use an index to locate the block in which the pair resides, and then accessing the data in the block until the key/object is found.
One Example: UNIX UNIX Files are streams of characters for application programs and sequences of logical fixed size blocks for file system. Both sequential and direct access methods are supported. other access methods can be built on top of the flat file structures.
Major Components in a file system Directory serviceName resolution, add and deletion of files Authorization serviceCapability and /or access control list File service TransactionConcurrency and replication management BasicRead/write files and get/set attributes System ServiceDevice, cache, and block management
Directory Service Directories are files that contain names and addresses of other files and subdirectories. Mapping and locating Search for a file Create a file Delete a file List a directory Rename a file Traverse the file system
11 Authorization Service File access must be regulated to ensure security Types of access – Read – Write – Execute – Append – Delete – List
12 File Service – Basic Operations Create – Allocate space – Make an entry in the directory Write – Search the directory – Write is to take place at the location of the write pointer Read – Search the directory – Read is to take place at the location of the read pointer Reposition within file – file seek – Set the current file pointer to a given value Delete – Search the directory – Release all file space Truncate – Reset the file to length zero Open(Fi) – Search the directory structure – Move the content of the directory entry to memory Close(Fi) – move the content in memory to directory structure on disk Get/set file attributes
File service-Transaction Concurrency and replication management (Section 6.3.3)
System Service System services are a FS’s interface to the hardware and are transparent to users of FS – Mapping of logical to physical block addresses – Interfacing to services at the device level for file space allocation/de- allocation – Actual read/write file operations – Caching for performance enhancement – Replicating for reliability improvement
Interaction among services in a DFS Clients Directory Services Authorization services File services System service
Organization of data files in a file system
Outline Files and File Systems File Mounting and Server Registration Stateful and Stateless File Servers File Access and Semantics of Sharing Example and Research
File Mounting and Server Registration Attach a remote named file system to the client’s file system hierarchy at the position pointed to by a path name (mounting point) – A mounting point is usually a leaf of the directory tree that contains only an empty subdirectory Once files are mounted, they are accessed by using the concatenated logical path names without referencing either the remote hosts or local devices – Location transparency – The linked information (mount table) is kept until they are unmounted
19 File Mounting Example root chow paperbook root OS DFSDSM Local ClientRemote Server Export Mount DFSDSM /chow/book/DSM /OS/DSM
File mounting and Server Registration Mounting Strategy – Explicit mounting: clients make explicit mounting system calls whenever one is desired – Boot mounting: a set of file servers is prescribed and all mountings are performed the client’s boot time – Auto-mounting: mounting of the servers is implicitly done on demand when a file is first opened by a client
A Simple Automounter for NFS
Server Registration The mounting protocol is not transparent – the initial mounting requires knowledge of the location of file servers Server registration – File servers register their services, and clients consult with the registration server before mounting – Clients broadcast mounting requests, and file servers respond to client’s requests
Outline Files and File Systems File Mounting and Server Registration Stateful and Stateless File Servers File Access and Semantics of Sharing Example and Trend
Stateful and stateless File Servers State information Opened files and their clients File descriptors and file handles Current file position pointers Mounting info Lock status Session keys Cache or buffer
Stateful and stateless File Servers A file server is called stateful if it maintains internally some of the state information and stateless if it maintains none at all. Stateless file server – when a client sends a request to a server, the server carries out the request, sends the reply, and then remove from its internal tables all information about the request – Between requests, no client-specific information is kept on the server – Each request must be self-contained: full file name and offset… Stateful file server – file servers maintain state information about clients between requests
Comparing
Outline Files and File Systems File Mounting and Server Registration Stateful and Stateless File Servers File Access and Semantics of Sharing Example and Research
File Access and Semantics of sharing File sharing means multiple clients can access the same file at the same time. Such sharing result from either overlapping or interleaving access operation. Overlapping access: implies multiple copies of the same file – Space multiplexing of the file – Cache or replication – Coherency control: managing accesses to the replicas, to provide a coherent view of the shared file – Desirable to guarantee the atomicity of updates (to all copies) Interleaving access: due to multiple granularities of data access operations – Time multiplexing of the file – Simple read/write, Transaction, Session – Concurrency control: how to prevent one execution sequence from interfering with the others when they are interleaved and how to avoid inconsistent or erroneous results
Space Multiplexing Remote access: no file data is kept in the client machine. Each access request is transmitted directly to the remote file server through the underlying network. Cache access: a small part of the file data is maintained in a local cache. A write operation or cache miss results a remote access and update of the cache one example: IBM’s DCE Distributed File Service( Client Caching ) see[4,2006] Download/upload access: the entire file is downloaded for local accesses. A remote access or upload is performed when updating the remote file
Time Multiplexing Simple RW: each read/write operation is an independent request/response access to the file server Transaction RW: a sequence of read and write operations is treated as a fundamental unit of file access (to the same file) – ACID properties Session RW: a sequence of transaction and simple RW operations
Space and time concurrencies of file accesses
Semantic of sharing Unix semantics The result of a write is propagated to the file and its copies immediately so that reads will return the “latest” value of the file. No delay of a write is imposed except unavoidable network delays. Subsequent accesses from the client that has issued the write must wait for the write to complete. Primary object: Currency of the data Transaction semantics the results of writes are tentatively stored in working storage and committed permanently only when some consistency constraints are met at the end of a transaction. The primary objective is to maintain consistency of the data. Session semantics writes to a file are performed on a working copy and the result is made only at the close of the session. The primary objective is to maintain efficiency of data accesses.
Outline Files and File Systems File Mounting and Server Registration Stateful and Stateless File Servers File Access and Semantics of Sharing Example and Research
34 DFS Architecture – NFS Example
Deployment of NFS
Configuration Server: 1.Edit /etc/exports for sharing directory. 2.Start service portmap, nfslock, nfs. var]# showmount -e show the sharing condition. Client: 1.Portmap 2.mount -t nfs IP:/var/nfs /mnt/nfs
New Research 1.P2P file system. 2.Load balancing for DFS server. [5,2006]
References 1.RFC RFC Randy Chow, Theodore Johnson, Distributed Operating Systems and Algorithms, Addison- Wesley, Distributed File Systems Pierre Boulet Masters Informatique TIIR et IAGL September 28,2006.
References 5.Glagoleva,2000.A load balancing tool based on mining access patterns for Distributed File System servers. System Sciences. HICSS. Proceedings of the 35th Annual Hawaii International Conference
Thanks