Download presentation
Presentation is loading. Please wait.
Published byMariah James Modified over 9 years ago
1
Satish Puri
2
File and File System concept File Mounting Stateful/Stateless server concept Current work and Future work
3
Files are named data objects. Files hold structured data that are used by programs but that are not part of the programs themselves. File system is responsible for the naming, creation, deletion, retrieval, modification, and protection of a file in the system. Logical components of a file for users. File Name File Attributes Data units
4
UNIX Files are streams of characters for application programs and sequences of logical fixed size blocks for file system. Both sequential and direct access methods are supported. Other access methods can be built on top of the flat file structures.
5
Directory serviceName resolution, add and deletion of files Authorization serviceCapability and /or access control list File service TransactionConcurrency and replication management BasicRead/write files and get/set attributes System ServiceDevice, cache, and block management
6
Directories are files that contain names and addresses of other files and subdirectories. o Mapping and locating o Search for a file o Create a file o Delete a file o List a directory o Rename a file o Traverse the file system
7
File access must be regulated to ensure security Types of access ◦ Read ◦ Write ◦ Execute ◦ Append ◦ Delete ◦ List 7
8
Create ◦ Allocate space ◦ Make an entry in the directory Write ◦ Search the directory ◦ Write is to take place at the location of the write pointer Read ◦ Search the directory ◦ Read is to take place at the location of the read pointer Reposition within file – file seek ◦ Set the current file pointer to a given value Delete ◦ Search the directory ◦ Release all file space Truncate ◦ Reset the file to length zero Open(Fi) ◦ Search the directory structure ◦ Move the content of the directory entry to memory Close(Fi) ◦ move the content in memory to directory structure on disk Get/set file attributes 8
9
System services are a FS’s interface to the hardware and are transparent to users of FS ◦ Mapping of logical to physical block addresses ◦ Interfacing to services at the device level for file space allocation/de-allocation ◦ Actual read/write file operations ◦ Caching for performance enhancement ◦ Replicating for reliability improvement
11
Attach a remote named file system to the client’s file system hierarchy at the position pointed to by a path name ◦ A mounting point is usually a leaf of the directory tree that contains only an empty subdirectory Once files are mounted, they are accessed by using the concatenated logical path names without referencing either the remote hosts or local devices ◦ Location transparency ◦ The linked information (mount table) is kept until they are unmounted
12
12 Different clients may perceive a different FS view ◦ To achieve a global FS view – SA enforces mounting rules Export: a file server restricts/allows the mounting of all or parts of its file system to a predefined set of hosts ◦ The information is kept in the server’s export file File system mounting: ◦ Explicit mounting: clients make explicit mounting system calls whenever one is desired ◦ Boot mounting: a set of file servers is prescribed and all mountings are performed the client’s boot time ◦ Auto-mounting: mounting of the servers is implicitly done on demand when a file is first opened by a client
13
The mounting protocol is not transparent – the initial mounting requires knowledge of the location of file servers Server registration ◦ File servers register their services, and clients consult with the registration server before mounting ◦ Clients broadcast mounting requests, and file servers respond to client’s requests
15
State information o Opened files and their clients o File descriptors and file handles o Current file position pointers o Mounting information o Lock status o Session keys o Cache or buffer
16
Sateful : a file server maintains internally some of the state information Stateless : a file server maintains none at all. Stateful file Server : file servers maintain state information about clients between requests Stateless file Server : when a client sends a request to a server, the server carries out the request, sends the reply, and then remove from its internal tables all information about the request ◦ Between requests, no client-specific information is kept on the server ◦ Each request must be self-contained: full file name and offset…
19
19 Overlapping access: multiple copies of the same file ◦ Space multiplexing of the file ◦ Cache or replication ◦ Coherency control: managing accesses to the replicas, to provide a coherent view of the shared file ◦ Desirable to guarantee the atomicity of updates (to all copies) Interleaving access: multiple granularities of data access operations ◦ Time multiplexing of the file ◦ Simple read/write, Transaction, Session ◦ Concurrency control: how to prevent one execution sequence from interfering with the others when they are interleaved and how to avoid inconsistent or erroneous results
20
20 Remote access: no file data is kept in the client machine. Each access request is transmitted directly to the remote file server through the underlying network. Cache access: a small part of the file data is maintained in a local cache. A write operation or cache miss results a remote access and update of the cache Download/upload access: the entire file is downloaded for local accesses. A remote access or upload is performed when updating the remote file
21
Lakshman, A. and Malik, P., Cassandra: a decentralized structured storage system, ACM SIGOPS Operating Systems Review, volume 44, number 2, pages 35-40, 2010 -> Facebook and Twitter uses Cassandra (distributed filesytem) -> Used for inbox search for about 800 million active users. -> The cluster of computers uses regular commodity hardware prone to failure.
22
Shvachko, K., Kuang, H., Radia, S. and Chansler, R., The hadoop distributed file system, Symposium on Mass Storage Systems and Technologies, pages 1-10, 2010 Borthakur, D., The hadoop distributed file system: Architecture and design, Hadoop Project Website, 2007 -> HDFS is a filesytem for Hadoop -> Designed to run on low cost hardware -> Highly fault-tolerant and suitable for large data sets -> Hardware failure a norm rather than the exception -> Moving computation is cheaper than moving data -> Emphasis on high throughput of data
23
Ungureanu, C., Atkin, B., Aranya, A., Gokhale, S., Rago, S., Cakowski, G., Dubnicki, C. and Bohra, A., HydraFS: a high-throughput file system for the HYDRAstor content-addressable storage system, Proceedings of the 8th USENIX conference on File and storage technologies, 2010 -> Content addressable storage -> Stores information that can be retrieved based on its content, not its storage location. -> HydraFS isbuilt on top of CAS
24
DFS at Exascale Today (2011): Petascale Computing O(10K) nodes and O(100K) cores Near future (~2018): Exascale Computing – ~1M nodes (100X) – ~1B processor-cores/threads (10000X) Ioan Raicu, Pete Beckman, Ian Foster, Making a Case for Distributed File Systems at Exascale, ACM Workshop on Large- scale System and Application Performance (LSAP), 2011
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.