Download presentation
Presentation is loading. Please wait.
Published byDaniella Knight Modified over 7 years ago
1
Chapter 11 & 12: File System Interface and Implementation
Modified by Dr. Neerja Mhaskar for CS 3SH3
2
File-System Structure
File system resides on secondary storage (disks) Provides user interface File system algorithms and data structures map the logical file system onto physical secondary storage. Provides efficient and convenient access to disk by allowing data to be stored, located and retrieved easily Logical storage unit in a file system is called a file I/O transfers performed in blocks (stored on one or more sectors) (usually 512 bytes) File system organized into layers - Layering are useful for reducing complexity and redundancy, but adds overhead and can decrease performance
3
File-System Structure Cont…
Many different types of file systems exist, sometimes many within an operating system. Each with its own format. UFS – Unix File system Extended file system (ext3 and ext 4) – LINUX FAT, FAT32, NTFS - Windows
4
File A file is associated with the following:
File attributes – Name, Identifier (e.g. inode number), text etc. File operations – Create, write, read, delete, truncate, repositioning within a file. File type – Executable (.bin, .exe), text file (.txt, .doc) File structure/Internal File structure – Files have internal structure - used by Oses and programs to work with them. Supporting different file structure can increase the size of the OS making it hard to manage and work with. Therefore, Unix treats all files as sequence of bytes All operating systems must support the executable file structure to load and run programs. Disk files are accessed in units of physical blocks, 512 bytes. Internally files are organized in logical units (>= 1 byte). The number of logical units which fit into one physical block determines its packing – suffers from internal fragmentation.
5
File System Layers I/O control level – Transfers information from main memory to the disk system Consists of device drivers (manage I/O devices) and interrupt handlers. Device drivers act as translators, which take high level commands like “retrieve block 123” and outputs low-level hardware specific commands to hardware controller. Basic file system - issues generic commands to the appropriate device driver to read and write physical blocks on the disk Manages memory buffers and caches. Buffers hold data in transit, and Caches hold frequently used data
6
File System Layers (Cont.)
File organization module Keeps track of files, their logical blocks and physical blocks. Translates logical block addresses to physical block addresses Manages free space, disk allocation Logical file system manages metadata (all information related to the file-system structure except the actual data) Manages the directory structure. Maintains file structure via file-control blocks A file control block (FCB) (inode in Unix file systems) contains all the information about the file (e.g.: ownership, permissions, and location of the file contents.) Also, responsible for Protection.
7
Disk Structure A disk can be used in its entirety for a file system.
Alternatively a physical disk can be broken up into multiple partitions, slices, or mini-disks, each of which becomes a virtual disk and can have its own filesystem. ( or be used for raw storage, swap space, etc. ) Or, multiple physical disks can be combined into one volume, i.e. a larger virtual disk, with its own filesystem spanning the physical disks. Each volume containing file system also tracks that file system’s info in device directory or volume table of contents
8
A Typical File-system Organization
9
Directories Directories – enable segregating files into groups and managing them as groups. Operations that are to be performed on a directory: Operations regarding a file: Create, delete rename, search. Traverse the file system and list directory. Unix treats directories and files as same entities, but Windows treats them as different entities. Various schemes for defining logical structure of a directory exist. Single-level, two-level, tree-structure etc.
10
Directory Structure A collection of nodes containing information about all files Directory Files F 1 F 2 F 3 F 4 F n Both the directory structure and the files reside on disk
11
Directory Organization - Single-Level Directory
A single directory for all users Naming problem Grouping problem
12
Directory Organization - Two-Level Directory
Separate directory for each user Path name to access other user’s files Can have the same file name for different user Efficient searching No grouping capability
13
Directory Organization - Tree-Structured Directories
Allows users to create their own subdirectories and to organize their files. Efficient searching and Grouping Capability Most common
14
Directory Organization - Acyclic-Graph Directories
Have shared subdirectories and files
15
Directory Organization - General Graph Directory
Issue with acyclic graph directories is cycle detection. General graph directories have cycles.
16
Directory Implementation
Linear list of file names with pointer to the data blocks Simple to program Time-consuming to execute as it has linear search time. Hash Table – linear list with hash data structure. Decreases directory search time Issue: Collisions – situations where two file names hash to the same location. Resolve collisions by maintaining each hash entry as linked list, and adding a new entry to the linked list. Only good if entries are fixed size.
17
File System Mounting Just as a file must be opened before it is used, a file system must be mounted before it can be available on the system. More specifically, the directory structure may be built out of multiple volumes, which must be mounted to make them available within the file-system name space. An unmounted file system (i.e., Fig. (b)) is mounted at a mount point - the location within the file structure where the file system is to be attached. In UNIX its an empty directory (typically empty).
18
Partitions and Mounting
Partition can be a volume containing a file system (“cooked”) or raw – just a sequence of blocks with no file system Boot block can point to boot volume or boot loader set of blocks that contain enough code to know how to load the kernel from the file system Or a boot management program for multi-os booting Root partition contains the OS, other partitions can hold other Oses, other file systems, or be raw Mounted at boot time Other partitions can mount automatically or manually At mount time, file system consistency checked Is all metadata correct? If not, fix it, try again If yes, add to mount table, allow access
19
Question What problems could occur if a system allowed a file system to be mounted simultaneously at more than one location?
20
File-System Implementation
The file-system implementation consists of three major layers, First layer - is the file-system interface, based on the open(), read(), write(), and close() calls and on file descriptors. Second layer is called the virtual file system (VFS) layer. Third layer – is the layer implementing the file-system type or the remote-file-system protocol.
21
File-System Implementation – First layer
Several on-disk and in-memory structures are used to implement a file system (in particular, implementation of file system operations.) On Disk Structures Boot control block (boot block in Unix) is per volume contains information needed by the system to boot OS from that volume. Needed only if volume contains OS, and is usually first block of volume. Volume control block (superblock in UNIX) is per volume Contains volume (or partition) details (e.g.: Total # of blocks, # of free blocks, block size, free block pointers or array etc.) Directory structure is per file system Used to organizes the files. In Unix file system, it contains file names and associated inode numbers.
22
File-System Implementation (Cont.)
File Control Block (FCB) is per file Contains details about the file (e.g. inode number, permissions, size, dates) A typical file-control block
23
In-Memory File System Structures
In-memory structures are used for both file-system management and performance improvement via caching. Mount table - contains information about each mounted volume In-memory directory-structure cache - holds the directory information of recently accessed directories. System-wide open-file table contains a copy of the FCB of each open file, as well as tracks the number of processes that have the file open. Per-process open-file table contains a pointer to the appropriate entry (for the file) in the system-wide open-file table and some other fields. Buffers hold file-system blocks when they are being read from disk or written to disk.
24
In-Memory File System Structures Cont…
open(filename) – system call returns a pointer to the appropriate entry (matching the filename) in the per-process file-system table. All file operations use this pointer UNIX refer to it as a file descriptor (an integer value).
25
In-Memory File System Structures
In-memory file-system structures. (a) File open. (b) File read
26
Virtual File Systems (VFS) – Second Layer
VFS provide a common interface to multiple different filesystem types Thus, it separates file-system-generic operations from their implementation detail. It provides a unique identifier (vnode) for files across the entire space, including across all filesystems of different types. (Note that: UNIX inodes are unique only across a single filesystem) Therefore, provides a mechanism for uniquely representing a file throughout a network.
27
Virtual File Systems (Cont.)
28
Access Methods The information in the file can be accessed in several ways. Sequential access: Information in the file is processed in order, one record after the other. Direct access: information in the file in no particular order.
29
Allocation Methods - Contiguous
An allocation method refers to how disk blocks are allocated for files: Three allocation methods are in practice. Contiguous allocation – each file occupies set of contiguous blocks Simple – only starting location (block #) and length (number of blocks) are required Problems include finding space for file, knowing file size, external fragmentation, need for compaction off-line (downtime) or on-line Many newer file systems (i.e., Veritas File System) use a modified contiguous allocation scheme Extent-based file systems allocate disk blocks in extents (contiguous block of disks). Instead of blocks, extents are allocated for file allocation A file consists of one or more extents
30
Contiguous Allocation
Mapping from logical address to physical address Block size = 512 = 29 bytes LA = Logical address LA/512 = Q (integer division) R = LA % 512 (remainder) Block to be accessed = Q + starting address Displacement into block = R LA/512 Q R
31
Allocation Methods - Linked
Linked allocation – each file is a linked list of blocks File ends at nil pointer No external fragmentation, therefore no need for compaction Each block contains pointer to next block Free space management system called when new block needed Improve efficiency by clustering blocks into groups but increases internal fragmentation Reliability can be a problem (pointer is lost or damaged) Locating a block can take many I/Os and disk seeks
32
Linked Allocation Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk Pointer = 4 bytes Mapping Block to be accessed is the Qth block in the linked chain of blocks representing the file. Displacement into block = R + 4 pointer block = LA/508 Q R
33
Linked Allocation
34
Allocation Methods – FAT
FAT (File Allocation Table) variation of Linked Allocation Beginning of volume has table, indexed by block number Directory entry contains the block# of the first block of the file The table entry indexed by that block# contains the block# of the next block in the file Chain continues until it reaches the last block Its table entry has a special end-of-file value An unused block is indicated by a table value of 0 Much like a linked list, but faster on disk and cacheable New block allocation simple Can result in a significant number of disk head seeks, unless the FAT table is cached. Random-access time is better than Linked allocation.
35
File-Allocation Table
36
Allocation Methods - Indexed
Indexed allocation Each file has its own index block(s) of pointers to its data blocks ith entry in the index block points to the ith block of the file Directory contains the address of the index block Logical view
37
Indexed Allocation (Cont.)
Need index table Supports Random Access Dynamic access without external fragmentation, but have overhead of index block Number of entries in the index block depends on the size of the block and size of the pointer holding block addresses. If block size = 512 bytes and pointer size = 4 bytes: Total number of entries in the index block = 512/4 = 128 Mapping from logical to physical in a file of maximum size of 64K bytes we need only 1 block for index table LA/512 Q R Q = displacement into index table R = displacement into block
38
Example of Indexed Allocation
39
Indexed Allocation Cont…
What happens if file is large such that the index block runs out of space to hold data block addresses? Various scheme to deal with this: Linked Scheme - Link together several index blocks. The last entry of an index block either holds the address of the next index block (for large file) or is NULL (for small file) Multilevel Scheme - Multilevel index – A variant of linked representation uses a first-level index block to point to a set of second-level index blocks, which in turn point to the file blocks. This approach could be continued to a third, fourth or n-th level, depending on the desired maximum file size. Two-level index (4K blocks could store 1,024 four-byte pointers in outer index -> 1,048,567 data blocks and file size of up to 4GB) Combined Scheme - Uses direct blocks (addresses of the data blocks) and indirect blocks (addresses of the index blocks).
40
Indexed Allocation – Multi-Level Scheme
41
Combined Scheme: UNIX UFS
42
Performance Best method depends on file access type
Contiguous great for sequential and random for relatively small files. However, suffers from external fragmentation. Linked good for sequential access for large files, but inefficient for random access Indexed more complex Good for both sequential and random access, for large files. Declare access type at creation -> select either contiguous or linked
43
Performance (Cont.) Adding instructions to the execution path to save one disk I/O is reasonable Intel Core i7 Extreme Edition 990x (2011) at 3.46Ghz = 159,000 MIPS Typical disk drive at 250 I/Os per second 159,000 MIPS / 250 = 630 million instructions during one disk I/O Fast SSD drives provide 60,000 IOPS 159,000 MIPS / 60,000 = 2.65 millions instructions during one disk I/O
44
Free-Space Management
File system maintains free-space list to track available blocks/clusters Bit vector or bit map (n blocks) Simple and easy to find first free block or n consecutive blocks Need to be kept in memory for efficiency. Also, to be written to disk occasionally – Why? … 1 2 n-1 bit[i] = 1 block[i] free 0 block[i] occupied
45
Free-Space Management (Cont.)
Inefficient as it requires lot of space. Example: block size = 4KB = 212 bytes disk size = 240 bytes (1 terabyte) n = 240/212 = 228 bits (or 32MB) if clusters of 4 blocks -> 8MB of memory
46
Linked Free Space List on Disk
Linked list -keep track of all free blocks. Traversing the list and/or finding a contiguous block of a given size are not easy (we must read each block, which requires substantial I/O time), but fortunately are not frequently needed operations. Generally the system just adds and removes single blocks from the beginning of the list.
47
Free-Space Management (Cont.)
Grouping Modify linked list to store address of next n-1 free blocks in first free block, plus a pointer to next block that contains free-block-pointers. Counting Because space is frequently contiguously used and freed, with contiguous-allocation allocation, extents, or clustering Keep address of first free block and count of following free blocks Free space list then has entries containing addresses and counts
48
Free-Space Management (Cont.)
Space Maps Sun's ZFS file system was designed for HUGE numbers and sizes of files, directories, and even file systems. The resulting data structures could be VERY inefficient if not implemented carefully. Divides device space into metaslab units and manages metaslabs Given volume can contain hundreds of metaslabs Each metaslab has associated space map Uses counting technique, records to log file rather than file system An in-memory space map is constructed using a balanced tree data structure, constructed from the log data. The combination of the in-memory tree and the on-disk log provide for very fast and efficient management of these very large files and free blocks
49
Sharing and Protection
Sharing of files on multi-user systems is desirable Enabling this results in storing more information about the file. Sharing may be done through a protection scheme Files must be kept safe for reliability ( against accidental damage ), and protection ( against deliberate malicious access. ) Types of access Read, Write, Execute, Append, Delete and List UNIX uses a set of 9 access control bits, in three groups of three. These correspond to R (read), W (write), and X (execute) permissions for each of the Owner, Group, and In addition to the above Unix has other special bits to control access.
50
Access Lists and Groups in Linux
Mode of access: R (read), W (write), and X (execute) Three classes of users on Unix / Linux: Owner, Group, and Others. Each class of user is given RWX permission using bits (0 = deny access and 1 to grant access). File/Directory permissions set using chmod command. Example: Suppose we want to set the below permissions on a file named test. txt: RWX a) owner access 7 RWX b) group access 6 1 1 0 c) public access 1 0 0 1 The command in UNIX is => chmod 761 test.txt
51
Question What access rights does the following command: chmod 751 test.txt specify on the file test.txt?
52
End of Chapter 11 & 12
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.