Chapter 11: File System Implementation

Chapter 11: File System Implementation

Chapter 11: File System Implementation
File-System Structure File-System Implementation Directory Implementation Allocation Methods Chapter 11.2 Free-Space Management Recovery Log-Structured File System Chapter 12 – An Introduction Overview of Mass Storage Disk Magnetic Tapes

Objectives Here we are going to discuss secondary storage – the disk, which holds files permanently, if desired. There are many issues dealing w/ file storage and access. So we need to understand how files are structured, how they are accessed, how disk space is allocated and freed up when no longer needed, and how to interface other parts of the operating system to secondary storage. Essentially, this is a chapter dealing with implementation.

File-System Structure
Unlike tapes and other storage devices, data on disk can be rewritten in place. Also unlike sequential storage media, such as a magnetic tape, a disk allows us to access any file and/or records in a file either Sequentially, or Randomly. Data Transfer As discussed, input and output to/from a disk is performed in blocks. This is simply a hardware constraint, although we have some control over this, as we shall discuss. Blocksize can vary considerably from a very small block (book offers 32 bytes) to thousands of bytes (I have personally seem blocks of bytes). But a disk must be organized, and typically (not always) has a file system that is used to provide such an organization to facilitate storage and retrieval.

File System Structure - Layers
Not unlike many other topics we study, like communications protocols, the file system is structured, where each ‘layer’ of the file system adds to the features of the lower levels, while abstracting other features... Layers typically start with primitive of operations and features. Lowest Level: I/O Control consists of device drivers and interrupt handlers. Device drivers are very low level programs consisting of very special commands facilitating data transfer between primary memory and (typically) a disk device or other storage device. Inputs to the device driver are high level, such as ‘retrieve block 123” but outputs are very specific instructions to the I/O Control Unit specifying which device to act on, what to do (read, write, etc.) These are instructions that are interpreted by the hardware and are of the type Test and Set, and other such low-level instructions dealing with timing and checking to see of the device is busy, and more…

Structure of a Layered File System
Inputs to the device drivers are high level as stated – relatively. Instructions from the basic file system might include a request to read some physical block by supplying desired items numeric disk address, such as drive 1, cylinder 73, track 2, sector 10. Of course, this begs the question, where does the basic file system ‘get’ that numeric disk address? The file organization module knows location of the file, the type of organization and allocation, and the file’s logical and physical blocks. This module can then translate (map) logical block addresses into physical block numbers, which is how the disk is organized, and pass this to the basic file system. Given some kind of symbolic file name, such as junk.txt, the logical file system manages the directory structure that supplies information the file organization module needs for its mappings. :FS maintains the file structure via file control blocks, which contain much file information such as ownership, permissions, location, etc.

A Typical File Control Block
A file control block (FCB) is a file system structure in which the state of an open file is maintained. (WIKI) A full FCB is generally about 36 bytes long. The meanings in the different fields differ from one OS to another. Drawing below is from your book. Note the next slide:

Another format of a File Control Block
Offset Size Contents 00 Byte Drive number — 0 for default, 1 for A:, 2 for B:,... 01 8 bytes File name and file type — together these form a file-name 09 3 bytes 0C 20 bytes Implementation dependent — should be initialized to zero before the FCB is opened. 20 1 byte Record number in the current section of the file — used when performing sequential access. 21 Record number to use when performing random access. Note this is 36 bytes!

Layered File Systems The nice thing about a layered file system is that other file systems on the same computing system can still use the same low-level modules, such as the basic file system code and the I/O control, since these are very low level, machine dependent and generally file-system independent. This is important because most operating systems support more than a single file system. Other devices, such as CD-ROMS have a different kind of file systems using a different ‘standard formats’ that CD-ROM manufacturers subscribe to. Same for jump drives, etc… All operating systems, however, have at least one disk-based file system. Unix uses its own: the Unix File System (UFS).

Some Standard File System Implementation Terminology
We know that applications use system calls for open(), close(), etc. In order to access data in files. Now we need to look into the various structures / operations needed to implement all this. There is a number of data structures and algorithms needed to implement a file system and these are very os-dependent. But there is some common definitions that must be understood. Certainly the file system must contain data as to how to boot the operating system stored on disk - its size in bytes, number and location of blocks, the directory structure, and many supporting files not necessarily part of the kernel. Consider the following:

Some Standard File System Implementation Terminology
Boot Control Block – (per volume) This block is physically normally the first block on a volume. It contains information telling the system how to boot the operating system from this volume. If not found on this volume, this first block is typically empty. Volume Control Block – (per volume) – describes the number of blocks, size of the blocks, free-block count and free-block pointers, and free FCB count and FCB pointers. (All this also applies to partitions, if volume is partitioned) Directory – A directory structure is used to organize files in the file system. Per-File FCB – The FCB, as shown, contains a lot of details about the file – permissions, ownership, size, location of block for each file. In the Unix file system (UFS), the FCB is called the inode.

In the Unix World: FCBs in the Unix world are called inodes, as stated
The directory structure in the Unix File system includes file names and inode numbers. Inode: In Unix, the inode is an element of an array on disk (called i-list) allocated to every unique file at the time it is created. An inode contains file attributes (file size in bytes etc. When a file is opened for an operation (such as read), the file’s inode is copied from disk to a slot in a inode table kept in the main memory so that the file’s attributes can be accessed quickly. Inode number: a two-byte index value into the i-list (or inode table) used to access the inode for a file. Inode Table – is the table (array) of inodes in the main memory that keeps inodes for all open files. Let’s, however, look at a general case of in-memory activities and data structures that work together to facilitate input-output.

In-Memory File System Structures
A number of data structures are needed for the management of the file system (and to aid in performance via caching!) There’s an in-memory mount table – contains info on each mounted volume. The mount table records the active file systems. Then there’s an in-memory directory structure cache holds the directory information of recently accessed directories. This cache is essential for good performan ce.  The system-wide open-file table – contains a copy of the FCB for each open file and more.  The per-process open-file table – among other items, contains a pointer to the appropriate entry in the system-wide open file table and more…for a specific process.

In-Memory File System Structures Overview
Consider Creating a New File: We recall in our layered file system that an application program calls the logical file system when it wants to create a file, via system calls... The system calls pass file names, etc. to the Logical File System, (LFS) along with other information such as desired type of access, etc. The logical file system (LFS), which knows the format of the directory structures, will create a new FCB for this new process. The logical file system then reads the appropriate directory into memory, updates it with the new file name and FCB it just created, and writes it back to the disk. This is shown on the next slide Depending on the operation (like a read, write, etc.) The LFS then calls the File-Organization Module to map the directory I/O requests into disk block numbers. Disk block numbers are then passed to the Basic File System as previously discussed and then onto the I/O control system for accommodation.

In-Memory File System Structures Creating the File
Creating a file: Reading() a file:

In-Memory File System Structures Reading the file
Now for an open file, an entry is made in the per-process open-file table with a pointer to the entry in the system-wide open-file table. (See lower part of figure below) The open() returns a pointer to the appropriate entry in the per-process file-system table. All file operations, such as reads, writes, closes, etc. are then performed via this pointer. (As a side note, the file name itself may not be part of the open-file table, as the system has no use for it once the FCB is located on disk. It could be cached, though, to save time on subsequent opens of the same file so that this process doesn’t have to be repeated….) The name given to the entry in the per-process table varies – some (Unix) refer to it as a file descriptor; Windows refers to it as a file handle. As long as the file is not closed, all file operations are done on the open-file table. Reading a file  (File descriptor / file handle is used here)

In-Memory File System Structures Closing the File
When the file is no longer needed by a process and a process closes it via a system call, the fiile handle (file descriptor) in the per-process table entry is removed (for that process) and the system-wide entry’s open count is decremented by one. If no other processes are using that file, any updated data dealing with the file is copied back to the disk-based directory structure. Then, the entry in the system-wide open-file table is removed. (An important side note: note that the updates take place in memory and some of the updating to file-based parameters (when accessed, records in the file, etc….) are not updated back to disk until the file is closed. This significantly impacts Recovery and Backup – as we shall see later in this chapter…

Partitions and Mounting
It is important to note that disks can be sliced and diced in a number of ways as we have said. It is important to note that a disk can be partitioned into many ‘parts.’ We will talk about this… A volume can span multiple partitions on multiple disks. This discussion is addressed by RAID – Chapter 12. Partitions can Contain a file system Contain no file system, or be Raw. Here, no file system is appropriate. Unix swap space often uses a raw partition Some data bases use raw disk for various needs so as to speed up certain database operations. Discussed in considerable detail in the next chapter.

At boot time, the system does not have a file system and thus boot information is stored in a separate partition in its own format. Boot info is usually a sequential series of blocks loaded as an image into memory. And this image starts at a fixed location, such as the first byte of the first block. (more later on this) Dual boots: The ‘boot image’ can contain more than the instructions for how to boot a specific operating system as well as booting information for multiple operating systems. PCs and other systems can be dual-booted. Multiple operating systems can be installed on such a system. A boot loader understands that multiple file systems and multiple operating systems can occupy the boot space.

Last thoughts: At boot time, we also load the OS kernel and other supporting system files for whatever operating system we desire. We can manually mount other partitions later or have them automatically mounted at boot time as well.

11.3 Directory Implementation
So how do we manage the directory and how do we allocate space for it? How is it organized and accessed? 1. Linear list of file names with pointer to the data blocks is simplest approach. simple to program but very time-consuming to execute Creating a new file necessitates a directory search.- for duplicates. Then add new entry at the end. Delete works similarly – must search and release allocated space. Can help by moving last directory entry over entry to be deleted or mark it (logical delete) special so that that space can be reused for next file creation. Search times may be horrible especially for a large directory, and Some systems use software cache to store most recently used directory info. Sorted list helps. Decreases search time, Binary search can be used. But keeping the directory in a sorted order imposes a lot of overhead too. Some implementations use a B-tree that brings the advantage that the directory is sorted without a sort step. But there’s a lot of compleity here (later)

Directory Implementation
2. Hash Table – linear list with hash data structure. Of course, we must compute (hash to) the location Hash algorithm computes an address from the file name or other parameter(s) and returns a pointer to the file name in the table. Pro: Hash algorithms are fast. The search significantly decreases search time Con: Hash Tables – normally fixed in size. Want to avoid resizing table and recreating it when size needs to be increased. If reorganized, algorithm would need to be changed and table is reorganized. Collisions – situations where two file names hash to the same location Lots of methods to handle collisions. Often a chained progressive overflow collision algorithm is used where each address is a base entry and collisions are linked into this address by a singly-linked list. While searching a linked list takes time, this is clearly better than searching a linear list.

11.4 Allocation Methods An allocation method refers to how disk blocks to store file data (records) are allocated for files: There are three primary approaches: Contiguous allocation Linked allocation Indexed allocation

Contiguous Allocation
Each file occupies a set of contiguous blocks on the disk Blocks occupy a linear ordering, and disk head movements (a disk seek), are only to the next track (or cylinder).  Number of disk seeks is therefore minimal since all blocks are kept together. Directory entry has address of first block and the number of blocks only. This is all that is needed. File access is very straightforward. For sequential access, the file system keeps track of the address of the last block it referenced and can readily read the next block (see FCB format). For random access to some specific block, given that we want block i and we start at block b, we can go very quickly to block b + i. Biggest problem: file growth. Is totally new space required or other mechanism? Ahead. Extents may help, but still a significant problem…

Contiguous Allocation of Disk Space
Can easily see starting block number and number of blocks for each file. See ‘count’ starts at 0. Mail starts at block 19 for six blocks. All allocations are contiguous!

Contiguous Allocation
Finding Space: Both first fit and best fit work pretty well, with first fit generally a bit better. (We will see how the system keeps track of available blocks ahead…) Worst fit is undesirable in terms of time and storage utilization. All contiguous allocation schemes have external fragmentation issues. Could be a major or minor problem in managing an overall disk resource. Down Side. Generally all installations have a downtime during low system usage where the disk can be compacted and external fragments brought together. Can be done off-line – generally best. Users get a ‘warning’ of imposing ‘non-availability’ like at 3am, etc. Save your files, the system will not be available for a while. Called also ‘periodic maintenance’ and ‘system saves’ and more… More later…

Extent-Based Systems How much space is needed for the file? Oftentimes we do not know! Lots of times, files cannot be extended ‘in place.’ So, what to do? Can take system offline, allocate more space and then restart the system Very costly in run time. But we can overestimate required space – can be very wasteful, especially if all the ‘required’ newly requested space is really not used / needed. Can find a totally larger space, copy the file into the new space and release old space. But this involves down time, possibly rerunning a process, and other management considerations. Some systems use extent-based file systems and they allocate disk blocks in extents An extent is a contiguous block of disks Extents are allocated for file allocation A file consists of a basic allocation plus one or more extents. IBM uses a SPACE parameter: A process requests an original allocation of say 10 tracks and 2 possible extents of one track each. Ten are allocated and two are held in reserve and used if needed. Extents are ‘linked in’ as needed.

Linked Allocation Here, in linked allocation, we no longer have problems with contiguous allocation scheme. Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk. Directory will point to the first block, and each block points to the next block. (of course, links take some of the space in the block) For a New file: creates a new entry in the directory – no final size is needed. Pointer is set to null and each request requires the space management system to find a block and link it in. No external fragmentation, and file can grow. Disk need not be compacted due to this kind of allocation. Major Disadvantage: Cannot be used for random access – only sequential access. We must follow the pointers until we find the desired block. Not efficient if we need a direct-access capability. Also pointers do take up some space, if one adds them up!

Linked Allocation Lots of times clusters of blocks are allocated.
If so, the pointers will occupy much less space, and efficiency is improved because the cluster of blocks are located together. Clusters are used in most systems. There are a lot of inherent dangers is present in a linked allocation:  dropping a pointer. Could link into a protected area Could link into some other file Could simply lose your data!!! Potential Solution - often used: have a doubly-linked list Potential Solution2 – store the file name and relative block number in each block – but this requires more space! And these links add up!

Linked Allocation Note: Starting location only is
stored in the directory. All else is linked!

Linked Allocation (Cont.)
Many disks use a FAT (File Allocation Table), which is a part of disk and located at the beginning of each volume. The directory has one entry per file, and this entry points into the FAT. (The FAT is indexed by block number) The FAT contains the address of the next block in the file. Final block in the table has a special end of file mark. (See next slide) Unused blocks in the FAT have a 0 table value. When more space is needed for the linked file, the file management system finds an available block (value 0 in the FAT) and moves that block number to the previous block’s EOF value. Downslide: This scheme may result in a lot of disk head movement, which definitely slows things down. Solution: Cache the FAT. Advantage: random-access is greatly improved because any block can be accessed via the FAT access, particularly if the FAT is in cache.

File-Allocation Table

Indexed Allocation In linked allocation, we don’t have the external fragmentation problem and we don’t have the size declaration problem, but we also do not have direct access capability without the FAT because the pointers to the blocks are within the blocks and hence must be retrieved. Indexed Allocation brings all pointers (links) together into the index block. Each file has its own index built as an array of block addresses. To access a block, we use the index, search it, and it will point to the disk location for that block. Indexed allocation supports direct access w/no external fragmentation. Any free block will suffice when a block needs to be added to the file. Pointer overhead is more than linked allocation because we actually have a separate file: the index. And this index itself will occupy at least one block of disk storage. (Of course, it can be cached during use – maybe) So how large should the index block be? Want it to be small, since every indexed file will have one, but we want a sufficient number of entries to support large files. Want it to be large? Might need to link several index blocks. Several implementations of this, as we shall see.

Example of Indexed Allocation

Structure of the Index Block
Linked Scheme: usually one-block long, but we can link blocks (that is, several ‘indices’) for particularly large files. Multilevel index: First index block may only be a set of pointers to a second level index block. These in turn point to the data blocks. IBM uses this organization for its indexed sequential files, which it calls Key Sequenced Data Sets (KSDS). It calls the outermost block the index set, followed by the sequence set followed by the data themselves organized into what they call control areas and control intervals… Note: a two-level index would allow a file size of up to 4GB (with 4K blocks). Combined Scheme: (used by Unix) keeps the first set of pointers of the index block in the file’s inode (FCB to the rest of us). This scheme involves a number of direct and indirect blocks and we will not spend time on this one.

Indexed Allocation – Mapping (Cont.)
 outer-index index table file

. . . INDEX COMPONENT … INDEX SET SEQUENCE SET . . . DATA COMPONENT
CONTROL INTERVALS . . . CONTROL AREA CONTROL AREA CONTROL AREA

KEY VALUES EXTREMELY EXAGGERATED!!
I1 I2 INDEX SET 62 S2 FREE 9/S1 SEQUENCE SETS S1 S2 S3 3 D1 9 D2 36 D3 62 D4 FREE FREE D1 D2 D3 D4 1 3 FREE 5 9 FREE 35 36 FREE 42 43 62 FREE CONTROL INTERVALS CONTROL INTERVALS CONTROL AREAS

Performance Choice of an allocation methods is largely dependent upon how the data needs to be accessed. Contiguous Allocation – requires only one access to get to the data block. Keep initial address in memory and calculate disk addresses from there. Linked Allocation – keep the address of the next block in memory and can read it directly. Major disadvantage – no random access, and access to a specific block might well require multiple reads to get ‘to’ that record. Some systems that require direct access use a contiguous allocation scheme and linked allocation for sequential access. These accesses must be declared when the file is created. Sequential files will be linked Direct access files will be contiguous and can support both direct access and sequential access.

Performance - 2 Indexed Allocation – If index is in memory, accesses are quick. Retaining the index in memory does require space If space is available, then this is good. If space is not available, then the index and the data require two I/Os – and this is not desirable. For multiple index blocks, more reads might be needed. Performance using indexed allocation depends on the index structure, the size of the file, and the position of the block desired. Caching the index file(s) is significantly helpful if space is available. There are a number of other approaches at optimization. Your book cites that oftentimes it is not unreasonable to add thousands of extra instructions to the operating system to save just a few disk-head movements. “Furthermore, this disparity is increasing over time, to the point where hundreds of thousands of instructions reasonably could be used to optimize head movements.” Discuss.

End of Chapter 11.1

Chapter 11: File System Implementation

Similar presentations

Presentation on theme: "Chapter 11: File System Implementation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 11: File System Implementation

Similar presentations

Presentation on theme: "Chapter 11: File System Implementation"— Presentation transcript:

Similar presentations

About project

Feedback