Today topics: File System Implementation

Today topics: File System Implementation
Data Structures for files Unix files, Inodes

Review: Representing an Open File (3)
file descriptor table system file table active inode table 1 2 3 1 r 10 fdrw 1 rw 20 . . fdr . n–1 ref count f pointer access inode In this slide we see the effect of two opens of the same file within the same process. fdrw = open("x", O_RDWR); fdr = open("x", O_RDONLY); write(fdrw, buf, 20); read(fdr, buf2, 10); disk buffer cache Copyright © 2002 Thomas W. Doeppner. All rights reserved.

file descriptor table system file table active inode table 1 2 3 fdrw 2 rw 20 . fdrw2 . . n–1 ref count f pointer access inode The dup system call causes two file descriptors to refer to the same file table entry and hence share the offset. fdrw = open("x", O_RDWR); fdrw2 = dup(fdrw); write(fdrw, buf, 20); disk buffer cache Copyright © 2002 Thomas W. Doeppner. All rights reserved.

file descriptor table system file table active inode table 1 2 3 2 r 10 fdrw 4 rw 20 . fdrw2 . fdr . n–1 ref count f pointer access inode If our process executes a fork system call, creating a child process, the child is given a file-descriptor table that’s a copy of its parent’s. Of course, the reference counts on the system-file-table entries are increased appropriately. fork( ) disk buffer cache Copyright © 2002 Thomas W. Doeppner. All rights reserved.

Review: I/O Redirection
% who > file & if (fork( ) == 0) { char *args[ ] = {"who", 0}; close(1); open("file", O_WRONLY|O_TRUNC, 0666); execv("who", args); printf("you screwed up\n"); exit(1); } This is an example of what a shell might do to handle I/O redirection: it first creates a new process in which to run a command (“who”, in this case). In the new process it closes file descriptor 1 (standard output—to which normal output is written). It then opens “file” (the arguments indicate that “file” is only to be written to, that any prior contents are erased, and that if “file” didn’t already exist, it will be created with read and write permission for all; assuming that file descriptor 0 is not available (it’s assigned to standard input), file descriptor 1 will be assigned to “file”. Assuming that execv succeeds, when “who” runs, its output is written to “file”. Note that the parent process does not wait for its child to terminate; it goes on to execute further commands. (This behavior occurs because we’ve placed an “&” at the end of the command line.) Note the args argument to execv: By convention, the first argument to each command is the name of the command (“who” in this case). To indicate that there are no further arguments, a zero is supplied. Note that we aren’t checking for errors: this is only because doing so would cause the resulting code not to fit in the slide. You should always check for errors.

Today topics: File System Implementation
Data Structures for files Unix files, Inodes

Unix File System – Inodes
Owner snt Group cpre308 Type regular file Perms rwxr-xr-x Accessed oct pm Modified …. Inode modified … Size bytes Disk addresses data structure on disk one inode per file

Structure of a disk drive

File system layout A disk is divided into partitions
Each partition can have its own file system Sector 0 of the disk is Master Boot Record (MBR) Contain partition table Starting and ending addresses of each partition One partition is active The first block called Boot Block Contains a program to load operating system

Implementing files-goal
Disk = (long) sequence of blocks Keep track of the blocks associated with a file

Methods Contiguous allocation Linked list allocation
Linked list allocation with a table in memory I-nodes

Contiguous Allocation
All disk blocks of a file allocated sequentially Advantages (very) Fast read Useful for read-only file systems (CD-ROM) Keeping track of blocks of a file is easy Problems Fragmentation with deletes File growth might be expensive X A B . A expands X . B A

Sequential access is fast, random access is slow
Linked List of Blocks Sequential access is fast, random access is slow

File Allocation Tables (FAT)
One entry per physical disk block; FAT can be in main memory

Disk Map in Unix 1 2 3 4 5 6 7 8 9 10 11 12 The purpose of the disk-map portion of the inode is to represent where the blocks of a file are on disk. I.e., it maps block numbers relative to the beginning of a file into block numbers relative to the beginning of the file system. Each block is 1024 (1K) bytes long. (It was 512 bytes long in the original Unix file system.) The data structure allows fast access when a file is accessed sequentially, and, with the help of caching, reasonably fast access when the file is used for paging (and other “random” access). The disk map consists of 13 pointers to disk blocks, the first 10 of which point to the first 10 blocks of the file. Thus the first 10Kb of a file are accessed directly. If the file is larger than 10Kb, then pointer number 10 points to a disk block called an indirect block. This block contains up to 256 (4-byte) pointers to data blocks (i.e., 256KB of data). If the file is bigger than this (256K +10K = 266K), then pointer number 11 points to a double indirect block containing 256 pointers to indirect blocks, each of which contains 256 pointers to data blocks (64MB of data). If the file is bigger than this (64MB + 256KB + 10KB), then pointer number 12 points to a triple indirect block containing up to 256 pointers to double indirect blocks, each of which contains up to 256 pointers pointing to single indirect blocks, each of which contains up to 256 pointers pointing to data blocks (potentially 16GB, although the real limit is 2GB, since the file size, a signed number of bytes, must fit in a 32-bit word). This data structure allows the efficient representation of sparse files, i.e., files whose content is mainly zeros. Consider, for example, the effect of creating an empty file and then writing one byte at location 2,000,000,000. Only four disk blocks are allocated to represent this file: a triple indirect block, a double indirect block, a single indirect block, and a data block. All pointers in the disk map, except for the last one, are zero. All bytes up to the last one read as zero. This is because a zero pointer is treated as if it points to a block containing all zeros: a zero pointer to an indirect block is treated as if it pointed to an indirect block filled with zero pointers, each of which is treated as if it pointed to a data block filled with zeros. However, one must be careful about copying such a file, since commands such as cp and tar actually attempt to write all the zero blocks! Copyright © 2002 Thomas W. Doeppner. All rights reserved.

Optimization for Sparse Files
1311 1423 12 Suppose a file was large, but mostly zeros Could be produced using lseek and write

Additional Enhancements
Performance depends on: How many disk accesses are needed to read a file? Store some data in the inode itself Perhaps the whole file will fit in! Need only 1 disk access for a small file Increase block size

Implementing directories
Goal: which files are in a directory

Unix Directory (V7) Directories are files whose data is a list of filenames & inodes filename (14 bytes) inode number (2 bytes) . 12 .. 14 etc 134 mail 346 crash 5 init 175 mount 586 Example inode Owner snt Group cpre308 Type regular file Perms rwxr-xr-x Accessed oct pm Modified …. Inode modified … Size bytes Disk addresses Max filename size = 14 chars

The steps in looking up /usr/ast/mbox
The UNIX V7 File System The steps in looking up /usr/ast/mbox

Today topics: File System Implementation

Similar presentations

Presentation on theme: "Today topics: File System Implementation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Today topics: File System Implementation

Similar presentations

Presentation on theme: "Today topics: File System Implementation"— Presentation transcript:

Similar presentations

About project

Feedback